Rabina Lama
@rabinalama1
Senior Data Engineer modernizing cloud-native lakehouses and streaming platforms to deliver governed, near real-time analytics.
What I'm looking for
I’m a Senior Data Engineer with 8+ years of experience designing, modernizing, and operating enterprise-scale cloud-native data platforms across insurance, healthcare, and financial services. I specialize in scalable Lakehouse architectures and distributed data processing frameworks using AWS, Azure, and GCP.
In my roles, I’ve built high-performance Spark and PySpark pipelines for large-scale batch and real-time ingestion, processing multi-terabyte datasets with optimized partitioning and workload tuning. I’ve also architected secure, governed data lakes and warehouses using Redshift, Synapse Analytics, BigQuery, and Delta Lake to deliver analytics-ready datasets for underwriting, actuarial modeling, clinical research, regulatory compliance, and executive reporting.
I bring hands-on experience implementing CDC-driven streaming pipelines with Kafka and Event Hubs to reduce latency and enable near real-time decision-making. I’m also strong in ELT/ETL design, dimensional modeling, and transformation frameworks using Airflow orchestration and dbt to improve data quality, lineage, and governance.
I lead modernization and migration initiatives that reduce operational costs, improve reliability, and strengthen cross-functional delivery with analytics, actuarial, compliance, and business stakeholders. I focus on infrastructure automation with Terraform, CI/CD with Jenkins and Azure DevOps, and monitoring with CloudWatch/Azure Monitor to proactively detect failures and speed up incident response.
Experience
Work history, roles, and key accomplishments
Led modernization of the data ecosystem by migrating underwriting, claims, and policy servicing workflows to AWS, reducing legacy infrastructure dependency by 80% and saving $1.2M annually. Built AWS lakehouse and CDC-driven ingestion pipelines with Spark and Kafka, cutting reporting latency from 48 hours to under 3 hours and improving analytics and reporting performance by 35%.
Architected an Azure lakehouse platform using ADLS Gen2, Databricks, and Synapse to deliver secure clinical and pharmaceutical datasets at scale. Optimized Delta Lake storage with partitioning and Z-ORDER, reducing query latency by 45%, and orchestrated batch/stream pipelines with Azure Data Factory and Airflow to cut operational overhead by 30%.
Designed scalable healthcare data pipelines for claims processing, pharmacy benefits, and population health analytics across millions of member records. Built Databricks and Spark ingestion with Event Hubs, improved Synapse performance by 30%, reduced data discrepancies by 25%, and lowered manual processing effort by 35% through orchestrated batch/stream workflows.
Built Python backend services and REST APIs to automate ingestion of policy, billing, and claims data into centralized analytics platforms, improving accessibility for reporting teams. Optimized SQL extraction and ETL logic to reduce batch processing latency by 30% and improved pipeline stability by reducing recurring data load failures by 20%.
Education
Degrees, certifications, and relevant coursework
University of Houston-Downtown
Bachelor's in Computer Science, Computer Science
Earned a Bachelor's in Computer Science from the University of Houston-Downtown.
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Rabina ?
You can contact Rabina and 90k+ other talented remote workers on Himalayas.
Message RabinaFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
