Sabina Gurung
@sabinagurung
Senior Data Engineer building cloud-native lakehouse platforms and AI-enabled data pipelines that drive scalable, reliable analytics.
What I'm looking for
I’m a Senior Data Engineer with 7+ years of experience designing, developing, and optimizing enterprise-scale data platforms across insurance, healthcare, retail, and financial services. I build cloud-native solutions across Microsoft Azure, Google Cloud Platform (GCP), and AWS, with deep hands-on expertise in Databricks, Apache Spark, PySpark, Snowflake, BigQuery, Microsoft Fabric, and modern Lakehouse architectures.
In my recent role, I architected a GCP-native enterprise data platform, migrated 50+ legacy AWS ETL pipelines (reducing infrastructure costs by 25%), and delivered scalable ELT pipelines processing 8TB+ daily (reducing processing times by 35%). I also implement Data Quality Frameworks and Data Observability to improve accuracy and cut incident detection/resolution time by 35%, and I integrate Generative AI-enabled workflows using Azure OpenAI, Vertex AI, and RAG for automation that reduces manual effort by 50%+.
Experience
Work history, roles, and key accomplishments
Architected a GCP-native enterprise data platform with BigQuery, Dataflow, Dataproc, Pub/Sub, and Cloud Storage, supporting insurance policy, claims, underwriting, and actuarial analytics. Migrated 50+ legacy AWS ETL pipelines to GCP, cutting infrastructure costs by 25%, reducing daily processing time by 35%, and improving data accuracy by 40% through data quality and observability initiatives.
Designed and built Azure-based batch and near real-time data pipelines for clinical research, patient safety, manufacturing, and commercial datasets. Reduced duplicate processing efforts via a Medallion Lake architecture and improved performance by 40% while cutting manual orchestration work by 50%, with automated data quality and observability reducing production incidents by 30%.
Developed and maintained enterprise ETL pipelines with Informatica PowerCenter to integrate high-volume retail transaction, inventory, merchandising, and supply chain data. Improved batch runtimes by 40% through SQL performance tuning and reduced manual intervention by 30% using Unix/Linux shell automation, while enhancing reliability via workflow recovery and production support.
Built and maintained backend RESTful APIs using Python and Flask for customer account management, transaction processing, and internal banking applications. Improved data accuracy by 25% through validation/reconciliation modules and increased API response times by 35% via database query optimization, while supporting reliable deployments through CI/CD with Jenkins and Git.
Education
Degrees, certifications, and relevant coursework
Gannon University
Master's in Business Analytics, Business Analytics
Earned a Master's in Business Analytics at Gannon University in Erie, PA.
Tech stack
Software and tools used professionally
Amazon Redshift
Azure Synapse
Apache Spark
AWS Glue
AWS IAM
Microsoft Azure
Google Cloud Platform
Amazon S3
Jenkins
NumPy
Pandas
PySpark
dbt
MySQL
PostgreSQL
MongoDB
IBM DB2
Gmail
Databricks
Azure DevOps
Jira
Java
JSON
Kafka
Azure Monitor
Linux
Oracle PL/SQL
AWS Lambda
Azure SQL Database
OAuth2
Airflow
Root Cause
Amazon Web Services (AWS)
Amazon EMR
SQL
LangChain
Delta Lake
Bash
Transform
Microsoft Fabric
Unity Catalog
Factory
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Sabina?
You can contact Sabina and 90k+ other talented remote workers on Himalayas.
Message SabinaFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
