Sam Naqvi
@samnaqvi2
Staff Data Engineer specializing in multi-cloud distributed lakehouse and real-time streaming platforms.
What I'm looking for
I’m a Staff Data Engineer with 10+ years architecting and scaling multi-cloud, distributed data platforms for petabyte-scale analytics, real-time streaming, and production ML and GenAI workloads. I focus on lakehouse architecture, CDC ingestion, streaming reliability, data governance, and cost-optimized infrastructure across AWS, Azure, and GCP.
In my most recent role, I built a multi-cloud Snowflake and Databricks lakehouse processing 4B+ clinical and claims records monthly, delivering sub-minute data freshness and 99.95 percent reliability. I’ve implemented enterprise data contracts and policy-as-code governance to achieve HIPAA and SOC 2 compliance, reduced cloud spend by up to 35 percent, and mentored teams to improve standards, incident response, and cross-team adoption.
Experience
Work history, roles, and key accomplishments
Architected multi-cloud Snowflake and Databricks lakehouse processing over 4B clinical and claims records monthly, enabling sub-minute data freshness. Built governed ML data infrastructure and implemented policy-as-code (HIPAA/SOC 2), reducing incident frequency 48% and cloud spend 35% while maintaining 99.95% reliability.
Designed real-time fraud detection infrastructure processing 50M transactions daily using Kafka, Spark, Snowflake, and BigQuery. Delivered exactly-once streaming to reduce fraud response latency 55%, improved model accuracy 42% via offline/online feature consistency, and cut deployment cycle time 60% with automated CI/CD provisioning.
Migrated a legacy Oracle warehouse to Snowflake and Azure Synapse lakehouse, improving query performance 8x. Built ingestion pipelines for 2TB/day telemetry with Kafka and Spark, reducing analytics latency 70% and infrastructure cost 28% through compute optimization.
Junior Data Engineer
Costco
Jul 2015 - May 2018 (2 years 10 months)
Built ETL pipelines integrating POS, inventory, and logistics data into Redshift and BigQuery to improve data availability SLAs by 40%. Developed demand forecasting datasets improving inventory turnover 18% and supported migration from on-prem SQL Server to AWS analytics.
Education
Degrees, certifications, and relevant coursework
Sam hasn't added their education
Don't worry, there are 90k+ talented remote workers on Himalayas
Tech stack
Software and tools used professionally
Azure Synapse
Apache Spark
AWS Glue
GitHub
Kubernetes
GitHub Actions
PySpark
Debezium
dbt
MySQL
PostgreSQL
MongoDB
Cassandra
Hadoop
Gmail
Databricks
Terraform
MLflow
Kafka
Datadog
Kafka Streams
Airflow
Root Cause
SQL
Dagster
Apache Iceberg
Monte Carlo
Delta Lake
Great Expectations
Trino
Collibra
Bash
Microsoft Fabric
Dynamic
Column
Unity Catalog
Factory
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Sam?
You can contact Sam and 90k+ other talented remote workers on Himalayas.
Message SamFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
