Skip to main content
HimalayasHimalayas logo
Sam NaqviSN
Open to opportunities

Sam Naqvi

@samnaqvi2

Staff Data Engineer specializing in multi-cloud distributed lakehouse and real-time streaming platforms.

United States
Message

What I'm looking for

I want to build and scale high-availability, cost-optimized multi-cloud data platforms—lakehouses, CDC ingestion, and real-time streaming—paired with strong governance and SLO-driven reliability, so teams can ship ML/GenAI with confidence.

I’m a Staff Data Engineer with 10+ years architecting and scaling multi-cloud, distributed data platforms for petabyte-scale analytics, real-time streaming, and production ML and GenAI workloads. I focus on lakehouse architecture, CDC ingestion, streaming reliability, data governance, and cost-optimized infrastructure across AWS, Azure, and GCP.

In my most recent role, I built a multi-cloud Snowflake and Databricks lakehouse processing 4B+ clinical and claims records monthly, delivering sub-minute data freshness and 99.95 percent reliability. I’ve implemented enterprise data contracts and policy-as-code governance to achieve HIPAA and SOC 2 compliance, reduced cloud spend by up to 35 percent, and mentored teams to improve standards, incident response, and cross-team adoption.

Experience

Work history, roles, and key accomplishments

Health Catalyst logoHC
Current

Staff Data Engineer

Mar 2023 - Present (3 years 3 months)

Architected multi-cloud Snowflake and Databricks lakehouse processing over 4B clinical and claims records monthly, enabling sub-minute data freshness. Built governed ML data infrastructure and implemented policy-as-code (HIPAA/SOC 2), reducing incident frequency 48% and cloud spend 35% while maintaining 99.95% reliability.

Stripe logoST

Senior Data Engineer

Jan 2021 - Feb 2023 (2 years 1 month)

Designed real-time fraud detection infrastructure processing 50M transactions daily using Kafka, Spark, Snowflake, and BigQuery. Delivered exactly-once streaming to reduce fraud response latency 55%, improved model accuracy 42% via offline/online feature consistency, and cut deployment cycle time 60% with automated CI/CD provisioning.

Datadog logoDA

Data Engineer

Jun 2018 - Dec 2020 (2 years 6 months)

Migrated a legacy Oracle warehouse to Snowflake and Azure Synapse lakehouse, improving query performance 8x. Built ingestion pipelines for 2TB/day telemetry with Kafka and Spark, reducing analytics latency 70% and infrastructure cost 28% through compute optimization.

Education

Degrees, certifications, and relevant coursework

Sam hasn't added their education

Don't worry, there are 90k+ talented remote workers on Himalayas

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan