Skip to main content
SS
Open to opportunities

Samuel Shrestha

@samuelshrestha

Senior Data Engineer building scalable batch and streaming platforms for AI, analytics, and cloud ecosystems.

United States
Message

What I'm looking for

I’m looking for a senior data engineering role to build resilient batch/streaming pipelines for AI and analytics—especially privacy-safe clean rooms, vector search/RAG, and self-serve data products—while mentoring teams.

I’m a Senior Data Engineer with 8+ years building scalable batch and streaming data platforms across AI, analytics, and cloud ecosystems. I bring deep expertise in distributed systems, data modeling, and modern ETL architectures, delivering large-scale solutions using Python, SQL, Spark, Kafka, Airflow, dbt, Snowflake, and AWS.

At Salesforce, I delivered Data 360 capabilities spanning AI retrieval, zero-copy clean rooms, vector search, and personalization—building JSON intent contracts, embeddings, model-serving APIs, and Spark Streaming pipelines. At Instacart and Komodo Health, I modernized platforms with dbt and Airflow orchestration, tuned Snowflake and Spark for reliability and cost observability, and built healthcare ETL workflows that processed vast datasets; I’m motivated by privacy-safe, observable pipelines and clear collaboration across product, ML, security, and infrastructure teams.

Experience

Work history, roles, and key accomplishments

Salesforce logoSA
Current

Senior Data Engineer

Mar 2023 - Present (3 years 3 months)

Delivered Salesforce Data 360 capabilities for AI retrieval, zero-copy clean rooms, vector search, and personalization using Python, SQL, Spark, Kafka, and AWS, enabling sub-second personalization at enterprise scale. Built SQL-validation and lineage-based collaboration components, improved connector coverage beyond 100 with 4x throughput, and reduced dialect delivery time from 40 to 10 days.

Instacart logoIN

Data Engineer

Jan 2020 - Feb 2023 (3 years 1 month)

Modernized Instacart’s data platform by building self-serve batch and streaming pipelines with Snowflake, dbt, Airflow, Kafka, Flink, and Kubernetes, improving reliability across 10+PB and 5M+ tables. Migrated legacy transformations to modular dbt models, scaled orchestration toward 400 DAGs/5,000 tasks, and reduced cold-start waste costs by 20–40%.

Education

Degrees, certifications, and relevant coursework

University of California, Berkeley logoUB

University of California, Berkeley

Bachelor of Science, Electrical Engineering and Computer Sciences

2014 - 2018

Bachelor of Science in Electrical Engineering and Computer Sciences from 2014 to 2018.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan