david cheng
@davidcheng
Senior Data Engineer specializing in zero-copy Snowflake lakehouse pipelines and governed AI data workflows.
What I'm looking for
I’m a Senior Data Engineer with 8+ years of experience building enterprise-scale data platforms at Snowflake and Salesforce. I focus on turning complex data ecosystems into governed, high-performance data products for analytics, CRM, and AI use cases.
At Salesforce, I architected a zero-copy activation layer that surfaces Snowflake data directly into Salesforce Data 360—eliminating duplicate ETL pipelines and delivering real-time freshness. I also led performance tuning and cost intelligence on federated queries, achieving sub-second latency and significant monthly compute cost reduction, while retiring legacy pipelines through secure data sharing and semantic harmonization patterns.
I’ve delivered medallion lakehouse migrations (Bronze/Silver/Gold) that improved query performance by 80% and reduced infrastructure and compute costs. I build real-time CDC and streaming pipelines for 100M+ events/day at 99.99% uptime, and I extend governance into AI workflows—creating Agentforce grounding/audit corpora with Snowflake Cortex AI and Snowpark, plus evaluation harnesses using VARIANT for production-ready, Trust Layer-compliant outcomes.
Experience
Work history, roles, and key accomplishments
Architected a zero-copy activation layer integrating Snowflake data directly into Salesforce Data 360, eliminating duplicate ETL and enabling real-time freshness across Sales, Service, Marketing, and Agentforce use cases. Improved federated query performance to sub-second latency while reducing monthly compute costs and built governed AI grounding/orchestration workflows using Snowflake Cortex AI
Delivered enterprise Snowflake Data Cloud migrations using a medallion lakehouse architecture, improving query performance by 80% while significantly reducing infrastructure and compute costs. Built automated pipeline orchestration and real-time Kafka/Snowpipe streaming pipelines handling 100M+ events/day at 99.99% uptime, and optimized workloads with Snowpark (Python) to achieve up to 90x faster
Education
Degrees, certifications, and relevant coursework
University of California, Berkeley
Bachelor of Science, Electrical Engineering and Computer Science
2013 - 2017
Earned a Bachelor of Science in Electrical Engineering and Computer Science from UC Berkeley (2013–2017).
Tech stack
Software and tools used professionally
Splunk
Apache Spark
Superset
Metabase
GitHub
GitLab
Kubernetes
Jenkins
CircleCI
GitHub Actions
GitLab CI
Salesforce
PySpark
Debezium
dbt
Django
Spring Boot
Databricks
Terraform
Azure DevOps
Java
MLflow
Kafka
FastAPI
Grafana
Prometheus
OpenTelemetry
Datadog
OpenSearch
Airflow
Apache Beam
SQL
Buildkite
Apache Iceberg
Hex
Delta Lake
Harness
Lightdash
Bash
Dive
Agentic
OpenLineage
Dynamic
Unity Catalog
Beam
Agentforce
Lakeflow
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring david ?
You can contact david and 90k+ other talented remote workers on Himalayas.
Message davidFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
