Lu Zheng
@luzheng
Senior data engineer building distributed streaming and compliant data platforms at billion-event scale.
What I'm looking for
I am a Senior Data Engineer with 10+ years of experience building scalable data platforms, distributed systems, and cloud-based architectures.
My work has focused on designing and operating large-scale data pipelines, ETL/ELT workflows, and real-time streaming systems using tools like Kafka, Spark, Airflow, and dbt. I’ve worked across AWS, GCP, and Azure environments, building data systems that support analytics, machine learning, and production decision-making at scale.
I have experience working with very large datasets in production environments, including data ingestion, transformation, and modeling for domains such as healthcare, advertising, and enterprise analytics. I care a lot about data reliability, system performance, and making sure data is actually usable for downstream teams.
Most of my recent work has involved solving real production challenges like data quality issues, schema evolution, pipeline stability, and optimizing performance of large distributed systems.
I enjoy working close to architecture and engineering teams to design systems that are simple, scalable, and reliable in the real world.
Experience
Work history, roles, and key accomplishments
Kafka exactly-once deletion platform processing 10M+ req/month with 99.95% SLA. Built Flink stream processing, Airflow pipelines, and Kubernetes multi-region EKS infra with Terraform and CI/CD. Designed ML feature pipelines, vector DBs, and monitoring systems. Improved DB performance, implemented security/compliance controls, and mentored engineers to accelerate delivery.
Kafka + Spark Structured Streaming with Avro and exactly-once processing improved reliability to 99%+ for 5M+ events/day. Optimized PostgreSQL and Redis reduced p95 latency by 30% and surfaced slow queries. Built OpenTelemetry tracing, Kafka monitoring, and DLQ alerts, cutting incident response time from 4h to 90m and improving system reliability and observability.
Built ML feature store supporting 10+ teams with Kafka, Spark, PostgreSQL CDC, and Redis caching, enabling 500M+ events/day with point-in-time correct features and versioning. Built Airflow/Spark framework adopted by 30+ engineers, cutting pipeline delivery time 60%. Improved Kubernetes CI/CD with Terraform and Jenkins, reducing on-call load from 15 to 4 hours/week.
Led Redshift to Snowflake migration using Terraform, improving query speed by 40% and reducing cost by 25% with clustering and materialized views. Built ingestion platform processing 50M+ records/day from multiple sources with idempotent pipelines powering real-time bidding. Implemented Datadog observability with SLOs, error budgets, and data quality monitoring.
Built claims ETL pipeline processing 170M+ insurance claims with 2x throughput using partition-aware incremental loading, parallel HDFS ingestion, and data validation. Designed HIPAA-compliant star schema models with AES-256 encryption, RBAC, and optimized Vertica projections, enabling sub-second queries and scalable analytics across large healthcare datasets.
Education
Degrees, certifications, and relevant coursework
Stanford University
Master of Science in Computer Science, Computer Science
2012 - 2014
Completed an M.S. in Computer Science at Stanford University from 2012 to 2014.
Stanford University
Bachelor of Science in Computer Science, Computer Science
2008 - 2012
Completed a B.S. in Computer Science at Stanford University from 2008 to 2012.
Tech stack
Software and tools used professionally
Quantcast
Apache Spark
Apache Flink
GitHub
Kubernetes
Cloudflare
Jenkins
GitHub Actions
Salesforce
PySpark
dbt
MySQL
PostgreSQL
Vertica
Gmail
Rollout
Node.js
Google Analytics
Redis
Terraform
React
JavaScript
Python
Java
Kafka
PagerDuty
Grafana
Prometheus
OpenTelemetry
Datadog
OpenSearch
Avro
TypeScript
OAuth2
Docker
Airflow
SQL
Apache Iceberg
Delta Lake
PgBouncer
Burn
Bash
pgvector
Faiss
Column
Task
Namespace
Movement
Availability
Location
Authorized to work in
Social media
Job categories
Skills
Interested in hiring Lu?
You can contact Lu and 90k+ other talented remote workers on Himalayas.
Message LuFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
