Skip to main content
PW
Open to opportunities

Peter Wong

@peterwong

Senior Data Engineer building governed lakehouse and real-time streaming platforms for regulated healthcare.

United States
Message

What I'm looking for

I’m looking for a senior data engineering role where I can build governed lakehouse and real-time streaming platforms, apply agentic data quality/observability, and deliver measurable performance and cost wins for analytics and ML teams.

I’m a Senior Data Engineer with 12+ years of experience building hyperscale data platforms across Google, Databricks, and healthcare at Optum. I focus on designing governed, AI-ready systems that improve performance, reduce cost, and accelerate time-to-insight in high-stakes environments.

At Optum, I architected HIPAA-compliant real-time streaming pipelines using Apache Kafka, Debezium CDC, and Databricks Spark Structured Streaming to process 5M+ patient events daily with sub-5-minute latency. I also designed and deployed medallion lakehouse architectures on Delta Lake with Unity Catalog governance and Apache Iceberg compatibility, reducing query latency by 75% and storage costs while supporting multimodal RAG and AI use cases.

I lead agentic AI-assisted data quality and observability with zero-ETL integrations, eliminating 90% of manual validation and achieving 99.99% data freshness SLAs across 200+ consumers. I optimize ELT orchestration with dbt and Apache Airflow for petabyte-scale datasets, and I’ve built production feature stores and contract-first ingestion capabilities that accelerate ML deployment cycles by 10x—backed by earlier lakehouse and streaming platform migrations at Databricks and foundational hyperscale pipelines on Google Cloud.

Experience

Work history, roles, and key accomplishments

Optum logoOP
Current

Senior Data Engineer

May 2021 - Present (5 years 1 month)

Architected HIPAA-compliant real-time streaming pipelines with Kafka/Debezium and Databricks Spark, processing 5M+ patient events daily with sub-5-minute latency and improving predictive readmission accuracy by 22%. Designed a Delta Lake medallion lakehouse with Unity Catalog governance, reducing query latency by 75% and storage costs while delivering 99.99% data freshness SLA across 200+ consumer

Databricks logoDA

Senior Data Engineer

May 2017 - Apr 2021 (3 years 11 months)

Delivered enterprise lakehouse migrations using Delta Lake, Apache Iceberg, and Unity Catalog, improving query performance by 80% and reducing infrastructure costs for Fortune 500 customers. Built real-time CDC and streaming pipelines handling 100M+ events/day at 99.99% uptime, and developed reusable Databricks workflow patterns that cut pipeline development time by 70%.

Google logoGO

Data Engineer

Sep 2015 - May 2017 (1 year 8 months)

Designed and scaled production data pipelines with Google Cloud (Dataflow/Apache Beam, BigQuery) to process multi-petabyte datasets with sub-second latency. Led migration of legacy Hadoop workloads to cloud-native Pub/Sub + Dataflow + BigQuery, reducing operational overhead by 60% and enabling real-time analytics, while implementing governance controls achieving 99.9% data reliability.

Education

Degrees, certifications, and relevant coursework

The University of Texas at Austin logoTA

The University of Texas at Austin

Bachelor of Science, Computer Science

2011 - 2015

Earned a Bachelor of Science in Computer Science from The University of Texas at Austin (2011–2015).

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan