Skip to main content
John PengJP
Open to opportunities

John Peng

@johnpeng

Senior AI Data Engineer with 10+ years building lakehouse platforms and Generative AI systems for real-time enterprise intelligence.

United States
Message

What I'm looking for

I’m looking for a role where I can lead secure, governed enterprise data platforms and Generative AI (RAG/vector search) solutions—combining streaming, MLOps, and observability to deliver measurable outcomes, ideally in regulated domains.

I’m a Senior AI Data Engineer with 10+ years of experience designing large-scale data platforms and AI/ML systems across healthcare, digital marketplaces, and e-commerce. I build cloud-native analytics and scalable data products that turn complex data into measurable outcomes at enterprise scale.

At Abbott, I architected the next-generation AI and analytics platform for the FreeStyle Libre CGM ecosystem, processing billions of daily healthcare events with a Medallion Lakehouse Architecture powered by Databricks, Delta Lake, Apache Iceberg, and Unity Catalog. I built AI-ready streaming and batch pipelines with Python, SQL, PySpark, Kafka, Debezium CDC, Spark Structured Streaming, Airflow, and AWS—enabling real-time analytics, machine learning, Generative AI, and governed healthcare intelligence.

I also lead advanced AI engineering, including agentic data engineering with LangGraph, OpenAI, Claude, and MLflow, plus enterprise RAG and vector search using LlamaIndex, Pinecone, and pgvector. Earlier, I scaled Airbnb’s petabyte-scale ETL and modernized its data warehouse ecosystem, and at Shopify I delivered SQL-based reporting and Tableau dashboards that improved decision-making and reduced manual effort.

Experience

Work history, roles, and key accomplishments

Abbott logoAB
Current

Senior AI Data Engineer

Nov 2020 - Present (5 years 7 months)

Architected Abbott's next-generation AI and analytics platform for the FreeStyle Libre CGM ecosystem, building a Medallion Lakehouse with real-time streaming pipelines and enterprise RAG/vector search capabilities. Established data governance, observability, and healthcare-compliance frameworks while delivering AI-ready datasets and BI/ML reporting.

Airbnb logoAI

Data Engineer

Aug 2016 - Nov 2020 (4 years 3 months)

Architected and scaled Airbnb's enterprise data platform for large-scale analytics, experimentation, and business intelligence. Built batch and near-real-time ETL pipelines, modernized the data warehouse ecosystem, and implemented data quality/governance while optimizing distributed query performance and reducing infrastructure costs.

Education

Degrees, certifications, and relevant coursework

University of the District of Columbia logoUC

University of the District of Columbia

Bachelor's Degree, Electrical and Electronics Engineering

2011 - 2015

Earned a Bachelor's degree in Electrical and Electronics Engineering from the University of the District of Columbia, completed in 2015.

Get matched with your dream remote job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan