Skip to main content
HimalayasHimalayas logo
HP
Open to opportunities

Henry Phan

@henryphan

Senior data engineer building cloud-native platforms, reducing latency, and improving data quality.

United States
Message

What I'm looking for

I’m looking for a Senior Data Engineer role where I can build cloud-native data platforms, own scalable ETL/ELT and streaming pipelines, and drive observability, cost optimization, and automated testing that help teams move faster.

I’m a data engineer with 7 years building cloud-native data platforms and analytics solutions across energy and software, focused on scalable pipelines, measurable performance, and trustworthy data. I’ve used Python, SQL, ETL/ELT, Airflow, dbt, Spark, and Kafka to reduce ETL latency, improve data quality, and enable self-serve analytics.

Most recently, I designed a GCP data lake with BigQuery, Cloud Storage, and Dataflow (reducing query time by 60%) and delivered near-real-time order processing with Kafka and Dataflow (cutting order-to-visibility latency from minutes to seconds). I also optimized BigQuery costs by 30% and brought observability with Prometheus and Grafana to improve incident response times by 45%, while mentoring engineers and building CI/CD for safer dbt and Airflow deployments.

Experience

Work history, roles, and key accomplishments

UM

Data Engineer

Umbrage

Aug 2023 - May 2025 (1 year 9 months)

Led migration of legacy ETL jobs from on-prem Hadoop to Dataproc and Cloud Storage, reducing processing time by 3x and infrastructure costs by 25%. Built event-driven ingestion and modular Airflow pipelines, cutting manual intervention during failures by 70% monthly.

BL

Data Engineer

Bluware

Jan 2021 - Sep 2023 (2 years 8 months)

Built scalable Spark workflows for seismic and well log data, reducing processing time by 4x and accelerating interpretation cycles. Developed ETL pipelines and dbt/SQL transformations to improve data integrity and downstream ML feature accuracy.

EG

Data Engineer

EnergyMakers Advisory Group

Jan 2019 - Jan 2021 (2 years)

Designed a centralized data platform consolidating SCADA, meter, and market data, improving asset-optimization visibility and reducing retrieval times by 50%. Built ETL pipelines and scheduled Spark/Airflow jobs to normalize time-series data and cut manual reconciliation effort by 65%.

Education

Degrees, certifications, and relevant coursework

Rice University logoRU

Rice University

Master of Science in Subsurface Data Science, Subsurface Data Science

2018 - 2019

Completed a Master of Science in Subsurface Data Science at Rice University from 2018 to 2019.

Texas A&M University logoTU

Texas A&M University

Bachelor of Science in Geology/Earth Science, Geology/Earth Science

2015 - 2017

Completed a Bachelor of Science in Geology/Earth Science at Texas A&M University from 2015 to 2017.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan