HimalayasHimalayas logo
HP
Open to opportunities

Henry Phan

@henryphan

Senior data engineer building cloud-native platforms, reducing latency, and improving data quality.

United States
Message

What I'm looking for

I’m looking for a Senior Data Engineer role where I can build cloud-native data platforms, own scalable ETL/ELT and streaming pipelines, and drive observability, cost optimization, and automated testing that help teams move faster.

I’m a data engineer with 7 years building cloud-native data platforms and analytics solutions across energy and software, focused on scalable pipelines, measurable performance, and trustworthy data. I’ve used Python, SQL, ETL/ELT, Airflow, dbt, Spark, and Kafka to reduce ETL latency, improve data quality, and enable self-serve analytics.

Most recently, I designed a GCP data lake with BigQuery, Cloud Storage, and Dataflow (reducing query time by 60%) and delivered near-real-time order processing with Kafka and Dataflow (cutting order-to-visibility latency from minutes to seconds). I also optimized BigQuery costs by 30% and brought observability with Prometheus and Grafana to improve incident response times by 45%, while mentoring engineers and building CI/CD for safer dbt and Airflow deployments.

Experience

Work history, roles, and key accomplishments

UM

Data Engineer

Umbrage

Aug 2023 - May 2025 (1 year 9 months)

Led migration of legacy ETL jobs from on-prem Hadoop to Dataproc and Cloud Storage, reducing processing time by 3x and infrastructure costs by 25%. Built event-driven ingestion and modular Airflow pipelines, cutting manual intervention during failures by 70% monthly.

BL

Data Engineer

Bluware

Jan 2021 - Sep 2023 (2 years 8 months)

Built scalable Spark workflows for seismic and well log data, reducing processing time by 4x and accelerating interpretation cycles. Developed ETL pipelines and dbt/SQL transformations to improve data integrity and downstream ML feature accuracy.

EG

Data Engineer

EnergyMakers Advisory Group

Jan 2019 - Jan 2021 (2 years)

Designed a centralized data platform consolidating SCADA, meter, and market data, improving asset-optimization visibility and reducing retrieval times by 50%. Built ETL pipelines and scheduled Spark/Airflow jobs to normalize time-series data and cut manual reconciliation effort by 65%.

Education

Degrees, certifications, and relevant coursework

Rice University logoRU

Rice University

Master of Science in Subsurface Data Science, Subsurface Data Science

2018 - 2019

Completed a Master of Science in Subsurface Data Science at Rice University from 2018 to 2019.

Texas A&M University logoTU

Texas A&M University

Bachelor of Science in Geology/Earth Science, Geology/Earth Science

2015 - 2017

Completed a Bachelor of Science in Geology/Earth Science at Texas A&M University from 2015 to 2017.

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan