HimalayasHimalayas logo
Bob HBH
Open to opportunities

Bob H

@bobh

Senior data engineer specializing in large-scale ML data platforms, pipelines, and research enablement.

United States
Message

What I'm looking for

I seek roles where I can build reliable, scalable ML and research data platforms, partner with researchers, and improve data quality and observability.

I am a senior data engineer with deep experience building and operating large-scale data platforms for ML and research. I have supported high-throughput training, evaluation, and production analytics for conversational models and helped scale data systems for mRNA vaccine research.

At OpenAI I worked on the data systems behind ChatGPT, owning pipelines that ingest multi-terabyte daily language datasets and building dataset versioning, validation, and lineage to ensure trustworthy training data. I partnered closely with researchers and ML teams to deliver feature-ready datasets, embedding and vector pipelines, and near-real-time streaming signals for safety and performance monitoring.

At Moderna I helped integrate experimental, genomics, and clinical data to enable reproducible research and predictive modeling in a regulated environment. I implemented data lineage, versioning, and auditability, and optimized availability and processing performance to accelerate analysis for vaccine R&D.

Earlier, I built enterprise healthcare data lakes and ETL pipelines with strong data quality and HIPAA-compliant controls while helping migrate systems toward cloud platforms. I focus on reliability, observability, cost efficiency, and enabling teams to trust and act on their data.

Experience

Work history, roles, and key accomplishments

OpenAI logoOP
Current

Senior Data Engineer

Jan 2024 - Present (2 years 2 months)

Built and maintained large-scale data systems for ChatGPT training, evaluation, and production analytics, owning multi-terabyte ingestion pipelines and improving dataset trust through versioning and validation.

Moderna logoMO

Senior Data Engineer

Mar 2020 - Dec 2023 (3 years 9 months)

Developed data platforms and pipelines for COVID-19 mRNA vaccine R&D and clinical analytics, enabling reproducible research through lineage, versioning, and auditability in a regulated environment.

SH

Data Engineer

Sentara Health

Aug 2017 - Mar 2020 (2 years 7 months)

Built enterprise data lake and ETL pipelines integrating EHR, claims, and operational data, improving data quality and implementing HIPAA-compliant access controls while migrating toward Azure.

Education

Degrees, certifications, and relevant coursework

Florida International University logoFU

Florida International University

Bachelor of Science, Computer Science

2013 - 2017

Completed a Bachelor of Science in Computer Science, supporting preparation for roles in data engineering and analytics.

Tech stack

Software and tools used professionally

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan