Bryan Schaefer
@bryanschaefer1
Lead Data Engineer building scalable data platforms for analytics and machine learning.
What I'm looking for
I am a Lead Data Engineer specializing in architecting and operating scalable data platforms that power analytics, reporting, and machine learning workloads. I design and implement both batch and streaming pipelines, cloud-native architectures, and distributed processing frameworks using modern data stack technologies.
At Flatiron Health I directed development of 30 batch and streaming pipelines processing 5TB daily and integrated 12 healthcare data sources to deliver curated analytics datasets that support BI and ML teams. I established data reliability and observability frameworks across 50 production pipelines, reducing failure rates by 35% and delivered feature-ready datasets for 20 ML models.
Previously, I engineered ETL and streaming pipelines at Flare and Uber, optimizing multi-terabyte workflows, improving query performance and pipeline throughput, and strengthening data quality monitoring to reduce incidents. I mentor engineers, introduced CI/CD and modular pipeline standards, and translate complex business and clinical requirements into production-grade, maintainable data solutions.
I bring strong foundations in data modeling, orchestration, observability, and automated data quality controls, and I collaborate closely with engineering, analytics, and data science teams to ensure platform scalability, reliability, and long-term sustainability.
Experience
Work history, roles, and key accomplishments
Lead Data Engineer
Flatiron Health
Sep 2021 - Present (4 years 6 months)
Directed development of 30 batch and streaming pipelines processing 5TB daily to enable analytics and ML, established observability and reliability frameworks that reduced pipeline failures by 35%, and mentored a team of 6 engineers while improving deployment efficiency by 40%.
Senior Data Engineer
Flare
Dec 2018 - Aug 2021 (2 years 8 months)
Engineered 25 ETL and streaming pipelines ingesting 3TB daily from 10 systems, standardized transformation frameworks to reduce duplication by 30%, and optimized queries to cut average runtimes by 40%.
Data Engineer
Uber
Feb 2018 - Oct 2018 (8 months)
Built high-throughput Spark pipelines processing billions of ride and event records, improved pipeline throughput by 25% via partitioning strategies, and maintained reliability across 20 production workflows supporting operational analytics.
Data Analyst Intern
MindEase
Jan 2017 - Jan 2018 (1 year)
Analyzed operational datasets with SQL and Python to support 10 reporting dashboards, implemented validation checks that improved reporting accuracy by 20%, and produced BI dashboards tracking 15 KPIs.
Education
Degrees, certifications, and relevant coursework
Texas Tech University
Bachelor of Science, Computer Science
2013 - 2016
Grade: 3.8
Completed a Bachelor of Science in Computer Science with coursework in algorithms, data structures, distributed systems, and database systems; applied Python, Java, and SQL in academic projects.
Availability
Location
Authorized to work in
Salary expectations
Social media
Job categories
Skills
Interested in hiring Bryan?
You can contact Bryan and 90k+ other talented remote workers on Himalayas.
Message BryanFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
