HimalayasHimalayas logo
ML
Looking for a job

Michael Li

@michaelli1

Senior data engineer building reliable Spark/Databricks pipelines to power analytics and ML products.

United States
Message

What I'm looking for

I’m looking to build end-to-end data platforms and ELT/ETL systems with modern tooling, partner closely with product and ML teams, and deliver measurable gains in data latency, reliability, and decision-making with clean, reusable models.

I’m a Senior Data Engineer who designs and ships production-grade ELT/ETL and real-time data pipelines that make analytics and machine learning possible at scale. I’ve built Spark-based pipelines in Databricks and used streaming feature pipelines to improve ETA prediction and location ranking for search and routing systems.

At Lyft, I designed Medallion architecture and dbt models in Snowflake, built feature pipelines to support ML teams, and translated data/ML requirements into scalable architectures with product, mapping, and data science partners. I improved pickup/dropoff accuracy and reduced data latency by 17% for routing and search systems, and helped reduce deployment friction across ML teams.

I also focus heavily on reliability and data quality. I implemented Airflow workflows and monitoring for core mapping and mobility datasets, reducing incidents by 19% in critical operational datasets, and delivered semantic layers and curated datasets for pickup/dropoff funnel and driver supply-demand metrics—cutting average pickup time by 7% across major markets.

Previously at DigitalOcean and as a Data Analyst, I scaled batch and real-time pipelines with Airflow, Spark, and Kafka, re-architected transformations into bronze/silver/gold layers with dbt, and established data quality checks to reduce data incidents and discrepancies. I’m known for building simple, dependable systems that create measurable business impact while enabling self-service analytics.

Experience

Work history, roles, and key accomplishments

Lyft logoLY
Current

Senior Data Engineer

Jan 2023 - Present (3 years 3 months)

Built Spark-based ELT/ETL pipelines on Databricks to ingest ride, GPS, and map signal data, improving pickup/dropoff accuracy and reducing data latency by 17% for routing and search. Designed Medallion architecture and dbt models in Snowflake, and developed streaming feature pipelines for ML ETA prediction and location ranking to improve model consistency and reduce deployment friction.

DigitalOcean logoDI

Senior Data Engineer

Mar 2019 - Dec 2022 (3 years 9 months)

Designed and scaled batch and real-time data pipelines with Airflow, Spark, and Kafka, improving data availability latency by 27% for growth marketing and analytics teams. Re-architected transformations into dbt bronze/silver/gold layers, reducing downstream inconsistencies by 25%, and implemented dbt tests and Airflow monitoring to cut data incidents by 30%.

University of Florida logoUF

Research Assistant

Apr 2017 - Dec 2017 (8 months)

Developed SQL/Python scripts to extract and transform data from multiple sources, improving data accessibility and reducing manual analysis effort for research teams. Collaborated with faculty and researchers to turn research questions into data-driven analyses supporting publications and project deliverables.

Education

Degrees, certifications, and relevant coursework

University of Florida logoUF

University of Florida

Master of Science, Computer Science

2015 - 2017

Grade: 3.8

Activities and societies: N/A

Earned a Master of Science in Computer Science at the University of Florida from 2015 to 2017.

University of Florida logoUF

University of Florida

Bachelor of Science, Computer Science

2011 - 2015

Earned a Bachelor of Science in Computer Science at the University of Florida from 2011 to 2015.

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan