Skip to main content
JF
Open to opportunities

Justin Feng

@justinfeng

I’m a data engineer building scalable batch and streaming analytics platforms delivering near-real-time insights.

United States
Message

What I'm looking for

I’m looking to build product-driven data platforms—owning batch + streaming pipelines, data quality/governance, and analytics performance—while partnering with product teams and mentoring engineers to deliver fast, trustworthy insights.

I’m a data engineer focused on building scalable data platforms and analytics systems that drive product decisions. As a Staff Software Engineer in Data Engineering, I helped design and deliver a Snowflake-backed merchant analytics platform ingesting and transforming 10B+ daily commerce events via Kafka and Apache Spark.

I build analytics-grade data models and transformations with dbt and SQL, delivering <15-minute data freshness across dashboards and APIs. I also lead hybrid streaming + batch pipeline architecture using Spark Structured Streaming, Airflow, and Snowflake ingestion patterns—reducing end-to-end latency by ~40% and improving reliability to 99.9%+ SLA.

I’m equally committed to correctness and operational excellence: I set data quality, observability, and governance standards to cut customer-visible incidents by ~50% and accelerate experimentation. From an event-driven insurance platform to batch warehouse ETL with Airflow, Python, Spark, and Redshift, I consistently translate business requirements into feature-ready datasets while optimizing cost/performance and mentoring engineers on modern cloud data architecture.

Experience

Work history, roles, and key accomplishments

Shopify logoSH
Current

Staff Software Engineer

Jul 2025 - Present (11 months)

Designed and delivered a Snowflake-backed merchant analytics platform ingesting and transforming 10B+ daily commerce events via Kafka and Apache Spark, enabling near-real-time customer analytics with <15-minute data freshness. Built hybrid streaming + batch pipelines (Spark Structured Streaming, Airflow) to cut end-to-end latency ~40% and improve reliability to 99.9%+ SLA while leading data qualit

HI

Senior Data Engineer

Homesite Insurance

Feb 2021 - Jul 2025 (4 years 5 months)

Designed an event-driven data platform for the full policy lifecycle, ingesting 300M+ daily events into AWS S3 and transforming them in Databricks (Spark) for downstream analytics and product features. Built real-time underwriting analytics with <30-minute quote-level signals, implemented a Snowflake analytics warehouse with dbt powering 100+ dashboards, and improved operational outcomes including

EN

Data Engineer

Engage3

May 2019 - Feb 2021 (1 year 9 months)

Designed and implemented scalable batch data pipelines on AWS to ingest and normalize multi-channel customer engagement data (email, web, and campaign events) using Python, Apache Spark, and Amazon S3. Built and maintained a Redshift-based analytical warehouse with Airflow-orchestrated ETL, improving pipeline reliability and reducing end-user report latency ~40% through Spark tuning and Redshift q

Education

Degrees, certifications, and relevant coursework

Georgia Institute of Technology logoGT

Georgia Institute of Technology

Master of Science (Computer Science), Computer Science

2019 - 2022

Master of Science in Computer Science at Georgia Institute of Technology (2019–2022).

CU

California State University

Bachelor's Degree, Business Information Systems

2013 - 2017

Bachelor's degree in Business Information Systems at California State University (2013–2017).

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan