Skip to main content
JS
Open to opportunities

Jerry Su

@jerrysu

Senior data engineer specializing in scalable ETL/ELT, data warehousing, and real-time analytics pipelines on AWS and Spark.

United States
Message

What I'm looking for

I’m looking for a role where I can build scalable ETL/ELT and data warehousing platforms, improve reliability and SLAs, and deliver real-time analytics that help analytics, reporting, and machine learning teams make faster, data-driven decisions.

I’m a Senior Data Engineer with extensive experience designing and building scalable data platforms, ETL/ELT pipelines, and real-time data processing systems. I focus on modern data architectures—data lakes, lakehouse patterns, and cloud data warehouses—delivering reliable analytics foundations for business and machine learning use cases.

At CVS Health, I built end-to-end healthcare claims and pharmacy analytics pipelines using Python, SQL, Airflow, AWS Glue, and Databricks. I implemented a Medallion architecture with Delta Lake on Amazon S3, optimized Spark workloads for efficiency, and developed Snowflake warehouse models with dbt to reduce errors and accelerate deployments—while enabling near real-time ingestion via Kinesis and improving data quality with Great Expectations.

Previously at Stripe and Amazon, I engineered low-latency event streaming with Apache Kafka and Spark Structured Streaming for fraud detection and risk scoring, and delivered batch + streaming financial analytics pipelines. I also built metadata-driven reporting in Redshift, automated data quality across 100+ datasets, and partnered with cross-functional teams to turn complex data into trustworthy, actionable insights.

Experience

Work history, roles, and key accomplishments

CVS Health logoCH
Current

Senior Data Engineer

CVS Health

Apr 2023 - Present (3 years 2 months)

Designed and built healthcare claims and pharmacy analytics ETL/ELT pipelines processing millions of records using Python, SQL, Spark, and AWS Glue/Databricks. Implemented Medallion (Bronze/Silver/Gold) on Delta Lake, built Snowflake/dbt models, and reduced data latency via Kinesis-based near real-time ingestion while improving data quality through validation frameworks and monitoring.

Stripe logoST

Data Engineer

Mar 2019 - Mar 2023 (4 years)

Enhanced real-time payment event streaming pipelines with Kafka and Kafka Streams, enabling low-latency processing of millions of transactions daily for fraud detection and risk scoring. Built batch and streaming ETL/ELT workflows with Spark, EMR, S3, and Snowflake, and migrated selected batch jobs to near real-time using Structured Streaming to improve timeliness.

Education

Degrees, certifications, and relevant coursework

University of California, Merced logoUM

University of California, Merced

Bachelor of Science, Computer Science

2009 - 2013

Earned a Bachelor of Science in Computer Science at the University of California, Merced.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan