Skip to main content
HimalayasHimalayas logo
DD
Open to opportunities

Derek Dai

@derekdai

Senior Data Analytics Engineer building scalable ETL/ELT and real-time analytics on AWS.

United States
Message

What I'm looking for

I’m looking to build and scale end-to-end analytics pipelines and real-time streaming platforms—partnering with Product and ML to deliver trusted, governed data products, automate CI/CD, and optimize performance for measurable cost and reliability gains.

I’m a Senior Data Analytics Engineer with 10+ years of experience building scalable ETL/ELT pipelines, real-time analytics platforms, and cloud-based data solutions. I specialize in Python, SQL, PySpark, and streaming architectures, with a strong focus on data modeling, performance optimization, and data governance.

At Whatnot, I designed and scaled end-to-end analytics pipelines for the live-commerce marketplace using Apache Kafka, dbt, Snowflake, and AWS. I built high-performance ELT workflows and dimensional data models that reduced dashboard latency by 60% and improved data reliability for Product, Growth, and Executive teams. I also developed real-time streaming analytics architecture with Kafka Streams, Spark Structured Streaming, and AWS S3 to reduce incident detection time by 45% while improving operational monitoring.

At Amazon, I delivered scalable ETL/ELT pipelines using Python, PySpark, AWS Glue, Apache Airflow, and Amazon S3—enabling near real-time analytics and reducing manual reporting efforts by 70%. I optimized large-scale SQL workloads in Amazon Redshift, Athena, and Hive to improve dashboard response times by 45%. I also engineered streaming analytics pipelines with Amazon Kinesis, Kafka, Spark Streaming, and AWS Lambda to improve delivery accuracy and reduce incident response time by 40%, while strengthening reliability through automated validation and Git-based CI/CD workflows.

Earlier roles strengthened my breadth: I automated market-data pipelines and reporting workflows, and built centralized analytics marts and dashboards that improved operational decision-making. Across my career, I’ve created trusted semantic layers and self-service analytics by standardizing KPIs and taxonomies, implementing robust data quality frameworks, and optimizing compute costs (including 30% reductions in Snowflake). I bring an engineering mindset to analytics—so teams can move faster with governed, trustworthy data that powers product and ML outcomes.

Experience

Work history, roles, and key accomplishments

Whatnot logoWH

Senior Data Analytics Engineer

Jun 2024 - May 2026 (1 year 11 months)

Designed and scaled end-to-end analytics pipelines for Whatnot’s live-commerce marketplace using Kafka, dbt, Snowflake, and AWS to deliver near real-time engagement and GMV visibility. Reduced dashboard latency by 60% and improved incident detection time by 45% while strengthening data reliability and governance for product and executive teams.

Amazon logoAM

Data Analytics Engineer

Sep 2022 - May 2024 (1 year 8 months)

Built scalable ETL/ELT pipelines with Python, PySpark, AWS Glue, Airflow, and S3 to enable near real-time analytics and reduce manual reporting by 70% for operations and business teams. Improved dashboard response times by 45% and reduced incident response time by 40% through Kinesis/Lambda streaming analytics and reliability optimizations across AWS services.

Clutter logoCL

Data Engineer

Clutter

Feb 2020 - Aug 2022 (2 years 6 months)

Developed scalable ETL pipelines with Python, PySpark, Airflow, S3, Glue, and Redshift to process logistics, warehouse, and customer data while improving reliability and reducing manual reporting effort. Implemented near real-time streaming with Kafka/Kinesis and automated CI/CD data-quality workflows (Terraform, Docker, Kubernetes, Jenkins) to reduce pipeline failures and improve stability during

Wag Labs logoWL

Data Analyst

Wag Labs

Jun 2018 - Jan 2020 (1 year 7 months)

Built marketplace dashboards in SQL, Python, Redshift, S3, and Tableau to track bookings, cancellations, utilization, and regional demand. Analyzed customer and product behavior with Mixpanel, Firebase, Amplitude, Looker, and Optimizely to improve onboarding funnels, retention, and A/B test outcomes.

NL

Data Acquisition Services Analyst

Nisa Investment Advisors, LLC

Apr 2017 - May 2018 (1 year 1 month)

Automated market-data pipelines with SQL, Python, SSIS, Bloomberg API, and Refinitiv to improve pricing accuracy and reduce overnight processing failures by 35%. Built reporting and reconciliation workflows using Tableau, Excel VBA, SSRS, Control-M, and SQL Server to reduce manual work and improve SLA performance.

RS

Consultant

Revenue Solutions

Jun 2014 - Mar 2017 (2 years 9 months)

Built tax compliance dashboards with Oracle, SQL Server, Tableau, Informatica, and Cognos to automate reporting, improve audit visibility, and reduce manual effort for revenue agencies. Supported legacy tax system migration by developing ETL pipelines and validating financial data using Python, PL/SQL, SSIS, Hadoop, AWS EC2/S3, and JIRA.

Education

Degrees, certifications, and relevant coursework

Washington University in St. Louis logoWL

Washington University in St. Louis

Bachelor of Computer Science, Computer Science

Earned a Bachelor of Computer Science degree at Washington University in St. Louis.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan