Skip to main content
HimalayasHimalayas logo
Van Duyen MaiVM
Open to opportunities

Van Duyen Mai

@vanduyenmai

I’m a data engineer building end-to-end pipelines and AI-driven analytics on cloud platforms.

Vietnam
Message

What I'm looking for

I’m looking for a data engineering role where I can own end-to-end pipelines and data quality, build AI-ready services, and deliver measurable performance gains using AWS/Azure and modern tooling like dbt and Dagster.

I’m a Data Engineer who partners cross-functionally to turn business requirements into AI-driven analytics. At Corsair, I built and operated end-to-end data pipelines using Docker, Dagster, dbt/Spark, and Snowflake/Fabric on AWS EC2—following Kimball dimensional modeling—with Power BI as the reporting layer.

I also develop AI-ready infrastructure, including backend services with FastAPI and MCP servers with FastMCP. I’ve implemented monitoring and data quality with Grafana and Elementary, optimized pipeline performance by ~70% (from 6h to <2h), and delivered NLP projects like sentiment analysis and an AI chatbot using LangGraph, FastAPI, and Postgres.

Beyond pipeline work, I’ve replaced third-party ingestion with internal pipelines (Shopify, Paypal, Google Analytics) to reduce platform cost by ~$20K/year and built a CDP MCP server connecting Claude to 10+ live data sources. I design for governance, security, and reliability—integrating semantic-layer architecture and securing access via Azure Entra ID, Key Vault, and Managed Identity.

Experience

Work history, roles, and key accomplishments

CO
Current

Data Engineer

Corsair

Sep 2023 - Present (2 years 9 months)

Built and operated end-to-end AWS data pipelines and AI services (dbt/Spark, Dagster, Snowflake/Fabric, FastAPI) using Kimball modeling, reducing processing time ~70% (6h to <2h). Replaced third-party ingestion and forecasting components to cut platform cost by ~$20K/year and save ~$100K/year, and built LLM workflows and an MCP server connecting Claude to 10+ live data sources secured with Azure E

PT

Data Engineer

PTSC

Sep 2025 - Dec 2025 (3 months)

Developed full ETL (Landing→Bronze→Silver→Gold) and ran a POC with parallel dbt projects (Trino/MinIO and Spark/Fabric/OneLake). Implemented data validation using Elementary, dbt tests, and Dagster scheduling, and designed data governance workflows with Purview and OpenMetadata.

Education

Degrees, certifications, and relevant coursework

Ho Chi Minh City University of Technology (HCMUT) logoHH

Ho Chi Minh City University of Technology (HCMUT)

Bachelor of Engineering, Computer Science

2018 - 2022

Activities and societies: Thesis: developed a mobile application to manage online classroom and tutoring; coursework included statistics and machine learning.

Bachelor of Engineering in Computer Science at HCMUT from August 2018 to November 2022.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan