Skip to main content
NL
Open to opportunities

Nathan Leung

@nathanleung

Staff Full-Stack ML Engineer specializing in GenAI platforms, MLOps, and scalable production AI systems.

United States
Message

What I'm looking for

I seek hands-on engineering roles building scalable GenAI/MLOps platforms where I can lead architecture, drive CI/CD and IaC, and enable data science self-service.

I am a Staff Full-Stack ML Engineer with 14+ years building production systems at scale, from ML training and deployment infrastructure to GenAI platforms. I design and deliver end-to-end solutions including RAG pipelines, LLM fine-tuning, multi-tenant AI SaaS, and event-driven services on AWS.

At AllCloud I architected a multi-tenant GenAI platform using Bedrock, OpenSearch Serverless, EKS, and Terraform, and led fine-tuning and deployment of custom models. Previously at Twitch and Amazon I built large-scale ML training pipelines, model registries, feature pipelines, low-latency model serving, and production ad and payment systems.

I bring strong hands-on expertise across cloud infrastructure, Kubernetes, IaC, SageMaker, PyTorch/TensorFlow, and distributed systems, paired with leadership in architecture, GitOps-driven delivery, and enabling data scientists to self-serve ML workflows.

Experience

Work history, roles, and key accomplishments

AL
Current

Staff Full-Stack ML Engineer

AllCloud

Sep 2023 - Present (2 years 9 months)

Architected a multi-tenant GenAI SaaS platform and end-to-end RAG pipelines using Amazon Bedrock and OpenSearch Serverless, enabling contextual recommendations and tenant-isolated scalable inference. Delivered fine-tuned LLMs, Terraform IaC, and GitOps CI/CD to production, reducing operational complexity and standardizing LLM integrations.

Twitch logoTW

Senior ML Platform Engineer

Jul 2021 - Aug 2023 (2 years 1 month)

Built end-to-end SageMaker training and deployment pipelines, model registry, and feature pipelines ingesting billions of events, reducing model deployment time from days to hours and enabling reproducible, scalable ad-serving ML systems. Implemented real-time serving, canary rollouts, and A/B testing for production models.

Twitch logoTW

Senior Software Engineer

Apr 2016 - Jun 2021 (5 years 2 months)

Designed and launched interactive ad formats and the Bounty Board marketplace using Go microservices and DynamoDB, increasing ad engagement and enabling thousands of brand-streamer partnerships; led migrations from monolith to microservices and built A/B experimentation infrastructure.

Amazon logoAM

Software Development Engineer II

Jul 2011 - Apr 2016 (4 years 9 months)

Architected client-side and backend flows for Amazon Appstore IAP, built resilient transaction handling and server-side receipt verification, and developed search/ranking and promotional systems that supported large-scale app discovery and high-volume downloads.

Education

Degrees, certifications, and relevant coursework

University of Waterloo logoUW

University of Waterloo

Bachelor of Computer Science, Computer Science

2006 - 2011

Completed a Bachelor of Computer Science program focusing on software engineering and systems between 2006 and 2011.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan