Himalayas logo
NL
Open to opportunities

Nathan Leung

@nathanleung

Staff Full-Stack ML Engineer specializing in GenAI platforms, MLOps, and scalable production AI systems.

United States
Message

What I'm looking for

I seek hands-on engineering roles building scalable GenAI/MLOps platforms where I can lead architecture, drive CI/CD and IaC, and enable data science self-service.

I am a Staff Full-Stack ML Engineer with 14+ years building production systems at scale, from ML training and deployment infrastructure to GenAI platforms. I design and deliver end-to-end solutions including RAG pipelines, LLM fine-tuning, multi-tenant AI SaaS, and event-driven services on AWS.

At AllCloud I architected a multi-tenant GenAI platform using Bedrock, OpenSearch Serverless, EKS, and Terraform, and led fine-tuning and deployment of custom models. Previously at Twitch and Amazon I built large-scale ML training pipelines, model registries, feature pipelines, low-latency model serving, and production ad and payment systems.

I bring strong hands-on expertise across cloud infrastructure, Kubernetes, IaC, SageMaker, PyTorch/TensorFlow, and distributed systems, paired with leadership in architecture, GitOps-driven delivery, and enabling data scientists to self-serve ML workflows.

Experience

Work history, roles, and key accomplishments

AL
Current

Staff Full-Stack ML Engineer

AllCloud

Sep 2023 - Present (2 years 5 months)

Architected a multi-tenant GenAI SaaS platform and end-to-end RAG pipelines using Amazon Bedrock and OpenSearch Serverless, enabling contextual recommendations and tenant-isolated scalable inference. Delivered fine-tuned LLMs, Terraform IaC, and GitOps CI/CD to production, reducing operational complexity and standardizing LLM integrations.

Twitch logoTW

Senior ML Platform Engineer

Jul 2021 - Aug 2023 (2 years 1 month)

Built end-to-end SageMaker training and deployment pipelines, model registry, and feature pipelines ingesting billions of events, reducing model deployment time from days to hours and enabling reproducible, scalable ad-serving ML systems. Implemented real-time serving, canary rollouts, and A/B testing for production models.

Twitch logoTW

Senior Software Engineer

Apr 2016 - Jun 2021 (5 years 2 months)

Designed and launched interactive ad formats and the Bounty Board marketplace using Go microservices and DynamoDB, increasing ad engagement and enabling thousands of brand-streamer partnerships; led migrations from monolith to microservices and built A/B experimentation infrastructure.

Amazon logoAM

Software Development Engineer II

Jul 2011 - Apr 2016 (4 years 9 months)

Architected client-side and backend flows for Amazon Appstore IAP, built resilient transaction handling and server-side receipt verification, and developed search/ranking and promotional systems that supported large-scale app discovery and high-volume downloads.

Education

Degrees, certifications, and relevant coursework

University of Waterloo logoUW

University of Waterloo

Bachelor of Computer Science, Computer Science

2006 - 2011

Completed a Bachelor of Computer Science program focusing on software engineering and systems between 2006 and 2011.

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan
Nathan Leung - Staff Full-Stack ML Engineer - AllCloud | Himalayas