HimalayasHimalayas logo
BZ
Open to opportunities

Brian Zhou

@brianzhou

Staff ML infrastructure and backend engineer specializing in scalable AI platforms and production ML.

United States
Message

What I'm looking for

I seek senior roles building scalable ML infrastructure or platform services where I can drive reliability, developer productivity, and production ML deployments at scale.

I am a Staff Machine Learning Infrastructure Engineer with over 12 years building scalable backend systems and AI platforms across startups and enterprise environments. I focus on distributed training systems, ML platform architecture, large-scale data pipelines, and reliable production deployment.

At Databricks I architected distributed ML training orchestration services supporting thousands of daily jobs, built Spark pipelines processing 5+ TB daily, and improved training pipeline performance by 40% through caching and optimization.

Previously at Cisco I designed distributed microservices, data pipelines processing billions of telemetry events, and ML monitoring systems that detect drift and performance degradation, increasing inference throughput by 30%.

I combine deep engineering practice in backend and distributed systems with strong MLOps and cloud experience (AWS, GCP, Kubernetes) to deliver developer productivity, robust CI/CD, and scalable model lifecycle management for production ML.

Experience

Work history, roles, and key accomplishments

Databricks logoDA
Current

Staff ML Infrastructure Engineer

Jan 2021 - Present (5 years 2 months)

Architected distributed ML training orchestration and backend model lifecycle services supporting thousands of daily jobs and 5+ TB daily pipelines; improved training pipeline performance by 40% and reduced deployment time by 60%.

Cisco Systems logoCS

Senior ML Platform Engineer

Jan 2017 - Jan 2021 (4 years)

Designed distributed microservices and data pipelines processing billions of telemetry events daily, implemented ML monitoring to detect data drift and increased inference throughput by 30%.

Education

Degrees, certifications, and relevant coursework

University of California, San Diego logoUD

University of California, San Diego

Master of Science, Computer Science

Master of Science in Computer Science focused on advanced topics relevant to machine learning and systems engineering.

Shanghai Jiao Tong University logoSU

Shanghai Jiao Tong University

Bachelor of Engineering, Computer Science

Bachelor of Engineering in Computer Science with coursework supporting backend systems and large-scale data processing.

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan