Brian Zhou
@brianzhou
Staff ML infrastructure and backend engineer specializing in scalable AI platforms and production ML.
What I'm looking for
I am a Staff Machine Learning Infrastructure Engineer with over 12 years building scalable backend systems and AI platforms across startups and enterprise environments. I focus on distributed training systems, ML platform architecture, large-scale data pipelines, and reliable production deployment.
At Databricks I architected distributed ML training orchestration services supporting thousands of daily jobs, built Spark pipelines processing 5+ TB daily, and improved training pipeline performance by 40% through caching and optimization.
Previously at Cisco I designed distributed microservices, data pipelines processing billions of telemetry events, and ML monitoring systems that detect drift and performance degradation, increasing inference throughput by 30%.
I combine deep engineering practice in backend and distributed systems with strong MLOps and cloud experience (AWS, GCP, Kubernetes) to deliver developer productivity, robust CI/CD, and scalable model lifecycle management for production ML.
Experience
Work history, roles, and key accomplishments
Architected distributed ML training orchestration and backend model lifecycle services supporting thousands of daily jobs and 5+ TB daily pipelines; improved training pipeline performance by 40% and reduced deployment time by 60%.
Designed distributed microservices and data pipelines processing billions of telemetry events daily, implemented ML monitoring to detect data drift and increased inference throughput by 30%.
Machine Learning Engineer
NovaAI
Jan 2014 - Jan 2017 (3 years)
Designed scalable data pipelines and backend services for training, evaluation, and model rollout, improving inference latency and enabling experimentation for conversational intelligence features.
Backend Software Engineer
BrightCommerce
Jan 2011 - Jan 2014 (3 years)
Built scalable APIs, asynchronous backend services, and ETL pipelines for product catalog and order processing, optimizing database queries to improve response times by 25%.
Education
Degrees, certifications, and relevant coursework
University of California, San Diego
Master of Science, Computer Science
Master of Science in Computer Science focused on advanced topics relevant to machine learning and systems engineering.
Shanghai Jiao Tong University
Bachelor of Engineering, Computer Science
Bachelor of Engineering in Computer Science with coursework supporting backend systems and large-scale data processing.
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Job categories
Interested in hiring Brian?
You can contact Brian and 90k+ other talented remote workers on Himalayas.
Message BrianFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
