Stephen Koo
@stephenkoo1
Senior Software Engineer specializing in scalable AI and LLM infrastructure.
What I'm looking for
I am a Senior Software Engineer with 8+ years building scalable backend systems, AI inference platforms, and production LLM services across multi-region cloud environments. I design and operate Kubernetes-based ML serving, optimize inference pipelines for low latency and high availability, and integrate generative AI using Vertex AI and Gemini to deliver user-facing AI features.
I have led architecture and implementation of inference orchestration services, improved P95 latencies through batching and caching, and scaled platforms with Cloud Spanner, Pub/Sub, and GKE to sustain 99.99% availability. I collaborate cross-functionally, mentor engineers, and produce tooling and control-plane UIs that support experimentation, monitoring, and reliable rollouts of AI capabilities.
Experience
Work history, roles, and key accomplishments
Architected and built AI inference orchestration and ML inference platforms for Gmail and Google Chat, reducing p95 latency by up to 30%, increasing throughput 3×, and sustaining 99.99% multi-region availability for production LLM services.
Education
Degrees, certifications, and relevant coursework
Stanford University
Master of Science, Computer Science (Artificial Intelligence)
2015 - 2017
Completed a Master of Science in Computer Science with a focus on Artificial Intelligence, covering advanced AI topics and research-oriented coursework.
Stanford University
Bachelor of Science, Computer Science (Artificial Intelligence)
2011 - 2015
Completed a Bachelor of Science in Computer Science with a focus on Artificial Intelligence, including foundational coursework in algorithms, systems, and AI.
Tech stack
Software and tools used professionally
JDA
GitHub
GitLab
Kubernetes
GitHub Actions
Salesforce
MySQL
PostgreSQL
MongoDB
Shopify
Gmail
Node.js
Spring Boot
Next.js
Tailwind CSS
Web Components
Lightning Web Components
Neo4j
Redis
Terraform
Webpack
JavaScript
Java
HAProxy
TensorFlow
PyTorch
Kafka
RabbitMQ
Prometheus
Crossplane
Linux
Windows
Datadog
Google Workspace
Gemini
Elasticsearch
AWS Lambda
Webflow
JUnit
Airflow
SQL
Google Kubernetes Engine
LangChain
Plane
Cursor
GitHub Copilot
Increase
Remote
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Stephen?
You can contact Stephen and 90k+ other talented remote workers on Himalayas.
Message StephenFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
