Kevin Wang
@kevinwang8
Senior software engineer delivering scalable AI inference platforms, distributed systems, and RAG-powered experiences at global scale.
What I'm looking for
I’m a Senior Software Engineer with 10+ years of experience building scalable backend systems, AI infrastructure, and production LLM platforms. I focus on distributed systems, Kubernetes-based ML serving, and generative AI integration to deliver real product impact.
At Google Workspace AI Core Platform, I architected and built an AI Inference Orchestration Service (Golang) to route requests between Workspace surfaces and ML models. I improved inference throughput by optimizing Golang request batching and adding in-memory caching, reducing P95 latency by 30%, and I designed routing and caching that sustained 99.99% availability across 5 global regions.
I’ve built the end-to-end experience layer as well: creating a Node.js gateway on Cloud Run to aggregate responses for AI prompt construction, and delivering UI capabilities with React.js and Next.js, including a control plane for model configuration, A/B testing, and real-time performance monitoring. I enabled low-latency interactions using SSR, streaming responses, and Next.js API routes.
I also lead secure, observable ML infrastructure—scaling distributed ML inference on GKE with Cloud Spanner and Pub/Sub, implementing mTLS, network policies, and VPC Service Controls, and integrating RAG components to improve contextual grounding and reduce hallucinations. I mentor 5 engineers, driving 3 promotions and supporting 2 successful transitions into ML engineering teams.
Experience
Work history, roles, and key accomplishments
Senior Software Engineer
Oct 2017 - Present (8 years 8 months)
Architected and built an AI inference orchestration service for Google Workspace (Gmail/Chat), improving inference throughput and reducing P95 latency by 30% via Go batching and in-memory caching. Designed ML inference routing on GKE across 5 global regions with 99.99% availability, implemented RAG with semantic embeddings to reduce hallucinations, and integrated Vertex AI/Gemini for AI-assisted d
Education
Degrees, certifications, and relevant coursework
Stanford University
Master of Science, Computer Science (Artificial Intelligence)
2011 - 2017
Earned an M.S. in Computer Science (Artificial Intelligence) from Stanford University from 2011 to 2017.
Tech stack
Software and tools used professionally
OpenAPI
RAML
Apache Flink
JDA
Kubernetes
Salesforce
MySQL
PostgreSQL
MongoDB
Shopify
Gmail
Node.js
Spring Boot
.NET Core
Next.js
.NET
Tailwind CSS
Neo4j
Redis
Terraform
Webpack
JavaScript
Java
TensorFlow
PyTorch
Kafka
RabbitMQ
Prometheus
Crossplane
Datadog
Google Workspace
Gemini
Elasticsearch
AWS Lambda
Webflow
JUnit
Airflow
SQL
Google Kubernetes Engine
LangChain
Plane
LangGraph
Increase
Remote
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Kevin?
You can contact Kevin and 90k+ other talented remote workers on Himalayas.
Message KevinFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
