Suki Xiao
@suqixiao
Staff AI software engineer building scalable LLM/RAG cloud platforms and improving cost, latency, and reliability.
What I'm looking for
I’m a Staff Software Engineer with 10+ years of experience delivering production-grade AI systems and cloud-native platforms. I build scalable distributed systems and AI-powered products that translate complex requirements into reliable, measurable outcomes—performance, cost, and speed to ship.
At AWS, I developed a multi-tenant LLM orchestration platform powering enterprise AI copilots for 1K+ concurrent users with sub-300ms response latency. I designed and owned end-to-end RAG (OpenSearch + FAISS), improving answer accuracy by 35% and reducing hallucinations, while optimizing the LLM request lifecycle to cut token usage costs by 28% and increase throughput under peak load.
I also led platform observability and reliability improvements using OpenTelemetry and CloudWatch, reducing MTTR by 40%, and established secure multi-tenant isolation patterns (IAM, KMS, role-based access) for enterprise governance. Earlier at Google, I engineered high-throughput pipelines for large-scale ML training and personalization, modernized frontend architecture with React and TypeScript, and applied ML-based anomaly detection to reduce production incidents by 25%—all while mentoring teams on distributed systems and AI system design.
Experience
Work history, roles, and key accomplishments
Developed a multi-tenant LLM orchestration platform for enterprise AI copilots, supporting 1K+ concurrent users with sub-300ms response latency. Designed and operated end-to-end RAG (OpenSearch + FAISS) to improve answer accuracy by 35%, while reducing token costs by 28% through caching, prompt deduplication, and async batching.
Built backend and system design for a large-scale intelligent search platform, improving search relevance and user engagement across multi-billion record datasets. Engineered data pipelines (Kafka + Apache Beam) and low-latency distributed services (Java/Go) for real-time recommendations, and reduced production incidents by 25% with ML-based anomaly detection and proactive monitoring.
Designed and implemented machine learning and optimization models, improving computational efficiency by ~20% for large-scale forecasting and decision systems. Built reusable ETL pipelines and applied statistical modeling and simulation to evaluate performance under uncertainty and varying constraints.
Education
Degrees, certifications, and relevant coursework
UC Berkeley
Master of Science, Industrial engineering
2012 - 2013
Completed an M.S. in Industrial engineering at UC Berkeley from 2012 to 2013.
Beijing University of Posts and Telecommunications
Bachelor of Science, Telecommunication Engineering and Management
2008 - 2012
Earned a B.S. in telecommunication engineering and management from 2008 to 2012.
Queen Mary University of London
Bachelor of Science, Telecommunication Engineering and Management
2008 - 2012
Earned a B.S. in telecommunication engineering and management from 2008 to 2012.
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Suki?
You can contact Suki and 90k+ other talented remote workers on Himalayas.
Message SukiFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
