Sarthak Singh
@sarthaksingh3
AI Engineer building multi-agent runtimes, ML infrastructure, and production backends.
What I'm looking for
I’m an AI Engineer who builds reliable multi-agent systems end-to-end, with a focus on dependable execution and context stability. I developed a custom agent harness engine with per-agent sandboxed execution, tool-call routing, and output parsing to complete tasks across heterogeneous agent types without shared state leakage.
In my recent work, I designed an event-driven agent runtime that captures user interaction patterns and session history to adapt behavior across multi-turn conversations. I also implemented a 3-tier context management system (short-term, working, long-term memory), reducing context-related failures by ~40%.
Prior to that, I strengthened my ML and production foundations through infrastructure and backend roles. As an ML Infrastructure Engineer intern, I refactored a monolith to CQRS/Event Sourcing with Apache Kafka (improving throughput ~4x), implemented GPUDirect Storage to cut diffusion cold-start from 45s to 3s, and used CUDA streams to boost GPU utilization to ~85% and throughput ~6x on Kubernetes with Triton Inference Server.
I’ve also shipped backend systems on AWS with strong observability and automation. I built event-driven FastAPI services with RAG using LangChain, migrated a React monolith to Next.js with Turborepo (cutting build time ~60%), and developed a robotics portfolio in ROS/ROS2 with OpenVLA/SAM2 and Jetson-based vision-language-action—achieving 78% task success on pick-and-place and 94% collision-free trajectories in simulation.
Experience
Work history, roles, and key accomplishments
AI Engineer
Fairquanta
Jan 2026 - May 2026 (4 months)
Built a custom multi-agent harness with per-agent sandboxing, tool-call routing, and output parsing to enable reliable heterogeneous agent task completion. Designed an event-driven agent runtime and 3-tier context management that reduced context-related failures by ~40%, and owned AWS (EKS/ECS/VPC) infrastructure with CI/CD and autoscaling.
ML Infrastructure Engineer
Image Pipeline
Jan 2025 - Aug 2025 (7 months)
Refactored a monolith to CQRS/event sourcing with Apache Kafka, improving horizontal throughput by ~4x under peak load. Implemented NVIDIA GPUDirect Storage to cut diffusion model cold-start time from 45s to 3s, and built dynamic CUDA stream batching to raise GPU utilization from ~40% to ~85% and throughput by ~6x on Kubernetes with Triton.
Backend Engineer (Contract)
RealList.ai
Dec 2024 - Apr 2025 (4 months)
Deployed a production app on AWS using VPC, ECS, EC2, and Auto Scaling, configuring secure multi-environment networking and load balancing. Built an observability stack (Loki/Grafana/Tempo/Mimir) and used k6 load tests to identify bottlenecks around ~300 concurrent users, tuning autoscaling thresholds ahead of launch.
Backend Engineer (Part-time)
Openpolitica
Jul 2022 - Oct 2024 (2 years 3 months)
Built a FastAPI event-driven workflow engine to trigger automated AI actions on incoming political events, routing processing pipelines by event type and user context. Developed a LangChain-based RAG system for context-aware summarization and personalized responses, and migrated the frontend from React to Next.js with Turborepo, cutting build time by ~60%.
Education
Degrees, certifications, and relevant coursework
Guru Gobind Singh Indraprastha University
B.Tech, Automation & Robotics
2021 - 2025
B.Tech in Automation & Robotics at Guru Gobind Singh Indraprastha University from 2021 to 2025.
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Sarthak?
You can contact Sarthak and 90k+ other talented remote workers on Himalayas.
Message SarthakFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
