Skip to main content
HimalayasHimalayas logo
akash singhAS
Open to opportunities

akash singh

@akashsingh11

AI Technical Lead specializing in real-time voice agents, multilingual speech AI, and on-device inference.

India
Message

What I'm looking for

I’m looking for a product-focused role building real-time, multilingual speech AI end-to-end—reliable production performance, hallucination mitigation, and strong on-device/edge inference optimization for conversational agents.

I’m an AI Technical Lead focused on shipping production-grade speech AI—especially real-time, full-duplex voice-to-voice experiences. At Paytm, I designed and shipped a WebRTC pipeline that connects streaming ASR → LLM reasoning → TTS for travel booking.

I also architected a multi-agent conversational system with LangGraph StateGraph and GPT-4, coordinating 10+ agents for search, filtering, and booking, achieving 95% intent recognition accuracy. I integrated live travel APIs using async clients with circuit breaker patterns and maintained 99.5% system uptime in production.

Previously, as a Research Scientist at Saarthi.AI, I led end-to-end TTS research across Tacotron, FastSpeech, and HiFi-GAN for single-/multi-speaker and multilingual settings across 11 Indian languages at 5M calls/day. I built and deployed streaming ASR systems (DeepSpeech, Whisper, Kaldi), developed a full NLU pipeline from data creation to Azure/AWS deployment, and led cross-functional teams across research-to-deployment—bringing models onto on-device Android for real-time recommendation inside a keyboard product.

Experience

Work history, roles, and key accomplishments

Paytm logoPA
Current

Technical Lead - AI

Jan 2025 - Present (1 year 5 months)

Designed and shipped a real-time voice-to-voice agent over WebRTC, integrating streaming ASR → LLM reasoning → TTS into a full-duplex production pipeline for travel booking. Architected a multi-agent system coordinating 10+ agents and achieved 95% intent recognition accuracy while maintaining 99.5% uptime via resilient API integration and hallucination mitigation.

SA

Research Scientist (TTS/ASR)

Saarthi.AI

Aug 2021 - Jan 2025 (3 years 5 months)

Led end-to-end TTS research across Tacotron, FastSpeech, and HiFi-GAN for single-speaker, multi-speaker, and multilingual settings across 11 Indian languages at 5M calls/day. Built and deployed streaming ASR systems and an end-to-end NLU pipeline on Azure/AWS, and drove on-device model distillation for Android real-time recommendation.

SA

Deep Learning Engineer (Speech/NLP)

Saarthi.AI

Dec 2018 - Jul 2021 (2 years 7 months)

Trained ELMo and ULMFiT language models from scratch in 9 Indian languages and applied them to entity tagging, text classification, semantic role labeling, and POS tagging. Built speaker recognition pipelines (X-vector/D-vector) achieving 95%+ accuracy and developed keyword spotting and dialog policy using TensorFlow.js and deep RL for browser deployment.

Education

Degrees, certifications, and relevant coursework

IL

Institute of Engineering and Technology, Lucknow

Bachelor of Technology, Computer Science & Engineering

Completed a B.Tech in Computer Science and Engineering at IET Lucknow in 2018.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan