Skip to main content
Mandava KashyapsaiMK
Looking for a job

Mandava Kashyapsai

@mandavakashyapsai

AI/ML Engineer focused on Generative AI, LLMs, and low-latency Voice AI pipelines with RAG.

India
Message

What I'm looking for

I want to build production-grade LLM and Voice AI systems—self-hosted inference, RAG, and real-time streaming—where I can optimize latency/cost, scale pipelines, and use strong observability to ship measurable improvements.

I’m an AI/ML Engineer building production-scale Generative AI solutions centered on LLM pipelines, RAG, and real-time Voice AI. At Vocab AI, I automated customer call QA for enterprise clients (CRED, Lenovo, Libas), processing 4,000+ calls/day across 35–40 evaluation parameters.

I’ve driven measurable cost and performance improvements by replacing Gemini-based evaluation with self-hosted Qwen2.5-7B (llama.cpp), cutting monthly AI costs from INR 100K+ to INR 60K while maintaining throughput. I also built streaming conversational pipelines (ASR → LLM → TTS) for outbound voice campaigns, supporting 1,000+ calls/day with sub-2s response latency.

I enjoy engineering self-hosted inference systems and scaling them via batching, Kafka, and Nginx load balancing, with optimized GPU utilization through containerized Docker deployments. My work also includes a LangGraph/RAG banking Voice AI assistant (90% intent resolution, sub-2s), and I earned an IEEE Best Paper Award for real-time MRI synthesis research—so I bring both applied engineering and rigorous problem-solving to every rollout.

Experience

Work history, roles, and key accomplishments

VA
Current

AI/ML Engineer

Vocab AI

Jun 2025 - Present (1 year)

Automated enterprise customer call QA using production LLM pipelines, processing 4,000+ calls/day across 35–40 evaluation parameters with advanced prompt engineering. Replaced Gemini-based evaluation with self-hosted Qwen2.5-7B (llama.cpp), cutting monthly AI costs from INR 100K+ to INR 60K, and built streaming ASR→LLM→TTS voice pipelines supporting 1,000+ calls/day with sub-2s latency while scali

Education

Degrees, certifications, and relevant coursework

Indian Institute of Information Technology Dharwad logoID

Indian Institute of Information Technology Dharwad

Bachelor of Engineering, Computer Science & Engineering

2021 - 2025

Grade: CGPA: 7.92 / 10

Activities and societies: IEEE author; Best Paper Award, IEEE ICEI 2024 (“Real-Time MRI Synthesis from Text Using a Two-Stage Pipeline for Pronunciation Training”).

Completed a B.E. in Computer Science & Engineering at IIIT Dharwad (Aug 2021–May 2025), achieving a CGPA of 7.92/10.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan