Srujan Teja User
@srujantejauser
AI/ML Engineer focused on Agentic AI, Voice AI, RAG, and MLOps—building low-latency, production-ready systems.
What I'm looking for
I’m an AI/ML Engineer building production-grade systems across Agentic AI, Voice AI, RAG, GenAI, and MLOps, with a strong bias toward low-latency inference, scalable deployment, and reliable memory architectures. I focus on taking research ideas to production by designing systems end-to-end—from orchestration and routing to observability and evaluation.
In agentic systems, I architected a multi-agent orchestration runtime with LangGraph featuring dynamic task decomposition, tool-use routing, and self-correction loops, plus shared-state coordination using Redis pub/sub and durable task memory in PostgreSQL. For voice, I built a multilingual real-time Voice AI platform supporting 10+ Indian languages with Whisper + INT8 ONNX STT, VITS2 fine-tuned TTS, and sub-150ms end-to-end latency on CPU, served via Triton with async WebSocket streaming.
On the retrieval and generation side, I engineered an enterprise semantic search and NLP intelligence engine with hybrid dense-sparse retrieval (BGE-M3 + BM25), HyDE expansion, and cross-encoder reranking achieving 91% top-3 retrieval precision, while cutting irrelevant retrievals by 38% and reducing P99 latency from 420ms to 55ms at scale. I also delivered a multimodal GenAI document intelligence platform with GPT-4o/Claude routing that reduced inference cost by 42% and supported 500+ concurrent sessions, plus an AI memory architecture for conversational agents that reduced prompt token overhead by 60% while improving multi-session coherence.
Experience
Work history, roles, and key accomplishments
Production Multi-Agent System
Independent Project
Mar 2026 - Present (3 months)
Architected a multi-agent runtime with dynamic task decomposition, tool-use routing, and self-correction loops; agents autonomously plan, execute, and recover across 10+ tool integrations. Built shared-state coordination using Redis pub/sub and PostgreSQL for durable task memory to enable concurrent execution without cross-agent context collisions.
Multilingual Real-Time Voice AI
Independent Project
Nov 2025 - Present (7 months)
Built a multilingual voice AI platform for 10+ Indian languages with real-time STT and neural TTS, achieving sub-150ms end-to-end latency on CPU. Implemented zero-shot voice cloning from 5s of audio and served the system on Triton with dynamic batching and async WebSocket streaming.
GenAI Document Intelligence Platform
Independent Project
Feb 2026 - Present (4 months)
Built a multimodal GenAI platform ingesting PDFs, spreadsheets, and images via Unstructured.io to generate audit-ready document summaries and executable code from natural-language specs. Designed a multi-LLM routing layer (GPT-4o vs Claude) based on task type, cost, and latency SLAs, reducing inference cost by 42% without degrading output quality.
Enterprise Semantic Search Engine
Independent Project
Jan 2026 - Present (5 months)
Engineered a RAG search engine over 100k+ documents using hybrid dense-sparse retrieval (BGE-M3 + BM25), HyDE query expansion, and cross-encoder reranking to reach 91% top-3 retrieval precision. Added NER/intent/coreference layers for query understanding and reduced irrelevant retrievals by 38% using vector-filtering plus Airflow/Kafka ingestion and Redis caching (P99 latency 420ms to 55ms).
AI Memory Architecture for Agents
Independent Project
Jan 2026 - Present (5 months)
Engineered a three-tier conversational memory system (working, episodic, semantic) with recency decay and frequency-weighted retrieval to improve multi-session coherence and prevent contradictory context hallucinations. Compressed conversations into rolling semantic summaries with a learned decay function, cutting prompt token overhead by 60% while preserving long-range context.
KAN Benchmarking Research
Independent Project
Feb 2025 - Present (1 year 4 months)
Rebuilt Kolmogorov–Arnold Networks (KAN) from scratch using kernel-based learnable activations on edges with fine-grained spline parameterization control. Benchmarked against MLP baselines and achieved 20% lower MSE with 1.5× faster convergence, validated via ablation studies over grid resolution and spline order.
Education
Degrees, certifications, and relevant coursework
Bennett University
Bachelor of Technology (B.Tech), Computer Science Engineering
2023 -
Grade: 7.01/10 CGPA
Activities and societies: Leadership Secretary, Game Development Club (Bennett University) and Secretary, RPA Club (Bennett University).
B.Tech in Computer Science Engineering at Bennett University (2023–2027), maintaining a CGPA of 7.01/10.
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Srujan Teja?
You can contact Srujan Teja and 90k+ other talented remote workers on Himalayas.
Message Srujan TejaFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
