Open to opportunities

Soham Chaudhari

@sohamchaudhari2004

Message

AI/ML engineer specializing in generative AI, NLP, and building scalable production AI/ML Systems.

India

Message

What I'm looking for

I seek roles building production-grade AI/ML systems—focusing on generative AI, NLP, and multimodal solutions with opportunities for technical ownership and scalable impact.

I am an AI/ML engineer with hands-on experience building high-accuracy models and deploying scalable solutions across recommendation systems, multimodal video platforms, and generative AI scaffolding. I have implemented production REST APIs, engineered efficient ML pipelines, and delivered measurable improvements such as a 95% recommendation-system accuracy and 40% faster API response times.

My projects span image/video upscaling with GANs, multimodal semantic video search using CLIP/FAISS/ChromaDB, and CLI-driven AI project scaffolding that reduces setup time dramatically. I focus on reliable, well-architected systems, practical evaluation (confusion matrices, ROC, hyperparameter tuning), and integrating LLMs, TTS/STT, and agentic tooling to solve real-world problems.

Experience

Work history, roles, and key accomplishments

Current

AI Engineer Intern

Current

Flip (PaybyFlip)

Feb 2026 - Present (5 months)

Engineered an 11-node LangGraph orchestration pipeline and a 4-step hybrid retrieval engine to power an AI credit card recommendation system over 9.7K+ offer entries. Built PostgreSQL semantic memory and production APIs, optimizing memory injection to top 1–2 items and reducing token usage by 60–70%, and deployed the backend on AWS EC2 with Docker Compose and rate limiting at 100 req/min.

LangGraph Retrieval Augmented Generation (RAG)Semantic Search PostgreSQL AWS EC2 Docker Compose

Current

AI intern

Current

FLIP

Feb 2025 - Present (1 year 5 months)

Built an AI credit card recommendation platform using a LangGraph workflow and hybrid retrieval (semantic search, web Search fallback) over 9.7K+ offers. Developed PostgreSQL-based semantic memory with session isolation and feedback APIs, cutting token usage by 60–70%. Deployed on EC2 with Docker, connection pooling, health checks, rate limiting, and production-grade reliability.

AI Developer Intern

UnLawC

Sep 2025 - Feb 2026 (5 months)

Deployed end-to-end Legal Operations AI agents using LLMs, reducing manual document review time by 50%. Improved LLM contextual accuracy by 31% via prompt engineering and architected RAG pipelines with Mistral AI, LangChain, and Pinecone to increase retrieval accuracy by 60% and eliminate hallucinations.

AI Agents Prompt Engineering Retrieval Augmented Generation (RAG)Mistral AI LLMs (LangChain Pinecone Vector Search

SDE Intern

SR Counselling

Dec 2024 - Oct 2025 (10 months)

Implemented and deployed a recommendation system achieving 95% accuracy, led development of an AI-based visa training system using TTS, STT, and LLMs, and engineered REST APIs (Node.js, Express.js) that reduced response times by 40%.

Python Node TTS AI STT LLM Machine Learning REST APIs

GenAI Developer

BCGX

May 2025 - May 2025 (0 months)

Built a financial chatbot with 95% accuracy that reduced response time by 60%, evaluated 10-K/10-Q datasets for major tech firms, and developed an NLP pipeline to extract financial KPIs for faster forecasting.

Python Pandas NumPy NLP Chatbot Financial Analysis Data Extraction Model Evaluation