Subhadip Mondal
@subhadipmondal1
I build production AI systems—RAG, LLM tuning, and MLOps for real-world decisions.
What I'm looking for
I’m an AI/ML engineer who turns LLM ideas into production systems—RAG pipelines, LLM fine-tuning, and reliable inference. I specialize in model evaluation and practical retrieval design to deliver accurate, fast answers.
In my current Business Analyst role, I automate workflows and apply GPT-based document intelligence to reduce manual processing while improving accuracy. As an AI/ML Freelance Engineer, I built a voice-enabled financial assistant using RAG (500+ earnings reports) and a Gmail–Slack assistant with multi-modal processing (Whisper, Google TTS), handling 200+ queries/day.
I also ship end-to-end projects like DeciScope and RepoMind, combining hybrid retrieval (Pinecone + PostgreSQL/pgvector) with AWS serverless infrastructure to cut latency and track production metrics. Recently, I fine-tuned Llama-3.2-3B with QLoRA and deployed optimized inference with vLLM, improving ROUGE-L and BERTScore while reducing VRAM needs.
Experience
Work history, roles, and key accomplishments
Business Analyst
Enverus
Jul 2025 - Present (10 months)
Automated supplier onboarding workflows using Python and Salesforce API, reducing manual processing time by 40% while handling 150+ vendor records/month with 98% data accuracy. Built a GPT-4-based internal document classification system that processed 200+ legal documents/week with 91% classification accuracy and automated metadata extraction.
AI & ML Freelance Engineer
Independent Contractor
Jan 2025 - Present (1 year 4 months)
Built a voice-enabled financial assistant using GPT-4 RAG over 500+ earnings reports, achieving 89% answer accuracy with real-time stock data APIs. Developed a Gmail-Slack multi-modal assistant (Whisper + Google TTS) handling 200+ queries/day and engineered a hybrid Pinecone + PostgreSQL retrieval system to reduce query latency from 3.2s to 840ms.
AI GitHub Intelligence Assistant
RepoMind
Built a production RAG system for 10K+ file codebases achieving 0.84 answer relevance using LangChain + Gemini 1.5 Pro with AST-based parsing and semantic chunking. Optimized pgvector search queries and connection pooling to reduce p95 latency from 2.1s to 780ms while handling 1K+ webhook events/hour, and implemented incremental indexing with deduplication to cut embedding costs by 73%.
Decision Intelligence Platform
DeciScope
Architected real-time decision monitoring integrating Slack API, processing 500+ messages/day with 87% extraction accuracy using GPT-4 and custom prompt chains. Designed hybrid memory with Pinecone vector search + DynamoDB metadata (0.81 context precision) and built serverless AWS infrastructure with sub-200ms latency, reducing decision reversal rate by 23% in pilot deployment.
Code Documentation SLM
DocForge
Fine-tuned Llama-3.2-3B on 8.5K+ code documentation examples using QLoRA (rank=32, alpha=64), reducing VRAM from 24GB to 8GB. Improved ROUGE-L (0.42→0.56) and BERTScore (0.71→0.91) versus the base model, and deployed with vLLM for sub-1.2s generation at 512 tokens, publishing the model on HuggingFace with documentation.
Education
Degrees, certifications, and relevant coursework
Chandigarh University
Bachelor of Engineering, Electronics and Communication Engineering
2021 - 2025
Bachelor of Engineering in Electronics and Communication Engineering at Chandigarh University from 2021 to 2025.
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Subhadip?
You can contact Subhadip and 90k+ other talented remote workers on Himalayas.
Message SubhadipFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
