Abhinav Singh
@abhinavsingh25
AI Engineer building production RAG and agentic systems on AWS Bedrock. Cut query latency from 6 minutes to under 1 second for 5,000+ users.
What I'm looking for
I'm a full-stack AI/ML engineer who builds production LLM, RAG, and agentic systems end-to-end, from backend services to frontend. I currently work on a customer-facing agentic chatbot on AWS Bedrock AgentCore, targeting 150,000+ monthly users, and have shipped two other AI platforms serving 5,000+ daily users across 3 countries.
On DanskeAssist, a RAG knowledge assistant, I architected a multi-region Elasticsearch setup with kNN semantic search and per-tenant index isolation, serving multiple regions from a single API. I reduced query resolution from 6 minutes to under 1 second for 5,000+ daily users, validating the system against 611 evaluated Q&A pairs before launch, and later led a zero-downtime embedding migration from ada-002 to text-embedding-3-large after measuring retrieval quality degradation on multilingual queries.
On GRASP, a multi-agent compliance platform, I built a 27-endpoint FastAPI backend orchestrating 6 specialized agents through a five-phase conversation state machine, with parallel function calling and DynamoDB session persistence. As sole developer, I took it from scratch to production, cutting a 3-month manual process down to 15-20 minutes for 400+ users.
On the AgentCore chatbot, I built the full ingestion pipeline (S3, Bedrock Knowledge Base, OpenSearch Serverless) with Terraform and GitHub Actions CI/CD, cut response latency from 6-8 seconds toward a sub-3-second target, and built an LLM evaluation framework on Phoenix and OpenTelemetry to track quality, latency, and cost. My core stack is Python, FastAPI, LangChain/LangGraph, AWS Bedrock, OpenSearch, DynamoDB, and Terraform.
Experience
Work history, roles, and key accomplishments
Building a customer-facing agentic chatbot on AWS Bedrock AgentCore Runtime using the Strands Agents SDK. Implemented the ingestion and retrieval pipeline, streaming/latency improvements, evaluation tooling, retrieval tools, session management, and content-safety guardrails.
Full-stack AI/ML engineer building production LLM, RAG, and agentic systems end-to-end, including backend services and frontend integrations. Works on features spanning search, embeddings, multi-agent orchestration, and evaluation.
Multi-Agent Workflow Platform
GRASP
Mar 2025 - Jan 2026 (10 months)
Developed a multi-agent workflow platform orchestrating multiple specialized agents via a conversation state machine and persisted sessions in DynamoDB. Built an end-to-end agentic workflow that reduced a manual 3-month process to 15–20 minutes for users across multiple regions.
RAG Knowledge Assistant
DanskeAssist
Mar 2024 - Jan 2025 (10 months)
Built and optimized a RAG knowledge assistant using a multi-region Elasticsearch setup with semantic kNN search and tenant isolation. Led ingestion, retrieval performance improvements, and embedding model migration for multilingual queries.
Education
Degrees, certifications, and relevant coursework
Jaypee University of Engineering and Technology
B.Tech. in Computer Science and Engineering, Computer Science and Engineering (Minor: AI & ML)
2019 - 2023
Grade: CGPA: 8.6/10
Earned a B.Tech. in Computer Science and Engineering with a minor in AI & ML (CGPA 8.6/10) from Aug 2019 to Jun 2023.
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Website
abhinav-builds.vercel.appPortfolio
abhinav-builds.vercel.appSocial media
Job categories
Skills
Interested in hiring Abhinav?
You can contact Abhinav and 90k+ other talented remote workers on Himalayas.
Message AbhinavGet matched with your dream remote job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
