Gaurav Thakur

I’m an AI Engineer specializing in architecting deterministic systems that bridge the gap between stochastic LLMs and production-grade reliability.

I design resilient, stateful workflows and evaluation frameworks to ensure autonomous agents operate with the precision required for enterprise and mission-critical environments.

EXPERIENCE

AI Engineer (Contract) at Arthik ERP July 2025 - Present Developing the Bulwark project (listed below) as the core AI layer for this system. https://github.com/gauravxthakur/Bulwark
● Building the core Conversational AI layer for an early-stage ERP startup in stealth, utilizing LangGraph to orchestrate stateful, multi-turn interactions for complex accounting and data entry tasks.
● Engineered the initial agentic modules for automated invoice processing, implementing Pydantic-driven extraction and Human-in-the-Loop validation to ensure 100% data integrity before database commits.
● Engineering production-ready agentic workflows by implementing asynchronous error-handling, persistent state management, and systematic evaluations to benchmark agent accuracy and minimize regression in decision-making logic.

Freelance AI Engineer Oct 2025 - Present
● Lead Gen Pipeline: Architected an Async Python system to scrape LinkedIn leads, reducing service and API costs by 95% ($100+ to <$5/mo) by refactoring n8n workflows and benchmarking 15+ providers for a high-reliability, pay-as-you-go architecture.
● Catalog Automation: Engineered a LangGraph Vision Agent to automate Shopify data entry, using multi-modal LLMs to extract and sync catalog data.

OPEN SOURCE CONTRIBUTIONS

EvalView - AI Agent Regression Testing Framework github.com/hidai25/eval-view
● Engineered an asynchronous health-check adapter for Ollama, implementing a robust connectivity verification layer using the /api/tags endpoint and httpx to maintain architectural consistency across the framework.
● Developed comprehensive unit tests in Pytest for model adapters, covering server successes, 500-series failures, and unreachable host scenarios to ensure framework reliability in local LLM environments.
● Standardized internal documentation by implementing Google-style docstrings across core reporting classes (ConsoleReporter), improving IDE autocompletion and decreasing onboarding friction for new contributors.