This is a remote position.
We're building agentic AI for Fortune 500 enterprises — the kind that survives contact with messy real-world data.
We're hiring a Forward Deployed Engineer with deep production RAG expertise to embed with our largest enterprise customers, architect their retrieval and grounding systems end-to-end, and lead a distributed team shipping AI that runs on real traffic.
This is not a "build a POC with LangChain in two weeks" role. This is "your code is in the customer's production environment on Monday, and you own the faithfulness, latency, and citation numbers."
What you'll do
- Design and ship production RAG systems for enterprise customers — chunking strategy, embeddings, hybrid retrieval, reranking, query rewriting, citation grounding, the works
- Own retrieval evaluation end-to-end — define golden sets, build eval pipelines (Ragas, TruLens, LangSmith, or custom), and drive measurable quality wins (think context precision 0.65 → 0.90, not "looks good to me")
- Architect agentic systems layered on top of RAG — multi-step retrieval, tool use, MCP integrations, guardrails, output validation
- Embed directly with customer engineering teams — pair on whiteboards, write code in their repos, debug their data, present to their CTO
- Technically lead a distributed team of FDEs across time zones — set the quality bar, review architecture, mentor engineers, own customer outcomes
- Close the loop with our product org — what breaks in the field shapes what we build next
Requirements
You should have
- 7+ years engineering experience, with at least 2 years shipping production RAG (not POCs) at real scale
- Demonstrated depth on the parts of RAG that actually matter:
- Chunking strategies and the ablations you ran to choose one
- Vector DB trade-offs from real production use — Pinecone, Qdrant, Weaviate, pgvector, Vespa, OpenSearch (pick your 2–3 and have opinions)
- Hybrid search, reranking (Cohere, cross-encoders, ColBERT), HyDE, query rewriting, context compression
- Retrieval and generation evaluation — you can quote real numbers from real systems you've owned
- The boring-but-critical stuff: access control, citation enforcement, freshness, multi-tenancy, cost-at-scale
- Strong Python; production experience on at least one cloud (AWS / Azure / GCP)
- Hands-on with at least one agent framework — LangGraph, AutoGen, CrewAI, Semantic Kernel, or equivalent
- Customer-facing presence — you can walk into a Fortune 500 architecture review and earn trust in 30 minutes
- Track record of leading engineers, not just being one
Nice to have
- .NET / C# and the broader Microsoft stack — Azure OpenAI, Azure AI Search, Semantic Kernel, Microsoft Agent Framework, Copilot Studio, Cosmos DB; comfort building enterprise-grade services in ASP.NET Core
- Voice AI / contact-center experience (latency-constrained retrieval is a different beast)
- OSS contributions to retrieval frameworks, published technical writing, or conference talks
- Domain depth in regulated industries — healthcare, financial services, legal
- .Net C# and Microsoft stack
Who you are
- Hands-on first.You build the system – architect and code.
- Customer-obsessed.You measure your work by what's running in production for the customer on Friday, not by what's in the design doc.
- Eval-driven.You don't trust vibes; you trust numbers from a golden set.
- An owner.No hand-offs. You take problems from discovery to production and stay until they're solved.
Why this role
- Hardest applied GenAI problems in the market — Fortune 500-scale RAG, not consumer chatbots
- Real influence on product direction — field signal shapes the roadmap
- Top-decile compensation for this profile
- Autonomy to embed, ship, and own outcomes end-to-end
Benefits
- Competitive salary commensurate with experience.
- Opportunities for career advancement and professional development.
- Experience collaborating with a diverse, global team within a remote work setting.
