HimalayasHimalayas logo
Worth AIWA

Senior Agentic (AI) Engineer

Worth AI is a UK-based consultancy specializing in AI literacy, training, and automation solutions to help businesses build digital workforces.

Worth AI

Employee count: 1-10

United States only

Stay safe on Himalayas

Never send money to companies. Jobs on Himalayas will never require payment from applicants.

Worth AI is hiring a Senior Agentic AI Engineer to design and ship production agent systems that automate KYB, underwriting, and risk decisions on regulated financial data. You’ll own agents end-to-end architecture, retrieval, tools, evals, and production deployment and partner closely with our Chief AI Officer, applied scientists, and platform teams.

Responsibilities

  • Design and ship multi-step agentic systems (planner/executor, tool-using, multi-agent, human-in-the-loop) for onboarding, underwriting, case review, and continuous monitoring.
  • Architect agent graphs in LangGraph (or comparable — CrewAI, AutoGen, Claude Agent SDK) with explicit state, durable execution, retries, and safe fallbacks.
  • Build the retrieval layer powering our agents — chunking, hybrid search, reranking, and grounded citation.
  • Own the eval stack: golden sets, offline regression suites, LLM-as-judge, online A/B and shadow evals, and red-teaming for jailbreaks, prompt injection, and PII leakage.
  • Expose agents to production systems via well-typed tools and MCP servers. Treat tool surface area as a product.
  • Drive production MLOps: deployment, versioning, traffic shaping, cost/latency budgets, tracing, and on-call playbooks for agent incidents.
  • Partner with security and compliance to keep agents inside SOC 2, GDPR, CCPA, and fair-lending posture — auditability and explainability built in, not bolted on.
  • Mentor engineers on agent patterns, prompt hygiene, eval discipline, and LLM failure modes.
  • Technology Stack
    • Languages: Python, Node.js, TypeScript
    • Agent / LLM frameworks: LangGraph, LangChain, Claude Agent SDK, MCP, OpenAI SDK
    • Models: Anthropic Claude, OpenAI, open-weight where appropriate
    • Retrieval & Data: PostgreSQL, pgvector, OpenSearch, Kafka, Redshift, Redis
    • Infra: AWS, Kubernetes (EKS), ArgoCD, Terraform
    • Evals & Observability: LangSmith / Langfuse / Braintrust-style tooling, DataDog

Requirements

  • 5+ years of software engineering experience, with 2+ years building production LLM or agentic systems (not just notebooks or demos).
  • Hands-on experience with a modern agent framework (LangGraph strongly preferred) and a track record of shipping agents that run, fail gracefully, and recover.
  • Strong RAG fundamentals chunking, embeddings, hybrid retrieval, reranking, grounding — and judgment about when RAG isn’t the right answer.
  • Real eval experience golden sets, offline and online evaluations, used to make ship/no-ship calls.
  • Production MLOps fluency: deployed LLM workloads under real latency, cost, and reliability constraints.
  • Strong Python; comfortable in TypeScript / Node.js.
  • Solid systems engineering instincts APIs, async patterns, queues, databases, distributed system failure modes.
  • Calibrated communicator; thrives in ambiguous, fast-moving environments.
  • Prior experience in fintech, lending, payments, KYB/KYC, fraud, or AML.
  • Experience building MCP servers or other structured tool interfaces for LLMs.
  • Background in classical ML (ranking, scoring, calibration).
  • Experience designing explainable / auditable AI workflows for regulated environments.
  • Open-source contributions to agent frameworks, eval tooling, or retrieval libraries.
  • AWS depth (EKS, MSK, RDS, S3, Lambda) and IaC with Terraform.

Success Metrics

  • Agent Quality: Measurable improvements in task success rate, grounding accuracy, and hallucination rate on our eval suites.
  • Production Reliability: Agents you own meet defined SLOs for latency (P90/P99), tool-call success, and cost per task.
  • Velocity: New agent capabilities go from prototype to production in weeks, without skipping evals or guardrails.
  • Risk Posture: Zero material incidents tied to prompt injection, PII leakage, or unsafe tool use on agents you own.
  • Force Multiplier: Patterns, tools, and eval scaffolding you build get adopted across engineering.

All Remote Hires will be required to travel to Orlando, Florida at least twice per year for Town Halls and team collaboration, in addition to orientation in Orlando.

Benefits

  • Health Care Plan (Medical, Dental & Vision)
  • Retirement Plan (401k, IRA)
  • Life Insurance
  • Flexible Paid Time Off
  • 9 paid Holidays
  • Family Leave
  • Remote
  • Hybrid work (for Orlando Associates)
  • Free Food & Snacks (Orlando)
  • Wellness Resources

About the job

Apply before

Posted on

Job type

Full Time

Experience level

Experience

5 years minimum

Location requirements

Hiring timezones

United States +/- 0 hours

About Worth AI

Learn more about Worth AI and their company culture.

View company profile

At the heart of Worth AI is a mission to demystify artificial intelligence for businesses, transforming it from a complex buzzword into a practical engine for growth. We believe that the true power of AI lies not just in the technology itself, but in how it empowers the people who use it. Our culture is built on 'keeping things human'—stripping away the jargon to provide honest, hands-on training and custom solutions that seamlessly integrate into real-world workflows.

We are dedicated to guiding business owners from a state of confusion to absolute clarity and capability. Whether it is by building 'AI teammates' to automate repetitive administrative tasks or deploying sophisticated voice agents for 24/7 customer support, our focus remains steadfast on delivering tangible, operational results. By combining deep technical expertise with an approachable, personalized coaching style, we ensure that technology serves your team, fostering a workplace where innovation and human potential thrive together.

Tech stack

Learn about the tools and technologies that Worth AI uses to build, market, and sell its products.

View tech stack

Worth AI employees can create an account to update this tech stack.

Claim this profileWorth AI logoWA

Worth AI

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

4 remote jobs at Worth AI

Explore the variety of open remote roles at Worth AI, offering flexible work options across multiple disciplines and skill levels.

View all jobs at Worth AI

Remote companies like Worth AI

Find your next opportunity by exploring profiles of companies that are similar to Worth AI. Compare culture, benefits, and job openings on Himalayas.

View all companies

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan