The Senior Software Engineer will be responsible for building and deploying cutting-edge AI solutions into real-time user applications and high-volume pipelines. The team uses a combination of serverless and AWS infrastructure fully defined as code, and continuously integrated and deployed.
Requirements
- Design and build production-grade AI/LLM processing pipelines (document ingestion, extraction, classification, summarization, RAG) with a focus on reliability, throughput, and unit economics.
- Own observability for AI workloads end-to-end — structured logs, metrics, traces, prompt/response capture, token and cost attribution, and quality evals — integrated with Datadog.
- Build and maintain the deployment and runtime infrastructure for these pipelines on AWS (EKS, Lambda, Step Functions, SQS/Kinesis, S3) using Infrastructure-as-Code (Terraform / CDK) with reusable, reviewable modules.
- Establish CI/CD for AI services and models: automated testing, regression evals, safe rollouts (canary / blue-green), and rollback paths for both code and prompt/model changes.
- Drive engineering excellence in the AI team — pipeline architecture patterns, versioning of prompts/models/datasets, reproducibility, and separation of batch vs. real-time paths.
- Partner with SRE, Platform, and Product Engineering to harden shared services (secrets, networking, identity, data access) and ensure AI workloads meet security and compliance requirements.
- Mentor engineers on building maintainable, testable AI systems; raise the bar on code review, design documents, and operational readiness reviews.
- Embed SOC 2 security and compliance practices into design and implementation decisions, proactively raising gaps in access control, logging, and data protection.
