This engagement is focused on building an internal AI platform that enables developers to ship AI-powered services efficiently. Scope includes model connectivity, prompt testing and evaluation, monitoring/observability, and the underlying AI infrastructure layer.
The objective is to improve DevEx and reduce time-to-market for AI features.
Tasks
- Build and operate the AI platform infrastructure enabling developers to ship LLM-based services faster.
- Implement and maintain Kubernetes-based runtime environments (incl. AKS) for AI workloads.
- Manage infrastructure as code with Terraform (modules, environments, CI/CD automation).
- Support LLM workflows: RAG, agents, prompt experimentation, evaluations, and deployment patterns.
- Integrate and operate tooling such as Azure AI Foundry, LiteLLM, Langfuse, MLflow.
- Orchestrate pipelines using Kubeflow Pipelines and/or Argo Workflows (build, deploy, evaluate).
- Improve platform reliability and observability (monitoring, logging, tracing, cost/perf signals).
- Collaborate closely with developers to streamline DX (APIs, templates, docs, golden paths, automation).
Requirements
- Strong hands-on experience with Kubernetes in production (preferably AKS).
- Solid Terraform expertise (IaC best practices, multi-env setups).
- Practical experience supporting ML/LLM workloads in a platform or DevOps/MLOps context.
- Proficiency in Python for automation, scripting, and supporting APIs/evaluation tooling.
- Understanding of CI/CD, release processes, and production-grade operations.
- Ability to work under tight timelines and deliver pragmatically.
Nice to Have
- Experience building internal developer platforms or “paved roads” for engineering teams.
- Familiarity with LLM evaluation frameworks, prompt testing workflows, and LLM observability.
- Exposure to RAG architectures, vector databases, and agentic patterns.
- Experience with Kubeflow, Argo, and ML lifecycle tooling.
Engagement Type
- Long-term B2B contract.
Team
- You will join a team of 5, with 3 AI Platform Engineers being added.
Location / Timezone
- Remote within Europe (preferred: Croatia, Poland, Portugal, Serbia).
- European working hours.
- Occasionally available for meetings up to 10:00 AM PST (US overlap).
