We're building a dedicated AI Compute and Infrastructure team to power the next generation of model training, inference, evaluation, and experimentation across the exchange. This team sits within engineering leadership and owns the infrastructure layer that lets Kraken run AI workloads with control, speed, reliability, and cost discipline.
Requirements
- 5+ years of infrastructure engineering experience
- Hands-on experience operating GPU clusters or accelerator-backed infrastructure in production or production-like environments
- Strong systems engineering fundamentals across Linux, networking, storage, containers, Kubernetes, distributed runtimes, and production debugging
- Experience with ML serving frameworks such as vLLM, Triton Inference Server, TensorRT, TorchServe, KServe, Ray Serve, or equivalent systems
- Proficiency in Python for infrastructure automation, tooling, debugging, integration, and operational workflows
- Practical understanding of performance tradeoffs across batching, concurrency, memory usage, GPU utilization, model size, latency, throughput, availability, and cost
Benefits
- Competitive salary
- Stock options
- Medical, dental, and vision insurance
- Retirement plan
- Generous parental leave
- Flexible PTO
- Professional development opportunities
- Free snacks and drinks
- Global team with diverse perspectives
