Aleph Alpha is hiring a Senior AI Engineer for pre-training evaluation. The role involves owning benchmarks, building evaluation infrastructure, designing aggregation and reporting, and correlating signals. The ideal candidate has experience with LLM evaluation, benchmark design, and statistical methods.
Requirements
- Experience with LLM evaluation, benchmark design, evaluation dataset curation, and experimental design.
- Familiarity with statistical methods for evaluation and experiment design.
- Track record of shipping impactful technical work.
- Strong Python skills and comfort with ML tooling (PyTorch, evaluation frameworks, distributed systems).
- Ability to reason about what an evaluation measures and whether it matters.
- Ownership mentality: you see problems through from diagnosis to solution to deployment.
- Willingness to relocate to Heidelberg or travel regularly.
Benefits
- 30 days of paid vacation
- Access to a variety of fitness & wellness offerings via Wellhub
- Mental health support through nilo.health
- Substantially subsidized company pension plan
- Subsidized Germany-wide transportation ticket
- Budget for additional technical equipment
- Flexible working hours
- Virtual Stock Option Plan
- JobRad Bike Lease
