We're building a dataset to evaluate AI coding agents and need a Freelance Agent Evaluation Engineer to create challenging tasks and evaluation criteria within realistic simulated environments. This is a part-time, project-based opportunity to work on testing, evaluating, and improving AI systems.
Requirements
- Degree in Computer Science, Software Engineering, or related fields
- 5+ years in software development, primarily Python
- Background in full-stack development, with experience building React-based interfaces and robust back-end systems
- Experience writing tests, familiarity with Docker containers, and CI/CD tools
- English proficiency - B2
Benefits
- Up to $17 per hour equivalent
- Flexible work schedule
