Freelance AI Evaluation Engineer opportunity to create challenging coding test cases for AI systems, review and refine realistic coding tasks, and analyze AI failures. Opportunity is part-time and non-permanent, with estimated 20 hours of effort per task.
Requirements
- Degree in Computer Science, Software Engineering or related fields
- 5+ years in software development, primarily Python (pytest, async/await, subprocess, file operations)
- Background in Full-Stack development, with an equal focus on building React-based interfaces and robust Back-end systems
- Experience writing tests (functional, integration – not just running them)
- Docker containers (running evaluations locally in containers)
- CI/CD understanding (GitHub Actions as a user: triggers, labels, reading results)
- English proficiency - B2
Benefits
- Competitive hourly rate (up to $50 per hour equivalent)
- Flexibility to choose when and how to work
- Opportunity to work on various projects with different scope and complexity
