Join the team redefining how the world experiences design. As the Design Generation Evaluation owner, you'll build the infrastructure that enables quality monitoring across Design Generation. You're the expert who both understands evaluation methodologies deeply AND builds the infrastructure to scale them across the organization.
Requirements
- Strong ML engineering fundamentals with experience building and maintaining production ML systems at scale
- Proven ability to build robust, scalable infrastructure (not just models) - you're a platform engineer who speaks ML
- Deep understanding of distributed systems, observability patterns, and monitoring best practices
- Python proficiency with production-quality coding standards, code reviews, and testing practices
- Experience with data pipelines, time-series data, and statistical analysis for detecting anomalies
- SQL fluency for querying and analyzing large datasets across data warehouse and analytics systems
- Track record of building self-service platforms or developer tooling that gets adoption
- Excellent collaboration skills - this role requires working across teams to understand needs and deliver solutions
- Experience with evaluation of Gen AI systems at scale (even better if that’s evaluation of systems with creative outputs!)
Benefits
- Equity packages
- Inclusive parental leave policy
- Annual Vibe & Thrive allowance
- Flexible leave options
