Zefr is hiring a Manager of Machine Learning Operations to lead the ML Ops team and drive the infrastructure, tooling, and processes that enable machine learning systems to operate at scale.
Requirements
- Lead, mentor, and grow a team of Machine Learning Engineers
- Design and implement scalable ML infrastructure for model training, deployment, and serving
- Establish and enforce best practices for ML model lifecycle management
- Develop and maintain CI/CD pipelines for machine learning workflows
- Optimize model inference performance and reduce latency/cost across production systems
- Collaborate with ML Engineers and Data Scientists to productionize models efficiently
- Implement robust monitoring, alerting, and observability solutions for ML systems
- Drive technical decisions on ML Ops tooling, infrastructure, and architecture
- Ensure high availability and reliability of ML services at scale
- Manage project timelines, priorities, and resource allocation for the ML Ops team
Benefits
- Flexible PTO
- Medical, dental, and vision insurance with FSA options
- Company-paid life insurance
- Paid parental leave
- 401(k) with company match
- Professional development opportunities
- 14 paid holidays off
- Flexible hybrid work schedule
- Summer Fridays
- In-office lunches and lots of free food
- Optional in-person and virtual events
