We're looking for a Site Reliability Engineer to own the space of service health, incident response, infrastructure monitoring, and making sure we're not blindly burning cloud budget. This role is responsible for proactive monitoring, incident response, and continuous improvement of platform reliability across a cloud-native stack.
Requirements
- 2+ years in a Site Reliability, DevOps, or Cloud Infrastructure role in a production environment
- Bachelor's degree in Computer Science, Engineering, or related field, or equivalent hands-on experience
- Practical experience with GCP — Cloud Run, API Gateway, and BigQuery in particular
- Experience with monitoring and observability tooling (Cloud Monitoring, Datadog, or similar)
- Solid grasp of cloud security fundamentals — IAM, network controls, access management
- Proficiency with Git and version control in a team setting
