Netflix is seeking a Site Reliability Engineer (SRE) to ensure the reliability, scalability, and operational excellence of their compute platforms, including EC2, Titus, and managed capacity. This role involves building automation, tooling, driving observability, and collaborating with engineering teams. The goal is to modernize the stack, improve launch latency, and participate in on-call rotations.
Requirements
- 5+ years of experience operating and scaling large-scale, high-performance cloud infrastructure.
 - Deep knowledge in Kubernetes, container runtimes, and cloud native tools.
 - Deep expertise in Linux/Unix systems, networking fundamentals, and cloud platforms.
 - Proficiency in Go, Python, Rust, or Java.
 - Familiarity with auto scaling, fleet management, and capacity planning at scale.
 
Benefits
- Health Plans
 - Mental Health support
 - 401(k) Retirement Plan
 - Stock Option Program
 - Disability Programs
 - Health Savings and Flexible Spending Accounts
 - Family-forming benefits
 - Life and Serious Injury Benefits
 - Paid leave of absence
 
