Himalayas logo
JobgetherJO

Senior Site Reliability Engineer, Managed Kubernetes - Europe

Jobgether
United Kingdom only

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer, Managed Kubernetes in Europe.

Join a dynamic engineering team responsible for building and scaling large-scale Kubernetes platforms to power cutting-edge AI and machine learning workloads. As a Senior Site Reliability Engineer, you will ensure the reliability, performance, and scalability of cloud infrastructure while contributing to automation, monitoring, and platform improvements. You will work closely with engineering, HPC operations, and data center teams to solve complex technical challenges, provide operational support, and improve service quality. This is a high-impact role for those passionate about distributed systems, automation, and delivering reliable services at scale.

Accountabilities:

  • Operate and maintain production Kubernetes clusters at scale, handling incidents, recovery, and cluster lifecycle management.
  • Build and maintain control plane services, custom controllers, and operators to enhance cluster reliability.
  • Automate deployment, upgrades, patching, and validation of Kubernetes workloads and platform components.
  • Collaborate with HPC Ops, Datacenter Ops, and engineering teams on cross-functional issues and incident resolution.
  • Define, implement, and monitor SLOs and SLIs to maintain high platform reliability and performance.
  • Assist customers with workload integration, authentication, and storage-related questions.
  • Contribute to tooling, observability, and platform quality improvements using Python, Go, and CI/CD pipelines.

Requirements

  • 6+ years of experience in SRE, operations engineering, or similar roles managing Linux clusters and systems.
  • Strong programming skills in Go and Python, with experience in GitOps, Helm, and Kubernetes operators.
  • Proven experience running Kubernetes clusters in production, including EKS, GKE, on-prem, or hybrid environments.
  • Familiarity with observability and monitoring tools such as Prometheus, Grafana, and FluentBit.
  • Experience provisioning Kubernetes using kubeadm, Cluster API, or similar tools.
  • Ability to work independently and collaboratively, managing customer interactions during incidents.
  • Nice-to-have: Deep Kubernetes expertise (CRDs, CSI, CNI), experience with HPC or GPU clusters, multi-cloud environments, and contributions to CNCF or Kubernetes SIGs.

Benefits

  • Competitive salary with market-based compensation range depending on location.
  • Equity options and performance-based incentives.
  • Health, dental, and vision coverage for you and your dependents.
  • Wellness and commuter stipends where applicable.
  • Flexible paid time off and hybrid work arrangements.
  • Opportunity to mentor and grow within a fast-paced, technology-driven environment.
  • Work with a team at the forefront of AI/ML cloud infrastructure and large-scale Kubernetes operations.

Jobgether is a Talent Matching Platform that partners with companies worldwide to efficiently connect top talent with the right opportunities through AI-driven job matching.

When you apply, your profile goes through our AI-powered screening process designed to identify top talent efficiently and fairly.

πŸ” Our AI evaluates your CV and LinkedIn profile thoroughly, analyzing your skills, experience, and achievements.
πŸ“Š It compares your profile to the job’s core requirements and past success factors to determine your match score.
🎯 Based on this analysis, we automatically shortlist the three candidates with the highest match to the role.
🧠 When necessary, our human team may perform an additional manual review to ensure no strong profile is missed.

The process is transparent, skills-based, and free of bias β€” focusing solely on your fit for the role.
Once the shortlist is completed, we share it directly with the company that owns the job opening. The final decision and next steps (such as interviews or additional assessments) are then made by their internal hiring team.

Thank you for your interest!

About the job

Apply before

Posted on

Job type

Full Time

Experience level

Senior

Location requirements

Hiring timezones

United Kingdom +/- 0 hours
Claim this profileJobgether logoJO

Jobgether

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

696 remote jobs at Jobgether

Explore the variety of open remote roles at Jobgether, offering flexible work options across multiple disciplines and skill levels.

View all jobs at Jobgether

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan
Jobgether hiring Senior Site Reliability Engineer, Managed Kubernetes - Europe β€’ Remote (Work from Home) | Himalayas