Himalayas logo
Milestone SystemsMS

Lead Site Reliability Engineer - Infrastructure

Milestone Systems A/S is a leading company in video technology software, offering innovative solutions like XProtect VMS, tailored to meet diverse industry needs.

Milestone Systems

Employee count: 501-1000

Salary: 160k-180k USD

United States only

Stay safe on Himalayas

Never send money to companies. Jobs on Himalayas will never require payment from applicants.

We are seeking a Lead Site Reliability Engineer (Infrastructure) to join our fast-moving VSaaS engineering organization. This role carries responsibility for technical leadership and operational execution of the Infrastructure SRE team. You will own the reliability, scalability, and operability of our shared platform and production systems, while shaping how reliability engineering and SRE practices are applied across the organization and mentoring senior and staff engineers.

You will work closely with product engineering and platform teams to ensure a seamless developer experience, while setting standards, driving priorities, and leading by example during incidents and high-impact operational work. This role requires a strong technical background in cloud infrastructure, distributed systems, CI/CD, and GitOps, along with hands-on development experience in Golang and/or Python, to improve developer workflows, automation, and long-term system reliability.

This is a remote role in the United States.

Role Overview

Site Reliability Engineer - Infrastructure

The Infrastructure team provides leadership, direction, and accountability for platform architecture, system design, and end-to-end implementation to meet and exceed product non-functional requirements, including quality, security, reliability, availability, and performance. Site Reliability Engineers enable Product Development teams to ship features with reliable velocity by owning the stability, scalability, and operability of the underlying infrastructure and shared services.

What You Will Do:

As a Lead Site Reliability Engineer, you will:

  • Operate and evolve large-scale distributed systems, anticipating failure modes and proactively mitigating risks across production environments, while owning day-to-day production operations, including monitoring, alert triage, incident response, post-incident analysis, and critical incident coordination and documentation.

  • Lead the design, build, and implementation of automation, orchestration, and operational tooling to improve efficiency, reliability, signal-to-noise ratio, and reduce recurring issues, minimizing service-impacting events.

  • Set technical direction and influence platform strategy by defining platform architecture, system design, and documentation to guide development, testing, deployment, and long-term maintenance of complex distributed systems.

  • Establish and enforce standards, operational rigor, and best practices for deploying, monitoring, managing, and operating cloud-native and distributed infrastructure environments.

  • Lead the adoption and execution of modern CI/CD, GitOps, and cloud-native infrastructure practices, ensuring reliable, scalable, and traceable software and infrastructure releases.

  • Mentor and develop senior and staff engineers, reinforcing SRE principles, DevOps practices, accountability, and operational excellence across the Infrastructure SRE team.

  • Collaborate closely with product and engineering stakeholders, advocating for an SRE mindset and system-level thinking to maximize reliability, performance, availability, security, and scalability across shared platforms and services.

Other duties as assigned are absorbed into the above ownership and operational responsibilities.

What You Have:

  • 10+ years of experience in site reliability engineering, infrastructure, or systems engineering, with deep ownership of large-scale production systems and demonstrated leadership of SRE or infrastructure teams, including setting technical direction and mentoring senior engineers.

  • Strong hands-on experience designing and building automation and operational tooling using Golang and/or Python, with expert-level proficiency in Linux/Unix systems, shell scripting, and production troubleshooting.

  • Advanced expertise in cloud-native and IaaS architectures, distributed systems, and container orchestration in production environments, including compliance, security, and network considerations.

  • Expertise in architecting modular Terraform frameworks and Infrastructure-as-code (IaC) design patterns.

  • Deep understanding of SRE and DevOps principles, including incident management, SLA/SLO ownership, automation, reliability engineering practices and leading incident response with post-incident analysis and preventive improvements.

  • Strong experience with CI/CD pipelines, GitOps workflows, release tooling, and modern cloud-native infrastructure practices, ensuring reliable and traceable software and infrastructure changes.

  • Hands-on experience operating Docker and Kubernetes environments, observability platforms (logging, monitoring, alerting), and SQL/NoSQL databases (e.g., Postgres, MongoDB, Graph DB), including performance tuning and operational troubleshooting.

Skills / Training Desired

  • Subject matter expertise in Google Cloud preferred; experience with other public cloud providers is also valuable.

  • Demonstrated expertise in microservices lifecycle management, including integration, testing, deployment, and operational best practices, supported by advanced knowledge of software release tooling and CI/CD platforms such as GitLab, Jenkins, Cloud Build, ArgoCD, and Spinnaker.

  • Deep understanding of the Docker and Kubernetes ecosystem, including orchestration, cluster management, and image lifecycle optimization.

  • Strong experience with observability, logging, and monitoring tools such as ELK Stack, Prometheus, Stackdriver, Datadog, New Relic, or Dynatrace.

  • Hands-on experience with algorithms, data structures, complexity analysis, and software/system design for large-scale distributed environments.

  • Experience driving automation for operational efficiency, signal noise reduction, recurring issue mitigation, performance testing, capacity planning, and system optimization in production environments.

  • Experience implementing security best practices and compliance considerations in infrastructure and platform design, along with the ability to influence cross-functional teams, evangelize SRE and DevOps practices, and foster a culture of reliability and operational excellence.

Why Milestone?

Milestone offers not only great benefits but also great culture. Employees here have flexible work environments, opportunities for further education, and the ability to effect change in our Organization directly.

The annual salary for this position ranges from $160,000 to $180,000 range. Pay is based on the level, location, complexity, responsibility, and job duties of the specific position and is just one component of Milestone’s total compensation package. Additionally, we offer an attractive benefits package that includes medical/dental benefits, FSA or HSA, 401k with 6% Safe Harbor employer match, paid parental leave, generous PTO (20 days' vacation, 10 days paid sick time, and 12 company holidays), fully paid Short Term disability policy, fully paid Long Term disability policy, and Life Insurance. If you are selected for an interview, please feel welcome to speak to our Talent Partner about our compensation philosophy.

All employees must complete a background check. Employees in fiscal roles are also required to undergo a credit check. All information obtained during these checks is handled confidentially and shared only with authorized personnel.

Milestone is committed to creating a diverse and inclusive workplace and is proud to be an equal opportunity employer.

Contact and application

Please apply at our website: www.milestonesys.com

We are looking forward to receiving your application

About the job

Apply before

Posted on

Job type

Full Time

Experience level

Senior
Manager

Salary

Salary: 160k-180k USD

Location requirements

Hiring timezones

United States +/- 0 hours

About Milestone Systems

Learn more about Milestone Systems and their company culture.

View company profile

Milestone Systems A/S is a global leader in video technology software, dedicated to empowering people, businesses, and societies with data-driven video solutions. Established in Denmark in 1998, Milestone has revolutionized the way video data is utilized across various sectors. The company’s flagship product, the award-winning XProtect® video management software (VMS), alongside other innovative products like BriefCam analytics platform and Arcules video surveillance as a service (VSaaS), positions Milestone as a front-runner in the industry.

With over 500,000 installations worldwide, from local stores to critical infrastructures, Milestone’s technology is embedded in various applications. The firm prides itself on a customer-centric approach that focuses not only on delivering cutting-edge technology but also on ensuring user experience and application ease. Milestone’s commitment to responsible technology is evident as it continually evolves to adapt to the fast-changing digital landscape and technological advancements. Through sustained growth of over 1,000 employees globally, Milestone cultivates a collaborative environment, championing innovation and strategic partnerships that enhance its comprehensive video solutions portfolio.

Claim this profileMilestone Systems logoMS

Milestone Systems

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

5 remote jobs at Milestone Systems

Explore the variety of open remote roles at Milestone Systems, offering flexible work options across multiple disciplines and skill levels.

View all jobs at Milestone Systems

Remote companies like Milestone Systems

Find your next opportunity by exploring profiles of companies that are similar to Milestone Systems. Compare culture, benefits, and job openings on Himalayas.

View all companies

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan