Director, Site Reliability Engineering

Cybersecurity is constantly changing. Time favors the adversary.

SentinelOne

Employee count: 1001-5000

Salary: 195k-293k USD

United States only

Apply now

About Us:

SentinelOne is defining the future of cybersecurity through our XDR platform that automatically prevents, detects, and responds to threats in real-time. Singularity XDR ingests data and leverages our patented AI models to deliver autonomous protection. With SentinelOne, organizations gain full transparency into everything happening across the network at machine speed – to defeat every attack, at every stage of the threat lifecycle.

We are a values-driven team where names are known, results are rewarded, and friendships are formed. Trust, accountability, relentlessness, ingenuity, and OneSentinel define the pillars of our collaborative and unified global culture. We're looking for people that will drive team success and collaboration across SentinelOne. If you’re enthusiastic about innovative approaches to problem-solving, we would love to speak with you about joining our team!

What are we looking for?

We are seeking an experienced engineering and operational director to lead our Site Reliability Engineering (SRE) team at SentinelOne. As the Director of SRE, you will manage a team of SRE professionals responsible for ensuring the reliability and scalability of our products and production services, focusing on the experience our customers have in production every day. You will work closely with other engineering teams to identify and address availability, performance, and capacity issues, and you’ll be a key partner for our externally facing teams including Support, Customer Success, and Sales Engineering. This is a highly visible role within S1 with frequent executive communication opportunities, and is a great opportunity to do good work with good people all around the world.

As a team we value

Thinking from first principles, understanding second order impacts
Curiosity to understand new systems, their operating principles and limitations
Strong operational ownership and a desire to reduce toil via automation
A drive to learn, especially from prior failures
Courage to take risks and make things happen
Empathy and humility to collaborate effectively with peers and across teams

What will you do?

Grow and lead a team of SRE professionals, including setting performance goals and measuring deliverables against key metrics, while evolving those metrics as S1 grows and needs develop
Invest in data-driven deep triage on recurring issues, collaborating with other engineering teams to identify and address issues related to reliability, performance, and capacity
Develop, improve, and implement processes for the full incident lifecycle including incident management, post-incident analysis, and learning from incidents Lead incident response efforts, including coordinating with other teams to investigate and resolve customer-impacting incidents
Design support model for SRE regarding service maturity and service ownership, including monitoring and alerting improvements and SLI / SLO design and implementation
Analyze production metrics and signals to identify areas for improvement and take proactive steps to mitigate issues
Develop and implement best practices and standards for Site Reliability Engineering, from day to day operations to hiring and planning
Communicate effectively with cross-functional teams to ensure alignment on objectives and priorities. Deliver outcomes, not just stories and tasks.

What skills and knowledge should you bring?

10+ years of engineering experience, with at least 5 years in a management role
Demonstrated experience leading technical and operational teams at various stages of maturity
Excellent analytical and problem-solving skills
Familiarity with modern software development methodologies, tools, and techniques including CI/CD
Experience working with cloud-native applications and large scale distributed systems including a working knowledge of technologies such as Kubernetes and Terraform/IaC and cloud providers such as AWS or GCP
Experience with various monitoring and alerting techniques and tools, including frameworks and concepts such as SLOs, OTel and Golden Signals as well as tooling such as Prometheus and Grafana
Extensive experience with incident response and management at various layers of the stack across different business needs and applications, including both hands on experience leading incidents/post-incident analysis and experience driving broader incident management initiatives
Ability to thrive in a fast-paced, dynamic environment
Driven by curiosity and humility - complex distributed systems are complex, so ask the “silly” question and seek out answers

Why us?

You will be joining a cutting-edge company where you will tackle extraordinary challenges and work with the very best in the industry.

Medical, Vision, Dental, 401(k), Commuter, Health and Dependent FSA
Unlimited PTO
Industry-leading gender-neutral parental leave
Paid Company Holidays
Paid Sick Time
Employee stock purchase program
Disability and life insurance
Employee assistance program
Gym membership reimbursement
Cell phone reimbursement
Numerous company-sponsored events, including regular happy hours and team-building events

This U.S. role has a base pay range that will vary based on the location of the candidate. For some

locations, a different pay range may apply. If so, this range will be provided to you during the recruiting

process. You can also reach out to the recruiter with any questions.

Base Salary Range$195,000—$293,000 USD

SentinelOne is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.

SentinelOne participates in the E-Verify Program for all U.S. based roles.

Elevate your application

Let our AI craft your perfect cover letter and align your resume to this job's criteria.

Optimize my resume Craft my cover letter

By using our AI tools, you consent to sharing your profile with our AI partner for this purpose.

Elevate your application

Let our AI craft your perfect cover letter and align your resume to this job's criteria.

By using our AI tools, you consent to sharing your profile with our AI partner for this purpose.

Optimize my resume Craft my cover letter

Apply now

Please let SentinelOne know you found this job on Himalayas. This helps us grow!

Apply now

Please let SentinelOne know you found this job on Himalayas. This helps us grow!

Apply now

About the job

Apply before

May 25, 2024

Posted on

Mar 26, 2024

Job type

Full Time

Experience level

Executive

Salary

Salary: 195k-293k USD

Location requirements

United States

Hiring timezones

United States +/- 0 hours

Job categories

Senior Site Reliability Engineer

Skills

Site Reliability Engineering SRE Incident Management CI CD Concepts Kubernetes Terraform AWS GCP Prometheus Grafana Incident Response Cloud Native Monitoring And Alerting SLOs

About SentinelOne

Learn more about SentinelOne and their company culture.

View company profile

Cybersecurity is constantly changing. Time favors the adversary. Today’s challenges are nothing like tomorrow’s. Threats are becoming more and more advanced leveraging the power of automation. Some wait and react. At SentinelOne, we innovate. Our mission is to defeat every attack, every second, of every day. Our Singularity Platform instantly defends against cyberattacks – performing at a faster speed, greater scale, and higher accuracy than possible from any single human or even a crowd.

So, if our tech seems like something from the future, good — that’s exactly what it is.

Who We Are

We are defenders. It is why we exist. Born from hustle, we’ve spent decades sharpening ourselves to make things better for our customers. How? With our autonomous technology, we create sustainable advantage, not momentary edge. Through relentless innovation, we give ourselves the power to challenge the accepted standards of today. By putting our customers first, we turn traditional customer relationships into true partnerships.

Tech stack

Learn about the tools and technologies that SentinelOne uses to build, market, and sell its products.

View tech stack

JavaScript

Python

HTML5

Java

CSS 3

GraphQL

gRPC

TypeScript

Optimize

Facebook Ads

75 more

SentinelOne employees can create an account to update this tech stack.

Employee benefits

Learn about the employee benefits and perks provided at SentinelOne.

View benefits

Healthcare benefits

Medical, dental, and vision insurance.

Retirement benefits

401(k) to help you invest in your future.

Life insurance

Life insurance so your family is protected.

Disability insurance

We'll cover your disability insurance so you don't have to worry.

View SentinelOne's employee benefits

Apply now

Please let SentinelOne know you found this job on Himalayas. This helps us grow!

Apply now

Elevate your application

Let our AI craft your perfect cover letter and align your resume to this job's criteria.

Optimize my resume Craft my cover letter

By using our AI tools, you consent to sharing your profile with our AI partner for this purpose.