Datadog logo

Software Engineer - Site Reliability

We're on a mission to build the best platform in the world for engineers to understand and scale their systems, applications, and teams.




We're on a mission to build the best platform in the world for engineers to understand and scale their systems, applications, and teams.  We operate at high scale—trillions of data points per day—providing always-on alerting, metrics visualization, logs, and application tracing for tens of thousands of companies. Our engineering culture values pragmatism, honesty, and simplicity to solve hard problems the right way

The team:

The Site Reliability teams at Datadog are responsible for ensuring that our high-volume, low-latency environments continue to perform around the clock. These teams collaborate closely with our product engineers to ensure that Datadog can monitor millions of servers and containers, ensuring our customers always have dependable and actionable data at their fingertips. You’ll be responsible for shaping the infrastructure of our data-intensive, real-time services as we continue to grow at petabyte scale.

You will:

  • Keep our service reliable, available and fast
  • Respond to, investigate and fix service issues, whether they be deep in the OS kernel or in the application code.
  • Design, build and maintain the infrastructure we need to support orders of magnitude more customers.


  • You have a track record working with large-scale distributed systems, preferably in the cloud OR you have a BS/MS/PhD in a scientific field or equivalent experience
  • You value correctness and efficiency; you leave no stone unturned when diagnosing production issues
  • You handle infrastructure with code because automation lets you focus on the more difficult and rewarding problems
  • You have production experience with distributed compute/storage tools, e.g. zookeeper, cassandra, postgres, kafka, elasticsearch, redis

Bonus points:

  • You have submitted bug fixes to the aforementioned projects
  • You are fully fluent in python, ruby and go

Is this you? Tell us why, and apply now. Include links to your github, stackoverflow or other online projects.

Apply now

Please let Datadog know you found this job on Himalayas. This will help us grow!

Apply now

About Datadog

Learn about Datadog and their company culture.

View company profile

Modern monitoring & analytics. See inside any stack, any app, at any scale, anywhere.

Datadog is a monitoring and analytics platform for large-scale application infrastructure and applications. Combining metrics, traces, and logs from servers, databases, and applications, Datadog delivers sophisticated, actionable alerts, and provides real-time visibility of your entire stack. Datadog includes 350+ vendor-supported, pre-built integrations, and monitors hundreds of thousands of hosts.

Read more Read more

Tech stack

Learn about the technology and tools that Datadog uses.

View tech stack
Log Management



Olivier Pomel

Icons/design/feather/country/us United States
Icons/design/feather/country/ie Ireland
Icons/design/feather/country/fr France
Icons/design/feather/country/sg Singapore
Icons/design/feather/country/jp Japan
Icons/design/feather/country/au Australia
Icons/design/feather/country/nl Netherlands

Similar remote jobs

These are some of our top picks for great remote jobs on Himalayas.

View all jobs
Netomi logo
Icons/design/feather/country/us Icons/design/feather/country/ca US & CA




Fastly logo
Icons/design/feather/country/us United States only
Squire logo
Icons/design/feather/country/us United States only
Medium logo
Icons/design/feather/country/us United States only logo



3 remote jobs at Datadog

Datadog is hiring Software Engineer - Site Reliability, Technical Writer, and more.

View all jobs at Datadog

Remote companies like Datadog

These are some great remote companies operating in similar industries to Datadog.

View all companies
Eaternity logo

We revolutionize the restaurant industry by giving them smart insights into their supply chain, that benefits people, planet and profit.

Mobile Jazz logo

From mission-critical aviation apps to websites that promote literacy in the developing world, Mobile Jazz is an expert in creating bespoke, compelling software solutions.

Our service makes it easy for anyone to get medication quickly, discreetly, and affordably.

Numbrs logo

Numbrs is a customer-centric financial services company.

The latest jobs in your inbox

We'll keep you updated with the best new remote jobs.

Read about our privacy policy

Hiring remotely?
Start posting jobs today.

Start posting jobs for only $50.

Post remote jobs on Himalayas