Himalayas logo
JobgetherJO

Software Reliability Engineer (Remote - US)

Jobgether
United States only

Stay safe on Himalayas

Never send money to companies. Jobs on Himalayas will never require payment from applicants.

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Software Reliability Engineer in United States.

As a Software Reliability Engineer, you’ll play a crucial role in ensuring system stability, rapid issue resolution, and platform resilience across large-scale distributed systems. This role goes beyond traditional DevOps or infrastructure tasks — you’ll dive directly into live systems, diagnosing complex incidents, identifying root causes, and preventing recurrences. Working closely with cross-functional teams, you will strengthen reliability across applications that impact millions of users. This position offers a highly collaborative environment, where your technical insight and problem-solving skills directly protect uptime, customer trust, and overall business continuity.

Accountabilities

  • Lead incident response efforts to quickly diagnose and resolve issues in distributed production environments.
  • Use observability and monitoring tools (such as Dynatrace and Azure Application Insights) to identify root causes and validate resolutions.
  • Collaborate with engineers across APIs, microservices, and data layers to stabilize live systems and prevent future disruptions.
  • Write and run targeted automated tests using tools like Jest, Cypress, or Playwright to confirm issue resolution and improve reliability.
  • Communicate root causes and fixes effectively to both technical and non-technical stakeholders.
  • Partner with platform and DevOps teams to enhance monitoring, alerting, and deployment workflows.
  • Participate in on-call rotations for high-priority production incidents and contribute to continuous improvement of reliability practices.

Requirements

  • Minimum 2 years of experience in software engineering, production support, or incident response.
  • Strong proficiency in JavaScript/TypeScript with the ability to debug live applications and services.
  • Solid understanding of SQL and NoSQL databases for tracing and troubleshooting data issues.
  • Experience working within Azure or GCP cloud environments.
  • Proven success stabilizing distributed or microservice-based architectures.
  • Excellent communication and problem-solving skills, with the ability to clearly articulate findings.
  • Preferred: experience managing P0/P1 incidents, knowledge of observability tools (Dynatrace, Datadog, or OpenTelemetry), and familiarity with event-driven architectures or message queues.

Benefits

  • Competitive salary and comprehensive health coverage (medical, dental, vision).
  • Flexible Paid Time Off (PTO) and 13 company holidays.
  • 401(k) with company match and paid parental leave, including adoption assistance.
  • Remote-first work model within the U.S., with occasional travel for team gatherings.
  • Free Fitbit and a fun, mission-driven, and collaborative culture focused on improving lives.

Jobgether is a Talent Matching Platform that partners with companies worldwide to efficiently connect top talent with the right opportunities through AI-driven job matching.
When you apply, your profile goes through our AI-powered screening process designed to identify top talent efficiently and fairly.

🔍 Our AI evaluates your CV and LinkedIn profile thoroughly, analyzing your skills, experience, and achievements.
📊 It compares your profile to the job’s core requirements and past success factors to determine your match score.
🎯 Based on this analysis, we automatically shortlist the 3 candidates with the highest match to the role.
🧠 When necessary, our human team may perform an additional manual review to ensure no strong profile is missed.

The process is transparent, skills-based, and free of bias — focusing solely on your fit for the role. Once the shortlist is completed, we share it directly with the company that owns the job opening. The final decision and next steps (such as interviews or additional assessments) are then made by their internal hiring team.

Thank you for your interest!

About the job

Apply before

Posted on

Job type

Full Time

Experience level

Mid-level

Location requirements

Hiring timezones

United States +/- 0 hours
Claim this profileJobgether logoJO

Jobgether

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

346 remote jobs at Jobgether

Explore the variety of open remote roles at Jobgether, offering flexible work options across multiple disciplines and skill levels.

View all jobs at Jobgether

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan
Jobgether hiring Software Reliability Engineer (Remote - US) • Remote (Work from Home) | Himalayas