ZapierZA

Site Reliability Engineer (Data)

Zapier (YC S12) gives people internet superpowers by letting them easily connect and automate the apps they use.

Zapier

Employee count: 501-1000

Salary: 141k-185k USD

AI, AG + 55 more

About Zapier

We're humans who simply think computers should do more work.

At Zapier, we’re not just making software—we’re building a platform to help millions of businesses globally scale with automation and AI. Our mission is to make automation work for everyone by delivering products that delight our customers. You’ll collaborate with brilliant people, use the latest tools, and leverage the flexibility of remote work. Your work will directly fuel our customers’ success, and as they grow, so will you.

Job Posted: May 9, 2025

Location: Americas

Hi there! 👋 Are you passionate about building reliable systems that help data teams thrive at scale?

Zapier is looking for a Site Reliability Engineer to join our Data Platforms team. In this role, you’ll work alongside our existing SRE to level up the reliability, observability, and operational maturity of the modern data stack that powers internal products and customer facing across Zapier. From orchestrating workflows in Databricks to tuning performance in our data infrastructure, you’ll play a key role in keeping our data ecosystem healthy, scalable, and developer-friendly.

About You

You’re experienced, but still growing. You have 4+ years of experience in Site Reliability Engineering roles. You’ve worked in production environments, solved real incidents, and shipped platform improvements—but you’re also eager to learn and grow alongside a thoughtful, distributed team.

You know the cloud—and how to keep it healthy. You’re familiar with cloud-native architecture and services (we use AWS). You’ve helped teams build and maintain reliable workflows using tools like Terraform and you understand the tradeoffs behind infrastructure decisions.

You’re observability- and incident-driven. You know how to detect issues before customers feel them. You believe in rich metrics, structured logs, and smart alerting. You’ve contributed to incident response processes and helped teams learn from failure.

You bring an automation- and AI-first mindset. You’re not afraid to write code (Python, TypeScript, or Bash are all great) and believe deeply in Infrastructure as Code. You lean into tools, automation, and AI to reduce toil, improve deployment confidence, and free up teams to focus on meaningful work. You're are open to experimenting with AI tools to decrease toil and increase your impact.

You’re a strong communicator in a remote-first world. You can clearly describe problems, propose solutions, and write clean documentation others can follow. You’re comfortable collaborating asynchronously with cross-functional teams and support partners.

Things You’ll Do

  • Level up reliability for our modern data stack – Help support and evolve our data platforms (including Databricks, Airflow and our LLMOps tooling) with reliability best practices and clear operational standards.

  • Improve observability and alerting – Partner with engineering teams to implement monitoring and alerting that supports ownership, reduces noise, and improves incident response metrics like MTTD (Mean Time to Detect) and MTTR (Mean Time to Resolve).

  • Automate and optimize operations – Build and maintain infrastructure-as-code, job orchestration logic, and internal tooling that reduce manual intervention and improve system resilience.

  • Participate in on-call and incident response – Share in our on-call rotation (~one week per quarter) and work alongside others to improve postmortems, retrospectives, and mitigation strategies.

  • Contribute to security and compliance readiness – Help evolve our access controls, auditability, and deployment practices in support of growing needs like sensitive Data security compliance.

  • Be a partner, not a gatekeeper – Work closely with Data Engineers, ML Engineers, and Backend Engineers to ensure platforms are reliable and empowering to use.

Bonus Points

(Not required, but nice to have!)

  • Experience with tools like Airflow, Databricks, or Kubernetes.

  • Experience with Databricks administration, cost governance, or workspace security

  • Familiarity with data lake architecture (e.g., Delta Lake, Unity Catalog)

  • Exposure to compliance-driven environments (HIPAA, SOC 2, etc.)

  • Demonstrated AI fluency—whether it’s applying AI for troubleshooting, documentation, automation, or infrastructure tooling

How to Apply

At Zapier, we believe that diverse perspectives and experiences make us better, which is why we have a non-standard application process designed to promote inclusion and equity. We're looking for the best fit for each of our roles, regardless of the type of companies in your background, so we encourage you to apply even if your skills and experiences don’t exactly match the job description. All we ask is that you answer a few in-depth questions in our application that would typically be asked at the start of an interview process. This helps speed things up by letting us get to know you and your skillset a bit better right out of the gate. Please be sure to answer each question; the resume and CV fields are optional.

Education is not a requirement for our roles; however, if you receive an offer, you will need to include your most recent educational experience as part of our background check process.

After you apply, you are going to hear back from us—even if we don’t see an immediate fit with our team. In fact, throughout the process, we strive to never go more than seven days without letting you know the status of your application. We know we’ll make mistakes from time to time, so if you ever have questions about where you stand or about the process, just ask your recruiter!

Zapier is an equal-opportunity employer and we're excited to work with talented and empathetic people of all identities. Zapier does not discriminate based on someone's identity in any aspect of hiring or employment as required by law and in line with our commitment to Diversity, Inclusion, Belonging and Equity. Our code of conduct provides a beacon for the kind of company we strive to be, and we celebrate our differences because those differences are what allow us to make a product that serves a global user base. Zapier will consider all qualified applicants, including those with criminal histories, consistent with applicable laws.

Zapier prioritizes the security of our customers' information and is dedicated to adhering to all applicable data privacy laws. You can review our privacy policy here.

Zapier is committed to inclusion. As part of this commitment, Zapier welcomes applications from individuals with disabilities and will work to provide reasonable accommodations. If reasonable accommodations are needed to participate in the job application or interview process, please contact [email protected].

Application Deadline:

The anticipated application window is 30 days from the date job is posted, unless the number of applicants requires it to close sooner or later, or if the position is filled.

Even though we’re an all-remote company, we still need to be thoughtful about where we have Zapiens working. Check outthis resource for a list of countries where we currently cannot have Zapiens permanently working.

About the job

Apply before

Posted on

Job type

Full Time

Experience level

Mid-level
Senior

Salary

Salary: 141k-185k USD

About Zapier

Learn more about Zapier and their company culture.

View company profile

Zapier (YC S12) gives people internet superpowers by letting them easily connect and automate the apps they use. Partners, including Salesforce, Intuit, Google, and Dropbox, utilize Zapier to offer their customers integrations with 1,000+ apps. The Zapier Developer Platform enables developers to add APIs for private or public use.

Our growing, remote team has members around the world. We are on a mission to make work easier. We face formidable technical hurdles, unique marketing challenges, and exciting brand and design opportunities that come with serving a vast multi-sided audience. We are hiring.

We're a 100% distributed team helping people across the world automate the boring and tedious parts of their job. We do that by helping everyone connect the web applications they already use and love.

We believe that there are jobs a computer is best at doing and that there are jobs a human is best at doing. We want to empower businesses to create processes and systems that let computers do what they are best at doing and let humans do what they are best at doing.

Employee benefits

Learn about the employee benefits and perks provided at Zapier.

View benefits

2 annual company retreat

Company retreats to awesome places!

Profit sharing

Profit-sharing program for 100% of Zapiens.

Retirement plan

We offer a 4% company match for US, UK, and Canadian employees.

Paid parental leave

14 weeks paid leave for new parents of biological or adopted children.

View Zapier's employee benefits
Claim this profileZapier logoZA

Zapier

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

9 remote jobs at Zapier

Explore the variety of open remote roles at Zapier, offering flexible work options across multiple disciplines and skill levels.

View all jobs at Zapier

Remote companies like Zapier

Find your next opportunity by exploring profiles of companies that are similar to Zapier. Compare culture, benefits, and job openings on Himalayas.

View all companies

Find your dream job

Sign up now and join over 85,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan
Zapier hiring Site Reliability Engineer (Data) • Remote (Work from Home) | Himalayas