Himalayas logo
Arista NetworksAN

Senior Customer Reliability Engineer (CRE)

Arista Networks is a leading provider of software-driven cloud networking solutions for large data center, campus and routing environments.

Arista Networks

Employee count: 1001-5000

United States only

Stay safe on Himalayas

Never send money to companies. Jobs on Himalayas will never require payment from applicants.

The Opportunity

This is not a traditional operations role. You will inherit a set of critical, manual, and hands-on operational responsibilities essential to our customers' success. We need you to help lead the effort to systematically dismantle this operational burden through automation, tooling, and systems. You will have a collaborative team of excellent engineers and a counterpart to you to work with on both the manual toil and the systems we need to engineer.

The short-term needs are: manual deployments, reactive troubleshooting, and on-call escalations. But we need you to help us build a system where programmatic solutions have replaced human intervention. You must have the pragmatism to manage the current reality and the systematic impatience to build its replacement.

Success in this role requires a dual mindset. You must be a skilled incident leader who can stabilize a crisis and a deliberate systems architect who can prevent the next one. You will work closely with our internal tools, platform, and product engineering teams to channel your direct operational knowledge into durable, long-term solutions.

What You’ll Do

Your work will follow a deliberate trajectory from reactive execution to proactive design.

Phase 1: Stabilize and Map (First 3-6 Months). You will embed with the team, taking ownership of the existing operational workload alongside the other customer SRE person covering the India time zone and product engineers. This includes customer deployments, upgrades, and incident response. Your initial goal is to achieve stability while mapping the landscape of our operational toil.

Phase 2: Automate and Influence (Months 6-18). Armed with your map of toil, you will begin to automate. You will write code, build tooling, and deploy declarative infrastructure to eliminate the most critical operational burdens. For larger projects, you will act as a primary stakeholder, providing clear requirements to our internal tooling and platform teams and ensuring their solutions meet the operational need. Your success will be measured by a demonstrable reduction in the overall support effort, fewer pages, support escalations, and manual tasks.

Phase 3: Architect and Evangelize (Year 2+). With the most acute operational pains addressed, your focus will shift to architectural concerns. You will define and implement Service Level Objectives (SLOs), influence the design of new products for operability, and help instill SRE principles throughout the engineering organization.

  • DevOps and SRE Proficiency
    • You must have a strong background in Site Reliability Engineering or a closely related DevOps function. You also have a strong command of Linux systems administration and possess an understanding of networking fundamentals (TCP/IP, DNS, routing).
  • Customer-Facing Experience
    • You must have experience working directly with external customers to solve difficult technical problems. Your communication must be clear, empathetic, and precise.
  • Cloud Infrastructure Expertise
    • You need production experience with a major cloud provider, preferably AWS. You should be proficient in its core concepts and services (VPC, EC2, IAM, S3) and have experience building and managing infrastructure as code with tools like Terraform.
  • Monitoring and Observability
    • You will be responsible for both building and using our observability stack. This requires hands-on experience instrumenting applications and managing the telemetry pipelines for metrics, logs, and traces.
    • A core part of the role is then applying this data to debug complex production incidents, understand system behavior, and define SLOs.
  • Automation and Software Development
    • You must be proficient in writing code to automate operational tasks. Expertise in a high-level language like Python or Go is required, as are strong shell scripting skills (e.g., Bash). We have a diverse tech stack including Python, Scala, C++, Haskell, Rust, PureScript, etc which requires experience with monitoring and debugging a complex system using system tools, command line utilities, networking debug tools, and filtering complex logs.

Preferred Skills

  • Proficiency with Kafka, Postgres, nginx, systemd, etc is a plus
    • We use this software extensively in the product in customer environments. Experience here is not required but it is a plus.
  • Proficiency in Nix and NixOS is a plus
    • We use Nix/NixOS extensively so knowing them helps, but they will not play a large role in your initial responsibilities. We'll train you on the job if you've never used Nix before.
  • Exposure to or proficiency in functional programming languages and paradigms is a plus
    • We value functional programming-oriented principles (compositionality, immutability, etc). You are not required to know functional languages, but some exposure is a plus as is a willingness to learn but this is not a requirement.

Values

  • We value compassion
    • We believe our mission is one of service to others, whether that is protecting our customers from harm or empowering other developers to do work they are proud of.
  • We value humility
    • Humility matters to everyone on the engineering team in Arista NDR and we accept the sobering reality that we as humans make mistakes, forget things easily, and have over-inflated confidence in our grasp of complexity. We value humility because we think it leads to better solutions (social or technological) and better understanding.
  • We value reliability
    • We believe that software (any form of automation, really), should free people to do their most creative work. Therefore, we value low maintenance software and technology that empowers us to write it.

This is a hybrid work environment where office presence maybe required 1-2 days a week.

Arista Networks is an equal opportunity employer. Arista makes all hiring and employment-related decisions in a non-discriminatory manner without regard to race, color, religion, sex, sexual orientation, gender identity, national origin or any other factor determined to be unlawful under applicable federal, state, or law law. All your information will be kept confidential according to EEO guidelines.

Arista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments. What sets us apart is our relentless pursuit of innovation. We leverage the latest advancements in cloud computing, artificial intelligence, and software-defined networking to provide our clients with a competitive edge in an increasingly interconnected world. Our solutions are designed to not only meet the current demands of the digital landscape but to also anticipate and adapt to future challenges.

At Arista we value the diversity of thought and perspectives that each employee brings to the table. We believe that fostering an inclusive environment, where individuals from various backgrounds and experiences feel welcome, is essential for driving creativity and innovation.

Our commitment to excellence has earned us several prestigious awards, such as Best Engineering Team, Best Company for Diversity, Compensation, and Work-Life Balance. At Arista, we take pride in our track record of success and strive to maintain the highest standards of quality and performance in everything we do.

About the job

Apply before

Posted on

Job type

Full Time

Experience level

Senior

Location requirements

Hiring timezones

United States +/- 0 hours

About Arista Networks

Learn more about Arista Networks and their company culture.

View company profile

At Arista Networks, we are at the forefront of technological innovation, pioneering software-driven cloud networking solutions that are fundamentally transforming the architecture of large-scale data center, campus, and routing environments. Through our groundbreaking Extensible Operating System (EOS™), we deliver a revolutionary approach to network infrastructure, providing unparalleled availability, agility, automation, analytics, and security. EOS is a highly modular, Linux-based network operating system, uniquely designed with a multi-process state-sharing architecture. This innovative design separates state information and packet forwarding from protocol processing and application logic, enabling a level of resiliency and programmability previously unattainable in the networking industry. Our platforms, which support a wide range of Ethernet speeds from 10 to 800 gigabits per second, are engineered to redefine scalability and resilience, empowering our customers to build robust, high-performance networks that can seamlessly handle the explosive growth of data and new application demands.

Our commitment to innovation extends beyond our core operating system. We are a key player in the advancement of AI networking, developing intelligent solutions that optimize workload performance and drive efficiency. Through strategic acquisitions, we have expanded our capabilities to include network detection and response (NDR), cognitive unified edge (CUE) for branch networking, and advanced monitoring fabrics. These integrations allow us to offer a comprehensive, data-driven cognitive cloud networking portfolio that provides end-to-end visibility and control. By championing open standards and fostering a culture of continuous innovation, Arista Networks is not just responding to the needs of the modern digital world; we are actively shaping the future of networking. Our solutions empower the world's leading cloud titans, financial services firms, enterprises, and service providers to build the next generation of IT infrastructure, capable of supporting the most demanding applications and services with unmatched performance and reliability.

Employee benefits

Learn about the employee benefits and perks provided at Arista Networks.

View benefits

401k Plan

A 401(k) retirement savings plan.

Maternity / Paternity Leave

Offers paid leave for new parents.

PTO & Paid holidays

Paid time off and paid holidays for employees.

Secured Bike Lockers

Secure lockers for employees who bike to work.

View Arista Networks's employee benefits
Claim this profileArista Networks logoAN

Arista Networks

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

46 remote jobs at Arista Networks

Explore the variety of open remote roles at Arista Networks, offering flexible work options across multiple disciplines and skill levels.

View all jobs at Arista Networks

Remote companies like Arista Networks

Find your next opportunity by exploring profiles of companies that are similar to Arista Networks. Compare culture, benefits, and job openings on Himalayas.

View all companies

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan
Arista Networks hiring Senior Customer Reliability Engineer (CRE) • Remote (Work from Home) | Himalayas