CoreWeaveCO

Director of Production Engineering

CoreWeave is a specialized AI cloud provider delivering a massive scale of GPU compute resources on the industry's fastest and most flexible infrastructure, purpose-built for AI, machine learning, and VFX rendering workloads.

CoreWeave

Employee count: 501-1000

Salary: 230k-275k USD

United States only

CoreWeave is the AI Hyperscaler™, delivering a cloud platform of cutting edge services powering the next wave of AI. Our technology provides enterprises and leading AI labs with the most performant, efficient and resilient solutions for accelerated computing. Since 2017, CoreWeave has operated a growing footprint of data centers covering every region of the US and across Europe. CoreWeave was ranked as one of the TIME100 most influential companies of 2024.

As the leader in the industry, we thrive in an environment where adaptability and resilience are key. Our culture offers career-defining opportunities for those who excel amid change and challenge. If you’re someone who thrives in a dynamic environment, enjoys solving complex problems, and is eager to make a significant impact, CoreWeave is the place for you. Join us, and be part of a team solving some of the most exciting challenges in the industry.

CoreWeave powers the creation and delivery of the intelligence that drives innovation.

About the Role

As we continue to scale the CoreWeave Cloud Platform, ensuring reliability, performance, and operational efficiency in a live production environment is mission-critical. We’re seeking a Director of Production Engineering to lead and expand our SRE team and practices. This leader will play a key role in shaping the resilience and reliability of our platform, partnering closely with engineering and product teams to build systems and services that are secure, scalable, and highly efficient.

You will champion a culture of operational excellence, automation, and ownership, empowering our cloud platform to scale confidently and operate with agility.

What You’ll Do

  • Define and execute the SRE vision, strategy, and roadmap for a large-scale, distributed cloud infrastructure.
  • Lead and mentor a high-performing team of SREs, promoting a culture of ownership, collaboration, and continuous learning.
  • Champion automation-first practices, leveraging tools like Terraform, Kubernetes, and Infrastructure-as-Code to minimize toil and manual interventions.
  • Establish and evolve best practices in observability, monitoring, and alerting, ensuring the platform is proactive, not reactive.
  • Drive initiatives for incident management, postmortem culture, root cause analysis, and system hardening.
  • Collaborate with engineering, product, and customer support teams to build scalable, resilient, and self-healing systems.
  • Evolve our on-call strategy and processes to support a 24x7, globally distributed platform with minimal disruptions.

Who You Are

We’re looking for a thoughtful leader who blends technical depth with strategic vision, and thrives in fast-moving, high-growth environments. If you value clarity over complexity, mentorship over management, and resilience over rigidity, you’ll fit right in.

Minimum Qualifications

  • Bachelor’s degrees in Computer Science, Engineering, or related fields.
  • 10+ years of engineering leadership roles within SRE, DevOps, or cloud infrastructure.
  • 5+ years in managing large-scale infrastructure-as-service in a geographically distributed, always-on environment.
  • Proven success leading 24x7 operations teams and delivering high-availability services at scale..
  • Deep expertise in automation, monitoring/observabilities, and incident response frameworks.
  • Familiarity with AI purpose-built cloud-native architectures, CI/CD systems, and performance tuning.

Additional Qualifications

  • Hands-on experience with Python, Go, Java, or Ruby for operational tooling and automation.
  • Strong track record of hiring, mentoring, and developing top-tier SRE talent in high-growth companies..
  • Comfortable navigating cross-functional dynamics and influencing leadership across engineering, product, and support.
  • Experience leading DevOps and reliability transformation projects, improving developer velocity and platform resilience.

Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $230,000-$275,000. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience.

What We Offer

The range we’ve posted represents the typical compensation range for this role. To determine actual compensation, we review the market rate for each candidate which can include a variety of factors. These include qualifications, experience, interview performance, and location.

In addition to a competitive salary, we offer a variety of benefits to support your needs, including:

  • Medical, dental, and vision insurance - 100% paid for by CoreWeave
  • Company-paid Life Insurance
  • Voluntary supplemental life insurance
  • Short and long-term disability insurance
  • Flexible Spending Account
  • Health Savings Account
  • Tuition Reimbursement
  • Mental Wellness Benefits through Spring Health
  • Family-Forming support provided by Carrot
  • Paid Parental Leave
  • Flexible, full-service childcare support with Kinside
  • 401(k) with a generous employer match
  • Flexible PTO
  • Catered lunch each day in our office and data center locations
  • A casual work environment
  • A work culture focused on innovative disruption

Our Workplace

At CoreWeave, we are committed to operating as a hybrid workplace, offering employees flexibility in how they structure their time between in-office and remote work. We recognize the significance of fostering connections, collaboration, and creativity within our office culture and its positive impact on our business. Our philosophy operating as a hybrid workplace underscores our dedication to enabling employees to tailor work-life balance to their individual preferences.

For those who do not live within 30 miles of one of our offices, we are open to considering remote work for candidates whose skills and experience strongly align with the role. While we prioritize a hybrid work environment for most roles, we understand the importance of flexibility and are open to remote work for specific positions and specialized skill sets. Onboarding is essential to your success. New employees not based out of an office will be invited to attend onboarding training at one of our hubs within their first month of employment. We continue to foster a collaborative environment by bringing teams together quarterly.

California Consumer Privacy Act - California applicants only

CoreWeave is an equal opportunity employer, committed to fostering an inclusive and supportive workplace. All qualified applicants and candidates will receive consideration for employment without regard to race, color, religion, sex, disability, age, sexual orientation, gender identity, national origin, veteran status, or genetic information.

As part of this commitment and consistent with the Americans with Disabilities Act (ADA), CoreWeave will ensure that qualified applicants and candidates with disabilities are provided reasonable accommodations for the hiring process, unless such accommodation would cause an undue hardship. If reasonable accommodation is needed, please contact: [email protected].

About the job

Apply before

Posted on

Job type

Full Time

Experience level

Director

Salary

Salary: 230k-275k USD

Location requirements

Hiring timezones

United States +/- 0 hours

About CoreWeave

Learn more about CoreWeave and their company culture.

View company profile

We are CoreWeave, the AI Hyperscaler™, and we're on a mission to revolutionize the way large-scale GPU-accelerated workloads are handled in the cloud computing industry. Since our founding in 2017, initially as Atlantic Crypto focused on Ethereum mining, we've pivoted and dedicated ourselves to building a cloud platform specifically designed for the demanding needs of AI and machine learning. We recognized early on the transformative potential of Generative AI and the immense computational power it would require. This foresight led us to repurpose our GPU capacity for high-performance computing, a decision that has positioned us at the forefront of the AI revolution.

Our CoreWeave Cloud Platform is engineered from the ground up, offering cutting-edge software and cloud services that deliver the automation and efficiency necessary to manage complex AI infrastructure at scale. We provide access to a massive scale of NVIDIA GPUs, including the latest H100 and Blackwell architectures, across our growing footprint of data centers in the United States and Europe. We're not just about providing hardware; we're about delivering a comprehensive suite of services, including GPU and CPU compute, high-performance storage, and robust networking solutions. Our Kubernetes-native architecture is designed to support large-scale, GPU-intensive tasks, making it easier for AI labs, enterprises, and innovators to train, fine-tune, and deploy their models faster and more cost-effectively. We pride ourselves on tackling the hard problems in AI infrastructure, working closely with our customers to push the boundaries of what's possible. We're committed to continuous learning and innovation, empowering our employees to take ownership and drive progress as we build the cloud for the AI era.

Employee benefits

Learn about the employee benefits and perks provided at CoreWeave.

View benefits

Vision Insurance

Offered through VSP.

Company equity

Offered by CoreWeave.

Childcare benefits

Offered by CoreWeave.

Generous parental leave

Offered by CoreWeave.

View CoreWeave's employee benefits
Claim this profileCoreWeave logoCO

CoreWeave

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

97 remote jobs at CoreWeave

Explore the variety of open remote roles at CoreWeave, offering flexible work options across multiple disciplines and skill levels.

View all jobs at CoreWeave

Remote companies like CoreWeave

Find your next opportunity by exploring profiles of companies that are similar to CoreWeave. Compare culture, benefits, and job openings on Himalayas.

View all companies

Find your dream job

Sign up now and join over 85,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan
CoreWeave hiring Director of Production Engineering • Remote (Work from Home) | Himalayas