CoreWeaveCO

Manager, Cloud Operations Engineering

CoreWeave
United States only
Apply now

CoreWeave is a specialized cloud provider, delivering a massive scale of GPU compute resources on top of the industry’s fastest and most flexible infrastructure. CoreWeave builds cloud solutions for compute intensive use cases — VFX and rendering, machine learning and AI, batch processing, and Pixel Streaming — that are up to 35 times faster and 80% less expensive than the large, generalized public clouds. Learn more at www.coreweave.com.

About the role:

The Cloud Operations Team is the heart of CoreWeave’s operational practice. This team responds to performance and availability issues across the CoreWeave cloud, bridging the gap between Customer Support and internal Service Owning teams. Working in shifts ensuring 24x7 coverage, the team develops proactive health monitoring, triage alerts and incidents serving in the commander role during Priority Incident events, and participates in ongoing analysis and reliability improvement practices.

Collaborating across development and engineering, this team operates horizontally and vertically within the CoreWeave ecosystem to root out problems, initiate and coordinate responses, and drive lower MTTR and MTTD scores.

The newly formed team is staffed with resources who have broad technology and troubleshooting skills and are actively expanding their knowledge in critical areas such as networking, storage, Kubernetes, automation, and observability. You will bootstrap the team’s processes and procedures and be their direct Manager.

As the people leader for this team of 8 Operations Engineers, you will facilitate and empower their success. Drawing on your experience in Cloud Operations, you understand deeply the importance of process, documentation and automation. You strive for continual improvement. You will maintain a close working relationship with each of your team members through regular 1:1s focusing on the ‘whole engineer’ guiding them in their skills and career development at CoreWeave. Resources on your team are likely to mature into strong individual contributors to peer engineering teams across the organization and you will help them prepare while simultaneously providing exceptional support to those same teams.

As Manager of the Cloud Operations Team you will:

  • Grow, change, invest in your teammates, be invested-in, share your ideas, listen to others, be curious, have fun, and above all, be yourself.
  • Learn and navigate the tools, systems and processes that enable the AI cloud.
  • Bootstrap the team’s operational processes and road map key project work and tooling requirements for the team’s success.
  • Own staffing, scheduling and HR responsibilities.
  • Develop and lead team cadence and planning sessions in conjunction with our Technical Project Manager.
  • Develop internal processes, procedures, and documentation to ensure efficient management of the team’s workload.
  • Track and report on key metrics that represent the team’s improvement and impact.
  • Act as the Sr. Incident Commander, and develop the team’s ability to efficiently operate Major Incidents.
  • Participate as a key member of the enterprise ITSM cadence, reporting on incident trends, durations (MTTR, MTTD etc.), problems, and Incident Reviews.
  • Own the Post Incident Review process.
  • Continually improve our incident response process with the goal of iteratively reducing MTTR through all reasonable methods (tooling, process, automation etc.).
  • Partner across service owners, SRE, Customer Support, to ensure process alignment, knowledge sharing and shared responsibility regarding Incident Management, Post Incident Reviews, Production Readiness Assessments etc.

Wondering if you’re a good fit? We believe in investing in our people, and value candidates who can bring their own diversified experiences to our teams – even if you aren't a 100% skill or experience match. Here are some qualities we’ve found compatible with our team. If a portion of this resonates with you, we’d love to talk.

  • You come with your own philosophies and strategies, are adaptable to new information, and freely provide feedback, coaching, and being an active participant in improving how the team functions.
  • You have experience with business process development and can see where communication breakdowns are likely to occur.
  • You are committed to understanding the needs of others, and how you can effectively lever your own talents to ensure collective success.
  • You are comfortable using observability data to visualize service health, and triangulate proximate cause of performance and availability issues.
  • You are comfortable making sense of complex environments and leading others through troubleshooting without actively fixing things yourself.
  • You can lead when there’s ambiguity, and following when engineers lead.
  • You have experience in a support capacity and/or a broad understanding of modern applications and infrastructure.
  • You are comfortable managing communication and coordinating multiple engineers during an incident.
  • You have a desire to learn or have experience with process automation.
  • You have a customer first mindset and bring empathy for the customer as well as the engineering team who’s tasked with solving complex problems.
  • You’re excited to join a team with diverse perspectives and backgrounds that believe in tackling challenges, growing hand in hand, and winning together.

Hybrid Workplace

If you reside within a 30-mile radius of our New Jersey, New York, or Philadelphia offices, we're excited for you to join us at the office at least three times a week, recognizing the significance we place on fostering connections, collaboration, and creativity within our office culture. Our commitment to operating as a hybrid workplace underscores our dedication to enabling our employees to tailor their work-life balance to their individual preferences.

Why CoreWeave?

At CoreWeave, we work hard, have fun, and move fast! We’re in an exciting stage of hyper-growth that you will not want to miss out on. We’re not afraid of a little chaos, and we’re constantly learning. Our team cares deeply about how we build our product and how we work together, which is represented through our core values:

  • Be Curious at your Core
  • Act like an Owner
  • Empower Employees
  • Deliver Best In-Class Client Experience
  • Achieve More Together

We support and encourage an entrepreneurial outlook and independent thinking. We foster an environment that encourages collaboration and provides the opportunity to develop innovative solutions to complex problems. As we get set for take off, the growth opportunities within the organization are constantly expanding. You will be surrounded by some of the best talent in the industry, who will want to learn from you, too. Come join us!

Benefits

We offer a competitive salary and benefits, including:

  • Medical, dental and vision insurance - 100% paid for the employee
  • Company paid Life Insurance
  • Voluntary supplemental life insurance
  • Short and long-term disability insurance
  • Flexible Spending Account
  • Tuition Reimbursement
  • Mental Wellness Benefits through Spring Health
  • Family-Forming support provided by Carrot
  • Paid Parental Leave
  • Flexible, full-service childcare support with Kinside
  • 401(k) with a generous employer match
  • Flexible PTO
  • Catered lunch each day in our offices
  • Weekly massages in NJ office
  • A casual work environment
  • Work culture focused on innovative disruption

California Consumer Privacy Act - California applicants only

CoreWeave is an equal opportunity employer, committed to our diversity and inclusiveness. We will consider all qualified applicants without regard to race, color, nationality, gender, gender identity or expression, sexual orientation, religion, disability or age.

Elevate your application

Let our AI craft your perfect cover letter and align your resume to this job's criteria.

By using our AI tools, you consent to sharing your profile with our AI partner for this purpose.

Apply now

Please let CoreWeave know you found this job on Himalayas. This helps us grow!

Apply now

About the job

Apply before

Jul 21, 2024

Posted on

May 22, 2024

Job type

Full Time

Experience level

Mid-level

Location requirements

Hiring timezones

United States +/- 0 hours
Claim this profileCoreWeave logoCO

CoreWeave

View company profileVisit coreweave.com

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

30 remote jobs at CoreWeave

Explore the variety of open remote roles at CoreWeave, offering flexible work options across multiple disciplines and skill levels.

View all jobs at CoreWeave

Remote companies like CoreWeave

Find your next opportunity by exploring profiles of companies that are similar to CoreWeave. Compare culture, benefits, and job openings on Himalayas.

View all companies

Find your dream job

Sign up now and join thousands of other remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan