IntelliPro Group Inc. hiring Staff Site Reliability Engineer • Remote (Work from Home) | Himalayas
IntelliPro Group Inc.II

Staff Site Reliability Engineer

IntelliPro Group is a comprehensive talent acquisition firm that offers a full suite of recruiting services along with outsourced business capabilities for a diverse range of clients.

IntelliPro Group Inc.

Employee count: 501-1000

Salary: 200k-250k USD

United States only

Staff Site Reliability Engineer (Remote, US)

Compensation: $200K–$250K + Equity
Full-Time | Remote | Infrastructure Team

We’re hiring a Staff Reliability Engineer to help scale and maintain the massive GPU infrastructure that powers our cutting-edge AI systems. If you're passionate about building robust, scalable systems and solving deep infrastructure challenges at scale, this role is for you.

What You’ll Be Doing

  • Work closely with engineers and researchers to define and meet system performance, availability, and efficiency requirements.

  • Operate and manage thousands of GPUs distributed across multiple cloud providers and clusters.

  • Design scalable solutions to support rapid growth in compute demands for AI model training, data processing, and inference.

  • Build resilient, fault-tolerant systems to ensure continuous uptime and seamless performance.

  • Develop automation tools to eliminate toil and streamline infrastructure operations.

  • Set up and maintain monitoring systems to proactively detect issues and drive performance improvements.

  • Define and track SLOs and SLIs that uphold system reliability standards.

  • Participate in an on-call rotation to ensure 24/7 system availability.

Qualifications

  • Proven 7+ years of experience as a reliability engineer, infrastructure engineer, or production engineer in fast-paced, high-growth environments.

  • Deep knowledge of GPU infrastructure, including scheduling, scaling, cloud networking, storage, and security.

  • Proficiency in one or more scripting or programming languages.

  • Strong experience with Kubernetes or similar container orchestration systems.

  • Familiarity with Infrastructure-as-Code tools like Terraform or CloudFormation.

  • Experience working with observability tools like Prometheus, Grafana, DataDog, ELK, or Splunk.

  • Excellent troubleshooting, debugging, and systems thinking.

  • Strong communication skills and a collaborative mindset.

  • Bonus: Experience in AI/ML infrastructure, or managing large-scale GPU clusters.

What We're Building

We're developing highly complex infrastructure to support advanced AI research and production systems running on thousands of GPUs. This is an opportunity to work on some of the most demanding reliability and performance challenges in tech today—at scale. You’ll have direct impact on how infrastructure supports foundation model development and deployment.

Compensation & Benefits

Base Salary: $200K–$250K/year

Competitive equity package (stock options)

Comprehensive health benefits

Generous PTO and flexible work policies

Support for ongoing professional development

About the job

Apply before

Posted on

Job type

Full Time

Experience level

Senior

Salary

Salary: 200k-250k USD

Location requirements

Hiring timezones

United States +/- 0 hours

About IntelliPro Group Inc.

Learn more about IntelliPro Group Inc. and their company culture.

View company profile

IntelliPro Group is a comprehensive talent acquisition firm that offers a full suite of recruiting services along with outsourced business capabilities for a diverse range of clients. With headquarters in Silicon Valley, IntelliPro aims to provide innovative talent solutions globally. The organization has gained substantial trust from top corporations around the universe by merging high-tech talent sourcing methodologies with a personalized approach that prioritizes candidate experience.

The company leverages a robust AI-powered talent-matching software that streamlines the talent acquisition process, enabling them to identify the most skilled candidates for various roles including engineering, marketing, management, and administrative positions effectively. Founded in 2009, IntelliPro has rapidly grown to become one of the leading talent acquisition agencies in both the United States and Asia-Pacific regions, showcasing a commitment to high quality in service delivery that continues to strengthen its reputation as a go-to partner in recruitment and human resources outsourcing.

Claim this profileIntelliPro Group Inc. logoII

IntelliPro Group Inc.

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

2 remote jobs at IntelliPro Group Inc.

Explore the variety of open remote roles at IntelliPro Group Inc., offering flexible work options across multiple disciplines and skill levels.

View all jobs at IntelliPro Group Inc.

Remote companies like IntelliPro Group Inc.

Find your next opportunity by exploring profiles of companies that are similar to IntelliPro Group Inc.. Compare culture, benefits, and job openings on Himalayas.

View all companies

Find your dream job

Sign up now and join over 85,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan