Senior DevOps Engineer, Infrastructure & Reliability

Worth AI is a UK-based consultancy specializing in AI literacy, training, and automation solutions to help businesses build digital workforces.

Worth AI

Employee count: 1-10

United States only

Stay safe on Himalayas

Never send money to companies. Jobs on Himalayas will never require payment from applicants.

Worth AI, a leader in the computer software industry, is looking for a Senior DevOps Engineer to join our Infrastructure team with a singular mission: to make our systems faster, more reliable, and more resilient while making life dramatically easier for engineers shipping software. In this role, you won’t just manage infrastructure; you will design and evolve the foundation that every product and engineer depends on.

You will act as a force multiplier by eliminating operational friction, automating repetitive processes, strengthening system reliability, and building scalable infrastructure patterns that allow teams to deploy confidently and recover quickly. You are part architect, part reliability engineer, and part automation evangelist.

Responsibilities

Conduct regular interviews with engineering teams to identify operational pain points in CI/CD, deployments, observability, and cloud environments and proactively eliminate them.
Design and implement scalable Infrastructure-as-Code patterns using tools like Terraform to standardize cloud provisioning and reduce configuration drift.
Own and evolve our Kubernetes platform (EKS or self-managed), ensuring workloads are secure, scalable, and resilient by default.
Architect and optimize CI/CD pipelines to improve deployment frequency, reduce lead time, and increase confidence in releases.
Lead systemic reliability initiatives, including incident response improvements, root cause analysis practices, and postmortem frameworks.
Design and enforce secure networking, IAM, and secrets management strategies across environments.
Improve observability by refining metrics, logs, and tracing using tools like DataDog, ensuring actionable insight into system health.
Optimize cloud cost efficiency through rightsizing, autoscaling strategies, and architectural improvements.
Own disaster recovery planning, backup strategies, and multi-region resilience initiatives.
Refactor brittle or manually managed infrastructure into automated, testable, and reproducible systems.
Introduce new infrastructure tooling or architectural shifts and drive adoption through documentation, workshops, and hands-on support.
Lead by example in incident management, risk mitigation, and operational excellence.
Communicate technical trade-offs clearly across engineering and product stakeholders, balancing speed with safety.

Technology Stack

Cloud & Infrastructure: AWS (EKS, RDS, MSK, S3, Lambda, IAM, VPC)
Containerization & Orchestration: Kubernetes, ArgoCD
Infrastructure-as-Code: Terraform
CI/CD: GitHub Actions (or equivalent)
Monitoring & Observability: DataDog
Data & Messaging: PostgreSQL, Kafka, Redis
Languages (as needed): Bash, Python, TypeScript

Requirements

8+ years of experience in DevOps, SRE, or Infrastructure Engineering roles.
Proven experience designing and operating production Kubernetes environments at scale.
Deep hands-on expertise with AWS infrastructure and cloud networking.
Strong experience building and maintaining Terraform modules across large cloud environments.
Demonstrated ownership of CI/CD systems and measurable improvement of DORA metrics.
Experience leading incident response processes and driving meaningful postmortem outcomes.
Strong understanding of distributed systems, event-driven architectures (Kafka), and database performance (PostgreSQL).
Proven ability to modernize legacy infrastructure and eliminate manual operational toil.
Experience navigating high-ambiguity environments and translating operational friction into prioritized infrastructure roadmaps.
Demonstrated ability to build trust across teams while raising the reliability bar.

Success Metrics

DORA Metrics Improvement:

Drive measurable improvements in Deployment Frequency, Lead Time for Changes, Change Failure Rate, and Mean Time to Recovery (MTTR).

System Reliability:

Maintain or exceed defined SLO/SLA targets with reduced incident frequency and duration.

Infrastructure Stability:

Reduce production incidents caused by misconfiguration, manual processes, or infrastructure drift.

Operational Efficiency:

Increase percentage of infrastructure managed through code and automation.

Cost Optimization:

Improve cloud cost efficiency without sacrificing reliability or performance.

Bonus Points (Nice to Have)

Experience operating high-throughput Kafka clusters (MSK or self-managed).
Strong background in database performance tuning (PostgreSQL, Redis).
Experience implementing autoscaling strategies for high-traffic systems.
Familiarity with service mesh technologies.
Experience building internal developer platforms (IDP).
Background in security best practices (zero-trust networking, policy-as-code).
Experience with multi-region or globally distributed systems.
Proficiency in Python for automation and tooling development.
Experience introducing platform-wide reliability frameworks (SLOs, error budgets, chaos testing).

** All Remote Hires - will be required to travel to Orlando, Florida at least twice per year for Town Halls and team collaboration in addition to orientation in Orlando, Florida.

Benefits

Health Care Plan (Medical, Dental & Vision)
Retirement Plan (401k, IRA)
Life Insurance
Flexible Vacation
Work From Home
Free Food & Snacks (in office)
Orlando, Florida (Hybrid)
Wellness Resources

Apply now

Please let Worth AI know you found this job on Himalayas. This helps us grow!

Apply now

About the job

Apply before

Apr 21, 2026

Posted on

Feb 20, 2026

Job type

Full Time

Experience level

Senior

Location requirements

United States

Hiring timezones

United States +/- 0 hours

About Worth AI

Learn more about Worth AI and their company culture.

View company profile

At the heart of Worth AI is a mission to demystify artificial intelligence for businesses, transforming it from a complex buzzword into a practical engine for growth. We believe that the true power of AI lies not just in the technology itself, but in how it empowers the people who use it. Our culture is built on 'keeping things human'—stripping away the jargon to provide honest, hands-on training and custom solutions that seamlessly integrate into real-world workflows.

We are dedicated to guiding business owners from a state of confusion to absolute clarity and capability. Whether it is by building 'AI teammates' to automate repetitive administrative tasks or deploying sophisticated voice agents for 24/7 customer support, our focus remains steadfast on delivering tangible, operational results. By combining deep technical expertise with an approachable, personalized coaching style, we ensure that technology serves your team, fostering a workplace where innovation and human potential thrive together.

Tech stack

Learn about the tools and technologies that Worth AI uses to build, market, and sell its products.

View tech stack

Gemini

Worth AI employees can create an account to update this tech stack.

Apply now

Please let Worth AI know you found this job on Himalayas. This helps us grow!

Apply now

About the job

Apply before

Apr 21, 2026

Posted on

Feb 20, 2026

Job type

Full Time

Experience level

Senior

Location requirements

United States

Hiring timezones

United States +/- 0 hours

Claim this profile

Worth AI

Company size

1-10 employees

Founded in

2021

Chief executive officer

Alexander Worth

Markets

Artificial Intelligence AI Consulting Business Automation Customer Support Automation Voice Technology Workflow Automation Digital Transformation Enterprise Software Small Business Solutions AI Training And Education

Employees live in

United Kingdom

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

United States only

Sr. DevOps AWS Cloud Engineer

H1 Insights

Employee count: 51-200

Salary: 120k-145k USD

2 remote jobs at Worth AI

Explore the variety of open remote roles at Worth AI, offering flexible work options across multiple disciplines and skill levels.

View all jobs at Worth AI

United States only

Senior QA Analyst

Worth AI

Employee count: 1-10

Full Time

Engineering

Top remote companies

Remote companies like Worth AI

Find your next opportunity by exploring profiles of companies that are similar to Worth AI. Compare culture, benefits, and job openings on Himalayas.

View all companies

BrightAI

BrightAI is dedicated to transforming traditional businesses into digital powerhouses through innovative technologies, enhancing efficiency, productivity, and sustainability.

IoT Artificial Intelligence

Unify Consulting

Unify Consulting is a management consulting firm specializing in AI-driven strategies and bespoke consulting solutions, headquartered in Kirkland, Washington.

AI Consulting Management Consulting

JUST ADD AI

JUST ADD AI GmbH specializes in creating customized AI solutions for businesses, focusing on generative AI to improve efficiency and performance.

Top remote companies

Remote companies like Worth AI

Find your next opportunity by exploring profiles of companies that are similar to Worth AI. Compare culture, benefits, and job openings on Himalayas.

View all companies

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Senior DevOps Engineer, Infrastructure & Reliability

Responsibilities

Technology Stack

Infrastructure-as-Code: Terraform

CI/CD: GitHub Actions (or equivalent)

Requirements

Success Metrics

Bonus Points (Nice to Have)

Benefits

Apply now

About the job

Apply before

Posted on

Job type

Experience level

Location requirements

Hiring timezones

Job categories

Skills

About Worth AI

Tech stack

Apply now

About the job

Apply before

Posted on

Job type

Experience level

Location requirements

Hiring timezones

Job categories

Skills

Worth AI

Company size

Founded in

Chief executive officer

Markets

Employees live in

Similar remote jobs

Sr. DevOps AWS Cloud Engineer

2 remote jobs at Worth AI

Senior QA Analyst

Remote companies like Worth AI

Remote companies like Worth AI

Find your dream job

Find your dream job

Senior QA Analyst

Find your dream job

Sr. DevOps AWS Cloud Engineer

Senior Infrastructure Engineer

Senior Software Engineer, Platform & Developer Experience

DevOps Engineer (Senior/Staff)

Senior Site Reliability Engineer

Senior DevOps Engineer

Remote companies like Worth AI