HimalayasHimalayas logo
KMC Solutions IncKI

XTN-EF5F239 | SENIOR DEVOPS ENGINEER -CLOUD & HPC INFRASTRUCTURE

KMC Solutions Inc
United States only

Stay safe on Himalayas

Never send money to companies. Jobs on Himalayas will never require payment from applicants.

Position Overview

We are seeking a highly experienced Senior DevOps Engineer to lead the design, deployment, automation, and operational excellence of our AWS-based cloud infrastructure and high-performance computing (HPC) environments. This role requires deep expertise in AWS architecture, Linux systems administration, server deployment, containerization, virtualization, license server management, and cloud networking.

The ideal candidate is hands-on, automation-driven, security-focused, and comfortable operating in complex hybrid environments supporting research, engineering, and compute-intensive workloads.

  • Health Insurance/HMO
  • Enjoy unlimited MadMax Coffee
  • Diverse learning & growth opportunities
  • Accessible Cloud HR platform (Sprout)
  • Above standard leaves

Key Responsibilities

  • Cloud Infrastructure & AWS Architecture
  • Design, deploy, and manage scalable, secure AWS infrastructure.
  • Architect and maintain VPCs, subnets, route tables, NAT gateways, transit gateways, and peering.
  • Manage AWS networking components including Route53, Load Balancers (ALB/NLB), CloudFront, and PrivateLink.
  • Implement infrastructure-as-code (IaC) using Terraform, CloudFormation, or similar.
  • Optimize cloud cost, performance, and resource utilization.
  • Implement AWS best practices for security, resilience, and high availability.

Server Deployment & Systems Engineering

  • Architect and automate server provisioning across cloud and hybrid environments.
  • Deploy and manage EC2, Auto Scaling Groups, Launch Templates, and AMIs.
  • Build hardened Linux server images (CIS benchmarks preferred).
  • Implement configuration management using tools such as Ansible, Puppet, or Chef.
  • Manage patching, lifecycle management, and OS hardening strategies.

Expert Linux Administration

  • Advanced administration of RHEL, Rocky, Ubuntu, or similar distributions.
  • Kernel tuning and performance optimization for compute-intensive workloads.
  • Troubleshooting system-level performance (CPU, memory, I/O, networking).
  • Manage system services, storage, RAID, LVM, NFS, and distributed filesystems.
  • Shell scripting and automation (Bash, Python).

Containerization & Virtualization

  • Design and manage containerized workloads using Docker.
  • Deploy and maintain Kubernetes (EKS preferred).
  • Implement CI/CD pipelines for container-based applications.
  • Manage virtualization platforms (VMware, KVM, or similar).
  • Optimize container orchestration for HPC and compute workloads.

HPC Infrastructure Management

  • Deploy and maintain High Performance Computing clusters.
  • Manage job schedulers (Slurm, PBS, or similar).
  • Optimize cluster performance, storage throughput, and node scaling.
  • Integrate HPC workloads with AWS services (e.g., ParallelCluster).
  • Manage high-speed networking (InfiniBand or equivalent if applicable).
  • Support GPU-based workloads where applicable.

License Server Administration

  • Deploy and manage FlexLM or similar license servers.
  • Ensure high availability and redundancy for engineering license services.
  • Monitor license usage and optimize allocation.
  • Troubleshoot license connectivity and performance issues.

Cloud Networking & Security

  • Deep understanding of TCP/IP, DNS, routing protocols, and firewall design.
  • Implement secure connectivity (VPN, Direct Connect, site-to-site).
  • Manage security groups, NACLs, IAM roles, and zero-trust principles.
  • Implement logging, monitoring, and alerting (CloudWatch, Prometheus, Grafana).
  • Support compliance frameworks and infrastructure security controls.

Automation & CI/CD

  • Build and maintain CI/CD pipelines (GitHub Actions, GitLab, Jenkins, etc.).
  • Automate infrastructure deployments and configuration management.
  • Implement DevSecOps best practices.
  • Develop reusable infrastructure modules and standards.

Monitoring & Observability

  • Implement centralized logging solutions.
  • Configure performance monitoring and alerting systems.
  • Perform root cause analysis and incident response.
  • Develop dashboards and operational metrics.

Required Qualifications

  • 7+ years of experience in DevOps, Infrastructure Engineering, or Systems Engineering.
  • 5+ years of hands-on AWS architecture experience.
  • Deep expertise in Linux systems administration.
  • Strong experience with containerization and Kubernetes.
  • Proven experience managing HPC environments.
  • Experience managing enterprise license servers.
  • Strong scripting skills (Bash, Python).
  • Experience with Infrastructure as Code (Terraform preferred).
  • Strong understanding of networking fundamentals and cloud networking.

Preferred Qualifications

  • AWS Solutions Architect Professional or DevOps Professional certification.
  • Experience with AWS ParallelCluster.
  • Experience with GPU workloads and AI/ML infrastructure.
  • Experience with enterprise storage solutions (NetApp, Isilon, etc.).
  • Experience supporting research or engineering compute environments.
  • Soft Skills
  • Strong troubleshooting and analytical skills.
  • Ability to work independently in high-complexity environments.
  • Clear documentation and communication skills.
  • Experience collaborating across engineering, security, and research teams.
  • Strategic mindset with hands-on execution capability.

What Success Looks Like

  • Highly available, secure, and automated AWS & HPC infrastructure.
  • Optimized cloud costs and compute performance.
  • Reliable license server infrastructure with minimal downtime.
  • Fully automated server deployments.
  • Secure, scalable cloud networking architecture.
  • Improved deployment velocity through CI/CD automation.

About the job

Apply before

Posted on

Job type

Full Time

Experience level

Experience

7 years minimum

Location requirements

Hiring timezones

United States +/- 0 hours

About KMC Solutions Inc

Learn more about KMC Solutions Inc and their company culture.

View company profile
Claim this profileKMC Solutions Inc logoKI

KMC Solutions Inc

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

307 remote jobs at KMC Solutions Inc

Explore the variety of open remote roles at KMC Solutions Inc, offering flexible work options across multiple disciplines and skill levels.

View all jobs at KMC Solutions Inc

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan