Himalayas logo
NK
Open to opportunities

nikhil kandi

@nikhilkandi

Site Reliability Engineer ensuring 24/7 clinical system availability, automation, and observability.

United States
Message

What I'm looking for

I seek SRE/DevOps roles where I can ensure high availability, automate operations, improve observability, mentor teams, and work within compliant healthcare or enterprise environments.

I am a results-driven Site Reliability Engineer with deep experience maintaining 24/7 clinical systems, cloud infrastructure, and monitoring solutions for healthcare platforms. I focus on uptime, compliance, security, and seamless patient care while collaborating closely with cross-functional and clinical teams.

I have implemented observability stacks (Splunk, Dynatrace, Grafana, Honeycomb), deployed applications on Kubernetes with Docker and Helm, and automated operational workflows using Azure Functions, PowerShell, and Ansible. I have led on-call rotations, conducted root cause analysis during critical outages, and overseen clinical system upgrades ensuring regulatory and business continuity.

I continuously improve reliability through CI/CD pipelines, infrastructure automation, cost optimization in AWS/Azure, and runbook/process documentation to reduce incident resolution time and upskill teams.

Experience

Work history, roles, and key accomplishments

DI
Current

Site Reliability Engineer

Data Solutions Inc

Dec 2022 - Present (2 years 10 months)

Maintained mission-critical clinical systems (EHR, CDSS, claims) with 24/7 availability, implemented observability (Splunk, Dynatrace, Grafana) and automated recovery processes to reduce outage impact and improve JVM performance.

DI

AWS Devops Engineer

Data Solutions Inc

Sep 2020 - Dec 2022 (2 years 3 months)

Managed AWS infrastructure across multiple environments, automated deployments and cost optimizations, and designed high-availability EC2 architectures with secure S3/Glacier backups to improve reliability.

AI

Process Associate

Amazon India

Aug 2015 - Nov 2015 (3 months)

Provided backend operational support and stakeholder service, resolving process inquiries and ensuring SLA adherence through timely escalation and cross-functional coordination.

Education

Degrees, certifications, and relevant coursework

Southern Arkansas University logoSU

Southern Arkansas University

Master of Science, Computer Science

2016 - 2017

Completed a Master of Science in Computer Science with coursework and projects in advanced computing and systems administration.

Jawaharlal Nehru Technological University, Hyderabad logoJH

Jawaharlal Nehru Technological University, Hyderabad

Bachelor of Science, Computer Science

2010 - 2014

Earned a Bachelor's degree in Computer Science focusing on software development and foundational computing concepts.

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan
nikhil kandi - Site Reliability Engineer - Data Solutions Inc | Himalayas