HimalayasHimalayas logo
Gaurav GogoiGG
Open to opportunities

Gaurav Gogoi

@gauravgogoi

Site Reliability Engineer improving AWS distributed systems through observability, automation, and incident response.

India
Message

What I'm looking for

I’m looking for a role where I can own reliability for AWS distributed systems—building SLI/SLO-driven observability, automating incident response, and reducing MTTR through clear runbooks, alerting signal-to-noise, and continuous delivery improvements.

I’m a Site Reliability Engineer/DevOps Engineer with 3+ years of experience building and operating distributed systems on AWS. I focus on practical reliability work—turning production signals into clear SLI/SLOs, tight feedback loops, and faster incident recovery.

At Zuora, I owned on-call incident response and cross-team coordination using PagerDuty, Grafana, and Kibana, reducing MTTR by 50% through structured runbook-driven triage. I also deployed event-driven autoscaling with KEDA and HPA on Kubernetes (AWS EKS), reducing workload-related production incidents by 70%.

I built latency monitoring for 30+ Tier-1 services with Prometheus and Grafana, defining SLIs, aligning alerting with SLOs and error budgets, and cutting SLA breaches/customer escalations by 60% while reducing alert noise to improve signal-to-noise during incidents.

I further strengthened observability and delivery by deploying a self-hosted Prometheus Blackbox Exporter for 1000+ endpoints across 100+ services, improving upstream dependency visibility and reducing customer-reported issues by 80%. I’ve also driven observability and deployment modernization—migrating to Grafana (saving $400K annually), and implementing a GitOps-driven continuous deployment platform with Terraform and FluxCD for AWS ECS, reducing deployment time by 10x.

Experience

Work history, roles, and key accomplishments

Zuora logoZU

Software Engineer

Jul 2023 - Oct 2025 (2 years 3 months)

Defined SLI/SLOs across 40+ critical services and implemented log-based alerting, reducing customer-reported incidents by 50%. Built an AI-powered incident management system and self-healing automation with PagerDuty integration, cutting mitigation and response times by 80% and 70%.

Education

Degrees, certifications, and relevant coursework

NIT Silchar logoNS

NIT Silchar

Bachelor of Technology, Electrical, Electronics and Communications Engineering

2019 - 2023

Grade: CGPA: 8.37

B.Tech in Electrical, Electronics and Communications Engineering (ECE) at NIT Silchar (2019–2023), achieving a CGPA of 8.37.

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan