Skip to main content
Gaurav GogoiGG
Open to opportunities

Gaurav Gogoi

@gauravgogoi

Site Reliability Engineer improving AWS distributed systems through observability, automation, and incident response.

India
Message

What I'm looking for

I’m looking for a role where I can own reliability for AWS distributed systems—building SLI/SLO-driven observability, automating incident response, and reducing MTTR through clear runbooks, alerting signal-to-noise, and continuous delivery improvements.

I’m a Site Reliability Engineer/DevOps Engineer with 3+ years of experience building and operating distributed systems on AWS. I focus on practical reliability work—turning production signals into clear SLI/SLOs, tight feedback loops, and faster incident recovery.

At Zuora, I owned on-call incident response and cross-team coordination using PagerDuty, Grafana, and Kibana, reducing MTTR by 50% through structured runbook-driven triage. I also deployed event-driven autoscaling with KEDA and HPA on Kubernetes (AWS EKS), reducing workload-related production incidents by 70%.

I built latency monitoring for 30+ Tier-1 services with Prometheus and Grafana, defining SLIs, aligning alerting with SLOs and error budgets, and cutting SLA breaches/customer escalations by 60% while reducing alert noise to improve signal-to-noise during incidents.

I further strengthened observability and delivery by deploying a self-hosted Prometheus Blackbox Exporter for 1000+ endpoints across 100+ services, improving upstream dependency visibility and reducing customer-reported issues by 80%. I’ve also driven observability and deployment modernization—migrating to Grafana (saving $400K annually), and implementing a GitOps-driven continuous deployment platform with Terraform and FluxCD for AWS ECS, reducing deployment time by 10x.

Experience

Work history, roles, and key accomplishments

Zuora logoZU

Software Engineer

Jul 2023 - Oct 2025 (2 years 3 months)

Defined SLI/SLOs across 40+ critical services and implemented log-based alerting, reducing customer-reported incidents by 50%. Built an AI-powered incident management system and self-healing automation with PagerDuty integration, cutting mitigation and response times by 80% and 70%.

Education

Degrees, certifications, and relevant coursework

NIT Silchar logoNS

NIT Silchar

Bachelor of Technology, Electrical, Electronics and Communications Engineering

2019 - 2023

Grade: CGPA: 8.37

B.Tech in Electrical, Electronics and Communications Engineering (ECE) at NIT Silchar (2019–2023), achieving a CGPA of 8.37.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan