Suman Pathak - Site Reliability Engineering Manager - Xebia IT Architects | Himalayas
SP
Looking for a job

Suman Pathak

@sumanpathak

Site Reliability Specialist focused on performance and scalability engineering.

India

What I'm looking for

I seek a challenging role in a dynamic environment that fosters innovation and growth.

I am a Site Reliability Specialist with 15 years of expertise in performance engineering, scalability, and reliability of mission-critical systems. My focus is on ensuring robust SLIs, SLOs, and SLA compliance for high-throughput web and cloud-native applications. I have a proven track record of implementing chaos engineering practices and designing monitoring solutions with tools like Prometheus, Grafana, and the ELK Stack.

As a Cloud and Automation Expert, I excel in managing cloud infrastructures using IaC tools like Terraform and Ansible. I am adept at automating performance test frameworks and integrating containerization/orchestration technologies like Docker and Kubernetes into CI/CD pipelines. My experience includes leading teams, defining service level objectives, and overseeing incident response efforts to ensure timely resolution of production issues.

Experience

Work history, roles, and key accomplishments

XA
Current

Site Reliability Engineering Manager

Xebia IT Architects

Led and mentored a team of SRE engineers, fostering a high-performance environment. Defined and enforced Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Service Level Agreements (SLAs) for critical systems, while overseeing the design and maintenance of robust monitoring and observability solutions. Coordinated incident response efforts, ensuring timely and effective resoluti

TT

SRE Engineer – High-Throughput Trading Platform

Trading Technologies

Jun 2023 - May 2025 (1 year 11 months)

Architected and executed stress and load tests using custom Python frameworks and Apache JMeter, while automating health checks for Kubernetes clusters to reduce mean time to repair. Developed disaster recovery strategies, enhancing data safety and reducing downtime risks, and implemented scalable monitoring solutions with Prometheus and Grafana, cutting incident response times. Spearheaded the ad

PE

SRE Engineer – Helix Core (Distributed Version Control Software)

Perforce

May 2022 - Jun 2023 (1 year 1 month)

Designed and implemented an end-to-end test automation framework integrating Ansible, Python, and JMeter. Automated Kubernetes cluster provisioning with Terraform, integrated into CI/CD pipelines, and conducted stress tests using Python frameworks, revealing system performance boundaries. Championed chaos engineering practices, enhancing system resilience and robustness, and secured critical custo

CL

SRE and Performance Engineer – School Information Management System

Capita SIMS (India) Pvt. Ltd.

Mar 2021 - Apr 2022 (1 year 1 month)

Migrated performance testing frameworks to JMeter and integrated with Azure DevOps pipelines. Built Azure-based monitoring dashboards, significantly improving issue detection, and optimized SQL queries, enhancing transaction processing speed and reducing CPU load. Transitioned on-prem VMs to Azure Kubernetes Service, cutting infrastructure costs, and implemented advanced capacity planning techniqu

NL

Performance Test Engineer

Nuance India Pvt. Ltd.

May 2016 - Oct 2016 (5 months)

Designed integrated load tests for complex VMCS products using JMeter. Streamlined monitoring processes, reducing performance bottlenecks.

CS

Lead Performance Engineer – DVSA CI Project

Capita Software

Jan 2019 - Feb 2021 (2 years 1 month)

Successfully migrated a web application from on-premise to AWS infrastructure. Automated performance testing workflows, continuously identifying and mitigating regressions, and improved disaster recovery readiness, reducing recovery time objectives. Designed and implemented load balancing strategies using NGINX, optimizing latency.

BS

Performance Test Engineer – SmartIT/DWP

BMC Software

Jan 2016 - Jan 2019 (3 years)

Created and executed performance test strategies, workload models, and test scenarios. Conducted bottleneck analysis, providing actionable optimization recommendations, and monitored MongoDB replica performance and executed endurance testing.

PL

Software Engineer

Persistent Systems Ltd.

Aug 2010 - May 2016 (5 years 9 months)

Conducted API performance testing for enterprise applications using JMeter. Optimized database performance and profiled Java/.NET code for production-level systems. Developed web applications using ASP.NET, C#, and SQL Server.

Education

Degrees, certifications, and relevant coursework

University of Pune logoUP

University of Pune

Bachelor of Engineering, Information Technology

Grade: First Class with Distinction

Graduated with First Class with Distinction. Focused on Information Technology, gaining foundational knowledge in various aspects of computing.

Find your dream job

Sign up now and join over 85,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan