Pratik Khairnar
@pratikkhairnar
Systems Software Engineer specializing in AI infrastructure, SRE reliability, and cloud-native distributed systems.
What I'm looking for
I’m a Systems Software Engineer with 7.1 years of experience building and operating large-scale AI infrastructure, distributed systems, and cloud-native SRE platforms. I focus on reliability and operability—using SLOs/SLAs and error budgets, blameless incident management, and structured incident response to continuously improve system resilience and team efficiency.
In my current role as Delivery Module Lead, I architect FastAPI-based microservices for large-scale data ingestion and telemetry, design database sharding/partitioning for horizontal scalability, and build Control-M–driven orchestration frameworks for automated data extraction pipelines. I also implement end-to-end observability with Splunk, Grafana, and Prometheus, automate CI/CD with GitLab CI, and ship Infrastructure as Code with Terraform and CloudFormation—cutting provisioning time by 60% and reducing manual operational overhead by 40%. I’ve led on-prem to cloud migration, containerization, and performance tuning, while running GPU-backed distributed data pipelines on Kubernetes to support high-throughput AI training and inference with sub-second observability.
Experience
Work history, roles, and key accomplishments
Delivery Module Lead
Mphasis Limited
Jan 2024 - Present (2 years 5 months)
Architected FastAPI-based microservices for large-scale data ingestion and telemetry workloads, improving scalability via database sharding/partitioning. Built event-based extraction orchestration (Control-M), implemented SRE practices (SLOs/SLAs/error budgets), and reduced provisioning time by 60% using Terraform/CloudFormation while automating CI/CD with GitLab.
Associate
Blackbuck Insights
Jul 2022 - Dec 2023 (1 year 5 months)
Designed and deployed cloud-native Python ETL pipelines on AWS Glue and Docker, processing 10M+ records daily using Pandas and PySpark. Improved resilience and reliability by strengthening logging/failure handling, reduced MTTR by 45% through SRE-aligned incident/change control, and automated infrastructure operations with Jenkins and Boto3.
Technical Analyst
Allianz Technology SE
Jun 2020 - Jul 2022 (2 years 1 month)
Developed Django-based cloud security platforms integrated with Illumio and ServiceNow, and deployed/monitored containerized workloads using Docker, Kubernetes, ELK, and Prometheus. Authored Python automation for security policy enforcement and vulnerability scanning across hybrid clouds, reducing manual audit efforts by 35%, and built health monitoring frameworks using Prometheus/ELK for real-tim
Software Engineer
Nihilent Limited
Mar 2019 - Jun 2020 (1 year 3 months)
Automated legacy systems and incident remediation workflows using Python and StackStorm, reducing MTTR by over 50% via event-driven runbooks. Deployed microservices on AWS and Azure with Docker/Kubernetes and implemented infrastructure-as-code using Terraform, alongside centralized logging/monitoring with ELK.
Education
Degrees, certifications, and relevant coursework
AISSMS Institute of Information Technology
Bachelor of Engineering, Computer Engineering
Completed a B.E. in Computer Engineering at AISSMS Institute of Information Technology in 2017.
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Pratik?
You can contact Pratik and 90k+ other talented remote workers on Himalayas.
Message PratikFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
