I'm looking to tackle the new and difficult challenges in the AI field.
R Narendran
@rnarendran
Experienced Full-Lifecycle Architect specializing in multi-cloud Kubernetes and AI/ML/LLM infrastructure.
What I'm looking for
I'm a passionate and experienced Site Reliability Engineer and Full-Lifecycle Architect who loves building and running reliable systems from the ground up. I specialize in the exciting world of multi-cloud Kubernetes and ensuring that cutting-edge AI, ML, and LLM platforms are robust and scalable. I enjoy bridging the gap between development and operations, using tools like GitOps and Terraform to deliver excellent and reliable software. I'm always looking forward to collaborating and tackling new technical challenges!
Experience
Work history, roles, and key accomplishments
Site Reliability Engineer
civo
May 2021 - Present (4 years 5 months)
Architected and managed the high-performance GPU infrastructure supporting the proprietary LLM product (relax.ai), ensuring the scalability and high availability of the multi-tiered conversational AI/ML platform. Orchestrated operators and releases across all active regions, enforcing immutability and governance through ArgoCD-powered GitOps pipelines.
Streamlined scalable multi-tenant AWS EKS platforms using Terraform, embedding security best practices to achieve wiz.io compliance, reducing cloud costs by 60% and audit-ready infrastructure.
Implemented unified observability stack (Prometheus/Grafana), customizing alerting pipelines to reduce Mean Time To Detect (MTTD) critical infrastructure and application anomalies by 40%.
Site Reliability Engineer Consultant
lineten
Jul 2021 - Jan 2022 (6 months)
Conceptualized multi-region Kubernetes clusters using Terraform and custom Golang controllers, optimizing production resources by 20%and ensuring high availability.
Implemented a centralized Observability Platform (Prometheus, Grafana, Thanos) for multi-region logs and metrics, reducing critical incident detection time by 50%
Site Reliability Engineer Consultant
aicrowd
May 2021 - Aug 2021 (3 months)
Transformed the AI/ML workload environment on AWS EKS by implementing GitOps for deployment automation and enhancing reliability with GPU metric monitoring; slashing storage costs by 73%−78% via strategic migration.
Delivered 10+ multiplatform, full-stack applications, streamlined CI/CD using Docker containerization and Git best practices for deployment on AWS, accelerating delivery and deployment frequency by 63%
Education
Degrees, certifications, and relevant coursework
The Linux Foundation
CKS: Certified Kubernetes Security Specialist
2025 -
The Linux Foundation
KCNA: Kubernetes and Cloud Native Associate
2025 -
The Linux Foundation
CKAD: Certified Kubernetes Application Developer
2024 -
The Linux Foundation
CKA: Certified Kubernetes Administrator
2023 -
Availability
Location
Authorized to work in
Salary expectations
Social media
Interested in hiring R?
You can contact R and 90k+ other talented remote workers on Himalayas.
Message RFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
