Shaharmeer Basharat
@shaharmeerbasharat
ML infrastructure and backend engineer building scalable, observable production AI systems.
What I'm looking for
I’m a Principal ML Infrastructure & Backend Engineer specializing in production-grade AI systems, including self-hosted LLM deployment and enterprise MLOps infrastructure. I focus on building scalable, fault-tolerant pipelines that stay observable under real traffic.
Most recently, I engineered an autonomous RAG pipeline with DB-tenant-isolated vector namespaces to eliminate cross-tenant data leakage, passing rigorous enterprise security audits. I also architected a Kafka-driven embedding pipeline that maintained P99 retrieval latency under 120ms at 1M+ active users, while delivering zero-downtime security refactors.
In prior roles, I led zero-downtime ML serving upgrades by replacing naive FastAPI wrappers with NVIDIA Triton Inference Server, implementing concurrent model execution and dynamic GPU memory isolation to prevent OOM crashes. I built KServe canary deployments with automated statistical A/B evaluation to safely promote or roll back models based on SLA latency, and I deployed drift detection integrated with Prometheus and Grafana.
Before that, I designed a distributed entity resolution engine for millions of business records, decomposed monolithic ingestion into microservices, and implemented exactly-once Kafka stream processing with idempotent consumers to ensure data integrity. I also strengthened deployment reliability through CI/CD automation, regression testing workflows, and load/stress-testing to identify backend bottlenecks early.
Experience
Work history, roles, and key accomplishments
Lead Backend & MLOps
NDA
Feb 2026 - Present (4 months)
Engineered an autonomous RAG pipeline using DB-tenant-isolated vector namespaces in Weaviate, eliminating cross-tenant data leakage and passing enterprise security audits. Built a Kafka-driven embedding pipeline to keep P99 retrieval latency under 120ms for a 1M+ active user base, and hardened security with zero-downtime migration plus PostgreSQL RLS and immutable audit logs for SOC 2 readiness.
Senior Software Engineer
Techlio PVT Limited
May 2025 - Feb 2026 (9 months)
Architected a zero-downtime ML serving layer by replacing FastAPI model wrappers with NVIDIA Triton Inference Server, enabling concurrent model execution with dynamic GPU memory isolation to prevent OOM crashes. Built a KServe canary deployment system with automated statistical A/B evaluation and deployed model drift detection integrated with Prometheus and Grafana for safer releases.
AI / ML Infrastructure Engineer
Paklogics
Oct 2024 - May 2025 (7 months)
Engineered a distributed entity resolution engine to process, match, and persist millions of heterogeneous business records. Rebuilt monolithic ingestion into independently scalable microservices with Kafka and Celery, implemented exactly-once processing with idempotent consumers, and added an ML bi-encoder + pgvector (HNSW) layer to reduce duplicate rates to under 0.4%.
Infrastructure & Automation Engineer
CodeAutomation.ai LLC
Apr 2023 - Sep 2024 (1 year 5 months)
Built and maintained CI/CD pipelines using GitHub Actions and Jenkins to enable automated high-velocity deployment workflows. Implemented load and stress testing to validate scalability, and automated regression/functional testing integrated into the deployment pipeline for zero-downtime, fault-tolerant releases.
Backend Python Developer
Enterprise Cube
Feb 2021 - Mar 2023 (2 years 1 month)
Developed and maintained scalable Python backend architectures using FastAPI and Django for high-availability enterprise applications. Designed end-to-end data pipelines and backend routing logic, and optimized PostgreSQL queries to minimize server response times and improve reliability.
Education
Degrees, certifications, and relevant coursework
National University of Modern Languages
Bachelor of Science, Computer Science
Grade: 3.32/4.0
Earned a B.S. in Computer Science from the National University of Modern Languages (Islamabad), with a CGPA of 3.32/4.0.
Tech stack
Software and tools used professionally
GitHub
Kubernetes
Jenkins
GitHub Actions
DB
PostgreSQL
MongoDB
Gmail
Django
Neo4j
Redis
Terraform
MLflow
Streamlit
Kafka
FastAPI
Grafana
Prometheus
OpenTelemetry
GraphQL
gRPC
Milvus
Airflow
LangChain
Weaviate
Weights & Biases
Evidently AI
Pinecone
Ray
KServe
vLLM
NVIDIA Triton Inference Server
Stable Diffusion
Ada
Odoo
pgvector
Dynamic
Stack AI
Objective
Remote
Availability
Location
Authorized to work in
Social media
Job categories
Skills
Interested in hiring Shaharmeer?
You can contact Shaharmeer and 90k+ other talented remote workers on Himalayas.
Message ShaharmeerFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
