Open to opportunities

Shaharmeer Basharat

@shaharmeerbasharat

Message

ML infrastructure and backend engineer building scalable, observable production AI systems.

Pakistan

Message

What I'm looking for

I’m looking to build production AI systems with strong reliability and observability—zero-downtime deployments, secure multi-tenant data handling, and scalable serving. I want teams that value performance metrics, fault tolerance, and practical MLOps.

I’m a Principal ML Infrastructure & Backend Engineer specializing in production-grade AI systems, including self-hosted LLM deployment and enterprise MLOps infrastructure. I focus on building scalable, fault-tolerant pipelines that stay observable under real traffic.

Most recently, I engineered an autonomous RAG pipeline with DB-tenant-isolated vector namespaces to eliminate cross-tenant data leakage, passing rigorous enterprise security audits. I also architected a Kafka-driven embedding pipeline that maintained P99 retrieval latency under 120ms at 1M+ active users, while delivering zero-downtime security refactors.

In prior roles, I led zero-downtime ML serving upgrades by replacing naive FastAPI wrappers with NVIDIA Triton Inference Server, implementing concurrent model execution and dynamic GPU memory isolation to prevent OOM crashes. I built KServe canary deployments with automated statistical A/B evaluation to safely promote or roll back models based on SLA latency, and I deployed drift detection integrated with Prometheus and Grafana.

Before that, I designed a distributed entity resolution engine for millions of business records, decomposed monolithic ingestion into microservices, and implemented exactly-once Kafka stream processing with idempotent consumers to ensure data integrity. I also strengthened deployment reliability through CI/CD automation, regression testing workflows, and load/stress-testing to identify backend bottlenecks early.

Experience

Work history, roles, and key accomplishments

Current

Lead Backend & MLOps

Current

NDA

Feb 2026 - Present (5 months)

Engineered an autonomous RAG pipeline using DB-tenant-isolated vector namespaces in Weaviate, eliminating cross-tenant data leakage and passing enterprise security audits. Built a Kafka-driven embedding pipeline to keep P99 retrieval latency under 120ms for a 1M+ active user base, and hardened security with zero-downtime migration plus PostgreSQL RLS and immutable audit logs for SOC 2 readiness.

Weaviate RAG Data Pipelines Kafka Secure Cookie Authentication Enterprise Security Performance Optimization

Senior Software Engineer

Techlio PVT Limited

May 2025 - Feb 2026 (9 months)

Architected a zero-downtime ML serving layer by replacing FastAPI model wrappers with NVIDIA Triton Inference Server, enabling concurrent model execution with dynamic GPU memory isolation to prevent OOM crashes. Built a KServe canary deployment system with automated statistical A/B evaluation and deployed model drift detection integrated with Prometheus and Grafana for safer releases.

NVIDIA Triton Inference Server KServe fastAPI GPU Memory Isolation Canary Deployment Prometheus Grafana Concurrency

AI / ML Infrastructure Engineer

Paklogics

Oct 2024 - May 2025 (7 months)

Engineered a distributed entity resolution engine to process, match, and persist millions of heterogeneous business records. Rebuilt monolithic ingestion into independently scalable microservices with Kafka and Celery, implemented exactly-once processing with idempotent consumers, and added an ML bi-encoder + pgvector (HNSW) layer to reduce duplicate rates to under 0.4%.

Exactly Once Semantics Celery Microservices Bi Encoder Models Pgvector Entity Resolution PostgreSQL

Infrastructure & Automation Engineer

CodeAutomation.ai LLC

Apr 2023 - Sep 2024 (1 year 5 months)

Built and maintained CI/CD pipelines using GitHub Actions and Jenkins to enable automated high-velocity deployment workflows. Implemented load and stress testing to validate scalability, and automated regression/functional testing integrated into the deployment pipeline for zero-downtime, fault-tolerant releases.

GitHub Actions Jenkins CI CD Load Testing Regression Testing Functional Testing CI Pipelines

Backend Python Developer

Enterprise Cube

Feb 2021 - Mar 2023 (2 years 1 month)

Developed and maintained scalable Python backend architectures using FastAPI and Django for high-availability enterprise applications. Designed end-to-end data pipelines and backend routing logic, and optimized PostgreSQL queries to minimize server response times and improve reliability.