Skip to main content
HI
Looking for a job

Hafeez Iqbal

@hafeeziqbal1

Senior AI/ML Engineer turning research models into reliable, cost-effective production systems.

United States
Message

What I'm looking for

I want to build and scale reliable ML/LLM systems in production—designing MLOps and automated pipelines, delivering RAG and quantization optimizations, and maintaining observability while controlling cloud cost and latency.

I’m a Senior AI/ML Engineer with over 8 years of experience moving complex machine learning models from research environments into high-scale production. I focus on designing MLOps frameworks, automated pipelines, and distributed infrastructure that keep systems reliable and cost-effective.

I specialize in the Generative AI transition—optimizing Large Language Models (LLMs) with retrieval-augmented generation (RAG) and model quantization. In my recent role, I built an automated MLOps framework with MLflow that streamlined model promotion from R&D experimentation to production deployment.

I also deliver performance at scale: I developed a high performance RAG pipeline using Pinecone and LLMs to improve accuracy for complex domain queries. I optimized large-scale transformer models for real-time use with TensorRT and quantization to significantly reduce inference latency, and I scaled training to billion-parameter levels using distributed protocols on multi-GPU clusters with DeepSpeed.

To keep models healthy long-term, I establish rigorous monitoring and observability standards to detect data drift and maintain production integrity. Previously, I architected automated ingestion and augmentation pipelines, built internal performance dashboards with Flask and Streamlit, and operationalized systems with Docker, CI/CD, Kubernetes, and large-scale Spark workflows—from training to deployment to iteration.

Experience

Work history, roles, and key accomplishments

NT
Current

Senior AI/ML Engineer

Nimer Tech

Aug 2022 - Present (3 years 10 months)

Owned the end-to-end ML lifecycle for global predictive services by designing scalable PyTorch and AWS SageMaker architectures. Built an automated MLflow-based MLOps framework, developed a Pinecone-based RAG pipeline to improve domain response accuracy, and reduced inference latency using TensorRT and model quantization.

DataRobot logoDA

Machine Learning Engineer

Feb 2018 - Apr 2020 (2 years 2 months)

Productionalized ensemble-based anomaly detection systems for real-time industrial monitoring, improving availability and reliability for mission-critical workloads. Built terabyte-scale feature engineering workflows with Spark and deployed BERT/spaCy NLP microservices in a high-availability Kubernetes environment with full CI/CD and automated hyperparameter sweeping.

Education

Degrees, certifications, and relevant coursework

Hafeez hasn't added their education

Don't worry, there are 90k+ talented remote workers on Himalayas

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan