HimalayasHimalayas logo
HI
Looking for a job

Hafeez Iqbal

@hafeeziqbal1

Senior AI/ML Engineer turning research models into reliable, cost-effective production systems.

United States
Message

What I'm looking for

I want to build and scale reliable ML/LLM systems in production—designing MLOps and automated pipelines, delivering RAG and quantization optimizations, and maintaining observability while controlling cloud cost and latency.

I’m a Senior AI/ML Engineer with over 8 years of experience moving complex machine learning models from research environments into high-scale production. I focus on designing MLOps frameworks, automated pipelines, and distributed infrastructure that keep systems reliable and cost-effective.

I specialize in the Generative AI transition—optimizing Large Language Models (LLMs) with retrieval-augmented generation (RAG) and model quantization. In my recent role, I built an automated MLOps framework with MLflow that streamlined model promotion from R&D experimentation to production deployment.

I also deliver performance at scale: I developed a high performance RAG pipeline using Pinecone and LLMs to improve accuracy for complex domain queries. I optimized large-scale transformer models for real-time use with TensorRT and quantization to significantly reduce inference latency, and I scaled training to billion-parameter levels using distributed protocols on multi-GPU clusters with DeepSpeed.

To keep models healthy long-term, I establish rigorous monitoring and observability standards to detect data drift and maintain production integrity. Previously, I architected automated ingestion and augmentation pipelines, built internal performance dashboards with Flask and Streamlit, and operationalized systems with Docker, CI/CD, Kubernetes, and large-scale Spark workflows—from training to deployment to iteration.

Experience

Work history, roles, and key accomplishments

NT
Current

Senior AI/ML Engineer

Nimer Tech

Aug 2022 - Present (3 years 7 months)

Owned the end-to-end ML lifecycle for global predictive services by designing scalable PyTorch and AWS SageMaker architectures. Built an automated MLflow-based MLOps framework, developed a Pinecone-based RAG pipeline to improve domain response accuracy, and reduced inference latency using TensorRT and model quantization.

DataRobot logoDA

Machine Learning Engineer

Feb 2018 - Apr 2020 (2 years 2 months)

Productionalized ensemble-based anomaly detection systems for real-time industrial monitoring, improving availability and reliability for mission-critical workloads. Built terabyte-scale feature engineering workflows with Spark and deployed BERT/spaCy NLP microservices in a high-availability Kubernetes environment with full CI/CD and automated hyperparameter sweeping.

Education

Degrees, certifications, and relevant coursework

Hafeez hasn't added their education

Don't worry, there are 90k+ talented remote workers on Himalayas

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan