HimalayasHimalayas logo
Moh IqbalMI
Open to opportunities

Moh Iqbal

@mohiqbal

Seasoned machine learning engineer specializing in LLMs, MLOps, and low-latency production AI.

United States
Message

What I'm looking for

I’m looking for a team where I can build reliable, scalable ML/LLM systems end-to-end—optimizing accuracy and latency, strengthening MLOps and monitoring, and applying responsible AI practices on AWS/GCP.

I’m a seasoned Machine Learning Engineer with 12 years of hands-on experience building and optimizing ML pipelines, deploying scalable AI systems, and leading end-to-end model lifecycle projects. My focus is delivering reliable, scalable, high-performance machine learning solutions that drive measurable business impact.

I’ve designed and deployed large-scale LLM systems using RAG, embeddings, and transformers, improving response accuracy and reasoning by 30%+. I’ve also built distributed training pipelines with PyTorch, Ray, and GPU/TPU clusters, reducing training time by 40–60%.

I specialize in production deployment and optimization, engineering low-latency inference systems with FastAPI, Kafka, and microservices that serve millions of daily requests. I’ve applied ONNX and TensorRT to reduce latency by up to 45%, while implementing robust MLOps automation with MLflow, Kubeflow, and CI/CD for reproducibility and faster releases.

I bring strong responsible AI fundamentals—integrating explainability, fairness, and monitoring using SHAP and drift detection to support compliance and trust. From streaming feature pipelines and model monitoring to evaluation and explainability metrics, I aim for dependable AI platforms that perform well in the real world.

Experience

Work history, roles, and key accomplishments

AN
Current

Lead Machine Learning Architect

Anysphere

Jul 2021 - Present (4 years 8 months)

Designed and deployed large-scale LLM/RAG systems, improving response accuracy and reasoning by 30%+. Built scalable distributed training and low-latency real-time inference pipelines, and reduced inference latency by up to 45% while serving millions of daily requests.

HA

Senior Machine Learning Engineer

Heavy AI

Dec 2017 - Jun 2021 (3 years 6 months)

Built and deployed production AI models for NLP, computer vision, and time-series forecasting, improving prediction accuracy by 25%+. Implemented streaming feature pipelines and reduced inference latency using ONNX and TensorRT while adding drift monitoring and performance tracking.

Education

Degrees, certifications, and relevant coursework

University of Houston logoUH

University of Houston

Bachelor of Computer Science, Computer Science

2010 - 2014

Earned a Bachelor of Computer Science from the University of Houston from 2010 to 2014.

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan