Open to opportunities

Moh Iqbal

@mohiqbal

Message

Seasoned machine learning engineer specializing in LLMs, MLOps, and low-latency production AI.

United States

Message

What I'm looking for

I’m looking for a team where I can build reliable, scalable ML/LLM systems end-to-end—optimizing accuracy and latency, strengthening MLOps and monitoring, and applying responsible AI practices on AWS/GCP.

I’m a seasoned Machine Learning Engineer with 12 years of hands-on experience building and optimizing ML pipelines, deploying scalable AI systems, and leading end-to-end model lifecycle projects. My focus is delivering reliable, scalable, high-performance machine learning solutions that drive measurable business impact.

I’ve designed and deployed large-scale LLM systems using RAG, embeddings, and transformers, improving response accuracy and reasoning by 30%+. I’ve also built distributed training pipelines with PyTorch, Ray, and GPU/TPU clusters, reducing training time by 40–60%.

I specialize in production deployment and optimization, engineering low-latency inference systems with FastAPI, Kafka, and microservices that serve millions of daily requests. I’ve applied ONNX and TensorRT to reduce latency by up to 45%, while implementing robust MLOps automation with MLflow, Kubeflow, and CI/CD for reproducibility and faster releases.

I bring strong responsible AI fundamentals—integrating explainability, fairness, and monitoring using SHAP and drift detection to support compliance and trust. From streaming feature pipelines and model monitoring to evaluation and explainability metrics, I aim for dependable AI platforms that perform well in the real world.

Experience

Work history, roles, and key accomplishments

Current

Lead Machine Learning Architect

Current

Anysphere

Jul 2021 - Present (5 years)

Designed and deployed large-scale LLM/RAG systems, improving response accuracy and reasoning by 30%+. Built scalable distributed training and low-latency real-time inference pipelines, and reduced inference latency by up to 45% while serving millions of daily requests.

RAG PyTorch Ray fastAPI Kafka MLFlow Kubeflow Onnx TensorRT

Senior Machine Learning Engineer

Heavy AI

Dec 2017 - Jun 2021 (3 years 6 months)

Built and deployed production AI models for NLP, computer vision, and time-series forecasting, improving prediction accuracy by 25%+. Implemented streaming feature pipelines and reduced inference latency using ONNX and TensorRT while adding drift monitoring and performance tracking.

NLP Computer Vision Time Series Kafka Spark Kubernetes Docker Onnx TensorRT Model Monitoring

Machine Learning Engineer

Clarifai

May 2015 - Nov 2017 (2 years 6 months)

Led end-to-end ML initiatives, developing regression/classification/clustering models and modern NLP systems for text processing and sentiment analysis. Built scalable data processing and ML pipelines, improving model accuracy and reliability through rigorous evaluation and validation.

predictive modeling Machine Learning Pipelines NLP Sentiment Analysis Data Cleaning Model Evaluation Python SQL

Data Scientist

Eyeris Technologies

Mar 2014 - Apr 2015 (1 year 1 month)

Delivered data-driven insights by extracting actionable findings from structured and unstructured data and building statistical and ML models for prediction and segmentation. Created dashboards, led A/B testing and hypothesis validation, and engineered features to improve model performance.

Python SQL Statistical Modeling Machine Learning Data Analysis A B Testing Data Visualization Experimentation