Muhammad Murtaza
@muhammadmurtaza1
Senior AI Engineer specializing in multi-agent LLM systems, RAG, and production-grade automation.
What I'm looking for
Senior Python AI Engineer with 4+ years of production experience building backend systems and APIs for AI-powered applications. I specialize in LLM orchestration (LangChain, LangGraph, LlamaIndex), RAG pipeline design, and multi-agent architectures that automate real business workflows at scale.
I've shipped systems serving 50K+ monthly users, processing 2M+ document pipelines, and handling 100K+ daily API requests at sub-200ms latency — reducing inference time by 25% and cloud costs by 30% through model quantization and infrastructure optimization.
My RAG work spans Pinecone, FAISS, ChromaDB, Weaviate, and Milvus with hybrid semantic search at 92% retrieval accuracy, evaluated using Ragas and DeepEval. I've fine-tuned LLMs including Llama 3 using LoRA and QLoRA via Hugging Face, achieving 35% accuracy improvement over baseline. I integrate and orchestrate foundation models across AWS Bedrock, OpenAI GPT-4, Anthropic Claude, Google Gemini, and Mistral — selecting the right model per cost, latency, and task.
On the backend, I build scalable FastAPI microservices with async patterns, structured JSON/function-calling outputs, and security best practices including HIPAA-compliant data handling and JWT authentication. I prototype rapidly with Streamlit and Gradio before productionizing, and deploy with Docker, Kubernetes, and CI/CD pipelines with MLflow experiment tracking for zero-downtime releases.
I work independently, communicate clearly across technical and non-technical teams, and ship reliable AI products people use every day.
Experience
Work history, roles, and key accomplishments
Generative AI Engineer
Vision Byte Technologies
Jan 2023 - Aug 2025 (2 years 7 months)
Designed and deployed production multi-agent AI systems using LangChain/LangGraph, improving workflow efficiency by 40% through parallel orchestration and tool integration. Built RAG pipelines for 2M+ documents (92% retrieval accuracy) and served 100K+ daily API requests with sub-200ms latency while reducing cloud costs 30% and inference time 25%.
ML & Deep Learning Engineer
Dot Coder
Jan 2021 - Nov 2022 (1 year 10 months)
Built and deployed production ML/NLP systems for text classification and sentiment analysis, reaching 89% accuracy on real-world datasets. Developed ML pipelines with automated monitoring and REST APIs for reliable model serving, and optimized deployments using Docker for consistent CI/CD integration.
Education
Degrees, certifications, and relevant coursework
Kohat University of Science & Technology
Bachelor of Science, Information Technology
2021 - 2025
Bachelor of Science in Information Technology at Kohat University of Science and Technology (Oct 2021–Jun 2025), covering machine learning, natural language processing, deep learning, data science, and healthcare informatics.
Tech stack
Software and tools used professionally
Kubernetes
PostgreSQL
MongoDB
Gmail
Node.js
Django
Next.js
NestJS
ImageKit
Redis
JSON
TensorFlow
PyTorch
MLflow
scikit-learn
Streamlit
Gradio
FastAPI
Gemini
Elasticsearch
Milvus
Serverless
Deepgram
Twilio
dockerized
Hugging Face
LangChain
LlamaIndex
Convex
Weaviate
ChromaDB
Neon
Pinecone
ElevenLabs
Clerk
Groq
DeepEval
Inngest
Coder
Ragas
Framer Motion
Agentic
Faiss
LangGraph
LangSmith
Loops
PEFT
Unsloth
Dynamic
Stack AI
Jan
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Muhammad?
You can contact Muhammad and 90k+ other talented remote workers on Himalayas.
Message MuhammadFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
