Sammy Naqi
@sammynaqi
Senior AI Engineer building production LLM, RAG, and agentic systems with low-latency, scalable LLMOps.
What I'm looking for
I’m a Senior AI Engineer with 8+ years of experience building and deploying production-grade machine learning and Generative AI systems across healthcare, fintech, and enterprise platforms. I specialize in Large Language Models (LLMs), Retrieval Augmented Generation (RAG), and agentic AI systems, with strong expertise in LLMOps, distributed systems, and low-latency inference.
I’ve architected end-to-end LLM pipelines that convert clinician-patient conversations into structured medical documentation, improving provider efficiency by 40%. I’ve also deployed advanced RAG systems using hybrid retrieval, embeddings (FAISS), and reranking to enhance accuracy and context by 30%, and built real-time vector search with Pinecone to improve access to grounded medical knowledge.
Across roles, I’ve scaled AI systems to millions of users while optimizing performance and cost using cloud-native architectures. I’m deeply focused on AI governance and responsible AI—implementing compliance aligned with HIPAA, SOC 2, GDPR, and data privacy best practices, and mentoring teams to strengthen MLOps with MLflow, Kubeflow, and CI/CD.
Experience
Work history, roles, and key accomplishments
Senior AI Engineer
Abridge
Jan 2024 - Present (2 years 5 months)
Architected and deployed end-to-end LLM pipelines to convert clinician-patient conversations into structured medical documentation, improving provider efficiency by 40%. Designed and deployed RAG and low-latency inference systems that increased accuracy by 30% and optimized operational costs by 30%, while implementing HIPAA/SOC 2-aligned AI governance.
Developed and deployed credit risk models using XGBoost, LightGBM, and deep learning (LSTM/CNN), improving model accuracy by 20%. Built Spark/Airflow data pipelines for 50M+ records and implemented CI/CD and low-latency inference on AWS, reducing deployment times by 40% while using SHAP/LIME for explainability.
Built conversational AI for customer support automation using transformer-based NLP models (e.g., BERT) and sequence-to-sequence dialogue generation. Developed scalable semantic search and ML pipelines using TensorFlow/Keras and AWS to support intent classification, information retrieval, and dialogue management.
Education
Degrees, certifications, and relevant coursework
Sammy hasn't added their education
Don't worry, there are 90k+ talented remote workers on Himalayas
Tech stack
Software and tools used professionally
Apache Spark
GitHub
GitLab
Kubernetes
GitHub Actions
GitLab CI
PySpark
Gmail
Databricks
OpenCV
Terraform
Java
TensorFlow
PyTorch
MLflow
scikit-learn
Keras
Kubeflow
Kafka
FastAPI
Grafana
Prometheus
Airflow
Netomi
SQL
XGBoost
Hugging Face
LightGBM
LangChain
Weaviate
Weights & Biases
Pinecone
vLLM
OpenAI API
ArgoCD
Abridge
Bash
pgvector
Agentic
Enhance
Faiss
Optuna
Dynamic
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Sammy?
You can contact Sammy and 90k+ other talented remote workers on Himalayas.
Message SammyFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
