Larry Honrada
@larryhonrada
Senior AI Engineer specializing in production-grade LLM/RAG systems and low-latency distributed inference.
What I'm looking for
I’m a Senior AI Engineer with 10+ years building production-grade machine learning systems and distributed data platforms. I focus on LLM-powered applications, retrieval systems, and low-latency inference, with a proven track record of delivering systems that process millions of documents and operate reliably under real-world workloads.
At Luxoft, I led development of a production-grade RAG platform enabling natural language querying across millions of insurance documents, reducing analysis time from hours to seconds. I designed end-to-end pipelines (ingestion, chunking, embeddings, vector indexing, retrieval, and LLM inference), optimized vector search/retrieval to achieve ~38ms query latency, built hybrid retrieval and reranking for better relevance, and shipped backend REST APIs (FastAPI, microservices) on AWS/Azure with Docker and Kubernetes. I also owned MLOps workflows with CI/CD and MLflow, integrated observability (Prometheus, Grafana, logging/alerting), and led a team of 4–6 engineers.
Experience
Work history, roles, and key accomplishments
Led development of a production-grade RAG platform for natural language querying across millions of insurance documents, reducing document analysis time from hours to seconds. Built end-to-end ingestion-to-inference pipelines and REST services, achieving ~38ms query latency with hybrid retrieval and reranking.
Built distributed ML infrastructure for real-time event detection across global data streams, supporting billions of daily inputs. Improved throughput by ~30%, reduced model deployment time from weeks to under one week, and cut incident detection time by ~50% via monitoring and alerting.
Built telecom customer churn prediction models (logistic regression, random forest, gradient boosting) and generated risk scores for millions of users to support retention strategies. Improved model performance using feature engineering and evaluation (ROC-AUC, precision/recall) and deployed batch scoring pipelines.
Education
Degrees, certifications, and relevant coursework
University of the Philippines
Bachelor’s Degree, Computer Science
2008 - 2013
Activities and societies: Built distributed ML systems for real-time event detection; designed multi-model inference for real-time decision pipelines; created AI image generation pipelines; developed an LLM-powered voice AI agent for inquiries and scheduling.
Earned a Bachelor’s Degree in Computer Science. Completed related projects including distributed event detection, multi-model inference, AI image generation, and an LLM-powered voice AI agent.
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Larry?
You can contact Larry and 90k+ other talented remote workers on Himalayas.
Message LarryFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
