Open to opportunities

Eric Redondo

@ericredondo

Message

I’m a senior AI/ML engineer building production GenAI, RAG, and MLOps systems at scale.

United States

Message

What I'm looking for

I’m looking for a role where I can lead production GenAI/RAG and MLOps—building evaluation, monitoring, and cost-aware deployment on Kubernetes/AWS to deliver reliable, measurable outcomes.

I’m a Senior AI/ML Engineer with 10+ years building production-grade machine learning and GenAI systems at scale. I specialize in LLM-powered applications, retrieval-augmented generation (RAG), and MLOps that keep models reliable in high-availability environments.

In my recent work, I designed and deployed production RAG for enterprise knowledge search serving 50K+ internal users with low-latency (<1.2s p95) responses. I built end-to-end evaluation and monitoring, including prompt regression testing in CI, LLM-as-judge scoring, retrieval confidence for hallucination mitigation, and embedding drift detection using cosine similarity distribution shift and PSI metrics.

I focus on measurable impact and operational efficiency. I reduced inference cost by 27% using dynamic context window trimming, response caching, and multi-model routing, while also cutting deployment cycles from weeks to days through CI/CD automation with MLflow model registry and Git-based workflows.

Earlier, I built scalable ML systems and data infrastructure—designing cloud-native pipelines with SageMaker, EKS, and Lambda, and developing distributed ETL and real-time streaming using Spark, Kafka, and Kinesis. I’ve worked across NLP/CV and analytics use cases, partnering with scientists to productionize models and improving performance and reliability through monitoring, drift detection, and governance standards.

Experience

Work history, roles, and key accomplishments

Senior AI Engineer / MLOps Architect

Snorkel AI

Sep 2022 - Mar 2026 (3 years 6 months)

Designed and deployed production RAG knowledge search for enterprise customers serving 50K+ internal users with low-latency (p95 <1.2s) responses. Built weak-supervision data labeling and LLM evaluation/monitoring frameworks, reducing manual labeling costs by 60% and inference costs by 27% while improving governance and reliability.

RAG Weak Supervision Snorkel Flow LLM Evaluation MLFlow CI CD Dynamic Context Trimming

Senior Machine Learning Engineer

Kohl's Careers

Nov 2020 - Aug 2022 (1 year 9 months)

Architected and deployed production ML systems on AWS (SageMaker, EKS, Lambda), including automated training/deployment pipelines with SageMaker Pipelines and Step Functions. Reduced training infrastructure costs by 28% using spot and distributed strategies and improved monitoring with CloudWatch/Prometheus drift detection.

Amazon SageMaker Amazon EKS AWS Lambda Step Functions Spot Training Distributed Training Cloudwatch Prometheus

Data Engineer / Machine Learning Engineer

Linedata

Sep 2019 - Oct 2020 (1 year 1 month)

Built distributed ETL pipelines with Apache Spark for petabyte-scale datasets and developed real-time streaming pipelines using Kafka and Kinesis. Improved analytics performance by reducing query latency by 35% through storage optimization and indexing, supporting ML experimentation and A/B testing.