Open to opportunities

Benjamin Sprayberry

@benjaminsprayberry

Message

Senior AI Engineer building low-latency, scalable AWS inference and LLM automation.

United States

Message

What I'm looking for

I’m looking to lead end-to-end AI infrastructure on AWS—shipping reliable inference, LLM automation, and streaming systems—while partnering cross-functionally and growing teams through architecture reviews and technical mentorship.

I’m a Senior AI Engineer focused on building cloud-native AI platforms that scale in production. I’ve architected large-scale inference systems on AWS, delivering measurable performance gains like reducing latency by 40% across high-volume workloads.

I design distributed microservices and streaming pipelines using Python and Kubernetes to support real-time, high-throughput applications. At Amazon, I’ve implemented fault-tolerant architectures and observability with Prometheus and Grafana, reducing downtime by 22% and improving throughput by 18% while keeping service reliability strong under peak traffic.

I also integrate LLMs and ML models into enterprise APIs—improving workflow automation accuracy by 30% and boosting internal document processing efficiency by 35% using OpenAI APIs, vector databases, and RAG-style pipelines.

Previously, I delivered AI/ML products in real-world environments: building computer vision models with PyTorch and OpenCV (27% detection accuracy lift), deploying RESTful FastAPI services for over 2M monthly requests, and strengthening data processing with Spark and Airflow (40% throughput). I bring this full-stack perspective—backend and front-end where needed—along with MLOps and DevOps discipline (Terraform, GitHub Actions, Kubernetes) to ship faster, deploy more consistently, and mentor engineers through architecture reviews.

Experience

Work history, roles, and key accomplishments

Current

Senior AI Engineer

Current

Amazon

Nov 2021 - Present (4 years 8 months)

Architected large-scale AI inference platforms on AWS, processing over 15M daily transactions with consistent low-latency performance. Reduced infrastructure costs by 28% and cut inference latency by 42% by optimizing model serving, microservices, and resource allocation.

AWS Python Kubernetes TorchServe Kafka Kinesis Terraform Prometheus Grafana

Senior AI/ML Engineer

Motorola Solutions

Jun 2019 - Oct 2021 (2 years 4 months)

Developed computer vision and NLP models for real-time surveillance and incident reporting, improving object detection accuracy by 27% and classification accuracy by 32%. Built and operationalized scalable AWS ML pipelines and low-latency APIs serving over 2M monthly requests.

PyTorch OpenCV AWS Sagemaker fastAPI Kubernetes Kafka Spark Airflow Jenkins scikit learn

AI Full Stack Engineer

Collins Aerospace

Jan 2010 - Jun 2019 (9 years 5 months)

Built data ingestion and predictive analytics systems for aerospace telemetry, processing over 5TB of data daily and reducing equipment failure rates by 26%. Improved performance and delivery by optimizing databases and ETL workflows (reduced query latency by 35% and data availability delays by 40%).