Open to opportunities

Tyler User

@tyleruser11

Message

Senior applied AI and full-stack engineer building production-grade LLM and cloud-native platforms end to end.

United States

Message

What I'm looking for

I’m looking for a remote-first, outcome-driven team where I can build scalable, reliable AI systems end to end—improving retrieval accuracy, reducing inference cost, and shipping cloud-native RAG and MLOps platforms in close collaboration with product and data.

I’m an applied AI engineer with 13 years of experience building production-grade LLM systems, RAG pipelines, and cloud-native ML platforms across risk analytics, enterprise cloud, computer vision, and AI automation. I’m especially driven by improving retrieval accuracy while reducing inference cost and increasing system throughput.

At PromptLoop, I built a multi-tenant RAG Automation Engine that improved retrieval accuracy by 40–55% using hybrid ranking and metadata-aware search. I reduced inference cost by ~40% through quantization (GGUF/ONNX), batch scheduling, and aggressive caching, while scaling ingestion throughput by 4× using async FastAPI + Redis Streams.

I also focused on production reliability and deployment velocity—automating the RAG engine lifecycle with MLflow + GitHub Actions and reducing regression-related rollbacks by 70%. By creating reusable “AI Automation Blocks,” I helped reduce customer onboarding time from 5 days to <24 hours.

Earlier roles reinforced my “platform-first” mindset: I shipped a GPU-accelerated EdgeVision Safety Platform (tripling inference throughput with PyTorch + TensorRT + DeepStream) and reworked event-processing microservices in Go and Rust to cut alert latency by 68%. Across teams, I work remote-first and outcome-driven to deliver scalable, reliable AI systems end to end.

Experience

Work history, roles, and key accomplishments

Senior Applied AI Engineer

PromptLoop

May 2023 - Jan 2026 (2 years 8 months)

Built a multi-tenant RAG Automation Engine that improved retrieval accuracy by 40–55% using hybrid ranking and metadata-aware search. Reduced inference cost by ~40% via quantization (GGUF/ONNX), batch scheduling, and caching, while scaling ingestion throughput 4× with async FastAPI + Redis Streams.

RAG Hybrid Search fastAPI REDIS Streams MLFlow GitHub Actions Quantization (GGUF ONNX)Kubernetes Observability

AI / Platform Engineer

Loko AI

Mar 2020 - Apr 2023 (3 years 1 month)

Developed a GPU-accelerated EdgeVision Safety Platform that tripled real-time inference throughput using PyTorch + TensorRT + DeepStream. Reduced end-to-end alert latency by 68% by rewriting event-processing microservices in Go and Rust, and improved hazard recognition accuracy by 22% through model and labeling changes.

Computer Vision PyTorch TensorRT Go Rust Kafka Microservices Prometheus Grafana

Full Stack Engineer

Oracle Florida

Jun 2017 - Feb 2020 (2 years 8 months)

Delivered a Workforce Insights Predictive Engine with ML-driven forecasting APIs used across enterprise dashboards. Reduced ETL latency by 60% with Airflow and optimized SQL, deployed models as Kubernetes microservices improving uptime and reducing failure rates by 35%, and cut API latency by 30% with async serving.

Python Go Java Airflow SQL Optimization Kubernetes REST APIs gRPC

Software Engineer

LexisNexis Risk Solutions

Feb 2012 - May 2017 (5 years 3 months)

Developed ML models for a Fraud & Risk Scoring Pipeline, raising risk-detection accuracy by ~18% via improved feature engineering and tuning. Reduced data-processing time by 45% by building ETL/ELT flows in Python and Scala, and improved scoring API response times from 600ms to 320ms through backend optimization and caching.

Machine Learning Python Scala ETL SQL Optimization Docker Microservices Caching