Open to opportunities

Julian Thomas

@julianthomas2

Message

Senior MLOps & AI/ML engineer delivering scalable, low-latency production ML systems across cloud environments.

United States

Message

What I'm looking for

I seek senior roles delivering production ML/AI systems with low-latency inference, strong observability, cloud-native scaling, and a collaborative, delivery-focused engineering culture.

I am a Senior MLOps and AI/ML Engineer with 9+ years building production-grade ML systems, data pipelines, and AI-driven web experiences. I specialize in Python-based model development, low-latency inference, and scalable orchestration across AWS, Azure, and GCP.

At recent roles I accelerated LLM inference, architected RAG and vector-search solutions, and processed terabytes-per-day with Databricks, Spark, and Delta Lake. I consistently reduced latency, improved retrieval accuracy, and increased throughput for enterprise workloads.

My background includes secure, compliant ML workflows for regulated industries, automated lineage and audit trails, and resilient containerized deployments using Docker, Kubernetes, Helm, and Terraform. I pair backend ML systems with React/Next.js frontends to deliver real-time, user-facing AI features.

I hold cloud certifications (Azure Fundamentals, AWS Solutions Architect Associate) and emphasize observability, cost-efficient serverless patterns, and reproducible ML lifecycle practices to drive measurable business impact.

Experience

Work history, roles, and key accomplishments

AI & React Developer

Ntiva

Dec 2024 - Nov 2025 (11 months)

Accelerated LLM inference from 1.8s to 900ms and architected production-grade RAG and Databricks Spark pipelines processing terabytes/day, improving retrieval accuracy 2.4x and cutting feature-generation cycles from hours to minutes.

Python Databricks Spark Faiss React Next.js Docker Kubernetes

Senior Software Engineer

Insight Global

Mar 2023 - Dec 2024 (1 year 9 months)

Delivered transformer-based NLP workloads and enterprise RAG/vector DB systems supporting 20k+ contextual queries/hour while cutting ingestion failures by over 90% through unified SQL Server to Databricks pipelines.

Python GPT 4 BERT Databricks Pinecone Weaviate Kubernetes Prometheus Grafana

Machine Learning Engineer

Codoxo

Feb 2019 - Feb 2023 (4 years)

Built HIPAA/SOC2-aligned ML workflows, powered millions of real-time healthcare fraud checks daily, and increased FastAPI inference capacity from 200 to 1,200 QPS while maintaining 99.99% uptime.

Python Kafka Databricks Spark fastAPI Redis Prometheus Terraform Jenkins

Software Developer

Plego Technologies

Feb 2015 - Jan 2019 (3 years 11 months)

Deployed predictive recommendation engines for 100k+ users, improved Elasticsearch relevance and built scalable FastAPI NLP microservices that handled 2x traffic while reducing incident resolution time by over 40 minutes.

Python fastAPI React Next.js Elasticsearch Docker Kubernetes AWS GCP

Education

Degrees, certifications, and relevant coursework

Siena College

Bachelor of Science, Computer Science

Completed a Bachelor of Science in Computer Science with coursework in data structures, databases, operating systems, networks, software engineering, and machine learning; graduated May 2014.