Julian Thomas
@julianthomas2
Senior MLOps & AI/ML engineer delivering scalable, low-latency production ML systems across cloud environments.
What I'm looking for
I am a Senior MLOps and AI/ML Engineer with 9+ years building production-grade ML systems, data pipelines, and AI-driven web experiences. I specialize in Python-based model development, low-latency inference, and scalable orchestration across AWS, Azure, and GCP.
At recent roles I accelerated LLM inference, architected RAG and vector-search solutions, and processed terabytes-per-day with Databricks, Spark, and Delta Lake. I consistently reduced latency, improved retrieval accuracy, and increased throughput for enterprise workloads.
My background includes secure, compliant ML workflows for regulated industries, automated lineage and audit trails, and resilient containerized deployments using Docker, Kubernetes, Helm, and Terraform. I pair backend ML systems with React/Next.js frontends to deliver real-time, user-facing AI features.
I hold cloud certifications (Azure Fundamentals, AWS Solutions Architect Associate) and emphasize observability, cost-efficient serverless patterns, and reproducible ML lifecycle practices to drive measurable business impact.
Experience
Work history, roles, and key accomplishments
AI & React Developer
Ntiva
Dec 2024 - Nov 2025 (11 months)
Accelerated LLM inference from 1.8s to 900ms and architected production-grade RAG and Databricks Spark pipelines processing terabytes/day, improving retrieval accuracy 2.4x and cutting feature-generation cycles from hours to minutes.
Senior Software Engineer
Insight Global
Mar 2023 - Dec 2024 (1 year 9 months)
Delivered transformer-based NLP workloads and enterprise RAG/vector DB systems supporting 20k+ contextual queries/hour while cutting ingestion failures by over 90% through unified SQL Server to Databricks pipelines.
Machine Learning Engineer
Codoxo
Feb 2019 - Feb 2023 (4 years)
Built HIPAA/SOC2-aligned ML workflows, powered millions of real-time healthcare fraud checks daily, and increased FastAPI inference capacity from 200 to 1,200 QPS while maintaining 99.99% uptime.
Software Developer
Plego Technologies
Feb 2015 - Jan 2019 (3 years 11 months)
Deployed predictive recommendation engines for 100k+ users, improved Elasticsearch relevance and built scalable FastAPI NLP microservices that handled 2x traffic while reducing incident resolution time by over 40 minutes.
Education
Degrees, certifications, and relevant coursework
Siena College
Bachelor of Science, Computer Science
Completed a Bachelor of Science in Computer Science with coursework in data structures, databases, operating systems, networks, software engineering, and machine learning; graduated May 2014.
Tech stack
Software and tools used professionally
GitHub
Kubernetes
Jenkins
NumPy
Pandas
DB
MySQL
PostgreSQL
MongoDB
Gmail
Rollout
Node.js
Next.js
Tailwind CSS
Databricks
Redis
Terraform
JavaScript
JSON
TensorFlow
PyTorch
NLTK
Kafka
FastAPI
Grafana
Prometheus
GraphQL
gRPC
Elasticsearch
Azure Cognitive Search
Serverless
Azure Functions
Uvicorn
s3-lambda
Redis Cloud
SQL
SciPy
Hugging Face
Weaviate
Pinecone
Delta Lake
Faiss
Availability
Location
Authorized to work in
Social media
Job categories
Skills
Interested in hiring Julian?
You can contact Julian and 90k+ other talented remote workers on Himalayas.
Message JulianFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
