Justin Stoecker
@justinstoecker
Staff Software Engineer specializing in hyperscale AI infrastructure and low-latency production systems.
What I'm looking for
I am a Staff Software Engineer with 14+ years of experience at Google, OpenAI, and Meta, focused on architecting hyperscale AI infrastructure, scaling ChatGPT, and optimizing Llama 4 inference for production. I bridge research-to-production, mentor teams, and deliver reliable, low-latency systems under extreme concurrency, achieving substantial performance and cost improvements.
My work includes designing production-grade inference stacks (Python, PyTorch, vLLM), building autoscaling and observability systems (Kubernetes, Ray, Prometheus/Grafana), and driving quantization and parallelism innovations that produced 2–4× efficiency gains. I prioritize robust deployment pipelines, fault tolerance, and measurable impact across multimodal, long-context, and high-traffic environments.
Experience
Work history, roles, and key accomplishments
Staff Software Engineer
Meta
May 2023 - Present (2 years 10 months)
Architected production-grade inference serving stack for Llama 4 family, achieving 2–4× improved performance-to-cost and supporting billions of daily multimodal interactions while delivering sub-second generation with >99.9% uptime.
Engineered production deployment and inference infrastructure for ChatGPT, scaling to hundreds of millions of daily queries and reducing latency up to 40% via optimized KV cache management, batching, and autoscaling on large GPU clusters.
Software Engineer
Keona Health
Jul 2011 - Dec 2019 (8 years 5 months)
Architected HIPAA-compliant backend and multi-channel AI pipelines for CareDesk, enabling real-time EHR integrations and reducing call volume 25–40% while maintaining deterministic workflows across millions of interactions.
Education
Degrees, certifications, and relevant coursework
University of Miami
Master of Science, Computer Science
2009 - 2011
Completed a Master of Science in Computer Science focusing on advanced topics in AI, machine learning, and systems from 2009 to 2011.
Tech stack
Software and tools used professionally
Splunk
Apache Spark
Apache Flink
Microsoft Azure
Google Cloud Platform
GitHub
GitLab
Bitbucket
Kubernetes
Jenkins
CircleCI
Travis CI
GitHub Actions
GitLab CI
React Native
NumPy
Pandas
Dask
MySQL
PostgreSQL
MongoDB
SQLite
MariaDB
Memcached
Cassandra
Hadoop
CouchDB
Rollout
InVision
Node.js
Django
Spring Boot
Android SDK
Ruby on Rails
Next.js
.NET
Tailwind CSS
Nuxt.js
Databricks
Figma
Adobe XD
Zeplin
Microsoft Teams
OpenCV
Redis
Terraform
AWS CloudFormation
Pulumi
Azure DevOps
Jira
Gradle
React
Vue.js
jQuery
Svelte
Webpack
JavaScript
Python
HTML5
Java
ES6
CSS 3
PHP
Kotlin
ASP.NET
Logstash
Apache Flume
TensorFlow
PyTorch
NLTK
Kafka
FastAPI
TypeORM
Grafana
Kibana
Prometheus
Firebase Realtime Database
Sequelize
Windows
New Relic
Datadog
Trello
ClickUp
GraphQL
Firebase
gRPC
Elasticsearch
WordPress
Ansible
AWS Lambda
Serverless
Google Cloud Functions
Azure Functions
monday.com
TypeScript
NGINX
Apache HTTP Server
Balsamiq
Root Cause
SQL
TeamCity
Hugging Face
Amazon ECR
Ray
vLLM
Caddy
Apigee
Zipkin
Scale AI
Dynamic
Task
MediaPipe
Joomla
ChatGPT
Availability
Location
Authorized to work in
Salary expectations
Job categories
Skills
Interested in hiring Justin?
You can contact Justin and 90k+ other talented remote workers on Himalayas.
Message JustinFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
