Michael Kim
@michaelkim1
Senior software engineer building scalable AI/ML and LLM backends with low-latency, cloud-native MLOps.
What I'm looking for
I’m a Senior Software Engineer with 12+ years of experience building and scaling AI/ML systems at Datadog, Salesforce, and Amazon. I specialize in end-to-end ML platform engineering, from real-time inference and LLM observability to MLOps pipelines and multi-tenant AIOps.
I’m known for translating cutting-edge AI capabilities into reliable, low-latency production infrastructure that directly reduces operational burden for engineering and SRE teams. Across enterprise-scale, cloud-native environments, I’ve delivered systems that process trillions of data points daily.
At Datadog, I led backend architecture for AIOps and LLM Observability platforms that power real-time anomaly detection, AI-driven root cause analysis, and generative AI monitoring across thousands of customers. I architected AIOps backend processing trillions of points daily for Watchdog (sub-second latency), engineered event-driven pipelines for Bits AI (reducing customer MTTR by 20% via LLM-driven remediation), and built telemetry to monitor token usage and cost for OpenAI and Anthropic.
Before that, at Salesforce, I built the core MLOps and multi-tenant infrastructure behind Einstein AI, training over 900,000 customer-specific ML models per hour. I also engineered scalable workflows for the ML Lake on AWS S3 and Apache Iceberg, designed distributed scheduling with AWS Lambda to orchestrate hundreds of thousands of parallel training jobs, and optimized real-time prediction serving infrastructure for low-latency outcomes inside Salesforce applications.
Experience
Work history, roles, and key accomplishments
Senior Software Engineer
Datadog
Sep 2020 - Present (5 years 7 months)
Led backend architecture and development of Datadog’s AIOps and LLM Observability platforms, enabling sub-second unsupervised anomaly detection and generative AI monitoring across thousands of enterprise customers. Built event-driven LLM remediation pipelines that reduced customer MTTR by 20% and delivered zero-downtime A/B testing for new detection algorithms.
Senior Member of Technical Staff
Salesforce
Apr 2015 - Aug 2020 (5 years 4 months)
Built core MLOps and multi-tenant infrastructure for Salesforce Einstein AI, supporting training of 900,000+ customer-specific ML models per hour at enterprise scale. Developed secure LLM grounding in CRM data and scalable ML workflows using AWS S3, Apache Iceberg, and a Lambda-based distributed scheduler.
Software Development Engineer
Amazon Web Services
Sep 2012 - Jan 2015 (2 years 4 months)
Developed backend components for Amazon Fraud Detector, delivering fraud risk scoring in milliseconds for high-volume enterprise e-commerce and financial platforms. Built scalable microservices and ingestion pipelines with AWS Kinesis and S3, optimizing low-latency inference APIs and end-to-end ML lifecycle workflows.
Education
Degrees, certifications, and relevant coursework
University of Virginia
Bachelor of Science, Computer Science
Earned a Bachelor of Science in Computer Science from the University of Virginia in 2012.
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Michael?
You can contact Michael and 90k+ other talented remote workers on Himalayas.
Message MichaelFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
