Dzmitry Shulhin
@shulhd
Staff software engineer specializing in distributed systems and AI performance engineering for reliable, high-load LLM inference.
What I'm looking for
I’m a Staff Software Engineer with a decade of experience in distributed systems, leading engineering from architecture to delivery. I focus on high-load reliability, technical excellence, and turning performance constraints into production-grade solutions.
In my recent work at VISA, I led cross-functional, platform-wide initiatives for high-load payment acceptance—upgrading TLS across microservices, enforcing PCI compliance, and migrating from embedded Hazelcast to a shared distributed cluster. I’ve also re-engineered agent availability with Kafka broadcasting, built Spark/Databricks ETL pipelines that cut processing runtimes by 30%, and developed Alexa TTS fine-tuning and AWS pipeline architectures that improved cache efficiency and enabled petabyte-scale processing.
Experience
Work history, roles, and key accomplishments
AI Performance Engineering Fellow
Nebius Academy
Mar 2026 - Present (3 months)
Specialize in LLM architecture and inference optimization using KV-caching, Mixture of Experts (MoE), and LoRA fine-tuning to maximize model efficiency. Build production-ready MLOps stacks with vLLM, Kubernetes, and MLflow for scalable, observable deployments.
Led cross-functional initiatives for high-load payment acceptance systems, orchestrating a platform TLS protocol upgrade while ensuring PCI compliance across microservices and maintaining high availability. Directed migration from embedded Hazelcast to a shared distributed cluster to improve resilience and observability while reducing infrastructure overhead and technical debt.
Re-engineered third-party agent availability tracking by replacing high-overhead MongoDB polling with Kafka broadcasting and local caching, improving real-time synchronization and reducing redundant database usage. Architected GPT-based agents and deployed ML-driven fraud prevention services, reducing false positives by 11%.
Software Engineer
Samba TV
Jun 2023 - Jan 2024 (7 months)
Worked on high-throughput data ingestion and downstream analytics reliability by implementing automated data cleaning and validation within Spark-based pipelines. Supported performance-focused ETL development to improve runtime efficiency for large-scale daily data processing.
Developed a Dockerized fine-tuning pipeline for Alexa custom voice generation using BERT and vocoders to improve speech synthesis quality. Improved AWS Polly in-memory cache hit rate by 18% and built Kinesis-to-S3 Lambda architecture pipelines for petabyte-scale data processing.
Software Engineer
Samba TV
Mar 2019 - Jan 2020 (10 months)
Engineered Spark-based ETL pipelines on Databricks, optimizing orchestration and performance tuning to reduce processing runtimes by 30%. Managed high-throughput terabyte-scale daily ingestion with automated data cleaning and validation for reliable downstream analytics.
Education
Degrees, certifications, and relevant coursework
NEBIUS Academy
AI Performance Engineering Fellow, AI Performance Engineering
2026 -
Activities and societies: Building production-ready MLOps ecosystems using vLLM, Kubernetes, and MLflow for end-to-end AI lifecycle management with enterprise-grade observability and scale.
AI Performance Engineering Fellow specializing in LLM architecture and inference optimization, including KV-caching, Mixture of Experts (MoE), and LoRA fine-tuning.
Belarusian State University
Bachelor of Applied Science (BASc), Computer Science
Earned a Bachelor of Applied Science (BASc) in Computer Science at Belarusian State University.
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Dzmitry?
You can contact Dzmitry and 90k+ other talented remote workers on Himalayas.
Message DzmitryFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
