Rauan Akylzhanov
@rauanakylzhanov
I'm a Senior ML/NLP Engineer specializing in LLM production and deployment.
What I'm looking for
I build and deploy production LLM-powered services focused on real-time reasoning, multi-GPU inference, and low-latency NLP pipelines. I design high-throughput vLLM inference stacks (RTX 4090), FastAPI-based serving architectures handling thousands of daily queries, and evaluation protocols to benchmark fine-tuned models, with measurable gains in throughput, latency and revenue.
I have improved model conversion and inference latency by porting Python algorithms to C++, led LLaMA/LangChain chatbot and CTR prediction projects on AWS, and implemented MLOps automation with Terraform, SageMaker, Kubeflow and Airflow. I hold a PhD in Mathematics and combine research rigor with production engineering to deliver scalable, reliable ML systems.
Experience
Work history, roles, and key accomplishments
Senior ML/NLP Engineer
HighSky
Jan 2024 - Present (1 year 7 months)
Developed and deployed high-performance LLM-powered web services for real-time reasoning, serving thousands of queries daily; optimized multi-GPU vLLM inference on RTX 4090 to maximize throughput and minimize latency. Ported semantic segmentation from Python to C++, reducing request latency from 15s to 1s and raised subnet metric from 0.72 to 0.96, improving revenue.
Architected and deployed an LLaMA 2-powered chatbot using LangChain, improving recommendation accuracy and raising user satisfaction by 12%. Improved conversion prediction accuracy by 20% via PyTorch transformer models and automated SageMaker resource provisioning with Terraform for LLM experiments.
ML Engineer
Kcell
Jan 2019 - Jan 2022 (3 years)
Developed churn, CLV and segmentation models and designed a scalable DataLake and MLOps pipelines, increasing model deployment efficiency by 20%. Fine-tuned TF-BERT for sentiment classification, boosting campaign response rates by 30%.
Research Associate
Imperial College London
Jan 2017 - Jan 2019 (2 years)
Taught introductory data science courses including High Performance Computing (M3C), delivering lectures and lab sessions and supporting student research projects. Contributed to academic instruction and course material development.
Faculty Teaching Assistant
Moscow State University
Jan 2012 - Jan 2014 (2 years)
Assisted in teaching data structures, algorithms and C/C++ programming, led labs and graded assignments for undergraduate computer science courses. Supported student learning and practical programming exercises on Linux.
Education
Degrees, certifications, and relevant coursework
Imperial College London
Doctor of Philosophy, Mathematics
2014 - 2018
Activities and societies: Member of the Imperial College Data Science Society and Machine Learning Society; participated in hackathons and industry engagement (2014–2017).
Conducted PhD research in non-commutative analysis at Imperial College London with EPSRC funding from 2014 to 2018.
Lomonosov Moscow State University
Specialist in Computer Science, Computer Science
2007 - 2012
Grade: With Honours
Completed a Specialist degree in Computer Science with honours, focusing on optimization theory, statistical inference, high-performance distributed computing, and numerical methods.
Availability
Location
Authorized to work in
Job categories
Interested in hiring Rauan?
You can contact Rauan and 90k+ other talented remote workers on Himalayas.
Message RauanFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
