MLOps Engineer (LLM Infrastructure)

Kyivstar is the leading telecommunications provider in Ukraine, serving millions with a wide range of communication services and innovative technologies.

Kyivstar

Employee count: 1001-5000

Ukraine only

We are hiring an MLOps Engineer specializing in Large Language Model (LLM) infrastructure to design and maintain the robust platform on which our AI models are developed, deployed, and monitored. As an MLOps Engineer, you will build the backbone of our machine learning operations – from scalable training pipelines to reliable deployment systems – ensuring that our NLP models (including LLMs) can be trained on large datasets and served to end-users efficiently. This role sits at the intersection of software engineering, DevOps, and machine learning, and is crucial for accelerating our R&D in the Ukrainian LLM project. You’ll work closely with data scientists and software engineers to implement best-in-class infrastructure and workflows for the continuous delivery of AI innovations.

About us

Kyivstar.Tech is a Ukrainian hybrid IT company and a resident of Diia.City.

We are a subsidiary of Kyivstar, one of Ukraine's largest telecom operators.

Our mission is to change lives in Ukraine and around the world by creating technological solutions and products that unleash the potential of businesses and meet users' needs.

Over 500+ KS.Tech specialists work daily in various areas: mobile and web solutions, as well as design, development, support, and technical maintenance of high-performance systems and services.

We believe in innovations that truly bring quality changes and constantly challenge conventional approaches and solutions. Each of us is an adherent of entrepreneurial culture, which allows us never to stop, to evolve, and to create something new.

What you will do

Design and implement modern, scalable ML infrastructure (cloud-native or on-premises) to support both experimentation and production deployment of NLP/LLM models. This includes setting up systems for distributed model training (leveraging GPUs or TPUs across multiple nodes) and high-throughput model serving (APIs, microservices).
Develop end-to-end pipelines for model training, validation, and deployment. Automate the ML workflow from data ingestion and feature processing to model training and evaluation, using technologies like Docker and CI/CD pipelines to ensure reproducibility and reliability.
Collaborate with Data Scientists and ML Engineers to design MLOps solutions that meet model performance and latency requirements. Architect deployment patterns (batch, real-time, streaming inference) are appropriate for various use-cases (e.g., a real-time chatbot vs. offline analysis).
Implement and uphold best practices in MLOps, including automated testing of ML code, continuous integration/continuous deployment for model updates, and rigorous version control for code, data, and model artifacts. Ensure every model and dataset is properly versioned and reproducible.
Set up monitoring and alerting for deployed models and data pipelines. Use tools to track model performance (latency, throughput) and accuracy drift in production. Implement logging and observability frameworks to quickly detect anomalies or degradations in model outputs.
Manage and optimize our Kubernetes-based deployment environments. Containerize ML services and use orchestration (Kubernetes, Docker Swarm or similar) to scale model serving infrastructure. Handle cluster provisioning, health, and upgrades, possibly using Helm charts for managing LLM services.
Maintain infrastructure-as-code (e.g., Terraform, Ansible) for provisioning cloud resources and ML infrastructure, enabling reproducible and auditable changes to the environment. Ensure our infrastructure is scalable, cost-effective, and secure.
Perform code reviews and guide other engineers (both MLOps and ML developers) on building efficient and maintainable pipelines. Troubleshoot issues across the ML lifecycle, from data processing bottlenecks to model deployment failures, and continuously improve system robustness.

Qualifications and experience needed

Experience & Background:

4+ years of experience in DevOps, MLOps, or ML Infrastructure roles
Strong foundation in software engineering and DevOps principles as they apply to machine learning
Bachelor’s or Master’s in Computer Science, Engineering, or related field is preferred

Cloud & Infrastructure:

Extensive experience with cloud platforms (AWS, GCP, or Azure) and designing cloud-native applications for ML
Comfortable using cloud services for compute (EC2, GCP Compute, Azure VMs), storage (S3, Cloud Storage), container registry, and serverless components where appropriate
Experience managing infrastructure with Infrastructure-as-Code tools like Terraform or CloudFormation

Containerization & Orchestration:

Proficiency in container technologies (Docker) and orchestration with Kubernetes
Ability to deploy, scale, and manage complex applications on Kubernetes clusters; experience with tools like Helm for Kubernetes package management
Knowledge of container security and networking basics in distributed systems

CI/CD & Automation:

Strong experience implementing CI/CD pipelines for ML projects
Familiar with tools like Jenkins, GitLab CI, or GitHub Actions for automating testing and deployment of ML code and models
Experience with specialized ML CI/CD (e.g., TensorFlow Extended TFX, MLflow for model deployment) and GitOps workflows (Argo CD) is a plus

Programming & Scripting:

Strong coding skills in Python, with experience in writing pipelines or automation scripts related to ML tasks
Familiarity with shell scripting and one or more general-purpose languages (Go, Java, or C++) for infrastructure tooling
Ability to debug and optimize code for performance (both in data pipelines and in model inference code)

ML Pipeline Knowledge:

Solid understanding of the machine learning lifecycle and tools
Experience building or maintaining ML pipelines, possibly using frameworks like Kubeflow, Airflow, or custom solutions
Knowledge of model serving frameworks (TensorFlow Serving, TorchServe, NVIDIA Triton, or custom Flask/FastAPI servers for ML)

Monitoring & Reliability:

Experience setting up monitoring for applications and models (using Prometheus, Grafana, CloudWatch, or similar) and implementing alerting for anomalies
Understanding of model performance metrics and how to track them in production (e.g., accuracy on a validation stream, response latency)
Familiarity with concepts of A/B testing or canary deployments for model updates in production

Security & Compliance:

Basic understanding of security best practices in ML deployments, including data encryption, access control, and dealing with sensitive data in compliance with regulations
Experience implementing authentication/authorization for model endpoints and ensuring infrastructure complies with organizational security policies

Team Collaboration:

Excellent collaboration skills to work with cross-functional teams
Experience interacting with data scientists to translate model requirements into scalable infrastructure
Strong documentation habits for outlining system designs, runbooks for operations, and lessons learned

A plus would be

LLM/AI Domain Experience:

Previous experience deploying or fine-tuning large language models or other large-scale deep learning models in production
Knowledge of specialized optimizations for LLMs (such as model parallelism, quantization techniques like 8-bit or 4-bit quantization, and use of libraries like DeepSpeed or Hugging Face Accelerate for efficient training) will be highly regarded

Distributed Computing:

Experience with distributed computing frameworks such as Ray for scaling up model training across multiple nodes
Familiarity with big data processing (Spark, Hadoop) and streaming data (Kafka, Flink) to support feeding data into ML systems in real time

Data Engineering Tools:

Some experience with data pipeline and ETL
Knowledge of tools like Apache Airflow, Kafka, or dbt and how they integrate into ML pipelines
Understanding of data warehousing concepts (Snowflake, BigQuery) and how processed data is used for model training

Versioning & Experiment Tracking:

Experience with ML experiment tracking and model registry tools (e.g., MLflow, Weights & Biases, DVC)
Ensuring that every model version and experiment is logged and reproducible for auditing and improvement cycles

Vector Databases & Retrieval:

Familiarity with vector databases (Pinecone, Weaviate, FAISS) and retrieval systems used in conjunction with LLMs for augmented generation is a plus

High-Performance Computing:

Exposure to HPC environments or on-prem GPU clusters for training large models
Understanding of how to maximize GPU utilization, manage job scheduling (with tools like Slurm or Kubernetes operators for ML), and profile model performance to remove bottlenecks

Continuous Learning:

Up-to-date with the latest developments in MLOps and LLMOps (Large Model Ops)
Active interest in new tools or frameworks in the MLOps ecosystem (e.g., model optimization libraries, new orchestration tools) and a drive to evaluate and introduce them to improve our processes

What we offer

Office or remote – it’s up to you. You can work from anywhere, and we will arrange your workplace
Remote onboarding
Performance bonuses
We train employees with the opportunity to learn through the company’s library, internal resources, and programs from partners
Health and life insurance
Wellbeing program and corporate psychologist
Reimbursement of expenses for Kyivstar mobile communication

Apply now

Please let Kyivstar know you found this job on Himalayas. This helps us grow!

Apply now

About the job

Apply before

Nov 12, 2025

Posted on

Sep 13, 2025

Job type

Full Time

Experience level

Mid-level

Location requirements

Ukraine

Hiring timezones

Ukraine +/- 0 hours

About Kyivstar

Learn more about Kyivstar and their company culture.

View company profile

Kyivstar, the largest electronic communications operator in Ukraine, has been at the forefront of telecommunications since its founding in 1994. With a robust network coverage serving approximately 24 million mobile subscribers and over 1 million fixed-line internet customers, Kyivstar provides an extensive range of communication services, including mobile voice, SMS, and high-speed mobile internet. The company operates over 53,000 base stations, utilizing a comprehensive fiber-optic network that spans more than 44,000 km. As part of the VEON Group, Kyivstar adheres to international standards of service and quality, promoting innovation through new technologies, such as 4G and expanding into 5G networks.

Since the onset of the full-scale military conflict in Ukraine, Kyivstar has demonstrated remarkable resilience and commitment to its customers by ensuring connectivity and access to essential services. The company has invested significantly in backup power systems to enhance the energy resilience of its infrastructure, allowing it to maintain operational capabilities even during power outages. Furthermore, Kyivstar actively participates in charity initiatives and social responsibility projects, contributing over UAH 2 billion to support military efforts, humanitarian aid, and the recovery of the digital infrastructure in Ukraine. The dedication of Kyivstar to enhancing the digital landscape of Ukraine and fostering communication continues to earn it recognition as a leading brand in the Ukrainian telecommunications sector.

Apply now

Please let Kyivstar know you found this job on Himalayas. This helps us grow!

Apply now

About the job

Apply before

Nov 12, 2025

Posted on

Sep 13, 2025

Job type

Full Time

Experience level

Mid-level

Location requirements

Ukraine

Hiring timezones

Ukraine +/- 0 hours

Claim this profile

Kyivstar

Company size

1001-5000 employees

Founded in

1994

Chief executive officer

Oleksandr Komarov

Markets

Telecommunications Mobile Communications Fiber Optic Networks 4G Technology 5G Technology Digital Infrastructure Corporate Social Responsibility Network Resilience Humanitarian Aid Internet services

Employees live in

Ukraine

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

Ukraine only

ML Tech Lead, GenAI remote for Ukraine

Provectus

Employee count: 501-1000

Salary: 96k-96k USD

Full Time

GenAI Engineering Lead

BY, BG + 7 more

AI Engineering Manager

doola

Employee count: 11-50

Full Time

AI ML Engineering Manager

AL, AD + 48 more

AI Engineering Lead

Goodnotes

Ukraine only

Senior Backend Engineer, MLOps (Deploy)

DataRobot

Employee count: 1001-5000

Full Time

Senior Backend Engineer

Ukraine only

Engineer DevOps

AUTODOC

Employee count: 5000+

Full Time

DevOps Engineer

Ukraine only

DevOps Engineer

AUTODOC

Employee count: 5000+

Full Time

Kyivstar

Employee count: 1001-5000

Full Time

Ukraine only

IPTV/OTT Technical Lead

Kyivstar

Employee count: 1001-5000

Full Time

Top remote companies

Remote companies like Kyivstar

Find your next opportunity by exploring profiles of companies that are similar to Kyivstar. Compare culture, benefits, and job openings on Himalayas.

View all companies

DT8 jobs

Deutsche Telekom

Salaries

Deutsche Telekom AG is a German telecommunications company, one of the world's leading integrated telecommunications providers, offering fixed-network/broadband, mobile communications, Internet, and IPTV products and services.

Telecommunications Mobile Communications

BG1 job

BT Group

BT Group plc is a leading UK telecommunications provider, offering fixed and mobile communication services to both consumers and businesses.

Telecommunications Broadband Services

VO4 jobs

Vodafone

Salaries

Vodafone Group Plc is a leading telecommunications company operating in Europe and Africa, dedicated to improving lives through technology.

Top remote companies

Remote companies like Kyivstar

Find your next opportunity by exploring profiles of companies that are similar to Kyivstar. Compare culture, benefits, and job openings on Himalayas.

View all companies

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

MLOps Engineer (LLM Infrastructure)

About us

What you will do

Qualifications and experience needed

Experience & Background:

Cloud & Infrastructure:

Containerization & Orchestration:

CI/CD & Automation:

Programming & Scripting:

ML Pipeline Knowledge:

Monitoring & Reliability:

Security & Compliance:

Team Collaboration:

A plus would be

LLM/AI Domain Experience:

Distributed Computing:

Data Engineering Tools:

Versioning & Experiment Tracking:

Vector Databases & Retrieval:

High-Performance Computing:

Continuous Learning:

What we offer

Apply now

About the job

Apply before

Posted on

Job type

Experience level

Location requirements

Hiring timezones

Job categories

Skills

About Kyivstar

Apply now

About the job

Apply before

Posted on

Job type

Experience level

Location requirements

Hiring timezones

Job categories

Skills

Kyivstar

Company size

Founded in

Chief executive officer

Markets

Employees live in

Similar remote jobs

ML Tech Lead, GenAI remote for Ukraine

AI Engineering Manager

AI Engineering Lead

Senior Backend Engineer, MLOps (Deploy)

Engineer DevOps

DevOps Engineer

23 remote jobs at Kyivstar

Mobile Developer (React Native)

Провідний інженер з оцінки технічних рішень

Compensation and Benefits Manager

Data Engineer

Інженер технічної підтримки системи One Identity Management

IPTV/OTT Technical Lead

Remote companies like Kyivstar

Remote companies like Kyivstar

Find your dream job

Find your dream job

Find your dream job

Mobile Developer (React Native)

Провідний інженер з оцінки технічних рішень

Compensation and Benefits Manager

Data Engineer

Інженер технічної підтримки системи One Identity Management

IPTV/OTT Technical Lead

Remote companies like Kyivstar

ML Tech Lead, GenAI remote for Ukraine

AI Engineering Manager

AI Engineering Lead

Senior Backend Engineer, MLOps (Deploy)

Engineer DevOps