What You'll Do:
- Manage multiple orchestration platforms: Kubernetes in AWS (CloudFormation) and on-prem Kubernetes clusters-
- Maintain Apache Flink infrastructure (managed in AWS or self-hosted in on-prem Kubernetes)
- Handle production support, incident response, and on-call rotations
- Perform regular patching activities and security vulnerability remediation
- Support and maintain workflow engine infrastructure
- Improve observability by utilizing Prometheus, Grafana, Splunk, Slack alerts, etc.
MLOps & Platform Development:
- Collaborate with Senior MLOps Architect to build and maintain ML infrastructure
- Set up and configure MLflow for experiment tracking and model registry
- Build automated MLOps pipelines for model training, experimentation, and deployment (Champion-Challenger, shadow mode)
- Support feature calculation pipelines and ETL processes
- Enable model serving infrastructure for Python-based ML services
We're Looking For:
- 3-5+ years of professional experience in DevOps or infrastructure engineering
- Strong hands-on experience with AWS services (EKS, ECR, SQS, S3, Managed Kafka, Managed Prometheus)
- Deep experience with Kubernetes in production environments (multi-cluster management is a plus)
- Proficiency with infrastructure as code: AWS CloudFormation and CDK (AWS Cloud Development Kit)
- Experience with containerization (Docker) and container orchestration
- Knowledge of setting up and maintaining CI/CD pipelines (GitHub Actions, ArgoCD, Jenkins, etc.)
- Hands-on experience with observability tools: Prometheus, Grafana, Splunk- Experience with production support, incident response, and on-call rotations
- Strong communication skills (English B2+)
- Ability to work collaboratively with cross-functional teams (MLOps engineers, data scientists, software engineers)
It would be a plus:
- Experience with Apache Flink, Kafka, or other stream processing frameworks
- Understanding of ML lifecycle: model training, evaluation, deployment patterns
- Experience with workflow engines or rule engines
- Knowledge of fraud prevention, fintech, or compliance domains
- Understanding of feature stores, ETL pipelines, and data engineering concepts
What We Offer:
- Remote work flexibility – work from anywhere- B2B contract with competitive gross compensation in USD
- Top-tier hardware to support your productivity
- A challenging role in a team of skilled professionals with opportunity to grow into MLOps specialization
- Direct collaboration with Senior MLOps Architect to learn and contribute to ML platform development
- Continuous learning and career growth opportunities
- Coverage for professional development: training, seminars, and conferences
- Access to high-quality English lessons
- Impact: Your work will directly prevent fraud while enabling secure financial access globally
