- Provide support for cloud-based and on-premises services
- Provide 24/7 on-call support on a weekly rotating basis (Monday to Monday)
- Monitor, mitigate, resolve, and perform root cause analysis of production issues (network and firewall issues, Docker and K8s issues, and mTLS connection issues)
- Improve observability, dashboards, and alerting of the database stack to proactively identify issues before incidents occur
- 4+ years of experience as a Site Reliability Engineer or Software Engineer
- Experience setting up and improving SRE processes
- Experience managing cloud or on-premises infrastructure (Azure Cloud, managing Linux (SFTP, SSH, certificates), Windows servers, Docker environments)
- Experience managing and troubleshooting CI/CD pipelines issues (GitLab CI/CD)
- Experience troubleshooting issues on different network layers (HTTP, TCP, SSL)
- Experience troubleshooting common K8s and Docker issues
- Experience configuring monitoring tools
- Hands-on experience with K8s, Python, Terraform/Terragrunt, Ansible, and Bash scripting
- Excellent communication skills, with the ability to collaborate effectively with management and cross-functional engineering teams to resolve incidents and address performance regressions
- Strong engineering craftsmanship with a passion for raising the bar in quality and reliability
- Upper-Intermediate level of English
WOULD BE A PLUS:
- Familiarity with NoSQL databases, providing versatility in managing different types of data storage systems
PERSONAL PROFILE
- Initiative, proactive, and results-oriented person
- Strong teamwork skills combined with a strong sense of responsibility and reliability
Are you a Reliability Engineer interested in supporting scalable, secure infrastructure? Do you thrive in both autonomous work and team collaboration, taking area of responsibility and delivering results?
If you enjoy working directly with Customers to deliver high-quality solutions that meet their business needs and exceed their expectations, we will be glad to have you on our team!
This is an excellent opportunity for you to work with a skilled team in a multinational environment at a world-leading telecommunications enterprise.
CUSTOMER
Our Customer is one of the world’s top-ranking telecommunications infrastructure companies, with nearly 100,000 employees. The company delivers products and services necessary for mobile and fixed-line communications, as well as radio networks and transmission networks. More than 40% of phone calls are made through its systems, and over 2 billion people use its network worldwide.
PROJECT
Our team helps the Customer (telecom provider) migrate applications to the cloud infrastructure, implement CI/CD, security, and administrative best practices using modern DevOps tools.