You will:
- Design, deploy, and manage our Kubernetes platform to support scalable and reliable application deployments. Monitor and maintain the platform's health, performance, and security
- Oversee the deployment of our Software-as-a-Service applications on the Kubernetes platform. Implement best practices for application scalability, high availability, and disaster recovery
- Implement robust monitoring, alerting, and logging systems to proactively identify and resolve potential issues. Ensure high system availability and quick incident response times
- Continuously optimize the Kubernetes infrastructure and SaaS applications to achieve maximum performance and efficiency. Conduct performance testing and tuning to meet or exceed service level objectives
- Participate in an on-call rotation to respond to incidents promptly and effectively
- Conduct thorough post-incident reviews to identify root causes and implement preventive measures
- Develop and maintain automation tools and scripts to streamline processes and improve the efficiency of operational tasks
- Implement security best practices for Kubernetes and SaaS applications
- Collaborate with the security team to ensure compliance with industry standards and regulations
- Work closely with cross-functional teams, including development, infrastructure, and product management, to provide expertise and support throughout the software development lifecycle
- Identify areas for improvement in the infrastructure, processes, and deployment methodologies. Propose and implement enhancements to increase system reliability and performance.
Requirements
- 5+ years of professional experience as a Site Reliability Engineer, DevOps Engineer, or in a similar role, with a strong focus on Kubernetes platform management and SaaS deployment
- Proficiency in managing Kubernetes clusters and related tooling (e.g., Helm, kubectl, operators). Experience with container orchestration, service mesh, and Kubernetes networking
- Knowledge of continuous integration and continuous deployment pipelines, preferably with tools like Jenkins, GitLab CI/CD, or Tekton
- Experience with monitoring solutions (e.g., Prometheus, Grafana) and centralized logging platforms (e.g., ELK stack)
- Familiarity with major cloud providers (e.g., AWS, Azure, GCP) and experience deploying and managing applications on cloud infrastructure
- Solid programming skills in languages such as Python or Go. Proficiency in scripting to automate tasks and develop tooling
- Understanding of networking concepts and security best practices in the context of Kubernetes and SaaS deployments.
- Strong analytical and problem-solving abilities to diagnose and resolve complex technical issues
- Excellent teamwork and communication skills to collaborate effectively with various teams and stakeholders.
Benefits
- A chance to be a part of a casual but professional environment where you will have a safe place to try, fail and learn
- Have full ownership over your code
- Coaching from our tech leads to advance your soft and technical skills and set your own development path
- Defined and organized the onboarding process for both, the company and the project
- Competitive compensation depending on experience and skills
- Private pension and medical insurance for you and your family. Also, maternity and sick leave are 100% paid
- Sport clubs – from fishing to basketball, whatever rocks your boat
- Awesome referral fees - because great people know great people
- Work-life balance – this is the company that really supports your professional, family and personal goals
- Freedom to decide how you want to work - partly or fully remote or from our offices.