At Cloudbeds, we're transforming hospitality by building AI-powered solutions that solve hoteliers' biggest challenges. As a Sr. Site Reliability Engineer, you'll be the guardian of our platform's reliability and performance, ensuring millions of hospitality transactions flow seamlessly across the globe.
Requirements
- Design and implement reliable and scalable AWS architecture
- Maintain and support highly loaded Kubernetes (EKS) clusters and infrastructure-related components
- Support the CICD process with ArgoCD and GitOps
- Automate the platform deployments with Terraform infrastructure-as-code
- Develop and continuously improve product Observability and Monitoring systems
- Respond and participate with Incident Management and Root Cause Analysis
- Optimize system performance and troubleshoot issues
- Collaborate with development teams to establish monitoring best practices and ensure systems meet reliability targets
- Collaborate with security teams to implement and maintain security best practices
Benefits
- PTO in accordance with local labor requirements
- Monthly Wellness Fridays
- Full Paid Parental Leave
- Home office stipend based on country of residency
- Professional development courses in Cloudbeds University
- Access to professional development, including manager training, upskilling and knowledge transfer
