Félix is building the financial ecosystem for Latin immigrants in the U.S. and is looking for a Site Reliability Engineer to join their Engineering Operations team. The role will focus on strengthening the reliability, scalability, and security of the infrastructure that powers their fintech platform.
Requirements
- Manage and optimize infrastructure on Google Cloud Platform (GCP) and Google Kubernetes Engine (GKE)
- Automate provisioning and configuration using Terraform, Helm, and scripting languages
- Build, maintain, and improve monitoring and alerting systems using Prometheus, Grafana, and centralized logging tools
- Participate in on-call rotations, incident response, and post-mortem analyses
- Define and track SLOs/SLIs and error budgets to monitor service health and performance
- Implement cloud security best practices to protect sensitive data and maintain the integrity of systems
- Collaborate across Engineering, Security, and Product teams to embed reliability and automation in every phase of development and deployment
- Contribute to GKE cost optimization and resource management strategies to enhance efficiency and control operational spend
Benefits
- Competitive salary
- Initial stock options grant
- Annual performance bonus
- Health, dental, and vision plans
- Remote work environment
- Continuous learning opportunities
- Unlimited PTO
- Paid parental leave
