Air Apps is a family-founded company creating an AI-powered Personal & Entrepreneurial Resource Planner. We're looking for a Site Reliability Engineer to ensure the reliability, availability, and scalability of our systems.
Requirements
- Design and implement scalable, reliable, and fault-tolerant systems across cloud environments.
- Develop and maintain observability tools, including monitoring, logging, and alerting.
- Automate infrastructure provisioning, deployment, and incident response using Infrastructure as Code tools.
- Optimize system performance, scalability, and incident response workflows to improve uptime.
- Work closely with development and DevOps teams to improve system design for reliability.
- Conduct root cause analysis (RCA) and implement preventative measures to minimize failures.
- Ensure high availability by designing and maintaining load balancing, failover, and disaster recovery strategies.
- Improve CI/CD pipelines to enhance deployment speed while maintaining stability.
- Optimize cloud cost and resource utilization for AWS, Azure, or Google Cloud Platform (GCP).
- Participate in on-call rotations to quickly address system failures and minimize downtime.
Benefits
- Apple hardware ecosystem for work.
- Annual Bonus
- Top-tier Health and Life Insurance for peace of mind.
- Transportation Budget to support your commute needs.
- Coverflex benefits package for meal allowances, well-being, and more.
- Childcare support.
- Air Conference - an opportunity to meet the team, collaborate, and grow together.
- Pension Fund to support your long-term financial planning.
- Urban Sports Club membership to keep you active.
- Meals 100% free at the hub.
