Responsibilities
- Collaborate closely with cross-functional teams to understand infrastructure requirements and develop scalable solutions.
- Design, implement, and manage robust, scalable, and highly available infrastructure on Google Cloud Platform (GCP).
- Deploy and manage Kubernetes clusters, adhering to best practices to ensure reliability and efficiency.
- Implement GitOps practices using tools such as ArgoCD for managing Kubernetes configurations.
- Develop and maintain CI/CD pipelines using GitHub Actions for automated testing, building, and deployment.
- Utilize Ansible and Terraform for infrastructure provisioning and configuration management.
- Managing Kafka for event streaming, and potentially building internal tooling around Kafka for enhanced functionality.
- Lead efforts in building and setting up observability and monitoring stacks using Prometheus, Grafana, and AlertManager for comprehensive visibility into infrastructure and application-level metrics.
- Ensure compliance with SOC2 Type 2 requirements and contribute to the ongoing improvement of compliance processes.
- Troubleshoot and resolve infrastructure issues and OpsGenie pages promptly, ensuring minimal downtime and optimal performance
- Document infrastructure configurations, processes, and best practices.
Qualifications
- Minimum of 6 years of experience in infrastructure engineering, DevOps, or a similar role
- Strong proficiency in GCP services and Kubernetes administration including building and maintaining Helm Charts
- Experience with GitOps principles and tools, such as ArgoCD
- Proficiency in CI/CD pipelines using GitHub Actions
- Well versed with Terraform
- Hands-on experience with Ansible (bonus)
- Familiarity with Kafka, including setup and management (bonus)
- Expertise in building and setting up observability and monitoring stacks using Prometheus, Grafana, and AlertManager
- Experience in working with multi-tenant infrastructures
- Knowledge of SOC2 Type 2 compliance requirements and experience in implementing compliance measures
- Excellent problem-solving skills and attention to detail
- Strong communication and collaboration skills, with the ability to work effectively in a small team environment