Job Title:Senior Site Reliability Engineer (SRE)
Experience: 5+ years Location: Mexico/LATAM
Engagement Type: Full-Time/contractual, Fully Remote
Job Description:
We are seeking a skilled Senior Site Reliability Engineer (SRE) to join our offshore team. In this role, you will be responsible for ensuring the reliability, performance, and scalability of our critical systems. You'll develop automation, build monitoring solutions, lead incident response, and work closely with engineering teams to implement infrastructure as code, CI/CD, and cloud-native tools.
Job Responsibilities:
Maintain the reliability, availability, and performance of critical systems
Develop and maintain automation scripts and tools to streamline operations
Develop and maintain monitoring dashboards and alerts
Lead incident response, conduct post-mortem analysis, and implement preventative measures
Optimize system performance and scalability
Implement and maintain security best practices
Create and maintain comprehensive system and process documentation
Participate in on-call rotations for 24/7 critical system support
Must Haves:
Kubernetes (hands-on experience) – managing and deploying workloads
AWS Cloud Platform – deep understanding and production experience
Infrastructure as Code (IaC) – using tools like Terraform (or CloudFormation/Ansible)
Scripting/Programming – Proficiency in Python or Go
Monitoring & Alerting – Experience with Prometheus, Grafana
CI/CD Pipelines – Jenkins, GitLab CI, or similar
Incident Management – Proven experience in responding to and analyzing outages
Linux Systems & Networking – Strong fundamentals
Good to Haves:
ArgoCD, Linkerd, Karpenter, or other Kubernetes-related tools
Logging tools – Loki, ELK Stack
Security best practices – Cloud and container security knowledge
Leadership/Mentorship – Experience guiding junior engineers
Post-mortem writing & RCA – Comfortable documenting incidents and learnings
Experience in distributed systems or high-availability architectures
Recruitment Process:
AI-based online screening test
Assignment
2 client interviews
CEO Discussion
Offer: Successful candidates will receive an offer to join the team.
Soft Skills
Excellent verbal and written communication skills in English - Must
Strong problem-solving ability with a customer-first mindset
Accountability – Takes ownership of reliability and incident outcomes.
Demonstrated ability to operate in high-pressure, multitasking environments independently
Passion for supporting and helping others
About Us:
We at Think Future Technologies (TFT) provide Technology Services to our customers, enabling them to achieve superior business outcomes. We come in as a trusted Partner completely owning the Technology piece. We brainstorm on our customer's business problems, arrive at the right solution framework, deploy the right blend of technical resources, and thereon provide optimal delivery at every step of the project implementation.