Site Reliability Engineer (Datadog)

At Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the world’s most innovative companies unleash their potential. From autonomous cars to life-saving robots, our digital and software technology.

Your Role

  • Identify and diagnose issues and problems.
  • Categorize and record reported queries and provide solutions.
  • Monitor issues from start to resolution.
  • Escalate if needed, unresolved problems to a higher level of support.
  • Receive and handles request for service, following agreed procedures.
  • Logs incidents and service requests and maintains relevant records: identifies and classifies incident types and service interruptions, record incidents cataloging them by symptom and resolution.
  • Trouble shoot and resolve Azure-related issues.
  • Create and maintain documentation for Azure configurations, processes, and procedures.
  • Configure and customize Datadog dashboards to provide real-time visibility into system and application performance and maintain Datadog alerts and integrations to meet the organization´s monitoring requirements.
  • Utilize Datadog metrics and analytics to perform capacity planning and resource optimization.
  • Collaborate with clients to understand their monitoring needs and provide customized Datadog solutions.
  • Manage server configurations of Linux systems using Puppet. 

Your Profile
 

  • Bachelor´s degree in Computer Science, information technology, or related field. 
  • Excellent communication (written and spoken) skills - English.
  • Experience in tools such as JIRA, Confluence, ServiceNow, other monitoring tools.
  • Excellent problem-solving and analytical skills
  • Comprehensive ability to prioritize and delegate.
  • Experience working with Microsoft Azure Services.
  • Basic scripting and automation skills (PowerShell, Azure CLI).
  • Experience performing troubleshooting in Azure Services.
  • Understanding of networking concepts and protocols.
  • Hands-on knowledge on Datadog and good knowledge on any other monitoring tool.
  • Proficient in Datadog configurations, dashboards, and alerting.
  • Experience provisioning and managing Linux servers.
  • Experience troubleshooting and resolving incidents using Datadog insights.
  • Demonstrate strong understanding of infrastructure components, networking and cloud environments.
  • Good hands-on knowledge of configuration management and deployment tools like Puppet.
  • Experience running Puppet-backup restore commands to restore PE infrastructure.
  • Experience managing and working with Puppet servers.

 

WHAT YOU’LL LOVE ABOUT WORKING HERE?
 

  • Capgemini Employer Promise: Learning + Flexibility + Team Spirit + Inclusion + Innovation.
  • Work from home: fully remote position.
  • Get competitive benefits above the law.
  • Build your future within a worldwide leader in ER&D projects.
  • Feel free to grow within different industries and choose your career path.
  • Be part of a great family of Engineers, and people all over Mexico and the world.
     

At Capgemini Mexico, we aim to attract the best talent and are committed to creating a diverse and inclusive work environment, so there is no discrimination based on race, sex, sexual orientation, gender identity or expression, or any other characteristic of a person. All applications welcome and will be considered based on merit against the job and/or experience for the position.

Capgemini is a global leader in partnering with companies to transform and manage their business by harnessing the power of technology. The Group is guided everyday by its purpose of unleashing human energy through technology for an inclusive and sustainable future. It is a responsible and diverse organization of over 300,000 team members in nearly 50 countries. With its strong 50-year heritage and deep industry expertise, Capgemini is trusted by its clients to address the entire breadth of their business needs, from strategy and design to operations, fueled by the fast evolving and innovative world of cloud, data, AI, connectivity, software, digital engineering and platforms.

Ref:  1794816
Fecha:  3 may 2024
Instalación:  Experiencia profesional
Tipo de puesto:  Contrato por tiempo indeterminado - full time
Ubicación: 

Guadalajara, JAL, MX

Departamento:  Digital