Senior Site Resiliency Engineer, Assurant-GCC, India
The IT Resiliency Engineer is responsible for designing, implementing, and maintaining robust IT systems and processes that ensure the continuity and resilience of the organization’s critical technology infrastructure. This role involves monitoring our applications, planning for potential disruptions, and ensuring that systems can withstand and quickly recover from unexpected outages or failures. The IT Resiliency Engineer will collaborate with various teams to assess risks, identify vulnerabilities, and implement strategies to minimize downtime and data loss.
This position will be in Hyderabad at our India location.
What will be my duties and responsibilities in this job?
All primary job accountabilities/responsibilities for the Senior Resiliency Engineer span the Assurant enterprise.
100%- Operational
- Build and maintain application monitors, to ensure our applications are performing at a high reliability rate
- Monitor applications, to ensure we identify and remediate issues impacting our customers quickly
- Maintain knowledge of overall distributed system environments, utilities and procedures
- Participate in on-call rotations
- Provide timely, concise communication of incident status to appropriate personnel
- Document incident occurrence, root cause analysis and resolution(s) applied using designated repositories
- Evaluate conditions and suggest possible strategies to minimize risk(s) of incident recurrence
- Consult with and direct other staff personnel as required for effective incident resolution
- Resolve development and support issues of high complexity or risk
What are the requirements needed for this position?
Education
- Bachelor’s Degree in Computer Science, Engineering, Information Technology or equivalent experience
- Technology Certification (Optional) – DR (e.g., CDRE), Business Continuity (e.g., MBCP), Network (e.g., CCNA), Microsoft (e.g., MCP) or equivalent experience
Previous Experience:
- Minimum 8-10 years of experience in the field of Information Technology, Systems & Application Development/Support, Infrastructure support, Disaster Recovery
- Minimum 3 years of experience in leading teams or project management
- Minimum 3 years of experience in advanced technology analysis and diagramming
Travel/Shift work
- On-call duty in the potential event of business continuity incidents and disaster response (24/7/365)
- Potential for long hours in the event of actual business continuity incidents and disaster response
- No travel required
What other are the Preferred Experience, Skills, and Knowledge?
Previous Experience
- Technology Management Experience
- Use of other business continuity software tools
- Reporting aptitude and capabilities (ie. Excel, PowerBI, Tableau)
Knowledge and Skills
- Good communication skills (English)
- 5 years’ experience web application development, to troubleshoot issues by reading logs, and resolving them)
- 5 years’ experience analyzing technical problems and delivering solutions of high risk
- 4 years’ experience in Azure, .Net, Angular, SQL, REST
- 4 years’ experience in APM Tools (Datadog, App Insights, Dynatrace)
- 4 years’ experience in Performance analysis
- Advanced Knowledge of Datadog & Dynatrace Dashboard creation (APM Reporting, Infrastructure Health Reporting)
- Familiarity with business operations, critical business processes, and interdependencies on systems and applications
- Familiarity with legal, regulatory and industry security requirements and frameworks. Including, but not limited to the following:
- International Organization for Standards (ISO/IEC 27001)
- Payment Card Industry – Data Security Standards (PCI – DSS)
- Sarbanes Oxley (SOX)
- Information Technology Infrastructure Library (ITIL)