Senior Manager, Engineering Technology Operations Center (TOC)

Posted 2025-03-14
Remote, USA Full-time Immediate Start

About the position

GEICO is seeking a dynamic, highly motivated Senior Manager to join our Reliability Engineering organization to oversee our Technology Operations Center (TOC), a central point of communication and incident management across our TECH organization. In this role, you will be part of a team that facilitates incident calls, measures, and improves production performance, availability, and reliability through sustainable engineering practices for our mission-critical systems. You will work closely with our Product, Platform, Security, and other Infrastructure teams to continuously automate and improve our products' availability to our customers. As a Senior Manager, you will manage a team of TOC engineers with diverse technology expertise who are passionate about triaging and collaborating with various Product groups across the organization to resolve issues effectively and efficiently. The ideal candidate should be an experienced leader capable of interacting effectively with all levels of GEICO Technology Solutions (GEICO Tech) and business organizations. You will need to keep a pulse on GEICO's business strategy to ensure that TOC is aligned with our business and GEICO Tech. Additionally, you must develop and maintain a strategic roadmap for the TOC organization and ensure that team goals are aligned. Your primary responsibilities will include leading a team of managers and engineers responsible for monitoring the availability and performance of our business applications, handling incidents efficiently, and coordinating the response to Major Incidents (MI). You will also lead the Incident Response Automation (IRA) initiatives across the enterprise. This role requires 24x7x365 on-call production support, and you will work with TOC engineers and other stakeholders to perform regular Incident Table Top exercises to identify deficiencies in our Incident Response plan, including technical, planning, and procedural aspects. You will provide oversight, lead, supervise, guide, and coach your team, assisting them when they encounter challenges and conflicting priorities. Developing an effective team through mentorship and motivation is crucial, as is providing leadership to attain productivity goals and response times for established service level agreements for daily operations. You will also be responsible for developing TOC policies and procedures, ensuring their correct execution by TOC personnel, managing coverage and staffing for all direct reports, and performing associate performance evaluations. Furthermore, fostering an atmosphere of professionalism and team collaboration across teams and business units is essential, as is promoting continuous learning to remain current on technology trends and changes to incident management.

Responsibilities
? Oversee the Technology Operations Center (TOC) and manage incident communication and management.
,
? Facilitate incident calls and improve production performance, availability, and reliability.
,
? Manage a team of TOC engineers with diverse technology expertise.
,
? Develop and maintain a strategic roadmap for the TOC organization.
,
? Lead and supervise a team of managers and engineers responsible for monitoring business applications.
,
? Handle incidents efficiently and coordinate responses to Major Incidents (MI).
,
? Lead Incident Response Automation (IRA) initiatives across the enterprise.
,
? Conduct regular Incident Table Top exercises to identify deficiencies in the Incident Response plan.
,
? Provide mentorship and motivation to team members.
,
? Develop TOC policies and procedures and ensure their correct execution.
,
? Manage coverage and staffing for all direct reports.
,
? Perform associate performance evaluations and assist with team performance.

Requirements
? Bachelor's degree in Computer Science, Information Technology, or a related field.
,
? 10+ years of hands-on work experience supervising personnel in a technical environment.
,
? Excellent troubleshooting skills and a thorough understanding of Operations, Incident Management, systems engineering, and infrastructure support.
,
? Experience leading a team in a fast-paced environment.
,
? Excellent verbal and written communication skills, including technical writing skills.
,
? Overall understanding of cloud computing and internet technologies.
,
? Ability to facilitate resolution of multiple incidents simultaneously.
,
? Strong interpersonal and presentation skills.

Nice-to-haves
? Cloud Certifications (preferably Azure or AWS).
,
? Experience in Agile methodologies such as Kanban and Scrum.
,
? Familiarity with observability tools such as Splunk, NetQoS, Dynatrace, Aternity, Moogsoft, etc.
,
? Knowledge of Windows and Linux operating systems.
,
? Understanding of Networking technologies.
,
? Experience with full stack engineering & development from Java front end services to backend storage systems in both SQL and no-SQL contexts.

Benefits
? Premier Medical, Dental and Vision Insurance with no waiting period.
,
? Paid Vacation, Sick and Parental Leave.
,
? 401(k) Plan.
,
? Tuition Assistance including Direct Billing and Reimbursement payment plan options.
,
? Paid Training, Licensures, and Certificates.

Apply Job!

For more such jobs please click here!

Similar Jobs

Back to Job Board