Manager Software Engineering - Site Reliability Engineering (SRE)

Mid / Senior

|

In Office

Meytier Premier Employer

Working there

About This Workplace

Meytier Partner

About Role:

As the Manager of Site Reliability Engineering (SRE), you will play a critical role in ensuring the performance, reliability, and scalability of our systems. Leveraging the principles of Site Reliability Engineering pioneered by Google, you will lead a team of talented engineers in implementing best practices for application performance monitoring, toil reduction, and system stability. Your focus will extend to both complex cloud-based and on-premises applications, ensuring high system uptime and availability. Collaboration with other SRE teams, departments, and business units across the organization will be essential to achieving our goals.


Additionally, your role will involve deep-diving with technologists and discussing strategic, long-term goals to drive innovation and growth. Experience with AWS and Azure technologies, as well as proficiency in industry standard tools is crucial for success in this role.


Key Responsibilities:

  • Lead and mentor a team of Site Reliability Engineers, fostering a culture of collaboration, innovation, and excellence.
  • Develop and implement strategies for application performance monitoring proactively identify and resolve performance bottlenecks.
  • Drive initiatives to reduce toil and automate repetitive tasks, allowing the team to focus on high-impact projects that improve system reliability and scalability.
  •  Collaborate closely with cross-functional teams including software engineering, infrastructure, and product management to design, deploy, and maintain highly available and resilient systems.
  • Establish and enforce best practices for incident management, post-mortem analysis, and continuous improvement, ensuring that lessons learned are applied to prevent future outages.
  • Implement robust monitoring and alerting systems using tools like Data Dog, ELK, and Open Telemetry to track system uptime and availability for complex cloud and on-premises applications, with a focus on meeting or exceeding defined service level objectives (SLOs) and service level agreements (SLAs).
  • Foster collaboration and knowledge sharing with other SRE teams and departments across the organization, leveraging their expertise and resources to drive improvements in system reliability and performance.
  • Engage in deep discussions with technologists to understand the intricacies of our systems and discuss strategic, long-term goals to drive innovation and growth.
  • Utilize expertise in AWS and Azure technologies to architect, deploy, and optimize cloud-based solutions, ensuring scalability, reliability, and cost-effectiveness.

Desired Profile:

  • Bachelor’s degree in Computer Science, Engineering, or related field
  • Proven experience leading a team of Site Reliability Engineers in a fast-paced and dynamic environment.
  • Deep understanding of application performance monitoring principles and tools, with hands-on experience in designing and implementing monitoring solutions.
  • Strong background in system architecture, infrastructure automation, and cloud technologies, with expertise in AWS and Azure.
  • Expertise in incident management, with the ability to effectively lead and coordinate response efforts during critical incidents.
  • Experience managing system uptime and availability for complex cloud-based and on-premises applications, with a track record of meeting or exceeding defined SLOs and SLAs.
  •  Excellent communication and interpersonal skills, with the ability to collaborate effectively with cross-functional teams and influence decision-making at all levels of the organization.
  • Strong problem-solving skills and a passion for driving continuous improvement and innovation.

Compensation Range:190-200K base



Is this job not quite the right fit? No worries, Meytier has hundreds of active, open jobs. Browse more opportunities here. If you’d like to connect with a Meytier champion for help in your job search, create an account here.

© 2024 Meytier - All Rights Reserved.
   Privacy Policy    Terms Of Use