Engineering Manager, Site Reliability Engineering - Observability


March 13, 2021

Seattle, WA, US

Company Description
Twitter is what’s happening and what people are talking about right now. For us, life's not about a job, it's about purpose. We believe real change starts with conversation. Here, your voice matters. Come as you are and together we'll do what's right (not what's easy) to serve the public conversation.
Job Description
Twitter Site Reliability Engineers (SREs) are Software Engineers who focus on Availability, Reliability, Disaster Recovery, and other challenges of Scale. They possess a breadth and depth of knowledge about Twitter’s production environment that allows them to craft tools, processes and frameworks to guide colleagues through safely releasing production code, provide guidance and support for monitoring distributed systems, reduce operational overhead, and enable teams to achieve their desired reliability outcomes.
The Observability team ingests and serves petabytes of data from all the services and systems across Twitter’s entire infrastructure. This data is highly critical for Twitters production services and includes system and service level metrics, logging, and tracing. You’ll be focused on creating an environment where Observability SREs, who are embedded with the Observability Software Engineering teams, can improve Reliability and meet the challenges of operating at our continuously-increasing scale.
We believe passion and personality matter; as such, we need leaders that can manage teams of diverse, smart, and driven engineers - while balancing day to day people management with moving the business forward both technically and culturally.
Your responsibilities include, but are not limited to:
  • Actively help bring great SREs to Twitter. Source and hire the talented SREs, grow a team with diverse perspectives, and help your peers do the same.
  • Mentor, grow, and empower your team by giving them the skills, confidence, space, and motivation to make decisions independently that lead to their personal and professional success, and enable them to become technical leaders. In other words, align the best outcomes for growth of the people around and business impact.
  • Take an active role in contributing to the roadmap for the Observability organization. Bring perspectives from across the SRE org into the Observability org. You will be the key connector in this important and highly visible partnership.
  • Participate in deep technical design discussions within your team, and across partner teams, and ensure that we’re building the right systems and keeping the quality high. Understand the Observability stack such that you can contribute meaningfully to architectural decisions.

  • You have 5+ years of software support, reliability, or operations engineering experience in a highly customer-focused environment.
  • You have 2+ years experience successfully managing a team of 5-8 engineers on large-scale projects that included technical deep-dives and production troubleshooting in the areas of: distributed systems, programming, configuration management, networking, storage, and operating systems.
  • You have production experience with multiple cloud vendors
  • You endorse infrastructure as code
  • You have a proven track record of managing diverse and distributed teams, ensuring all members can bring their best.
  • You possess strong leadership skills and the ability to motivate teams.
  • You will bring a collaborative partnership mindset, focused on business impact.

Additional Information
All of your information will be kept confidential according to EEO guidelines.