We're on a mission to build the best platform in the world for engineers to understand and scale their systems, applications, and teams. We operate at high scale—trillions of data points per day—providing always-on alerting, metrics visualization, logs, and application tracing for tens of thousands of companies. Our engineering culture values pragmatism, honesty, and simplicity to solve hard problems the right way.
Database Reliability Engineers (DRE) build production-ready solutions that our product engineers can use to build developer tools and automation that quickly scale up for use by thousands of customers. This team ensures that our databases are scalable and cost-efficient, giving our engineers the platform they need to operate in a high-volume, low-latency environment that is continuing to double in size. The DRE team works collaboratively with engineers across the company, using their deep systems understanding to respond to infrastructure failures and reduce operational toil for all.
As a Team Lead, you’ll be responsible for the people management of a small group of dedicated, pragmatic engineers working to build a world-class caching layer for petabyte-scale data. Team Leads are our first layer of management at Datadog - they are both technical leaders and people managers.
We are a globally distributed team with US Offices in New York (HQ), Boston, and Denver and International Offices in Paris, Dublin, London, Madrid, the Netherlands, and Singapore. About 33% of our engineering team are remote.
Datadog values people from all walks of life. We understand that not everyone will meet these requirements on day one. If you’re passionate about reliability engineering and want to grow these skills but don’t meet all of these qualifications, we encourage you to apply.
- Manage a team of 3-6 engineers.
- Guide projects to create automated datastores platforms by collaborating with other managers and leaders
- Keep our datastores reliable, available and fast by defining and setting priorities for your team, unblocking your direct reports when needed.
- Respond to, investigate and fix issues, whether it’s deep in the database code or in the client application.
- Protect and ensure the consistency of customer data.
- 5+ years of experience in software engineering, 1+ year of management or technical leadership
- You have operated a large infrastructure as an engineer
- You value correctness and efficiency; you leave no stone unturned when diagnosing production issues
- You handle infrastructure with code because automation lets you focus on the more difficult and rewarding problems
- You have production experience with distributed datastores, e.g. Cassandra, Postgres, Kafka, Elasticsearch, Redis
- You’ve worked at a company with large scale systems, handling large amounts of data
- You have created tooling for, or submitted contributions to, an open-source datastore
- You are fluent in Python or Golang
#LI-RemoteEqual Opportunity at Datadog:
Datadog is an Affirmative Action and Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements.
For more information on how we maintain the privacy of the information you submit as part of your application, please refer to our Applicant and Candidate Privacy Notice.