Engineering Manager, Site Reliability Engineering
Togetherai
San Francisco22d ago
Looking for more like this? See all DevOps Engineer jobs.
About the role
About the Role
We're hiring an Engineering Managers for our Site Reliability Engineering organization to lead the team that keeps Together AI's production infrastructure running. SRE at Together is roughly 20 engineers organized into three function areas: bare-metal / day-0 / day-2 operations, our inference platform, and our virtual clusters platform. Each function area is led by a technical lead; you'll partner with them to manage and develop the engineers in your timezone.
This is a true player-coach role -roughly 50-60% management and 40-50% hands-on technical work. You'll code, participate in architectural discussions, lead incident response, and stay close enough to the systems to coach effectively. You'll also do the work that makes a team great over time: build trust, develop engineers, hire, and shape the operating rhythms that determine whether the team thrives or burns o
More at Togetherai
- Research Intern, Model Shaping (Fall 2026)San Francisco
- Systems Research Engineer Intern - GPU Programming (Fall 2026)San Francisco
- Research Intern, Inference (Fall 2026)San Francisco
- Frontier Agents Intern (Fall 2026)San Francisco
- Data Center Operations CoordinatorSan Francisco · $150k – $200k
- Technical Account Manager (TAM), GPU ClusterSan Francisco