Reliability Engineer, Supercomputing
Thinkingmachines
San Francisco$350k – $475k7d ago
About the role
Thinking Machines Lab's mission is to empower humanity through advancing collaborative general intelligence. We're building a future where everyone has access to the knowledge and tools to make AI work for their unique needs and goals.
We are scientists, engineers, and builders who’ve created some of the most widely used AI products, including ChatGPT and Character.ai, open-weights models like Mistral, as well as popular open source projects like PyTorch, OpenAI Gym, Fairseq, and Segment Anything.
About the Role
We're hiring an engineer to ensure the reliability of our GPU supercomputing fleet, owning the seam between hardware, firmware, and operating system. You will track the long tail of hardware issues: We are conducting frontier research in AI and a single bad NIC, HBM or a kernel driver edge case can compr
More at Thinkingmachines
- Strategic Finance DirectorSan Francisco, CA · $275k – $325k
- Network Engineer, SupercomputingSan Francisco · $350k – $475k
- Reception & Workplace Experience CoordinatorSan Francisco, CA · $75k – $95k
- Assistant ControllerSan Francisco, CA · $325k – $400k
- Associate General Counsel, Corporate & CommercialSan Francisco · $350k – $425k
- Associate General Counsel, Advanced AI & PrivacySan Francisco · $350k – $425k