Skip to content

Software Engineer, Distributed Data Systems

Exa

San Francisco, CaliforniaFull-time139d ago
Looking for more like this? See all Software Engineer jobs.

About the role

Exa is building a search engine from scratch to serve every AI agent. We build massive-scale infrastructure to crawl the web, train state-of-the-art embedding models to process it, and design super high performant vector databases in rust to search over it. If you like compute, we also own a $5M H200 GPU cluster (and soon 5x'ing that) and regularly spin up batchjobs with tens of thousands of machines.   As a Data Engineer, you'll architect and build the data infrastructure that powers everything we do—from crawling billions of pages to training our embedding models to serving real-time search. You'll have enormous autonomy in designing systems that scale to hundreds of petabytes. If you've ever wanted to build data pipelines at a scale that most companies only dream about, this is your chance.   Who You Are - Deep understanding of lakehouse architectures (Delta Lake, Iceberg, Hudi) and when to use them - Experience building and operating large-scale distributed data processing

More at Exa