AI Engineer
Windmill
RemoteFull-time58d ago
Looking for more like this? See all Machine Learning Engineer jobs.
About the role
Own Windmill's agentic coding and tool/system-building pipeline end-to-end - from the AI backend (planning, tool use, retrieval, self-correction) to the UX and developer experience that wraps it. The bar: an agent that reliably goes from a natural-language spec to a working, deployed workflow or app - and that developers actually enjoy using.
Benchmarking : build and maintain the eval harness, task corpus, scoring, and regression tracking. Every prompt / model / tool change is measured.
Agent loop : design and improve planning, tool use, self-correction, retrieval, execution feedback, multi-file editing, test-driven iteration.
Integration & DX : own the full surface - UI flows, editor integration, feedback loops, error states - so the experience is polished end-to-end, not just the model calls.
Prompts & models : systematically optimize prompts; experiment with frontier models (Claude, GPT, Gemini, open-weights); fine-tuning / RL where it pays off.
Ship