Skip to content

Senior Machine Learning Engineer - Model Evaluations, Public Sector

Scaleai

San Francisco, CA; St. Louis, MO; New York, NY; Washington, DC18d ago
Looking for more like this? See all Machine Learning Engineer jobs.

About the role

<p><strong>Senior Machine Learning Engineer - Model Evaluations, Public Sector</strong></p> <p>The Public Sector ML team at Scale deploys advanced AI systems—including LLMs, agentic models, and multimodal pipelines—into mission-critical government environments. We build evaluation frameworks that ensure these models operate reliably, safely, and effectively under real-world constraints. As an ML Engineer, you will design, implement, and scale automated evaluation pipelines that help customers trust and operationalize advanced AI systems across defense, intelligence, and federal missions.</p> <p><strong>You will:</strong></p> <ul> <li>Develop and maintain automated evaluation pipelines for ML models across functional, performance, robustness, and safety metrics, including LLM-judge–based evaluations.</li> <li>Design test datasets and benchmarks to measure generalization, bias, explainability, and failure

More at Scaleai