
Site Reliability Engineer - AI Agents
Role on the AI Infrastructure team to build and operate production infrastructure for AI agents, including compute, orchestration, model serving, observability, IaC (Terraform), CI/CD, and developer-facing APIs/SDKs. Requires 5+ years in SRE/platform/infrastructure roles, hands-on experience with ML infrastructure or model serving, Kubernetes/Docker, cloud (preferably AWS), scripting (bash) and a programming language (Python preferred).














