
Generative AI Inference Engineer
Stability AI is hiring a Generative AI Inference Engineer to lead design and production of customer-facing multi-modal ML inference systems. Responsibilities include inference optimization, model tuning and deployment, working with cloud providers (AWS/GCP/Azure) and HPC resources, and bringing new Stability models and pipelines into production. Required: 7+ years production ML experience, expert Python, PyTorch, Triton/TensorRT, GPU profiling (NVIDIA Nsight), Docker, Kubernetes, and experience with OpenCV and open-source ML tools (HuggingFace, W&B).











