
Systems Software Engineer, Kubernetes Scale - DGX Cloud
Work on end-to-end performance, scalability and reliability of the NVIDIA DGX Cloud software stack (Kubernetes control/data planes, GPU Operator, Network Operator, DCGM, NIM, and distributed inference). Develop automated tests, monitoring and CI/CD frameworks, triage and root-cause performance issues, and engage with upstream open-source communities. Minimum 2+ years' relevant experience; familiarity with public cloud providers and proficiency in Golang/Python.








