
Senior AI and ML HPC Cluster Engineer
Technical leader for GPU-accelerated AI/HPC infrastructure: design and operate on-prem and cloud clusters, build automation, support researchers, perform performance analysis and root-cause investigations. Requires 5+ years designing and operating large-scale compute infrastructure and experience with schedulers (Slurm/K8s/PBS/LSF), Linux administration, configuration management (Ansible/Puppet/Salt), container technologies, Python and MPI.










