
Senior Software Engineer, CUDA Deep Learning Systems
Role to research, prototype, and optimize deep learning systems at the intersection of high-level frameworks and low-level CUDA, including custom kernel development, distributed cluster-scale design, and profiling. Requires a BS/MS/PhD (or equivalent), 8+ years relevant experience, and proven skills in C++, Python, CUDA, kernel optimization, distributed computing, and deep learning (transformers). Base salary ranges provided (USD, annual).