
Senior Deep Learning Tools Engineer – CUDA Tile
Role focuses on building performance testing frameworks, automated CI/CD pipelines, benchmarking systems, and dashboards to track and analyze GPU and compiler performance across models and hardware. Requires 5+ years software engineering experience, strong Python (C++ a plus), familiarity with deep learning frameworks (PyTorch/TensorFlow/JAX/TensorRT), GPU/hardware-aware performance analysis, profiling, and debugging across software and hardware layers. Base salary range listed: 152,000 USD–241,500 USD.







