
Senior ML Engineer - Kimchi (LLM Inference Optimization)
Lead technical direction for LLM inference optimization (Kimchi), improving throughput, lowering latency, and maximizing KV cache efficiency across GPU SKUs and distributed topologies. Requires 5+ years building ML systems/inference or training infrastructure, strong Python, hands-on experience with vLLM/SGLang/TensorRT-LLM, quantization fluency, distributed systems knowledge, and a measurement-first approach. Remote-first role available in multiple European countries.



