1) Role Overview Mercor is engaging advanced CUDA experts who specialize in GPU kernel optimization, performance... throughput while maintaining correctness and reproducibility, 2) Key Responsibilities Develop, tune, and benchmark CUDA...
curve. Our work focuses on three pillars: high-performance, asynchronous, zero-copy tensor and optimizer-state-aware data... become experiments and products). About the Role As a Training Performance Engineer, you'll drive efficiency improvements...