Where does output diversity collapse in post-training? Paper • 2604.16027 • Published 15 days ago • 22
SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space Paper • 2511.20102 • Published Nov 25, 2025 • 28