grimjim/mistralai-Mistral-Nemo-Instruct-2407-12B-MPOA-v1 Text Generation • 12B • Updated about 16 hours ago • 1
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention Paper • 2509.24006 • Published Sep 28 • 115