Theano: A Python framework for fast computation of mathematical expressions Paper • 1605.02688 • Published May 9, 2016 • 2
Critical Data Size of Language Models from a Grokking Perspective Paper • 2401.10463 • Published Jan 19, 2024 • 1
AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence Paper • 2502.13943 • Published Feb 19 • 8
FreqKV: Frequency Domain Key-Value Compression for Efficient Context Window Extension Paper • 2505.00570 • Published May 1
Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models Paper • 2508.09874 • Published Aug 13 • 7
MLP Memory: Language Modeling with Retriever-pretrained External Memory Paper • 2508.01832 • Published Aug 3 • 1
Fourier-VLM: Compressing Vision Tokens in the Frequency Domain for Large Vision-Language Models Paper • 2508.06038 • Published Aug 8