CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics Paper • 2508.18124 • Published Aug 25 • 48
view article Article NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks By nvidia and 4 others • Aug 11 • 75
AdaLomo: Low-memory Optimization with Adaptive Learning Rate Paper • 2310.10195 • Published Oct 16, 2023 • 3
IAG: Input-aware Backdoor Attack on VLMs for Visual Grounding Paper • 2508.09456 • Published Aug 13 • 8
Mol-R1: Towards Explicit Long-CoT Reasoning in Molecule Discovery Paper • 2508.08401 • Published Aug 11 • 41
VLM-R1 Collection Multimodal Reasoning Dataset for Large Scale Training with DeepSeek-R1 thoughts style • 18 items • Updated Apr 14 • 2
RefGPT Datasets Collection A large-scale dialogue dataset with references. • 6 items • Updated May 17, 2024 • 4
RLVR Collection Model and data for 'Expanding RL with Verifiable Rewards Across Diverse Domains' • 3 items • Updated Mar 31 • 13
view article Article Welcome the NVIDIA Llama Nemotron Nano VLM to Hugging Face Hub By nvidia and 11 others • Jun 27 • 29
AV-Reasoner: Improving and Benchmarking Clue-Grounded Audio-Visual Counting for MLLMs Paper • 2506.05328 • Published Jun 5 • 20
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published May 30 • 139
MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback Paper • 2505.17873 • Published May 23 • 30
LLaVA-Critic: Learning to Evaluate Multimodal Models Paper • 2410.02712 • Published Oct 3, 2024 • 37
AlignRAG: An Adaptable Framework for Resolving Misalignments in Retrieval-Aware Reasoning of RAG Paper • 2504.14858 • Published Apr 21 • 4
Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models Paper • 2504.15271 • Published Apr 21 • 67
S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models Paper • 2504.10368 • Published Apr 14 • 21