Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6 • 485
The Majority is not always right: RL training for solution aggregation Paper • 2509.06870 • Published Sep 8 • 16
A Primer on the Inner Workings of Transformer-based Language Models Paper • 2405.00208 • Published Apr 30, 2024 • 12
Fantastic Pretraining Optimizers and Where to Find Them Paper • 2509.02046 • Published Sep 2 • 13
Deep Ignorance Collection This collection contains the model and data artifacts from O'Brien et al. (2025). https://deepignorance.ai • 43 items • Updated 14 days ago • 6
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published Aug 8 • 190
AirTrafficGen: Configurable Air Traffic Scenario Generation with Large Language Models Paper • 2508.02269 • Published Aug 4 • 1
Air Traffic Controller Task Demand via Graph Neural Networks: An Interpretable Approach to Airspace Complexity Paper • 2507.13423 • Published Jul 17 • 1
view article Article You could have designed state of the art positional encoding Nov 25, 2024 • 398
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level Paper • 2411.03562 • Published Nov 5, 2024 • 68