Tim Dingman's picture

1 57 1

Tim Dingman

tdingman

https://timdingman.com/

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

ResearchRubrics: A Benchmark of Prompts and Rubrics For Evaluating Deep Research Agents

upvoted a paper 4 days ago

The Path Not Taken: RLVR Provably Learns Off the Principals

upvoted a paper 14 days ago

Robot Learning: A Tutorial

View all activity

Organizations

None yet

upvoted a paper 3 days ago

ResearchRubrics: A Benchmark of Prompts and Rubrics For Evaluating Deep Research Agents

Paper • 2511.07685 • Published 10 days ago • 7

upvoted a paper 4 days ago

The Path Not Taken: RLVR Provably Learns Off the Principals

Paper • 2511.08567 • Published 9 days ago • 27

upvoted a paper 14 days ago

Robot Learning: A Tutorial

Paper • 2510.12403 • Published Oct 14 • 108

upvoted 4 papers 22 days ago

Agentic Entropy-Balanced Policy Optimization

Paper • 2510.14545 • Published Oct 16 • 102

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29 • 137

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 263

Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published 23 days ago • 92

upvoted 5 papers about 2 months ago

Scaling Agents via Continual Pre-training

Paper • 2509.13310 • Published Sep 16 • 115

FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18 • 113

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

Paper • 2509.15221 • Published Sep 18 • 110

LIMI: Less is More for Agency

Paper • 2509.17567 • Published Sep 22 • 100

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper • 2509.02544 • Published Sep 2 • 123

upvoted 8 papers 3 months ago

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

Paper • 2507.23726 • Published Jul 31 • 113

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Paper • 2508.13167 • Published Aug 6 • 127

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7 • 178

Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Paper • 2508.01191 • Published Aug 2 • 236

Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4 • 261

Reinforcement Learning with Rubric Anchors

Paper • 2508.12790 • Published Aug 18 • 13

Shortcut Learning in Generalist Robot Policies: The Role of Dataset Diversity and Fragmentation

Paper • 2508.06426 • Published Aug 8 • 10

MolmoAct: Action Reasoning Models that can Reason in Space

Paper • 2508.07917 • Published Aug 11 • 44