Kai Zhang's picture

Kai Zhang

drogozhang

·

https://drogozhang.github.io

AI & ML interests

NLP

Recent Activity

authored a paper 17 days ago

Scaling Agent Learning via Experience Synthesis

upvoted a paper 17 days ago

Scaling Agent Learning via Experience Synthesis

upvoted a paper 20 days ago

ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation

View all activity

Organizations

upvoted a paper 17 days ago

Scaling Agent Learning via Experience Synthesis

Paper • 2511.03773 • Published 19 days ago • 77

upvoted a paper 20 days ago

ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation

Paper • 2511.01163 • Published 21 days ago • 31

upvoted 2 papers 26 days ago

SPICE: Self-Play In Corpus Environments Improves Reasoning

Paper • 2510.24684 • Published 27 days ago • 14

VisCoder2: Building Multi-Language Visualization Coding Agents

Paper • 2510.23642 • Published about 1 month ago • 21

upvoted a paper about 1 month ago

R-WoM: Retrieval-augmented World Model For Computer-use Agents

Paper • 2510.11892 • Published Oct 13 • 21

upvoted 5 papers about 2 months ago

Large Reasoning Models Learn Better Alignment from Flawed Thinking

Paper • 2510.00938 • Published Oct 1 • 58

UniVideo: Unified Understanding, Generation, and Editing for Videos

Paper • 2510.08377 • Published Oct 9 • 70

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 264

Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks

Paper • 2510.02286 • Published Oct 2 • 28

The Era of Real-World Human Interaction: RL from User Conversations

Paper • 2509.25137 • Published Sep 29 • 18

upvoted 3 papers 3 months ago

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Paper • 2509.01055 • Published Sep 1 • 73

Attributes as Textual Genes: Leveraging LLMs as Genetic Algorithm Simulators for Conditional Synthetic Data Generation

Paper • 2509.02040 • Published Sep 2 • 14

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper • 2509.00676 • Published Aug 31 • 83

upvoted 2 papers 5 months ago

AR-RAG: Autoregressive Retrieval Augmentation for Image Generation

Paper • 2506.06962 • Published Jun 8 • 28

AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy

Paper • 2506.13284 • Published Jun 16 • 26

upvoted 4 papers 6 months ago

VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation

Paper • 2506.03930 • Published Jun 4 • 26

PhyX: Does Your Model Have the "Wits" for Physical Reasoning?

Paper • 2505.15929 • Published May 21 • 49

ARM: Adaptive Reasoning Model

Paper • 2505.20258 • Published May 26 • 45

AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning

Paper • 2505.16400 • Published May 22 • 35

upvoted a collection 8 months ago

WebDreamer

Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents • 6 items • Updated Apr 14 • 6