14 64 70

Qiushi

QiushiSun

https://qiushisun.github.io/

AI & ML interests

Code Intelligence; Large Langauge Models; AI Agents

Recent Activity

upvoted a paper about 16 hours ago

ARISE: An Adaptive Resolution-Aware Metric for Test-Time Scaling Evaluation in Large Reasoning Models

upvoted a paper 4 days ago

InteractScience: Programmatic and Visually-Grounded Evaluation of Interactive Scientific Demonstration Code Generation

upvoted a paper 6 days ago

OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows

View all activity

Organizations

authored 3 papers 10 days ago

ARISE: An Adaptive Resolution-Aware Metric for Test-Time Scaling Evaluation in Large Reasoning Models

Paper • 2510.06014 • Published Oct 7 • 9

OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows

Paper • 2510.24411 • Published 12 days ago • 70

JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence

Paper • 2510.23538 • Published 12 days ago • 95

authored a paper about 2 months ago

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

Paper • 2509.15221 • Published Sep 18 • 109

authored 2 papers 2 months ago

OS-MAP: How Far Can Computer-Using Agents Go in Breadth and Depth?

Paper • 2507.19132 • Published Jul 25

CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning

Paper • 2508.20096 • Published Aug 27 • 36

authored 2 papers 3 months ago

Dynamic and Generalizable Process Reward Modeling

Paper • 2507.17849 • Published Jul 23

CodeEvo: Interaction-Driven Synthesis of Code-centric Data through Hybrid and Iterative Feedback

Paper • 2507.22080 • Published Jul 25 • 9

authored a paper 5 months ago

ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

Paper • 2505.19897 • Published May 26 • 104

authored 3 papers 7 months ago

Automated Peer Reviewing in Paper SEA: Standardization, Evaluation, and Analysis

Paper • 2407.12857 • Published Jul 9, 2024 • 1

Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning

Paper • 2504.08672 • Published Apr 11 • 55

Breaking the Data Barrier -- Building GUI Agents Through Task Generalization

Paper • 2504.10127 • Published Apr 14 • 17

authored a paper 8 months ago

CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era

Paper • 2503.12329 • Published Mar 16 • 27

authored a paper 10 months ago

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published Dec 27, 2024 • 87

authored 2 papers about 1 year ago

OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

Paper • 2410.23218 • Published Oct 30, 2024 • 49

AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant

Paper • 2410.18603 • Published Oct 24, 2024 • 32

authored 4 papers over 1 year ago

TransCoder: Towards Unified Transferable Code Representation Learning Inspired by Human Skills

Paper • 2306.07285 • Published May 23, 2023 • 2

Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models

Paper • 2405.12939 • Published May 21, 2024 • 1

Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language Models

Paper • 2406.11736 • Published Jun 17, 2024 • 6

KS-Lottery: Finding Certified Lottery Tickets for Multilingual Language Models

Paper • 2402.02801 • Published Feb 5, 2024 • 1

Qiushi

AI & ML interests

Recent Activity

Organizations

QiushiSun's activity