Xi Yang's picture

1 3 13

Xi Yang

xiyang99

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago

Do Vision-Language Models Measure Up? Benchmarking Visual Measurement Reading with MeasureBench

authored a paper about 1 month ago

CMMU: A Benchmark for Chinese Multi-modal Multi-type Question Understanding and Reasoning

authored a paper about 1 month ago

HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination Evaluation

View all activity

Organizations

Articles 1

Article

33

Letting Large Models Debate: The First Multilingual LLM Debate Competition

Papers 13

arXiv:2509.17177

arXiv:2508.11252

arXiv:2508.10015

arXiv:2508.02178

models 0

None public yet

datasets 0

None public yet