AI & ML interests
None defined yet.
Recent Activity
View all activity
csabakecskemeti
posted
an
update
22 days ago
mbrack
authored
a
paper
25 days ago
osanseviero
authored
a
paper
about 1 month ago
xiyang99
authored
13
papers
about 1 month ago
CMMU: A Benchmark for Chinese Multi-modal Multi-type Question Understanding and Reasoning
Paper
•
2401.14011
•
Published
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination Evaluation
Paper
•
2406.07070
•
Published
MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding
Paper
•
2406.04264
•
Published
•
2
Emu3: Next-Token Prediction is All You Need
Paper
•
2409.18869
•
Published
•
95
CS-Dialogue: A 104-Hour Dataset of Spontaneous Mandarin-English Code-Switching Dialogues for Speech Recognition
Paper
•
2502.18913
•
Published
SeniorTalk: A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors
Paper
•
2503.16578
•
Published
Video-SafetyBench: A Benchmark for Safety Evaluation of Video LVLMs
Paper
•
2505.11842
•
Published
•
1
EmotionTalk: An Interactive Chinese Multimodal Emotion Dataset With Rich Annotations
Paper
•
2505.23018
•
Published
RoboBrain 2.0 Technical Report
Paper
•
2507.02029
•
Published
•
33
Beyond Solving Math Quiz: Evaluating the Ability of Large Reasoning Models to Ask for Information
Paper
•
2508.11252
•
Published
•
3
RealTalk-CN: A Realistic Chinese Speech-Text Dialogue Benchmark With Cross-Modal Interaction Analysis
Paper
•
2508.10015
•
Published
Reconsidering Overthinking: Penalizing Internal and External Redundancy in CoT Reasoning
Paper
•
2508.02178
•
Published
FlagEval Findings Report: A Preliminary Evaluation of Large Reasoning Models on Automatically Verifiable Textual and Visual Questions
Paper
•
2509.17177
•
Published
•
13
Post
6443
We're kick-starting the process of Transformers v5, with
@ArthurZ
and
@cyrilvallez
!
v5 should be significant: we're using it as a milestone for performance optimizations, saner defaults, and a much cleaner code base worthy of 2025.
Fun fact: v4.0.0-rc-1 came out on Nov 19, 2020, nearly five years ago!
v5 should be significant: we're using it as a milestone for performance optimizations, saner defaults, and a much cleaner code base worthy of 2025.
Fun fact: v4.0.0-rc-1 came out on Nov 19, 2020, nearly five years ago!
Post
4311
Thread to gossip during the
openai
GPT-5 livestream: https://www.youtube.com/watch?v=0Uu_VJeVVfo. Feel free to post your impressions below!
JingzeShi
authored
a
paper
3 months ago
wubingheng
authored
a
paper
3 months ago