WeThink-Qwen2.5VL-7B / README.md

yangjie-cv

Update README.md

b1e50f1 verified 5 months ago

preview code

raw

history blame contribute delete

1.51 kB

metadata

license: apache-2.0
tags:
  - Reinforcement Learning
  - Visual-langauge Reasoning

Model Card for WeThink-Qwen2.5VL-7B

Repository: https://github.com/yangjie-cv/WeThink

Paper: https://arxiv.org/abs/2506.07905

🏆 Performance Highlights

WeThink-Qwen2.5VL-7B achieves:

🥇 1st place on OpenCompass Multimodal Reasoning Leaderboard
🏅 5th place on OpenCompass Multi-modal Academic Leaderboard
(As of May 30th, 2025)

🚀 Quick Start

Inference

git clone https://github.com/yangjie-cv/WeThink
cd WeThink
python inference.py

💡 Note: System prompt is required during inference.

📊 Evaluation

We have integrated WeThink-Qwen2.5VL-7B into the VLMEvalKit. Please follow its Quickstart guide to evaluate WeThink-Qwen2.5VL-7B on various benchmarks.

Citation

@misc{yang2025wethink,
      title={WeThink: Toward General-purpose Vision-Language Reasoning via Reinforcement Learning}, 
      author={Jie Yang and Feipeng Ma and Zitian Wang and Dacheng Yin and Kang Rong and Fengyun Rao and Ruimao Zhang},
      year={2025},
      eprint={2506.07905},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2506.07905}, 
}