16 28 26

Zhaokai Wang

wzk1015

https://www.wzk.plus

wzk1015

AI & ML interests

Computer Vision Music Generation Multimodal Large Language Models

Recent Activity

liked a model 27 days ago

Zhenxin-Lei/MetaCaptioner

upvoted a paper about 1 month ago

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

upvoted a paper about 1 month ago

NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints

View all activity

Organizations

liked a model 27 days ago

Zhenxin-Lei/MetaCaptioner

Updated 28 days ago • 7 • 1

upvoted 4 papers about 1 month ago

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published Oct 17 • 87

NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints

Paper • 2510.08565 • Published Oct 9 • 19

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

Paper • 2510.08540 • Published Oct 9 • 108

Factuality Matters: When Image Generation and Editing Meet Structured Visuals

Paper • 2510.05091 • Published Oct 6 • 18

updated a dataset about 1 month ago

OpenGVLab/GenExam

Updated Oct 6 • 212 • 3

authored a paper 2 months ago

GenExam: A Multidisciplinary Text-to-Image Exam

Paper • 2509.14232 • Published Sep 17 • 21

upvoted a paper 2 months ago

SAIL-VL2 Technical Report

Paper • 2509.14033 • Published Sep 17 • 44

liked a model 2 months ago

facebook/nllb-200-distilled-600M

Translation • Updated Feb 14, 2024 • 317k • 789

upvoted a paper 2 months ago

GenExam: A Multidisciplinary Text-to-Image Exam

Paper • 2509.14232 • Published Sep 17 • 21

liked a dataset 2 months ago

OpenGVLab/GenExam

Updated Oct 6 • 212 • 3

published a dataset 2 months ago

OpenGVLab/GenExam

Updated Oct 6 • 212 • 3

liked a dataset 2 months ago

PhoenixZ/RISEBench

Updated May 30 • 91 • 3

upvoted a paper 2 months ago

Does DINOv3 Set a New Medical Vision Standard?

Paper • 2509.06467 • Published Sep 8 • 36

liked 2 models 3 months ago

OpenGVLab/InternVL3_5-241B-A28B-HF

Image-Text-to-Text • 241B • Updated Sep 8 • 167 • 11

OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview

Image-Text-to-Text • 0.4B • Updated Aug 29 • 42.4k • 77

authored a paper 3 months ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25 • 205

upvoted a paper 3 months ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25 • 205

liked a model 3 months ago

OpenGVLab/InternVL3_5-241B-A28B

Image-Text-to-Text • 241B • Updated Aug 29 • 589 • 131

liked a Space 3 months ago

RISEBench Gallery

👀

A Gallery of Generation Results on RISEBench

Zhaokai Wang

AI & ML interests

Recent Activity

Organizations

wzk1015's activity

RISEBench Gallery