Update Hulu-Med-4B
Browse files
README.md
CHANGED
|
@@ -27,12 +27,15 @@ library_name: transformers
|
|
| 27 |
[](https://modelscope.cn/models/Med-Team/Hulu-Med)
|
| 28 |
[](LICENSE)
|
| 29 |
[](https://github.com/ZJUI-AI4H/Hulu-Med)
|
|
|
|
| 30 |
|
| 31 |
-
[๐ Paper](http://arxiv.org/abs/2510.08668) | [๐ค Hulu-Med-7B](https://huggingface.co/ZJU-AI4H/Hulu-Med-7B) |[๐ค Hulu-Med-14B](https://huggingface.co/ZJU-AI4H/Hulu-Med-14B) |[๐ค Hulu-Med-32B](https://huggingface.co/ZJU-AI4H/Hulu-Med-32B) | [๐ฎ ModelScope Models](https://modelscope.cn/models/Med-Team/Hulu-Med) | [๐ Demo](#demo)
|
| 32 |
|
| 33 |
</div>
|
| 34 |
|
| 35 |
## ๐ฅ News
|
|
|
|
|
|
|
| 36 |
- **[2025-11-01]** ๐ Releasing our new evaluation code, **MedUniEval**! Built on MedEvalKit, MedUniEval is designed for the comprehensive evaluation of medical visual-language models across various modalitiesโincluding text, 2D, 3D, and video. More benchmarks are coming soon.
|
| 37 |
|
| 38 |
- **[2025-10-15]** ๐ Hulu-Med now supports Transformers integration! HuggingFace-compatible models released with simplified loading and inference. Integration with VLLM is ongoing. *The HF models are now available in the **main branch** on Hugging Face*.
|
|
@@ -67,6 +70,8 @@ Our training corpus encompasses:
|
|
| 67 |
|
| 68 |
## ๐ Performance Highlights
|
| 69 |
|
|
|
|
|
|
|
| 70 |
### Medical Multimodal Benchmarks
|
| 71 |
|
| 72 |
Performance comparison on medical multimodal benchmarks (For the 'Medical VLM < 10B' subgroup, **bold** indicates the best method):
|
|
@@ -91,6 +96,7 @@ Performance comparison on medical multimodal benchmarks (For the 'Medical VLM <
|
|
| 91 |
| MedGemma-4B | 70.7 | 49.2 | 72.3 | 78.2 | 48.1 | 25.4 | 43.2 |
|
| 92 |
| HuatuoGPT-V-7B | 74.3 | 53.1 | 67.6 | 68.1 | 44.8 | 23.2 | 49.8 |
|
| 93 |
| Lingshu-7B | 82.9 | 56.3 | 67.9 | 83.1 | 61.9 | 26.7 | - |
|
|
|
|
| 94 |
| **Hulu-Med-7B** | **84.2** | **66.8** | **78.0** | **86.8** | **65.6** | **29.0** | **51.4** |
|
| 95 |
| **Medical VLMs > 10B** |
|
| 96 |
| HealthGPT-14B | 75.2 | 56.4 | 65.0 | 66.1 | 56.7 | 24.7 | 49.6 |
|
|
@@ -123,6 +129,7 @@ Performance comparison on medical text benchmarks (**bold** indicates the best m
|
|
| 123 |
| MedGemma-4B | 38.6 | 12.8 | 45.6 | 21.6 | 72.2 | 52.2 | 56.2 | 66.7 |
|
| 124 |
| HuatuoGPT-V-7B | 44.6 | 10.1 | 40.9 | 21.9 | 72.8 | 51.2 | 52.9 | 69.3 |
|
| 125 |
| Lingshu-7B | 50.4 | 16.5 | 56.2 | 26.3 | 76.6 | 55.9 | 63.3 | 74.5 |
|
|
|
|
| 126 |
| **Hulu-Med-7B** | **60.6** | **19.6** | **61.5** | **31.1** | **77.4** | **67.6** | **73.5** | **79.5** |
|
| 127 |
| **Medical VLMs > 10B** |
|
| 128 |
| HealthGPT-14B | 63.4 | 11.3 | 39.8 | 25.7 | 68.0 | 63.4 | 66.2 | 80.2 |
|
|
|
|
| 27 |
[](https://modelscope.cn/models/Med-Team/Hulu-Med)
|
| 28 |
[](LICENSE)
|
| 29 |
[](https://github.com/ZJUI-AI4H/Hulu-Med)
|
| 30 |
+

|
| 31 |
|
| 32 |
+
[๐ Paper](http://arxiv.org/abs/2510.08668) | [๐ค Hulu-Med-4B](https://huggingface.co/ZJU-AI4H/Hulu-Med-4B) | [๐ค Hulu-Med-7B](https://huggingface.co/ZJU-AI4H/Hulu-Med-7B) |[๐ค Hulu-Med-14B](https://huggingface.co/ZJU-AI4H/Hulu-Med-14B) |[๐ค Hulu-Med-32B](https://huggingface.co/ZJU-AI4H/Hulu-Med-32B) | [๐ฎ ModelScope Models](https://modelscope.cn/models/Med-Team/Hulu-Med) | [๐ Demo](#demo)
|
| 33 |
|
| 34 |
</div>
|
| 35 |
|
| 36 |
## ๐ฅ News
|
| 37 |
+
- **[2025-11-18]** ๐ We released **Hulu-Med-4B**, a lightweight model with strong multimodal and text reasoning abilities that surpasses **MedGemma-4B** and **Lingshu-7B**!
|
| 38 |
+
|
| 39 |
- **[2025-11-01]** ๐ Releasing our new evaluation code, **MedUniEval**! Built on MedEvalKit, MedUniEval is designed for the comprehensive evaluation of medical visual-language models across various modalitiesโincluding text, 2D, 3D, and video. More benchmarks are coming soon.
|
| 40 |
|
| 41 |
- **[2025-10-15]** ๐ Hulu-Med now supports Transformers integration! HuggingFace-compatible models released with simplified loading and inference. Integration with VLLM is ongoing. *The HF models are now available in the **main branch** on Hugging Face*.
|
|
|
|
| 70 |
|
| 71 |
## ๐ Performance Highlights
|
| 72 |
|
| 73 |
+
## ๐ Performance Highlights
|
| 74 |
+
|
| 75 |
### Medical Multimodal Benchmarks
|
| 76 |
|
| 77 |
Performance comparison on medical multimodal benchmarks (For the 'Medical VLM < 10B' subgroup, **bold** indicates the best method):
|
|
|
|
| 96 |
| MedGemma-4B | 70.7 | 49.2 | 72.3 | 78.2 | 48.1 | 25.4 | 43.2 |
|
| 97 |
| HuatuoGPT-V-7B | 74.3 | 53.1 | 67.6 | 68.1 | 44.8 | 23.2 | 49.8 |
|
| 98 |
| Lingshu-7B | 82.9 | 56.3 | 67.9 | 83.1 | 61.9 | 26.7 | - |
|
| 99 |
+
| **Hulu-Med-4B** | **81.6** | **64.6** | **71.6** | **85.0** | **60.1** | **26.4** | **50.5** |
|
| 100 |
| **Hulu-Med-7B** | **84.2** | **66.8** | **78.0** | **86.8** | **65.6** | **29.0** | **51.4** |
|
| 101 |
| **Medical VLMs > 10B** |
|
| 102 |
| HealthGPT-14B | 75.2 | 56.4 | 65.0 | 66.1 | 56.7 | 24.7 | 49.6 |
|
|
|
|
| 129 |
| MedGemma-4B | 38.6 | 12.8 | 45.6 | 21.6 | 72.2 | 52.2 | 56.2 | 66.7 |
|
| 130 |
| HuatuoGPT-V-7B | 44.6 | 10.1 | 40.9 | 21.9 | 72.8 | 51.2 | 52.9 | 69.3 |
|
| 131 |
| Lingshu-7B | 50.4 | 16.5 | 56.2 | 26.3 | 76.6 | 55.9 | 63.3 | 74.5 |
|
| 132 |
+
| **Hulu-Med-4B** | **58.6** | **16.8** | **59.4** | **29.5** | **77.6** | **64.8** | **71.9** | **78.6** |
|
| 133 |
| **Hulu-Med-7B** | **60.6** | **19.6** | **61.5** | **31.1** | **77.4** | **67.6** | **73.5** | **79.5** |
|
| 134 |
| **Medical VLMs > 10B** |
|
| 135 |
| HealthGPT-14B | 63.4 | 11.3 | 39.8 | 25.7 | 68.0 | 63.4 | 66.2 | 80.2 |
|