Update README.md
Browse filesupdate benchmarks
README.md
CHANGED
|
@@ -16,25 +16,25 @@ The model eliminates the need for OCR-based text extraction and related preproce
|
|
| 16 |
We evaluated granite-vision-embedding-3.3-2b alongside other top colBERT style multi-modal embedding models in the 1B-3B parameter range using two benchmark: Vidore2 and [Real-MM-RAG-Bench](https://arxiv.org/abs/2502.12342) which are specifically addressing complex multi-modal documents retrieval task.
|
| 17 |
|
| 18 |
## **NDCG@5 - ViDoRe V2**
|
| 19 |
-
| Collection \ Model | ColPali-v1.3 | ColQwen2.5-v0.2 | ColNomic-3b |
|
| 20 |
-
|
| 21 |
-
| ESG Restaurant Human | 51.10 | 68.40 | 65.80 | 60.00 |
|
| 22 |
-
| Economics Macro Multilingual | 49.90 | 56.50 | 55.40 | 50.13 |
|
| 23 |
-
| MIT Biomedical | 59.70 | 63.60 | 63.50 |
|
| 24 |
-
| ESG Restaurant Synthetic | 57.00 | 57.40 | 56.60 |
|
| 25 |
-
| ESG Restaurant Synthetic Multilingual | 55.70 | 57.40 | 57.20 |
|
| 26 |
-
| MIT Biomedical Multilingual | 56.50 | 61.10 | 62.50 | 54.00 |
|
| 27 |
-
| Economics Macro | 51.60 | 59.80 | 60.20 |
|
| 28 |
-
| **Avg (ViDoRe2)** | **54.50** | **60.60** | **60.17** | **55.20** |
|
| 29 |
|
| 30 |
## **NDCG@5 - REAL-MM-RAG**
|
| 31 |
-
| Collection \ Model | ColPali-v1.3 | ColQwen2.5-v0.2 | ColNomic-3b |
|
| 32 |
-
|----------------------------------------|--------------|------------------|-------------|--------------------------|
|
| 33 |
-
| FinReport | 0.55 | 0.66 | 0.78 |
|
| 34 |
-
| FinSlides | 0.68 | 0.79 | 0.81 |
|
| 35 |
-
| TechReport | 0.78 | 0.86 | 0.88 |
|
| 36 |
-
| TechSlides | 0.90 | 0.93 | 0.92 |
|
| 37 |
-
| **Avg (REAL-MM-RAG)** | **0.73** | **0.81** | **0.85** |
|
| 38 |
|
| 39 |
- **Release Date**: June 2025
|
| 40 |
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
|
|
|
| 16 |
We evaluated granite-vision-embedding-3.3-2b alongside other top colBERT style multi-modal embedding models in the 1B-3B parameter range using two benchmark: Vidore2 and [Real-MM-RAG-Bench](https://arxiv.org/abs/2502.12342) which are specifically addressing complex multi-modal documents retrieval task.
|
| 17 |
|
| 18 |
## **NDCG@5 - ViDoRe V2**
|
| 19 |
+
| Collection \ Model | ColPali-v1.3 | ColQwen2.5-v0.2 | ColNomic-3b | ColSmolvlm-v0.1 | GraniteVision emb-3.3-2b |
|
| 20 |
+
|----------------------------------------|--------------|------------------|-------------|-------------------|-----------
|
| 21 |
+
| ESG Restaurant Human | 51.10 | 68.40 | 65.80 | 62.4 | 60.00 |
|
| 22 |
+
| Economics Macro Multilingual | 49.90 | 56.50 | 55.40 | 47.4 | 50.13 |
|
| 23 |
+
| MIT Biomedical | 59.70 | 63.60 | 63.50 | 58.1 |60.00 |
|
| 24 |
+
| ESG Restaurant Synthetic | 57.00 | 57.40 | 56.60 | 51.1 |54.00 |
|
| 25 |
+
| ESG Restaurant Synthetic Multilingual | 55.70 | 57.40 | 57.20 | 47.6 |52.00 |
|
| 26 |
+
| MIT Biomedical Multilingual | 56.50 | 61.10 | 62.50 | 50.5 | 54.00 |
|
| 27 |
+
| Economics Macro | 51.60 | 59.80 | 60.20 | 60.9 |57.00 |
|
| 28 |
+
| **Avg (ViDoRe2)** | **54.50** | **60.60** | **60.17** | **54**. |**55.20** |
|
| 29 |
|
| 30 |
## **NDCG@5 - REAL-MM-RAG**
|
| 31 |
+
| Collection \ Model | ColPali-v1.3 | ColQwen2.5-v0.2 | ColNomic-3b | ColSmolvlm-v0.1 | GraniteVision emb-3.3-2b |
|
| 32 |
+
|----------------------------------------|--------------|------------------|-------------|--------------------------| ------------------
|
| 33 |
+
| FinReport | 0.55 | 0.66 | 0.78 | 0.65 |0.60
|
| 34 |
+
| FinSlides | 0.68 | 0.79 | 0.81 | 0.55 |0.72
|
| 35 |
+
| TechReport | 0.78 | 0.86 | 0.88 | 0.83 |0.80
|
| 36 |
+
| TechSlides | 0.90 | 0.93 | 0.92 | 0.91 |0.92
|
| 37 |
+
| **Avg (REAL-MM-RAG)** | **0.73** | **0.81** | **0.85** | **0.74** |**0.79**
|
| 38 |
|
| 39 |
- **Release Date**: June 2025
|
| 40 |
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|