Chat-UniVi
/

MoH-ViT-S-80

Model card Files Files and versions

Chat-UniVi commited on Oct 17, 2024

Commit

7cb2266

·

verified ·

1 Parent(s): 2603ffe

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ license: apache-2.0
 # MoH: Multi-Head Attention as Mixture-of-Head Attention
 **Paper or resources for more information:**
-[[Paper]()] [[Code](https://github.com/SkyworkAI/MoH)]
 ## ⚡ Overview
 We propose Mixture-of-Head attention (MoH), a new architecture that treats attention heads as experts in the Mixture-of-Experts (MoE) mechanism. MoH has two significant advantages:

 # MoH: Multi-Head Attention as Mixture-of-Head Attention
 **Paper or resources for more information:**
+[[Paper](https://huggingface.co/papers/2410.11842)] [[Code](https://github.com/SkyworkAI/MoH)]
 ## ⚡ Overview
 We propose Mixture-of-Head attention (MoH), a new architecture that treats attention heads as experts in the Mixture-of-Experts (MoE) mechanism. MoH has two significant advantages: