Update README.md
Browse files
README.md
CHANGED
|
@@ -29,7 +29,43 @@ This is a small-sized model with 367M parameters. It is trained on 180k hours of
|
|
| 29 |
|
| 30 |
|
| 31 |
|
| 32 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
|
| 34 |
#### OWSM-CTC
|
| 35 |
|
|
|
|
| 29 |
|
| 30 |
|
| 31 |
|
| 32 |
+
### OWSM series
|
| 33 |
+
|
| 34 |
+
#### Encoder-decoder OWSM
|
| 35 |
+
|
| 36 |
+
| Name | Size | Hugging Face Repo |
|
| 37 |
+
| :--- | ---: | :---------------- |
|
| 38 |
+
| OWSM v3.1 base | 101M | https://huggingface.co/espnet/owsm_v3.1_ebf_base |
|
| 39 |
+
| OWSM v3.1 small | 367M | https://huggingface.co/espnet/owsm_v3.1_ebf_small |
|
| 40 |
+
| OWSM v3.1 medium | 1.02B | https://huggingface.co/espnet/owsm_v3.1_ebf |
|
| 41 |
+
| OWSM v3.2 small | 367M | https://huggingface.co/espnet/owsm_v3.2 |
|
| 42 |
+
| OWSM v4 base | 102M | https://huggingface.co/espnet/owsm_v4_base_102M |
|
| 43 |
+
| OWSM v4 small | 370M | https://huggingface.co/espnet/owsm_v4_small_370M |
|
| 44 |
+
| OWSM v4 medium | 1.02B | https://huggingface.co/espnet/owsm_v4_medium_1B |
|
| 45 |
+
|
| 46 |
+
|
| 47 |
+
#### CTC-based OWSM
|
| 48 |
+
|
| 49 |
+
| Name | Size | Hugging Face Repo |
|
| 50 |
+
| :--- | ---: | :---------------- |
|
| 51 |
+
| OWSM-CTC v3.1 medium | 1.01B | https://huggingface.co/espnet/owsm_ctc_v3.1_1B |
|
| 52 |
+
| OWSM-CTC v3.2 medium | 1.01B | https://huggingface.co/espnet/owsm_ctc_v3.2_ft_1B |
|
| 53 |
+
| OWSM-CTC v4 medium | 1.01B | https://huggingface.co/espnet/owsm_ctc_v4_1B |
|
| 54 |
+
|
| 55 |
+
|
| 56 |
+
|
| 57 |
+
### Citations
|
| 58 |
+
|
| 59 |
+
#### OWSM v4
|
| 60 |
+
|
| 61 |
+
```BibTex
|
| 62 |
+
@inproceedings{owsm-v4,
|
| 63 |
+
title={{OWSM} v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning},
|
| 64 |
+
author={Yifan Peng and Shakeel Muhammad and Yui Sudo and William Chen and Jinchuan Tian and Chyi-Jiunn Lin and Shinji Watanabe},
|
| 65 |
+
booktitle={Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH) (accepted)},
|
| 66 |
+
year={2025},
|
| 67 |
+
}
|
| 68 |
+
```
|
| 69 |
|
| 70 |
#### OWSM-CTC
|
| 71 |
|