Update README.md
Browse files
README.md
CHANGED
|
@@ -26,11 +26,11 @@ model-index:
|
|
| 26 |
type: wer
|
| 27 |
value: 9.914
|
| 28 |
---
|
| 29 |
-
# Wav2vec 2.0 large VoxRex Swedish (
|
| 30 |
|
| 31 |
**Disclaimer:** This is a work in progress. See [VoxRex](https://huggingface.co/KBLab/wav2vec2-large-voxrex) for more details.
|
| 32 |
|
| 33 |
-
Finetuned version of KBs [VoxRex large](https://huggingface.co/KBLab/wav2vec2-large-voxrex) model using Swedish radio broadcasts, NST and Common Voice data. Evalutation without a language model gives the following: WER for NST + Common Voice test set (2% of total sentences) is **
|
| 34 |
|
| 35 |
When using this model, make sure that your speech input is sampled at 16kHz.
|
| 36 |
|
|
@@ -40,7 +40,7 @@ When using this model, make sure that your speech input is sampled at 16kHz.
|
|
| 40 |
<center>*<i>Chart shows performance without the additional 20k steps of Common Voice fine-tuning</i></center>
|
| 41 |
|
| 42 |
## Training
|
| 43 |
-
This model has been fine-tuned for 120000 updates on NST + CommonVoice and then for an additional 20000 updates on CommonVoice only. The additional fine-tuning on CommonVoice hurts performance on the NST+CommonVoice test set somewhat and, unsurprisingly, improves it on the CommonVoice test set. It seems to perform generally better though [citation needed]
|
| 44 |
|
| 45 |

|
| 46 |
|
|
|
|
| 26 |
type: wer
|
| 27 |
value: 9.914
|
| 28 |
---
|
| 29 |
+
# Wav2vec 2.0 large VoxRex Swedish (C)
|
| 30 |
|
| 31 |
**Disclaimer:** This is a work in progress. See [VoxRex](https://huggingface.co/KBLab/wav2vec2-large-voxrex) for more details.
|
| 32 |
|
| 33 |
+
Finetuned version of KBs [VoxRex large](https://huggingface.co/KBLab/wav2vec2-large-voxrex) model using Swedish radio broadcasts, NST and Common Voice data. Evalutation without a language model gives the following: WER for NST + Common Voice test set (2% of total sentences) is **2.5%**. WER for Common Voice test set is **8.49%** directly and **7.37%** with a 4-gram language model.
|
| 34 |
|
| 35 |
When using this model, make sure that your speech input is sampled at 16kHz.
|
| 36 |
|
|
|
|
| 40 |
<center>*<i>Chart shows performance without the additional 20k steps of Common Voice fine-tuning</i></center>
|
| 41 |
|
| 42 |
## Training
|
| 43 |
+
This model has been fine-tuned for 120000 updates on NST + CommonVoice<del> and then for an additional 20000 updates on CommonVoice only. The additional fine-tuning on CommonVoice hurts performance on the NST+CommonVoice test set somewhat and, unsurprisingly, improves it on the CommonVoice test set. It seems to perform generally better though [citation needed]</del>.
|
| 44 |
|
| 45 |

|
| 46 |
|