Update README.md
Browse files
README.md
CHANGED
|
@@ -273,24 +273,23 @@ These are greedy WER numbers without external LM. More details on evaluation can
|
|
| 273 |
|
| 274 |
## Model Fairness Evaluation
|
| 275 |
|
| 276 |
-
As
|
| 277 |
-
dataset and results are reported as follows:
|
| 278 |
|
| 279 |
### Gender Bias:
|
| 280 |
|
| 281 |
| Gender | Male | Female | N/A | Other |
|
| 282 |
| :--- | :--- | :--- | :--- | :--- |
|
| 283 |
| Num utterances | 19325 | 24532 | 926 | 33 |
|
| 284 |
-
| % WER |
|
| 285 |
|
| 286 |
### Age Bias:
|
| 287 |
|
| 288 |
| Age Group | $(18-30)$ | $(31-45)$ | $(46-85)$ | $(1-100)$ |
|
| 289 |
| :--- | :--- | :--- | :--- | :--- |
|
| 290 |
| Num utterances | 15956 | 14585 | 13349 | 43890 |
|
| 291 |
-
| % WER |
|
| 292 |
-
|
| 293 |
|
|
|
|
| 294 |
|
| 295 |
## NVIDIA Riva: Deployment
|
| 296 |
|
|
|
|
| 273 |
|
| 274 |
## Model Fairness Evaluation
|
| 275 |
|
| 276 |
+
As outlined in the paper "Towards Measuring Fairness in AI: the Casual Conversations Dataset", we assessed the parakeet-tdt-1.1b model for fairness. The model was evaluated on the CausalConversations-v1 dataset, and the results are reported as follows:
|
|
|
|
| 277 |
|
| 278 |
### Gender Bias:
|
| 279 |
|
| 280 |
| Gender | Male | Female | N/A | Other |
|
| 281 |
| :--- | :--- | :--- | :--- | :--- |
|
| 282 |
| Num utterances | 19325 | 24532 | 926 | 33 |
|
| 283 |
+
| % WER | 17.18 | 14.61 | 19.06 | 37.57 |
|
| 284 |
|
| 285 |
### Age Bias:
|
| 286 |
|
| 287 |
| Age Group | $(18-30)$ | $(31-45)$ | $(46-85)$ | $(1-100)$ |
|
| 288 |
| :--- | :--- | :--- | :--- | :--- |
|
| 289 |
| Num utterances | 15956 | 14585 | 13349 | 43890 |
|
| 290 |
+
| % WER | 15.83 | 15.89 | 15.46 | 15.74 |
|
|
|
|
| 291 |
|
| 292 |
+
(Error rates for fairness evaluation are determined by normalizing both the reference and predicted text, similar to the methods used in the evaluations found at https://github.com/huggingface/open_asr_leaderboard.)
|
| 293 |
|
| 294 |
## NVIDIA Riva: Deployment
|
| 295 |
|