|
|
--- |
|
|
language: en |
|
|
license: cc-by-4.0 |
|
|
tags: |
|
|
- automatic-speech-recognition |
|
|
- nemo |
|
|
- conformer |
|
|
- entity_tagging |
|
|
- intent |
|
|
datasets: |
|
|
- slurp |
|
|
metrics: |
|
|
- wer |
|
|
- cer |
|
|
model-index: |
|
|
- name: 1step ASR-NL for Slurp dataset |
|
|
results: |
|
|
- task: |
|
|
name: Automatic Speech Recognition |
|
|
type: automatic-speech-recognition |
|
|
dataset: |
|
|
name: Slurp dataset |
|
|
type: slurp |
|
|
metrics: |
|
|
- name: Word Error Rate |
|
|
type: wer |
|
|
value: |
|
|
- Insert WER Value |
|
|
- name: Character Error Rate |
|
|
type: cer |
|
|
value: |
|
|
- Insert CER Value |
|
|
--- |
|
|
|
|
|
|
|
|
# This speech tagger performs transcription, annotates entities, predict intent for SLURP dataset |
|
|
|
|
|
Model is suitable for voiceAI applications. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Model type**: NeMo ASR |
|
|
- **Architecture**: Conformer CTC |
|
|
- **Language**: English |
|
|
- **Training data**: Slurp dataset |
|
|
- **Performance metrics**: [Metrics] |
|
|
|
|
|
## Usage |
|
|
|
|
|
To use this model, you need to install the NeMo library: |
|
|
|
|
|
```bash |
|
|
pip install nemo_toolkit |
|
|
``` |
|
|
|
|
|
### How to run |
|
|
|
|
|
```python |
|
|
import nemo.collections.asr as nemo_asr |
|
|
|
|
|
# Step 1: Load the ASR model from Hugging Face |
|
|
model_name = 'WhissleAI/speech-tagger_en_slurp-iot' |
|
|
asr_model = nemo_asr.models.EncDecCTCModel.from_pretrained(model_name) |
|
|
|
|
|
# Step 2: Provide the path to your audio file |
|
|
audio_file_path = '/path/to/your/audio_file.wav' |
|
|
|
|
|
# Step 3: Transcribe the audio |
|
|
transcription = asr_model.transcribe(paths2audio_files=[audio_file_path]) |
|
|
print(f'Transcription: {transcription[0]}') |
|
|
``` |