WhissleAI
/

stt_en_conformer_ctc_large_slurp

Automatic Speech Recognition

Model card Files Files and versions

stt_en_conformer_ctc_large_slurp / README.md

ksingla025's picture

Update README.md

83a7367 verified about 1 year ago

|

history blame contribute delete

1.47 kB

	---
	language: en
	license: cc-by-4.0
	tags:
	- automatic-speech-recognition
	- nemo
	- conformer
	- entity_tagging
	- intent
	datasets:
	- slurp
	metrics:
	- wer
	- cer
	model-index:
	- name: 1step ASR-NL for Slurp dataset
	results:
	- task:
	name: Automatic Speech Recognition
	type: automatic-speech-recognition
	dataset:
	name: Slurp dataset
	type: slurp
	metrics:
	- name: Word Error Rate
	type: wer
	value:
	- Insert WER Value
	- name: Character Error Rate
	type: cer
	value:
	- Insert CER Value
	---


	# This speech tagger performs transcription, annotates entities, predict intent for SLURP dataset

	Model is suitable for voiceAI applications.

	## Model Details

	- Model type: NeMo ASR
	- Architecture: Conformer CTC
	- Language: English
	- Training data: Slurp dataset
	- Performance metrics: [Metrics]

	## Usage

	To use this model, you need to install the NeMo library:

	```bash
	pip install nemo_toolkit
	```

	### How to run

	```python
	import nemo.collections.asr as nemo_asr

	# Step 1: Load the ASR model from Hugging Face
	model_name = 'WhissleAI/speech-tagger_en_slurp-iot'
	asr_model = nemo_asr.models.EncDecCTCModel.from_pretrained(model_name)

	# Step 2: Provide the path to your audio file
	audio_file_path = '/path/to/your/audio_file.wav'

	# Step 3: Transcribe the audio
	transcription = asr_model.transcribe(paths2audio_files=[audio_file_path])
	print(f'Transcription: {transcription[0]}')
	```