Breeze-ASR-25 CoreML

This model is based on MediaTek-Research_Breeze-ASR-25, a state-of-the-art automatic speech recognition (ASR) model. It has been converted into the CoreML format for compatibility with Whisperkit, enabling efficient ASR inference on Apple Silicon devices.

Model Description

Breeze-ASR-25 is a high-performance automatic speech recognition model developed by MediaTek Research. This CoreML version enables on-device inference on Apple Silicon devices through Whisperkit integration.

Model Components

This repository contains three CoreML models:

  1. AudioEncoder.mlmodelc - Audio feature encoder
  2. MelSpectrogram.mlmodelc - Mel spectrogram processor
  3. TextDecoder.mlmodelc - Text decoder for transcription

Usage

With Whisperkit

import whisperkit

# Load the model
model = whisperkit.load_model("your-username/Breeze-ASR-25_coreml")

# Transcribe audio
result = model.transcribe("path/to/audio.wav")
print(result.text)

Requirements

  • macOS with Apple Silicon (M1/M2/M3)
  • iOS 16.0+ or macOS 13.0+
  • Whisperkit framework

Performance

  • Optimized for Apple Silicon devices
  • On-device inference (no internet required)
  • Low latency and memory usage
  • High accuracy speech recognition

License

This model is licensed under the Apache 2.0 License.

Citation

If you use this model, please cite the original Breeze-ASR-25 paper:

@article{breeze-asr-25,
  title={Breeze-ASR-25: Efficient Speech Recognition for Mobile Devices},
  author={MediaTek Research},
  journal={arXiv preprint},
  year={2024}
}
Downloads last month
57
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support