Echolancer-v0.1-zs

This is a TTS model trained on approximately ~5-7k hours of private labeled data, finetuned from the base model; it's conditioned on SpeechBrain ECAPA embeddings. This model has 177M parameters and on single AMD Instinct MI300X with the ROCm PyTorch Training v25.7 container, fine-tuning for 52k steps -- almost one epoch -- took a little under 4 hours. It's capable of zero-shot voice cloning with a reference clip

The training objective was standard next-token prediction on concatenated text-audio tokens.

Code

For more information including a Colab notebook, see the repository.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ZDisket/echolancer-v0.1-zs

Finetuned
(1)
this model