Update README.md
Browse files
README.md
CHANGED
|
@@ -18,7 +18,7 @@ MOSS-TTSD supports voice cloning and single-session speech generation of up to 1
|
|
| 18 |
|
| 19 |
## Highlights
|
| 20 |
|
| 21 |
-
- **Highly Expressive Dialogue Speech**: Built on unified semantic-acoustic neural audio codec, a pre-trained large language model, millions of hours of TTS data
|
| 22 |
- **Two-Speaker Voice Cloning**: MOSS-TTSD supports zero-shot two speakers voice cloning and can generate conversational speech with accurate speaker swithcing based on dialogue scripts.
|
| 23 |
- **Chinese-English Bilingual Support**: MOSS-TTSD enables highly expressive speech generation in both Chinese and English.
|
| 24 |
- **Long-Form Speech Generation (up to 1700 seconds)**: Thanks to low-bitrate codec and training framework optimization, MOSS-TTSD has been trained for long speech generation, enabling single-session speech generation of up to 1700 seconds.
|
|
|
|
| 18 |
|
| 19 |
## Highlights
|
| 20 |
|
| 21 |
+
- **Highly Expressive Dialogue Speech**: Built on unified semantic-acoustic neural audio codec, a pre-trained large language model, millions of hours of TTS data and conversational speech, MOSS-TTSD generates highly expressive, human-like dialogue speech with natural conversational prosody.
|
| 22 |
- **Two-Speaker Voice Cloning**: MOSS-TTSD supports zero-shot two speakers voice cloning and can generate conversational speech with accurate speaker swithcing based on dialogue scripts.
|
| 23 |
- **Chinese-English Bilingual Support**: MOSS-TTSD enables highly expressive speech generation in both Chinese and English.
|
| 24 |
- **Long-Form Speech Generation (up to 1700 seconds)**: Thanks to low-bitrate codec and training framework optimization, MOSS-TTSD has been trained for long speech generation, enabling single-session speech generation of up to 1700 seconds.
|