Fish Audio Voice Cloning
- 2025.9.14 This site now supports the fish-tts model
- Many built-in voices, such as: Elon Musk, Zhao Benshan, Zhu Bajie, Sun Wukong voice cloning
- Supports Emotion Control
Emotion Control - Supports uploading your own audio files for cloning
- Fully compatible with the OpenAI TTS
/v1/audio/speechinterface
Pricing
- Price at exchange rate 1:3
- Official site: $15.00 USD per million UTF-8 bytes
- Ours: ¥45.00 CNY per million UTF-8 bytes
API
- Convention: include
Authorization: Bearer hk-your-keyin the request header
1. Clone Voice
curl --request POST \
--url https://api.openai-hk.com/v1/audio/speech \
--header 'Authorization: Bearer hk-your-key' \
--header 'Content-Type: application/json' \
--data '{
"model": "fish-tts",
"input": "(高兴) 师傅,我想娶媳妇",
"voice": "d7900c21663f485ab63ebdb7e5905036"
}' \
--output speech.mp3curl --request POST \
--url https://api.openai-hk.com/v1/audio/speech \
--header 'Authorization: Bearer hk-your-key' \
--header 'Content-Type: application/json' \
--data '{
"model": "fish-tts",
"input": "(高兴) 师傅,我想娶媳妇",
"voice": "d7900c21663f485ab63ebdb7e5905036"
}' \
--output speech.mp3Output is mp3, header
Content-Type: application/octet-streamThe response header contains
task-id: b7f228140bd949d7b466e1c33566a2fdWith this task-id, the mp3 link is https://platform.r2.fish.audio/task/b7f228140bd949d7b466e1c33566a2fd.mp3
voiceis the voice clone source;d7900c21663f485ab63ebdb7e5905036is the Zhu Bajie voice, which can be a model id from Fish AudioMore voice sources can be retrieved via the API or by uploading your own audio to clone
inputis the content; emotion control tags are supported in the content — Official DocsReturn JSON format
Request body
{
"model": "fish-tts",
"input": "(高兴) 师傅,我想娶媳妇",
"voice": "d7900c21663f485ab63ebdb7e5905036",
"response_format": "url"
}{
"model": "fish-tts",
"input": "(高兴) 师傅,我想娶媳妇",
"voice": "d7900c21663f485ab63ebdb7e5905036",
"response_format": "url"
}Parameter Description
| Parameter | Type | Description |
|---|---|---|
| model | string | Voice clone model name |
| input | string | Text content |
| voice | string | Voice clone source, model id from Fish Audio |
| response_format | string | Response format, default is stream, can be url |
Response
{
"audio_url": "https://platform.r2.fish.audio/task/ad68621d26334758922403bd2d0a4bd4.mp3"
}{
"audio_url": "https://platform.r2.fish.audio/task/ad68621d26334758922403bd2d0a4bd4.mp3"
}2. Create Your Own Voice Clone Source
curl --request POST \
--url https://api.openai-hk.com/fish/model \
--header 'Authorization: Bearer hk-your-key' \
--header 'Content-Type: application/json' \
--data '{
"title":"Teacher"
,"description":"Description"
,"voices":"https://platform.r2.fish.audio/task/604133d7b3c7430385382470f67770e8.mp3"
,"cover_image":"https://www.open-hk.com/res/img/open.png"
,"train_mode":"fast"
}'curl --request POST \
--url https://api.openai-hk.com/fish/model \
--header 'Authorization: Bearer hk-your-key' \
--header 'Content-Type: application/json' \
--data '{
"title":"Teacher"
,"description":"Description"
,"voices":"https://platform.r2.fish.audio/task/604133d7b3c7430385382470f67770e8.mp3"
,"cover_image":"https://www.open-hk.com/res/img/open.png"
,"train_mode":"fast"
}'To prevent abuse, each call deducts 50 credits
Parameter description
Parameter Type Description title string Voice clone name description string Voice clone description voices string Audio file URL cover_image string Cover image train_mode string Voice clone training mode is_save bool Default is false; whether to keep the model — if not saved, it will be deleted after 10 minutes Response body
The _id is used as the voice parameter
{
"_id": "d3645891f9e742108c313e66271394ac",
"type": "tts",
"title": "老师",
"description": "描述",
"cover_image": "coverimage/d3645891f9e742108c313e66271394ac",
"train_mode": "fast",
"state": "trained",
"tags": [],
"samples": [
{
"title": "Default Sample",
"text": "通过光合作用,植物能够将阳光转化为生命所需的能量。这个精密的生物化学过程不仅维持了植物的生长,更为地球上的氧气循环做出了重要贡献,展现了自然界的奇妙平衡。",
"task_id": "1ae43d4f956d4744b7ecb6c6d54e4437",
"audio": "task/1ae43d4f956d4744b7ecb6c6d54e4437.mp3"
}
],
"created_at": "2025-09-14T15:10:09.033883Z",
"updated_at": "2025-09-14T15:10:09.033485Z",
"languages": ["zh"],
"visibility": "public",
"lock_visibility": false,
"default_text": "通过光合作用,植物能够将阳光转化为生命所需的能量。这个精密的生物化学过程不仅维持了植物的生长,更为地球上的氧气循环做出了重要贡献,展现了自然界的奇妙平衡。",
"like_count": 0,
"mark_count": 0,
"shared_count": 0,
"task_count": 0,
"unliked": false,
"liked": false,
"marked": false
}{
"_id": "d3645891f9e742108c313e66271394ac",
"type": "tts",
"title": "老师",
"description": "描述",
"cover_image": "coverimage/d3645891f9e742108c313e66271394ac",
"train_mode": "fast",
"state": "trained",
"tags": [],
"samples": [
{
"title": "Default Sample",
"text": "通过光合作用,植物能够将阳光转化为生命所需的能量。这个精密的生物化学过程不仅维持了植物的生长,更为地球上的氧气循环做出了重要贡献,展现了自然界的奇妙平衡。",
"task_id": "1ae43d4f956d4744b7ecb6c6d54e4437",
"audio": "task/1ae43d4f956d4744b7ecb6c6d54e4437.mp3"
}
],
"created_at": "2025-09-14T15:10:09.033883Z",
"updated_at": "2025-09-14T15:10:09.033485Z",
"languages": ["zh"],
"visibility": "public",
"lock_visibility": false,
"default_text": "通过光合作用,植物能够将阳光转化为生命所需的能量。这个精密的生物化学过程不仅维持了植物的生长,更为地球上的氧气循环做出了重要贡献,展现了自然界的奇妙平衡。",
"like_count": 0,
"mark_count": 0,
"shared_count": 0,
"task_count": 0,
"unliked": false,
"liked": false,
"marked": false
}3. Official Voice Sources
Response body — the _id is used as the voice parameter
{
"total": 272820,
"items": [
{
"_id": "54a5170264694bfc8e9ad98df7bd89c3",
"type": "tts",
"title": "丁真",
"description": "",
"cover_image": "coverimage/54a5170264694bfc8e9ad98df7bd89c3",
"train_mode": "fast"
},
......,
{}
]
}{
"total": 272820,
"items": [
{
"_id": "54a5170264694bfc8e9ad98df7bd89c3",
"type": "tts",
"title": "丁真",
"description": "",
"cover_image": "coverimage/54a5170264694bfc8e9ad98df7bd89c3",
"train_mode": "fast"
},
......,
{}
]
}4. Emotion and Control Tags
Examples
- At the start of a sentence: (Angry) Is this how you repay me?
- Mid-sentence: I trusted you so much, (Angry) is this how you repay me?
- Speed up: Run! Someone is chasing us, (speed up) if we don't run now it'll be too late!
- Slow down: He spoke word by word, (slow down) as if each word weighed a thousand pounds.
- Lower volume: He leaned close to my ear, (lower volume) and quietly told me a secret.
- Raise volume: (Loud) What did you say? I can't hear you!
- Excited tone: This is unbelievable! (Excited) We actually succeeded!
- Laugh: Hearing the joke, he couldn't hold it anymore, (laugh) hahaha!
- Cry: She covered her face, (cry) sobbing, unable to say another word.
- Sigh: How did things end up like this... (sigh) Ah.
5. Official /v1/tts Interface
curl --request POST \
--url https://api.openai-hk.com/fish/v1/tts \
--header 'Authorization: Bearer hk-your-key' \
--header 'format: url' \
--header 'model: s2-pro' \
--data '{
"text": "[嘿嘿]你说什么?",
"reference_id": "5c353fdb312f4888836a9a5680099ef0",
"temperature": 0.7,
"top_p": 0.7,
"prosody": {
"speed": 1,
"volume": 0,
"normalize_loudness": true
},
"chunk_length": 300,
"normalize": true,
"format": "mp3",
"sample_rate": 44100,
"mp3_bitrate": 128,
"latency": "normal",
"max_new_tokens": 1024,
"repetition_penalty": 1.2,
"min_chunk_length": 50,
"condition_on_previous_chunks": true,
"early_stop_threshold": 1,
"references":[{
"audio_url":"http://cos.aitutu.cc/mp4/ru-user-voice.mp3",
"text":"Hello! Welcome to Fish Audio."
}]
}'curl --request POST \
--url https://api.openai-hk.com/fish/v1/tts \
--header 'Authorization: Bearer hk-your-key' \
--header 'format: url' \
--header 'model: s2-pro' \
--data '{
"text": "[嘿嘿]你说什么?",
"reference_id": "5c353fdb312f4888836a9a5680099ef0",
"temperature": 0.7,
"top_p": 0.7,
"prosody": {
"speed": 1,
"volume": 0,
"normalize_loudness": true
},
"chunk_length": 300,
"normalize": true,
"format": "mp3",
"sample_rate": 44100,
"mp3_bitrate": 128,
"latency": "normal",
"max_new_tokens": 1024,
"repetition_penalty": 1.2,
"min_chunk_length": 50,
"condition_on_previous_chunks": true,
"early_stop_threshold": 1,
"references":[{
"audio_url":"http://cos.aitutu.cc/mp4/ru-user-voice.mp3",
"text":"Hello! Welcome to Fish Audio."
}]
}'modelis placed in the header; supportss2-proands1To return a URL, add
format: urlto the header; default returns a streamParameter description
| Parameter | Type | Description |
|---|---|---|
| text | string | Text content |
| reference_id | string | Voice timbre clone id |
| references | object | Voice timbre clone references; ignored if reference_id has a value |
| references[0].audio_url | string | Voice timbre clone audio URL; duration 10s~270s; supports mp3 and wav |
| references[0].text | string | Reference sample text for the voice timbre clone |
OpenAi-HK