Skip to content

Fish Audio Voice Cloning

  • 2025.9.14 This site now supports the fish-tts model
  • Many built-in voices, such as: Elon Musk, Zhao Benshan, Zhu Bajie, Sun Wukong voice cloning
  • Supports Emotion Control Emotion Control
  • Supports uploading your own audio files for cloning
  • Fully compatible with the OpenAI TTS /v1/audio/speech interface

Pricing

  • Price at exchange rate 1:3
  • Official site: $15.00 USD per million UTF-8 bytes
  • Ours: ¥45.00 CNY per million UTF-8 bytes

API

  • Convention: include Authorization: Bearer hk-your-key in the request header

1. Clone Voice

post https://api.openai-hk.com/v1/audio/speech

shell
curl --request POST \
  --url https://api.openai-hk.com/v1/audio/speech \
  --header 'Authorization: Bearer hk-your-key' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "fish-tts",
    "input": "(高兴) 师傅,我想娶媳妇",
    "voice": "d7900c21663f485ab63ebdb7e5905036"
}'  \
  --output speech.mp3
curl --request POST \
  --url https://api.openai-hk.com/v1/audio/speech \
  --header 'Authorization: Bearer hk-your-key' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "fish-tts",
    "input": "(高兴) 师傅,我想娶媳妇",
    "voice": "d7900c21663f485ab63ebdb7e5905036"
}'  \
  --output speech.mp3
  • Output is mp3, header Content-Type: application/octet-stream

  • The response header contains task-id: b7f228140bd949d7b466e1c33566a2fd

  • With this task-id, the mp3 link is https://platform.r2.fish.audio/task/b7f228140bd949d7b466e1c33566a2fd.mp3

  • voice is the voice clone source; d7900c21663f485ab63ebdb7e5905036 is the Zhu Bajie voice, which can be a model id from Fish Audio

  • More voice sources can be retrieved via the API or by uploading your own audio to clone

  • input is the content; emotion control tags are supported in the content — Official Docs

  • Return JSON format

Request body

json
{
  "model": "fish-tts",
  "input": "(高兴) 师傅,我想娶媳妇",
  "voice": "d7900c21663f485ab63ebdb7e5905036",
  "response_format": "url"
}
{
  "model": "fish-tts",
  "input": "(高兴) 师傅,我想娶媳妇",
  "voice": "d7900c21663f485ab63ebdb7e5905036",
  "response_format": "url"
}

Parameter Description

ParameterTypeDescription
modelstringVoice clone model name
inputstringText content
voicestringVoice clone source, model id from Fish Audio
response_formatstringResponse format, default is stream, can be url

Response

json
{
  "audio_url": "https://platform.r2.fish.audio/task/ad68621d26334758922403bd2d0a4bd4.mp3"
}
{
  "audio_url": "https://platform.r2.fish.audio/task/ad68621d26334758922403bd2d0a4bd4.mp3"
}

2. Create Your Own Voice Clone Source

post https://api.openai-hk.com/fish/model

shell
curl --request POST \
  --url https://api.openai-hk.com/fish/model \
  --header 'Authorization: Bearer hk-your-key' \
  --header 'Content-Type: application/json' \
  --data '{
    "title":"Teacher"
    ,"description":"Description"
    ,"voices":"https://platform.r2.fish.audio/task/604133d7b3c7430385382470f67770e8.mp3"
    ,"cover_image":"https://www.open-hk.com/res/img/open.png"
    ,"train_mode":"fast"

}'
curl --request POST \
  --url https://api.openai-hk.com/fish/model \
  --header 'Authorization: Bearer hk-your-key' \
  --header 'Content-Type: application/json' \
  --data '{
    "title":"Teacher"
    ,"description":"Description"
    ,"voices":"https://platform.r2.fish.audio/task/604133d7b3c7430385382470f67770e8.mp3"
    ,"cover_image":"https://www.open-hk.com/res/img/open.png"
    ,"train_mode":"fast"

}'
  • To prevent abuse, each call deducts 50 credits

  • Parameter description

    ParameterTypeDescription
    titlestringVoice clone name
    descriptionstringVoice clone description
    voicesstringAudio file URL
    cover_imagestringCover image
    train_modestringVoice clone training mode
    is_saveboolDefault is false; whether to keep the model — if not saved, it will be deleted after 10 minutes
  • Response body

The _id is used as the voice parameter

json
{
  "_id": "d3645891f9e742108c313e66271394ac",
  "type": "tts",
  "title": "老师",
  "description": "描述",
  "cover_image": "coverimage/d3645891f9e742108c313e66271394ac",
  "train_mode": "fast",
  "state": "trained",
  "tags": [],
  "samples": [
    {
      "title": "Default Sample",
      "text": "通过光合作用,植物能够将阳光转化为生命所需的能量。这个精密的生物化学过程不仅维持了植物的生长,更为地球上的氧气循环做出了重要贡献,展现了自然界的奇妙平衡。",
      "task_id": "1ae43d4f956d4744b7ecb6c6d54e4437",
      "audio": "task/1ae43d4f956d4744b7ecb6c6d54e4437.mp3"
    }
  ],
  "created_at": "2025-09-14T15:10:09.033883Z",
  "updated_at": "2025-09-14T15:10:09.033485Z",
  "languages": ["zh"],
  "visibility": "public",
  "lock_visibility": false,
  "default_text": "通过光合作用,植物能够将阳光转化为生命所需的能量。这个精密的生物化学过程不仅维持了植物的生长,更为地球上的氧气循环做出了重要贡献,展现了自然界的奇妙平衡。",
  "like_count": 0,
  "mark_count": 0,
  "shared_count": 0,
  "task_count": 0,
  "unliked": false,
  "liked": false,
  "marked": false
}
{
  "_id": "d3645891f9e742108c313e66271394ac",
  "type": "tts",
  "title": "老师",
  "description": "描述",
  "cover_image": "coverimage/d3645891f9e742108c313e66271394ac",
  "train_mode": "fast",
  "state": "trained",
  "tags": [],
  "samples": [
    {
      "title": "Default Sample",
      "text": "通过光合作用,植物能够将阳光转化为生命所需的能量。这个精密的生物化学过程不仅维持了植物的生长,更为地球上的氧气循环做出了重要贡献,展现了自然界的奇妙平衡。",
      "task_id": "1ae43d4f956d4744b7ecb6c6d54e4437",
      "audio": "task/1ae43d4f956d4744b7ecb6c6d54e4437.mp3"
    }
  ],
  "created_at": "2025-09-14T15:10:09.033883Z",
  "updated_at": "2025-09-14T15:10:09.033485Z",
  "languages": ["zh"],
  "visibility": "public",
  "lock_visibility": false,
  "default_text": "通过光合作用,植物能够将阳光转化为生命所需的能量。这个精密的生物化学过程不仅维持了植物的生长,更为地球上的氧气循环做出了重要贡献,展现了自然界的奇妙平衡。",
  "like_count": 0,
  "mark_count": 0,
  "shared_count": 0,
  "task_count": 0,
  "unliked": false,
  "liked": false,
  "marked": false
}

3. Official Voice Sources

get https://api.openai-hk.com/fish/q/model?page_size=10&page_number=1&title=&language=zh&sort_by=task_count

Response body — the _id is used as the voice parameter

json
{
  "total": 272820,
  "items": [
    {
      "_id": "54a5170264694bfc8e9ad98df7bd89c3",
      "type": "tts",
      "title": "丁真",
      "description": "",
      "cover_image": "coverimage/54a5170264694bfc8e9ad98df7bd89c3",
      "train_mode": "fast"
    },
    ......,
    {}
  ]
}
{
  "total": 272820,
  "items": [
    {
      "_id": "54a5170264694bfc8e9ad98df7bd89c3",
      "type": "tts",
      "title": "丁真",
      "description": "",
      "cover_image": "coverimage/54a5170264694bfc8e9ad98df7bd89c3",
      "train_mode": "fast"
    },
    ......,
    {}
  ]
}

4. Emotion and Control Tags

Examples

  • At the start of a sentence: (Angry) Is this how you repay me?
  • Mid-sentence: I trusted you so much, (Angry) is this how you repay me?
  • Speed up: Run! Someone is chasing us, (speed up) if we don't run now it'll be too late!
  • Slow down: He spoke word by word, (slow down) as if each word weighed a thousand pounds.
  • Lower volume: He leaned close to my ear, (lower volume) and quietly told me a secret.
  • Raise volume: (Loud) What did you say? I can't hear you!
  • Excited tone: This is unbelievable! (Excited) We actually succeeded!
  • Laugh: Hearing the joke, he couldn't hold it anymore, (laugh) hahaha!
  • Cry: She covered her face, (cry) sobbing, unable to say another word.
  • Sigh: How did things end up like this... (sigh) Ah.

5. Official /v1/tts Interface

post https://api.openai-hk.com/fish/v1/tts

shell
curl --request POST \
  --url https://api.openai-hk.com/fish/v1/tts \
  --header 'Authorization: Bearer hk-your-key' \
  --header 'format: url' \
  --header 'model: s2-pro' \
  --data '{
    "text": "[嘿嘿]你说什么?",
    "reference_id": "5c353fdb312f4888836a9a5680099ef0",
    "temperature": 0.7,
    "top_p": 0.7,
    "prosody": {
      "speed": 1,
      "volume": 0,
      "normalize_loudness": true
    },
    "chunk_length": 300,
    "normalize": true,
    "format": "mp3",
    "sample_rate": 44100,
    "mp3_bitrate": 128,
    "latency": "normal",
    "max_new_tokens": 1024,
    "repetition_penalty": 1.2,
    "min_chunk_length": 50,
    "condition_on_previous_chunks": true,
    "early_stop_threshold": 1,
    "references":[{
        "audio_url":"http://cos.aitutu.cc/mp4/ru-user-voice.mp3",
        "text":"Hello! Welcome to Fish Audio."
    }]
  }'
curl --request POST \
  --url https://api.openai-hk.com/fish/v1/tts \
  --header 'Authorization: Bearer hk-your-key' \
  --header 'format: url' \
  --header 'model: s2-pro' \
  --data '{
    "text": "[嘿嘿]你说什么?",
    "reference_id": "5c353fdb312f4888836a9a5680099ef0",
    "temperature": 0.7,
    "top_p": 0.7,
    "prosody": {
      "speed": 1,
      "volume": 0,
      "normalize_loudness": true
    },
    "chunk_length": 300,
    "normalize": true,
    "format": "mp3",
    "sample_rate": 44100,
    "mp3_bitrate": 128,
    "latency": "normal",
    "max_new_tokens": 1024,
    "repetition_penalty": 1.2,
    "min_chunk_length": 50,
    "condition_on_previous_chunks": true,
    "early_stop_threshold": 1,
    "references":[{
        "audio_url":"http://cos.aitutu.cc/mp4/ru-user-voice.mp3",
        "text":"Hello! Welcome to Fish Audio."
    }]
  }'
  • model is placed in the header; supports s2-pro and s1

  • To return a URL, add format: url to the header; default returns a stream

  • Parameter description

ParameterTypeDescription
textstringText content
reference_idstringVoice timbre clone id
referencesobjectVoice timbre clone references; ignored if reference_id has a value
references[0].audio_urlstringVoice timbre clone audio URL; duration 10s~270s; supports mp3 and wav
references[0].textstringReference sample text for the voice timbre clone