Skip to main content
POST
/
api
/
open-api
/
v1
/
soundCloning
/
clones
SoundClone - Create Preview Task
curl --request POST \
  --url https://www.jimmyai.cn/api/open-api/v1/soundCloning/clones \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "fileUrl": "https://example.com/source-audio.mp3",
  "contentText": "A short preview sentence for voice cloning.",
  "soundVersion": "v1",
  "language": "Chinese"
}
'
{
  "code": 20000,
  "msg": "ok",
  "data": {
    "id": "audio_16b635ba-5889-4fa5-bbcc-bf67a38c353a",
    "object": "audio",
    "created": 1781777280,
    "model": "soundCloningClone",
    "status": "queued",
    "error": null
  }
}
SoundClone tasks are async. The create response returns a task id. Poll Query SoundClone Task until completion to obtain modelId and preview audioUrl, then call Create Audio Task for production audio.All responses use the envelope { "code": 20000, "msg": "ok", "data": { ... } }. Examples below show the data payload.

Overview

Submit a voice-cloning preview task from a source audio or video URL. When the task completes, you receive a preview audio URL and a modelId for formal audio generation.

Request body

FieldTypeRequiredDescription
fileUrlstringYesPublic URL of source audio or video. No local paths or Chinese characters in the URL. Audio: mp3, ogg, wav, m4a, aac. Video: mp4, avi, mov, mkv, flv. Spoken content must be > 15s and < 60s.
contentTextstringNoPreview script, max 270 characters. Uses a default line when omitted.
soundVersionstringNov1 (24 languages) or v2 (40 languages). Default v1.
languagestringNoLanguage code, default auto. Example: Chinese, English. Some languages require v2; see Create Audio Task.

Billing

Preview is billed by character count in units of 10,000 characters (price_mode: per_10k_char).
Model configDescription
sound-cloning-clonePreview character fee; unit price is per 10k characters
  • Characters are counted as Unicode runes; <#x#> pause markers are excluded.
  • The default preview text counts toward billing when contentText is omitted.
  • Balance is checked before submission; failed tasks are refunded.

Example

curl --request POST \
  --url 'https://www.jimmyai.cn/api/open-api/v1/soundCloning/clones' \
  --header 'Authorization: Bearer sk_xxx' \
  --header 'Content-Type: application/json' \
  --data '{
    "fileUrl": "https://example.com/source-audio.mp3",
    "contentText": "A short preview sentence for voice cloning.",
    "soundVersion": "v1",
    "language": "Chinese"
  }'

Response example

{
  "code": 20000,
  "msg": "ok",
  "data": {
    "id": "audio_16b635ba-5889-4fa5-bbcc-bf67a38c353a",
    "object": "audio",
    "created": 1781777280,
    "model": "soundCloningClone",
    "status": "queued",
    "error": null
  }
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
fileUrl
string
required

Public URL of source audio or video

contentText
string

Preview script, max 270 characters

soundVersion
enum<string>
default:v1
Available options:
v1,
v2
language
string
default:auto

Response

200 - application/json

Task created

code
integer
Example:

20000

msg
string
Example:

"ok"

data
object
Example:
{
"id": "audio_16b635ba-5889-4fa5-bbcc-bf67a38c353a",
"object": "audio",
"created": 1781777280,
"model": "soundCloningClone",
"status": "queued",
"error": null
}