SoundClone - Create Audio Task
SoundClone
SoundClone - Create Audio Task
Generate production audio using modelId from a completed preview task.
POST
SoundClone - Create Audio Task
Use the
modelId from a completed preview task. Poll Query SoundClone Task for the final audioUrl.modelId is valid for 3 days. The first successful call to this endpoint within that window permanently activates the voice for future generation.Response envelope: { "code": 20000, "msg": "ok", "data": { ... } }.Request body
| Field | Type | Required | Description |
|---|---|---|---|
modelId | string | Yes | Voice model ID from preview query result. |
contentText | string | Yes | Text to synthesize, max 10,000 characters. Insert <#x#> between words for pauses (x in seconds, 0.01–99.99). |
soundVersion | string | No | v1 or v2. |
language | string | No | Language code, default auto. |
emotion | string | No | Default neutral. Values: happy, sad, angry, fearful, disgusted, surprised, neutral. |
speed | number | No | Speech rate [0.5, 2], default 1.0. |
vol | number | No | Volume (0, 10], default 1.0. |
pitch | integer | No | Pitch [-12, 12], default 0. |
subtitleEnable | boolean | No | Generate subtitles, default false. |
subtitleType | string | No | When subtitles enabled, word for word-level; omit for sentence-level. |
Billing
Production audio has two fee components:| Model config | Description |
|---|---|
sound-cloning-audio | Character fee per 10,000 characters from contentText |
sound-cloning-voice | Voice fee: charged on every production audio submission (per task) |
<#x#> markers.
Example
Response example
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
application/json
Text to synthesize, max 10000 characters
Available options:
v1, v2 Available options:
happy, sad, angry, fearful, disgusted, surprised, neutral Required range:
0.5 <= x <= 2Required range:
x <= 10Required range:
-12 <= x <= 12Available options:
word