aodianyun/indextts2-thai

Public
5 runs

Run aodianyun/indextts2-thai with an API

Use one of our client libraries to get started quickly. Clicking on a library will take you to the Playground tab where you can tweak different inputs, see the results, and copy the corresponding code to use in your own project.

Input schema

The fields you can use to run this model with an API. If you don't give a value for a field its default value will be used.

Field Type Default value Description
prompt_audio
string
说话人参考音频(泰语,wav/mp3)
text
string
待合成的文本(建议泰文为主,可少量混英文)
temperature
number
0.5
采样温度,越低越稳定、越少离谱音色,越高越有表现力(推荐 0.3~0.8)
top_p
number
0.7
nucleus sampling 截断概率,越低越“保守”,越高越“开放”(推荐 0.5~0.9)
top_k
integer
20
每步候选 token 数,越小越稳,越大越有创意但更易出噪音(推荐 10~30)
do_sample
boolean
True
是否开启采样;False 时使用 beam search,一般更稳定、噪音更少
num_beams
integer
1
beam search 的 beam 数,>1 且 do_sample=False 时生效,越大越慢但更稳(推荐 3~5)
length_penalty
number
0
beam search 长度惩罚系数,0 表示不过度偏好长句,一般保持 0 即可
repetition_penalty
number
1.2
重复惩罚系数,略微大于 1 可减少奇怪重复/口吃感(推荐 1.1~1.3)
max_mel_tokens
integer
1500
最大 mel token 数,上限越大越不容易被截断但会变慢(推荐 1500~2200)
max_text_tokens_per_segment
integer
120
单段最大文本长度,适当减小可提升长句稳定性(推荐 80~120)
interval_silence
integer
200
分段之间插入的静音时长(毫秒),控制句子停顿感(推荐 200~400)
emo_alpha
number
0.8
情感强度 [0,1],越小越接近原声线且更稳,越大情感更夸张(推荐 0.5~0.8)
use_emo_text
boolean
False
是否根据文本/emo_text 自动推断情感向量,不依赖情感参考音频
emo_text
string
独立的情感提示文本;留空时默认使用合成文本本身
use_random
boolean
False
情感向量采样是否加入随机性,一般建议关闭以保证可复现和稳定性
verbose
boolean
False
是否打印详细调试信息,仅排查问题时建议开启

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{
  "type": "string",
  "title": "Output",
  "format": "uri"
}