jichengdu / spark-tts

0.5B

  • Public
  • 46 runs
  • L40S

Input

*string
Shift + Return to add a new line

Text for TTS generation - REQUIRED in both modes (要转换为语音的文本 - 两种模式下都必需)

string

TTS mode: voice cloning requires a prompt audio file to mimic the voice; voice creation generates speech with specified gender/pitch/speed parameters. (TTS模式:声音克隆需要提供语音样本来模仿声音;声音创建使用指定的性别/音高/语速参数生成语音)

Default: "voice_creation"

file

[Voice Cloning] Path to the prompt audio file - REQUIRED in voice cloning mode (声音克隆模式:提示音频文件路径 - 声音克隆模式下必需)

string
Shift + Return to add a new line

[Voice Cloning] Transcript of prompt audio - Optional but improves quality (声音克隆模式:提示音频的文本转录 - 可选,但提供可提高质量)

Default: ""

string

[Voice Creation] Voice gender - REQUIRED in voice creation mode (声音创建模式:声音性别 - 声音创建模式下必需)

Default: "female"

string

[Voice Creation] Voice pitch level - REQUIRED in voice creation mode (声音创建模式:声音音高 - 声音创建模式下必需)

Default: "moderate"

string

[Voice Creation] Speaking speed - REQUIRED in voice creation mode (声音创建模式:说话速度 - 声音创建模式下必需)

Default: "moderate"

number

Sampling temperature (0.0-1.0) - Controls randomness in generation (采样温度 - 控制生成的随机性)

Default: 0.8

integer

Top-k sampling parameter - Limits the token selection to top k options (Top-k采样参数 - 将令牌选择限制为前k个选项)

Default: 50

number

Top-p sampling parameter - Nucleus sampling probability threshold (Top-p采样参数 - 核采样概率阈值)

Default: 0.95

Output

Video Player is loading.
Current Time 00:00:000
Duration 00:00:000
Loaded: 0%
Stream Type LIVE
Remaining Time 00:00:000
 
1x
Generated in

Run time and cost

This model runs on Nvidia L40S GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

This model doesn't have a readme.