You're looking at a specific version of this model. Jump to the model overview.

qwen /qwen3-tts:23e60a41

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field Type Default value Description
text
string
Text to synthesize into speech
mode
None
custom_voice
TTS mode: 'custom_voice' uses preset speakers, 'voice_clone' clones from reference audio, 'voice_design' creates voice from description
language
None
auto
Language of the text (use 'auto' for automatic detection)
speaker
None
Serena
Preset speaker voice (only for 'custom_voice' mode)
voice_description
string
Natural language description of desired voice (only for 'voice_design' mode). Example: 'A warm, friendly female voice with a slight British accent'
reference_audio
string
Reference audio file for voice cloning (only for 'voice_clone' mode)
reference_text
string
Transcript of the reference audio (recommended for 'voice_clone' mode)
style_instruction
string
Optional style/emotion instruction (e.g., 'speak slowly and calmly', 'excited tone')

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{'format': 'uri', 'title': 'Output', 'type': 'string'}