You're looking at a specific version of this model. Jump to the model overview.

minimax /speech-2.8-turbo:fb87e0f2

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field Type Default value Description
text
string
Text to narrate (max 10,000 characters). Use markers like <#0.5#> to insert pauses in seconds.
voice_id
string
Wise_Woman
Voice to synthesize. Pick any MiniMax system voice or a voice_id returned by https://replicate.com/minimax/voice-cloning.
speed
number
1

Min: 0.5

Max: 2

Speech speed multiplier (0.5–2.0). Lower is slower, higher is faster.
volume
number
1

Max: 10

Relative loudness. 1.0 is default MiniMax gain. Range 0–10.
pitch
integer
0

Min: -12

Max: 12

Semitone offset applied to the voice (−12 to +12).
emotion
None
auto
Desired delivery style. Use auto to let MiniMax choose, or pick a specific emotion.
english_normalization
boolean
False
Improve number/date reading for English text (adds a small amount of latency).
sample_rate
None
32000
Audio sample rate in Hz.
bitrate
None
128000
MP3 bitrate in bits per second. Only used when audio_format is mp3.
audio_format
None
mp3
File format for the generated audio. Choose mp3 for general use, wav/flac for lossless, or pcm for raw bytes.
channel
None
mono
mono for 1 channel (default), stereo for 2 channels.
subtitle_enable
boolean
False
Return MiniMax subtitle metadata with sentence timestamps (non-streaming only).
language_boost
None
None
Optional language hint. Choose Automatic to let MiniMax detect the language, or pick a specific locale.

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{'format': 'uri', 'title': 'Output', 'type': 'string'}