Official

minimax / speech-02-turbo

Text-to-Audio (T2A) that offers voice synthesis, emotional expression, and multilingual capabilities. Designed for real-time applications with low latency

  • Public
  • 237.8K runs
  • Commercial use
  • License
Iterate in playground
  • Prediction

    minimax/speech-02-turbo
    ID
    by67sg9dxdrm80cpjat9x3apxw
    Status
    Succeeded
    Source
    Web
    Total duration
    Created

    Input

    text
    Speech-02-series is a Text-to-Audio and voice cloning technology that offers voice synthesis, emotional expression, and multilingual capabilities. The HD version is optimized for high-fidelity applications like voiceovers and audiobooks. While the turbo one is designed for real-time applications with low latency. When using this model on Replicate, each character represents 1 token.
    pitch
    0
    speed
    1
    volume
    1
    bitrate
    128000
    channel
    mono
    emotion
    angry
    voice_id
    Deep_Voice_Man
    sample_rate
    32000
    language_boost
    English
    english_normalization

    Output

    Video Player is loading.
    Current Time 00:00:000
    Duration 00:00:000
    Loaded: 0%
    Stream Type LIVE
    Remaining Time 00:00:000
     
    1x
    Generated in
    Input tokens
    380
    Output tokens
    1
    Tokens per second
    0.42 tokens / second
    Time to first token

Want to make some of these yourself?

Run this model