bzikst/s2-pro

Fish Audio S2 Pro is a leading text-to-speech (TTS) model with fine-grained inline control of prosody and emotion. Trained on over 10M+ hours of audio data across 80+ languages

Public
36 runs

Run bzikst/s2-pro with an API

Use one of our client libraries to get started quickly. Clicking on a library will take you to the Playground tab where you can tweak different inputs, see the results, and copy the corresponding code to use in your own project.

Input schema

The fields you can use to run this model with an API. If you don't give a value for a field its default value will be used.

Field Type Default value Description
text
string
Text to synthesize
reference_audio
string
Reference audio for voice cloning
reference_text
string
Transcript of the reference audio
chunk_length
integer
200

Min: 100

Max: 300

Chunk length for iterative prompting
max_new_tokens
integer
1024

Max: 4096

Maximum new tokens, 0 means no limit
top_p
number
0.8

Min: 0.1

Max: 1

Top-p
repetition_penalty
number
1.1

Min: 0.9

Max: 2

Repetition penalty
temperature
number
0.8

Min: 0.1

Max: 1

Sampling temperature
seed
integer
Deterministic seed, omit for random generation

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{
  "type": "string",
  "title": "Output",
  "format": "uri"
}