bzikst/higgs-audio-v3-tts-4b

Public

3.6K runs

Run bzikst/higgs-audio-v3-tts-4b with an API

Use one of our client libraries to get started quickly. Clicking on a library will take you to the Playground tab where you can tweak different inputs, see the results, and copy the corresponding code to use in your own project.

Input schema

The fields you can use to run this model with an API. If you don't give a value for a field its default value will be used.

Field	Type	Default value	Description
text	string	Hello, this is Higgs Audio v3 TTS.	Text to synthesize. Supports inline Higgs control tokens.
sentence_pause_mode	None	off	Automatically insert Higgs pause tokens after sentence endings.
reference_audio	string		Optional reference audio for zero-shot voice cloning (WAV/MP3).
reference_text	string		Transcript of the reference audio; materially improves cloning quality.
voice_id	string		Private embedded voice id. Leave empty unless you have a valid id.
response_format	None	opus	Output audio format.
temperature	number	0.8 Max: 2	Sampling temperature.
top_p	number	Max: 1	Top-p (nucleus) sampling. Unset = server default.
top_k	integer	50	Top-k sampling. Higgs examples use 50.
max_new_tokens	integer	2048 Min: 1 Max: 8192	Maximum number of generated multi-codebook steps.
seed	integer	63359	Random seed for reproducibility. Unset = random.
debug_codec_tail	boolean	False	Log the final generated codec rows without modifying audio.
debug_reference_payload	boolean	False	None

{
  "type": "object",
  "title": "Input",
  "properties": {
    "seed": {
      "type": "integer",
      "title": "Seed",
      "default": 63359,
      "x-order": 10,
      "nullable": true,
      "description": "Random seed for reproducibility. Unset = random."
    },
    "text": {
      "type": "string",
      "title": "Text",
      "default": "Hello, this is Higgs Audio v3 TTS.",
      "x-order": 0,
      "description": "Text to synthesize. Supports inline Higgs control tokens."
    },
    "top_k": {
      "type": "integer",
      "title": "Top K",
      "default": 50,
      "minimum": 0,
      "x-order": 8,
      "nullable": true,
      "description": "Top-k sampling. Higgs examples use 50."
    },
    "top_p": {
      "type": "number",
      "title": "Top P",
      "maximum": 1,
      "minimum": 0,
      "x-order": 7,
      "nullable": true,
      "description": "Top-p (nucleus) sampling. Unset = server default."
    },
    "voice_id": {
      "type": "string",
      "title": "Voice Id",
      "x-order": 4,
      "nullable": true,
      "description": "Private embedded voice id. Leave empty unless you have a valid id."
    },
    "temperature": {
      "type": "number",
      "title": "Temperature",
      "default": 0.8,
      "maximum": 2,
      "minimum": 0,
      "x-order": 6,
      "description": "Sampling temperature."
    },
    "max_new_tokens": {
      "type": "integer",
      "title": "Max New Tokens",
      "default": 2048,
      "maximum": 8192,
      "minimum": 1,
      "x-order": 9,
      "description": "Maximum number of generated multi-codebook steps."
    },
    "reference_text": {
      "type": "string",
      "title": "Reference Text",
      "x-order": 3,
      "nullable": true,
      "description": "Transcript of the reference audio; materially improves cloning quality."
    },
    "reference_audio": {
      "type": "string",
      "title": "Reference Audio",
      "format": "uri",
      "x-order": 2,
      "nullable": true,
      "description": "Optional reference audio for zero-shot voice cloning (WAV/MP3)."
    },
    "response_format": {
      "enum": [
        "wav",
        "mp3",
        "flac",
        "opus",
        "aac",
        "pcm"
      ],
      "type": "string",
      "title": "response_format",
      "description": "Output audio format.",
      "default": "opus",
      "x-order": 5
    },
    "debug_codec_tail": {
      "type": "boolean",
      "title": "Debug Codec Tail",
      "default": false,
      "x-order": 11,
      "description": "Log the final generated codec rows without modifying audio."
    },
    "sentence_pause_mode": {
      "enum": [
        "off",
        "pause",
        "long_pause"
      ],
      "type": "string",
      "title": "sentence_pause_mode",
      "description": "Automatically insert Higgs pause tokens after sentence endings.",
      "default": "off",
      "x-order": 1
    },
    "debug_reference_payload": {
      "type": "boolean",
      "title": "Debug Reference Payload",
      "default": false,
      "x-order": 12
    }
  }
}

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema

{
  "type": "string",
  "title": "Output",
  "format": "uri"
}