platform-kit / mars5-tts

A novel speech model for insane prosody.

  • Public
  • 149 runs
  • GitHub
  • License

Readme

This is a demo for the MARS5 English speech model (TTS) from CAMB.AI.

The model follows a two-stage AR-NAR pipeline with a distinctively novel NAR component (see more info in the Architecture).

With just 5 seconds of audio and a snippet of text, MARS5 can generate speech even for prosodically hard and diverse scenarios like sports commentary, anime and more.