vladpolbennikov/kokoro-82m-all-voices | Run with an API on Replicate

Kokoro v1.0 2025 Jan 27 - text-to-speech (82M params, based on StyleTTS2)

Public

4.5K runs

License

GitHub

Run time and cost

This model costs approximately $0.0061 to run on Replicate, or 163 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia T4 GPU hardware. Predictions typically complete within 28 seconds. The predict time for this model varies significantly based on the inputs.

Readme

license: apache-2.0 language: - en base_model: - yl4579/StyleTTS2-LJSpeech pipeline_tag: text-to-speech

This is a fork of the original Kokoro repo, in order to provide easy inference on Replicate. I am not affiliated with the original Kokoro authors, and this is not an official release of the Kokoro model. Similar to the Huggingface Space, this implementation provides automatic text splitting to support long form text inputs. See the original README below for more details.

Model created 9 months, 4 weeks ago