vladpolbennikov/kokoro-82m-all-voices

Kokoro v1.0 2025 Jan 27 - text-to-speech (82M params, based on StyleTTS2)

Public
928 runs

Run time and cost

This model costs approximately $0.00022 to run on Replicate, or 4545 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia T4 GPU hardware. Predictions typically complete within 1 seconds.

Readme

license: apache-2.0 language: - en base_model: - yl4579/StyleTTS2-LJSpeech pipeline_tag: text-to-speech

This is a fork of the original Kokoro repo, in order to provide easy inference on Replicate. I am not affiliated with the original Kokoro authors, and this is not an official release of the Kokoro model. Similar to the Huggingface Space, this implementation provides automatic text splitting to support long form text inputs. See the original README below for more details.