Kokoro is a frontier TTS model for its size of 82 million parameters (text in/audio out).
SoTA Zero Shot Voice Cloning and TTS model
SoTA depth estimation
This model is cold. You'll get a fast response if the model is warm and already running, and a slower response if the model is cold and starting up.