suno-ai / bark

🔊 Text-Prompted Generative Audio Model

  • Public
  • 252.3K runs
  • GitHub
  • License

Input

Output

Run time and cost

This model costs approximately $0.0087 to run on Replicate, or 114 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia T4 GPU hardware. Predictions typically complete within 39 seconds.

Readme

🐶 Bark

Original repo: https://github.com/suno-ai/bark

Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying. To support the research community, we are providing access to pretrained model checkpoints ready for inference.

🙏 Appreciation

  • nanoGPT for a dead-simple and blazing fast implementation of GPT-style models
  • EnCodec for a state-of-the-art implementation of a fantastic audio codec
  • AudioLM for very related training and inference code
  • Vall-E, AudioLM and many other ground-breaking papers that enabled the development of Bark

© License

Bark is licensed under the MIT License.

Please contact us at bark@suno.ai to request access to a larger version of the model.