Original repo: https://github.com/suno-ai/bark
Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying. To support the research community, we are providing access to pretrained model checkpoints ready for inference.
- nanoGPT for a dead-simple and blazing fast implementation of GPT-style models
- EnCodec for a state-of-the-art implementation of a fantastic audio codec
- AudioLM for very related training and inference code
- Vall-E, AudioLM and many other ground-breaking papers that enabled the development of Bark
Bark is licensed under a non-commercial license: CC-BY 4.0 NC. The Suno models themselves may be used commercially. However, this version of Bark uses
EnCodec as a neural codec backend, which is licensed under a non-commercial license.
Please contact us at
firstname.lastname@example.org if you need access to a larger version of the model and/or a version of the model you can use commercially.