Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization
Want to make some of these yourself?