mtg / effnet-discogs

An EfficientNet for music style classification by 400 styles from the Discogs taxonomy

  • Public
  • 114.3K runs
  • GitHub
  • License

Input

Output

Run time and cost

This model runs on CPU hardware. Predictions typically complete within 10 seconds. The predict time for this model varies significantly based on the inputs.

Readme

effnet-discogs

effnet-discogs is an EfficientNet architecture trained to predict music styles for 400 of the most popular Discogs music styles. The output plot also shows the Discogs genre the predicted style belongs to.

This model was trained in more than two million music recordings from an in-house dataset annotated by Discogs metadata and is part of an ongoing research.

The architecture consists of an EfficientNet on its B0 configuration with an additional penultimate dense layer plus batch normalization to facilitate using the model as an embedding extractor.

This demo outputs the top_n music style activations, summarized as their mean and standard deviation through time.

License

This model is part of Essentia Models made by MTG-UPF.