mtg / effnet-discogs

An EfficientNet for music style classification by 400 styles from the Discogs taxonomy

  • Public
  • 153.9K runs
  • CPU
  • GitHub
  • License

Input

file

Audio file to process

string
Shift + Return to add a new line

YouTube URL to process (overrides audio input)

integer

Top n music styles to show

Default: 10

string

Output either a bar chart visualization or a JSON blob

Default: "Visualization"

Output

output
Generated in

This example was created by a different version, mtg/effnet-discogs:3b1d08bb.

Run time and cost

This model costs approximately $0.00025 to run on Replicate, or 4000 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on CPU hardware. Predictions typically complete within 3 seconds.

Readme

effnet-discogs

effnet-discogs is an EfficientNet architecture trained to predict music styles for 400 of the most popular Discogs music styles. The output plot also shows the Discogs genre the predicted style belongs to.

This model was trained in more than two million music recordings from an in-house dataset annotated by Discogs metadata and is part of an ongoing research.

The architecture consists of an EfficientNet on its B0 configuration with an additional penultimate dense layer plus batch normalization to facilitate using the model as an embedding extractor.

This demo outputs the top_n music style activations, summarized as their mean and standard deviation through time.

License

This model is part of Essentia Models made by MTG-UPF.