mtg / music-classifiers

Transfer learning models for music classification by genres, moods, and instrumentation

  • Public
  • 6.8K runs
  • GitHub
  • Paper
  • License



Run time and cost

This model runs on Nvidia T4 GPU hardware. Predictions typically complete within 133 seconds. The predict time for this model varies significantly based on the inputs.


This demo runs transfer learning classifiers trained on various public and in-house MTG datasets using different audio embeddings.

Source models used for embeddings

  • MusiCNN. A musically motivated CNN with two variants trained on the Million Song Dataset and the MagnaTagATune.
  • VGGish. A large VGG variant trained on a preliminary version of the AudioSet Dataset.

Transfer learning classifiers

Our models consist of single-hidden-layer MLPs trained on the considered embeddings.


These models are part of Essentia Models made by MTG-UPF and are publicly available under CC by-nc-sa and commercial license.