mtg / music-arousal-valence

Regression of musical arousal and valence values

  • Public
  • 8.6K runs
  • CPU
  • GitHub
  • License

Input

Video Player is loading.
Current Time 00:00:000
Duration 00:00:000
Loaded: 0%
Stream Type LIVE
Remaining Time 00:00:000
 
1x
file

Audio file to process

string
Shift + Return to add a new line

YouTube URL to process (overrides audio input)

string

Embedding type to use: vggish, or musicnn

Default: "msd-musicnn"

string

Arousal/Valence training dataset

Default: "emomusic"

string

Output either a bar chart visualization or a JSON blob

Default: "Visualization"

Output

output
Generated in

This example was created by a different version, mtg/music-arousal-valence:1064850e.

Run time and cost

This model costs approximately $0.00046 to run on Replicate, or 2173 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on CPU hardware. Predictions typically complete within 5 seconds.

Readme

This demo runs a series of transfer learning regression models trained to predict musical arousal and valence values. These classifiers were trained on a mixture of public and in-house MTG datasets.

Source models

  • MusiCNN. A musically motivated CNN with two variants trained on the Million Song Dataset and the MagnaTagATune.
  • VGGish. A large VGG variant trained on a preliminary version of the AudioSet Dataset.

Transfer learning classifiers

Our models consist of single-hidden-layer MLPs trained on the considered embeddings.

License

These models are part of Essentia Models made by MTG-UPF and are publicly available under CC by-nc-sa and commercial license.