Speech Emotion Recognition
Introduction
- This repository handles building and training Speech Emotion Recognition System.
- The basic idea behind this tool is to build and train/test a suited machine learning ( as well as deep learning ) algorithm that could recognize and detects human emotions from speech.
- This is useful for many industry fields such as making product recommendations, affective computing, etc.
- Check this tutorial for more information.
Emotions available
There are 3 emotions available: “neutral”, “happy” “sad”.
Feature Extraction
Feature extraction is the main part of the speech emotion recognition system. It is basically accomplished by changing the speech waveform to a form of parametric representation at a relatively lesser data rate.
In this repository, we have used the most used features that are available in librosa library including: - MFCC - Chromagram - MEL Spectrogram Frequency (mel) - Contrast - Tonnetz (tonal centroid features)
Output:
{'happy': 0.8502438, 'sad': 1.15252915e-05, 'neutral': 8.986728e-05}
Algorithms Used
This repository can be used to build machine learning classifiers as well as regressors
Classifiers/regressors:
- SVC
- RandomForestClassifier
- GradientBoostingClassifier
- KNeighborsClassifier
- MLPClassifier
- BaggingClassifier
- Recurrent Neural Networks (Keras)