You are on page 1of 2

PROGRESS REPORT

ON
MUSIC EMOTION RECOGNITION
USING GAUSSIAN PROCESSES
(INTRODUCTION TO ARTIFICIAL
NEURAL NETWORK MEL G622)

Submitted by:
Submitted To:
Sarthak Jain
Dr. Ananthakrishna Chintanpalli
Ankit Pareek
Instructor Incharge

Page | 1
INTRODUCTION
Alot of music data have become available recently either locally or over the Internet but in order
for users to benefit from them, an efficient music information retrieval technology is necessary.
Research in this area has focused on tasks such as genre classification, artist identification, music
mood estimation, cover song identification, music annotation, melody extraction, etc. which
facilitate efficient music search and recommendation services, intelligent play-list generation and
other attractive applications. The main power of music is in its ability to communicate and trigger
emotions in listeners. Thus, determining computationally the emotional content of music is an
important task. In this study, we concern ourselves with audio based music emotion estimation
tasks. To alleviate the challenge of ensuring consistent interpretation of mood categories, some
studies propose to describe emotion using continuous multidimensional metrics defined on low-
dimensional spaces. Most widely accepted is the Russell's two-dimensional Valence-Arousal (VA)
space where emotions are represented by points in the VA plane. In this project, the task of music
emotion recognition is to automatically find the point in the VA plane which corresponds to the
emotion induced by a given music piece. Since the Valence and Arousal are by definition
continuous and independent parameters, we can estimate them separately using the same music
feature representation and two different regression models. Gaussian Processes are based on
Kernel functions and Gram matrices. The advantage of Gaussian Processes is that their predictions
are truly probabilistic and that they provide a measure of the output uncertainty. Another big plus
is the availability of algorithms for their hyper parameter learning.

WORK COMPLETED AND WHERE WE STAND NOW


Literature review is almost completed. For the music emotion recognition we are using the
MediaEval database . It consists of 1000 clips (each 45 seconds long) taken at random locations
from 1000 different songs. They belong to the following 8 music genres: Blues, Electronic, Rock,
Classical, Folk, Jazz, Country, and Pop, and were distributed uniformly, i.e., 125 songs per genre.
We are exploring Marsyas tool to extract the features from this database.

WHAT WILL BE DONE NEXT & WHAT WE AIM TO ACHIEVE


The goal of this work is to investigate the applicability of Gaussian Process models to music
emotion recognition tasks and to compare their performance with the current state-of-the-
art Support Vector Machines. Out feature extraction task is on the verge of completion. Next,
we will be performing regression using the given dataset using both Gaussian Processes and
SVM and compare the results. As far as the novel idea is concerned, we plan to study further
about the kernel functions and try to come up with some novel modification in the kernel
function. Another possible extension would be to convert the regression problem into
classification and use supervised and unsupervised schemes for the task.

Page | 2

You might also like