Music Genre Classification

Rahul K. Prajapati

Rahul Saha

Muskan Gautam

Samarth Sala

Project Information

Report Demo code Dataset Github Code Spotlight Video

Abstract

This project proposes an automated music genre classification system using machine learning techniques. The system uses a dataset of audio samples to extract features, which are then used to train various algorithms. The system evaluates the performance of each classifier and provides real-time feedback. The system’s effectiveness is demonstrated, and the results suggest potential for further enhancements. This project contributes to the advancement of automated music genre classification systems, improving music content organization and retrieval in digital libraries and streaming platforms.

Introduction

Music genre classification aims to classify the audio files in certain categories of sound to which they belong. The application requires automation to reduce the manual error and time because if we have to classify the music manually then one has to listen out each file for the complete duration. So to automate the process we use Machine learning and deep learning algorithms. We are given multiple audio files, and the task is to categorize each audio file in a certain category like audio belongs to Disco, hip-hop, etc. The music genre classification can be built using different approaches.

Model Used

1. Decision Trees (DecisionTreeClassifier)

2. K-Nearest Neighbors (KNN Classifier)

3. Naive Bayes (GaussianNB)

4. Stochastic Gradient Descent (SGDClassifier)

5. Random Forest (RandomForestClassifier)

6. Support Vector Machine (SVC)

Best Model

We used K-Nearest Neighbours algorithm (KNN Classifier) because various researches prove it is one of the best algorithms to give good performance and till time along with optimized models organizations uses this algorithm in recommendation systems as support. K-Nearest Neighbour KNN is a machine learning algorithm used for regression, and classification. It is also known as the lazy learner algorithm. It simply uses a distance-based method to find the K number of similar neighbours to new data and the class in which the majority of neighbours lies, it results in that class as an output. Now let us get our system ready for project implementation. Dataset Overview The dataset we will use is named the GTZAN genre collection dataset which is a very popular audio collection dataset. It contains approximately 1000 audio files that belong to 10 different classes. Each audio file is in .wav format (extension). The classes to which audio files belong are Blues, Hip-hop, classical, pop, Disco, Country, Metal, Jazz, Reggae, and Rock.

Model Accuracy Graphical Result

Summary

Throughout this project, we began by loading and exploring a dataset containing features related to music genres, examining its contents and class distribution. Following this, we preprocessed the data by scaling features and splitting it into training and testing sets for model evaluation. We defined and evaluated several machine learning models including Naive Bayes, Stochastic Gradient Descent, K-Nearest Neighbors, Decision Trees, Random Forest, and Support Vector Machine, assessing their performances using accuracy scores and visualizing confusion matrices. Additionally, we compared the accuracies of these models through a bar plot, providing insights into their relative performance. Visualizations of class distributions shed light on the dataset composition. Lastly, we conducted error analysis, focusing on the Random Forest model, identifying and visualizing error samples to understand where the model struggles and where improvements could be made. Overall, this comprehensive approach allowed us to gain a deep understanding of the dataset, model performances, and potential areas for refinement in the classification task.