Mcon 1 2 ML Methodology MCON

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Machine Learning Analysis for MCON

MCON Team USA


October 2024

1 Introduction
This report presents an analysis of predicting popular music tracks using var-
ious machine learning models. The primary goal is to identify tracks, artists,
and albums that are likely to become hits based on a variety of audio features
such as danceability, energy, and loudness. The process integrates both data
preprocessing and the application of predictive models to analyze the music
dataset.

2 Methodology
The data preprocessing began by loading the music dataset, handling missing
values, and scaling the features using StandardScaler to ensure uniformity. Pop-
ularity was converted into a binary class for the classification task, where the
goal was to predict whether a track would become a hit or not.
We employed multiple machine learning models, including Logistic Regres-
sion, Support Vector Machines (SVM), Decision Trees, and others. Logistic Re-
gression, for instance, estimates the probability of a track being popular based
on a linear combination of the input features. SVM, on the other hand, seeks to
find the optimal hyperplane that separates hit tracks from non-hits. Decision
Trees break the decision-making process into a series of conditional branches,
making decisions based on features like danceability or tempo.
After training the models, we computed probabilities or decision function
scores for each track, predicting which were most likely to become popular. The
top five tracks for each model were determined based on these scores, highlight-
ing potential hits.
In addition, statistical analysis was conducted by calculating the normalized
means of ten different audio features. This provided a deeper understanding of
the factors influencing popularity. The features included metrics such as energy,
danceability, and tempo, which were examined to determine their correlation
with popularity.

1
3 Results and Analysis
The machine learning models successfully identified a number of tracks that
were predicted to be hits. A detailed list of these tracks is available in the
accompanying CSV file.
Furthermore, the ranking of artists and albums was conducted by calculating
the mean popularity and danceability scores, offering insights into which artists
or albums had the most influence in the dataset. This ranking sheds light on
key influencers within the music industry.
The normalized mean scores for the ten selected audio features demonstrated
significant variance across tracks. For instance, tracks with higher energy and
danceability were often predicted to be more popular. This suggests that certain
characteristics may be more indicative of a track’s potential success.

4 Conclusion
Through the application of various machine learning models, this analysis pro-
vided valuable insights into the factors that contribute to a track’s popularity.
By combining the predictions of multiple models and conducting detailed sta-
tistical analyses of audio features, we were able to effectively predict top tracks
and rank influential artists and albums. This comprehensive approach offers a
robust framework for predicting music trends in the future.

You might also like