Deep Learning Football
Deep Learning Football
Deep Learning Football
net/publication/335230415
CITATIONS READS
16 11,993
4 authors, including:
Dwijen Rudrapal
National Institute of Technology, Agartala
37 PUBLICATIONS 262 CITATIONS
SEE PROFILE
All content following this page was uploaded by Dwijen Rudrapal on 27 January 2021.
1 Introduction
Football is the most played and most watched sport in the world. Footballs
governing body, the International Federation of Association Football (FIFA),
estimated that at the turn of the 21st century there were approximately 250
million football players and over 1.3 billion people interested in football 1 . Pop-
ularity of this game created huge number of people’s association. Therefore,
prediction of match result in advance is very attractive to the experts and re-
searchers. But it is very difficult to guess the result of a football match by experts
or past statistics. There are many factors who can influence the match outcome
like, skills, player combination, key players forms, teamwork, home advantage
and many others. Prediction becomes more challenging when match played with
extra time or substitute players or injuries.
Football sports related data is now publicly available and grown interest in
developing intelligent system to forecast the outcomes of matches. In last two
decades, researchers proposed several state-of-the-art methodology to predict
outcome using past and available data. Most of the previous research devise the
prediction problem as a classification problem [1, 13]. The classifier predicts the
result with one class among win, loss or draw class [9].
1
https://www.britannica.com/sports/football-soccer
In this paper we focus on deep neural network for this problem. Deep neural
network technique has shown to be effective in various classification problems in
many domains [4] including in this domain [5, 7].
The main contribution of this paper are as follows:
– We propose Multi Layer Perceptron based prediction model to predict foot-
ball match result. We experimented our model with classical machine learn-
ing based models and report our evaluation result.
– We propose new set of features to train our predictive model by detail anal-
ysis of features performance and their role on match outcome and from
head-to-head match result.
The remainder of this paper is organized as follows. A summary of previous
promising work on football predictions is presented in Section 2. In Section 3, the
proposed approach including the dataset, is presented followed by experiment
result in section 4. We discuss error analysis of our approach in section 5 and
finally, the conclusion is in Section 6.
2 Related Work
In this section we have discussed promising research work related to predictive
model for football match result. The research for predicting the results of football
matches outcome as started as early 1977 by [11]. proposed a model to predict
the match outcome using matrix of goal scoring distribution. Purucker proposed
[10] an ANN model to predict results in the National Football League (NFL)
using hamming, adaptive resonance theory (ART), self-organizing map (SOM)
and back-propagation (BP). The approach uses 5 features and experimented
on data of 8 round matches in the competition. The work extended in the ap-
proach proposed by Kahn [5] achieve greater accuracy by using large dataset of
208 matches in the 2003 season of National Football League (NFL). The work
includes features like home team and away team indicator indicator. The ap-
proach [8] presents a framework for football match result prediction based on
two components: rules based component and Bayesian network component. The
performance of the system depends on sufficient expert knowledge of past data.
McCabe and Trevathan predict result in their approach [7] for 4 different kind
of sports. Proposed approach targeted Rugby League, Australian Rules football,
Rugby Union and EPL. A multi-layer perceptron model trained using BP and
conjugative-gradient algorithms for predicting task.
The system in the work [3] proposed a prediction model to generate fore-
casts of football match outcome for EPL matches using objective and subjective
information of teams like team strength, team form, psychological impact and
fatigue. Ben Ulmer and Matthew Fernandez used features like game day data
and current team performance in the proposed work [14]. Albina Yezus[15] used
both static and dynamic features in proposed approach. Static features includes
forms of the players and teams, concentration and motivation, while dynamic
features includes goal difference, score difference and history.
Tax and Joustra proposed a prediction system [12] for Dutch football com-
petition based on past 13 years data to predict the match results. Koopman and
Lit developed a statistical model in their work [6] for predicting football match
results. The work used Bayesian Networks and assumes a bivariate Poisson dis-
tribution with intensity coefficients that change randomly over time. Recently
Bunker and Thabtah [2] investigated machine learning approaches for sport re-
sult prediction and focused on the neural network for prediction problem. The
work identifies the learning methodologies, data sources, evaluation and chal-
lenges proposed a prediction framework using machine learning approach as a
learning strategy.
Features count
Features class
Home team Away team Total
Team 9 9 18
Player 7 7 14
Head-to-head
4 4 8
match
Total 20 20 40
1. Team Points: Team point is the average of the points achieved by a home
or away team. Every win conceded 3 points, draw and lost conceded 1 and
0 points respectively.
2. Form Points: It describes the form of a team during head to head matches.
Last 5 home and 5 away matches prior to the match is considered.
3. Goal Difference: It is the difference between goals scored and goals con-
ceded divided by the total number of matches.
4. Form Points Difference: It is the difference between the home form point
and away form point.
In this paper we have proposed Multi-Layer Perceptron (MLP) for football match
result prediction. In earlier similar research work [5] [10] MLP has shown compa-
rable performance. We have also experimented with popular statistical machine
learning algorithms like Support vector machine (SVM), Gaussian Naive Bayes
and Random Forest to evaluate our approach.
Proposed MLP classifier having 10 hidden states trained on our dataset. We used
holdout method to split our dataset into training set and testing set. So, 70%
of the dataset is tagged as training dataset and rest 30% tagged as test dataset.
In table 3, we report the accuracy as 73.57% on the test dataset. We also report
sensitivity, specificity, precision, recall and F1-score of our proposed approach.
Accuracy F1-score
Algorithm
(in %) (in %)
MLP 73.57 71.45
SVM 58.77 50.07
Gaussian Naive Bayes 65.84 64.26
Random Forrest 72.92 66.07
5 Error Analysis
In this section we make an in depth analysis of our model and report the situa-
tions where the model failed to predict match outcome.
1. In our current research work, we are considering the features of only the
starting 11 players. But sometimes substitute of a player also determines
very important role in the match. Sometimes, playing 11 may also change in
another match. So, these challenges can be resolved by considering all the
players playing 11 and substitutes.
2. There are many hidden features like crowd support, unfairness of a referee,
unpredictable nature of the game etc. In our current work we could not
include those features. Inclusion of those features will improve our model.
3. As the features which we have considered are dynamic in nature, the value
changes slowly and gradually with the passage of seasons. So, performance
of our approach is mostly dependant on recent past matches.
6 Conclusion