Article 4

Prediction of student academic performance using machine
learning algorithms
Jovana Jović1, Emilija Kisić1, Miroslava Raspopović Milić1, Dragan Domazet1, Kavitha
Chandra2
1
Faculty of Information Technology, Belgrade Metropolitan University, Tadeuša Košćuška 63, 11000 Belgrade,
Serbia
2
University of Massachusetts Lowell, Department of Electrical and Computer Engineering, 1 University Ave,
Lowell, MA, USA
Abstract
Educational data mining (EDM) can be used to identify students’ activities, progress,
achievements, and overall success in learning. EDM has become very popular in recent years
as a convergence of learning, analysis, visualization, and recommendation which makes the
learning process persistent and visible. In this paper, an EDM approach was conducted in order
to classify and predict student performance with machine learning techniques. Based on the
history educational dataset collected in Learning Management System (LMS) and Educational
Management System (EMS), a model for the classification of student performance was
conducted. A model is trained and evaluated on data from four different courses. Machine
learning algorithms such as Logistic Regression (LR), Linear Discriminant Analysis (LDA),
K-Nearest Neighbor (KNN), Decision Trees (DT), Naive Bayes (NB), and Support Vector
Machine (SVM) are analyzed. Support Vector Machine (SVM) classifier was finally selected
for model training and evaluation. Although the proposed model gave quite good results, there
is room for improvement in future work, which is discussed in the paper
Keywords 1
elearning, student academic performance prediction, educational data mining, machine
learning
1. Introduction
Wide usage of online educational learning and management systems led to a large amount of stored
data. The educational experience such as students’ interactions with forums, lectures, and online
assessments in the form of homework, projects, tests, etc. provide the possibility to discover valuable
and significant knowledge about student specifics and their further achievements [1].
Students’ performance is a term used for measuring not only students’ achievements but also the
quality of educational institutions. While some authors define student performance as a value obtained
from measuring a particular student learning assessment compared with study curriculum, grade point
average (GPA), or final grades, others define student academic performance only as the possibility of
gaining a long-term goal such as graduation or potential for future job prospects [2]–[4].
Analyzing collected data and predicting student performance has great importance for the efficiency
of educational institutions and can help in identifying students with low academic achievements at the
early stages of studying, tackling academic underachievement, increased university dropout rates,
graduation delays, etc.[5]. For educational institutions, it is very important to understand the potential
of using collected data in order to improve learning efficacy and academic achievements of individuals
and institutions [6].
Proceedings 13th International Conference on eLearning (eLearning-2022), September 29–30, 2022, Belgrade, Serbia
EMAIL: jovana.jovic@metropolitan.ac.rs (A. 1); emilija.kisic@metropolitan.ac.rs (A. 2); miroslava.raspopovic@metropolitan.ac.rs (A. 3);
dragan.domazet@metropolitan.ac.rs (A. 4); kavitha_chandra@uml.edu (A. 5)
ORCID: 0000-0002-4204-0233 (A. 1); 0000-0003-3059-2353 (A. 2); 0000-0003-2158-8707 (A. 3); 0000-0001-8095-5143 (A. 4); 0000-0001-
8074-2861 (A. 5);
©️ 2022 Copyright for this paper by its authors.
Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR Workshop Proceedings (CEUR-WS.org)
Educational data mining (EDM) is one approach that educational organizations can use to uncover
the patterns hidden in educational data, extend their knowledge or make predictions about further
student achievements [6]. While EDM is used for discovering knowledge from data, machine learning
(ML) algorithms provide tools for that purpose.
EDM uses a broad range of data features, metrics, and prediction methods. In order to make
conclusions or predictions on students’ academic performance, features like cumulative grade point
average (CGPA) and performance on online assessments (i.e. assessment scores, quizzes, attendance)
have been used most frequently [7]–[10]. A prior academic achievement (i.e. high school data) can also
help in understanding students’ performance [11]–[14]. Some authors include university entrance tests
as an important attribute as well [11]. Additionally, students’ demographics such as gender, age,
socioeconomic status, family background, and disability can also have an impact on students’ success
[8], [15], [16]. Learning in online learning environments means that the data recorded in the system,
such as the number of access to lessons, time spent in learning, and participation in forums, play an
important role and represent significant attributes in researching adequate metrics for addressing student
performance [8]. Psychological attributes such as motivation, student interests, and personality type are
usually interesting for research and are listed as important, but their qualitative nature sometimes makes
them difficult for analysis [17], [18].
The large number of features that have been found in different research bring with them different
prediction models to discover students’ performance. The prediction of students’ academic performance
consists in estimating the unknown score or grade usually obtained by using different classification and
regression techniques such as Decision Trees, Artificial Neural Networks, Naive Bayes, K-Nearest
Neighbor, and Support Vector Machines [19]. Object Oriented Programming course data obtained from
Politehnica University Timisoara was used for developing a model that could help in the identification
of students at risk by predicting student academic performance. Their dataset included attributes such
as student membership to the advanced study groups, number of credits earned in the previous year,
average activity mark, number of attendances in practical activity meetings, average examination mark,
and number of final exam attempts, with the conclusion that the Logistic Regression (LR) classifier
produced the best accuracy for prediction students’ academic performance [20]. In training small
dataset size in order to predict students' academic performance, Support Vector Machine (SVM) and
Learning Discriminant Analysis (LDA) algorithms showed the best accuracy [21]. A dataset from the
University of Minho in Portugal with 395 samples was used to predict students’ academic success using
SVM and KNN. The performance of both algorithms was compared, and it was discovered that SVM
performed better than KNN [22]. Review papers on machine learning-based student academic
performance prediction show that Neural Network has the highest prediction accuracy (98%), followed
by Decision Tree (91%), Support Vector Machine (83%), K-Nearest Neighbor (83%), and Naive Bayes
(76%) [7].
In this paper, we focus on predicting students’ academic performance by using historical data
collected at Belgrade Metropolitan University with the aim to identify a model suitable to predict
students’ success in a course. Data used in this work represent educational data collected in two Object-
oriented programming courses and two Information Technology based courses, gathered from academic
year 2017/18 to 2021/22. The collected data set contains students’ high school average grade, grades
on tests, homework, projects, and class participation, as well as student class attendance, number of
failed attempts to pass the final exam and final grade. Final grades are classified in two categories –
those who passed the course and those who failed it. This work provides comparative analysis on
different machine learning algorithms such as Logistic Regression (LR)[23], Linear Discriminant
Analysis (LDA) [24], K-Nearest Neighbor (KNN) [25], Decision Trees (DT) [26], Naive Bayes (NB)
[27], and Support Vector Machine (SVM) [28].
This paper is organized as follows. Section 2 presents short overview of the Educational Data Mining
techniques. Section 3 describes used methodology for data collection and analyses. Section 4 presents
and discusses obtained results. Finally, Section 5 concludes the paper.
2. Educational data mining
EDM develops and adopts different methods that are used in order to gain valuable knowledge
hidden in educational data from educational settings. EDM uses different statistical, machine learning,
and data-mining methods with the aim to better understand students and to try to predict patterns that
characterize students’ behaviors and performances [29], [30]. Education, statistics, and informatics
represent the main areas of EDM where overlaps of these areas lead to coupling of EDM with machine
learning, data mining, learning analytics, and computer-based education [31]. The goal of the EDM is
to transform raw data with a large number of attributes into meaningful data-driven decisions. EDM
can also lead to more accurate predictions of student knowledge, dropouts, and student motivational
state as it is based on different data, which in return provides a broader understanding of specific groups
of students [32], [33]. EDM can be classified in five main categories: (i) prediction, (ii) clustering, (iii)
relationship mining, (iv) distillation of data for human judgment, and (v) discovery with models [34]
Prediction - Develops a model that calculates assumptions for certain events and are made based on
available processed data. In data mining, independent variables are attributes that are already known,
and response factors are what needs to be predicted. Three main categories of prediction are
classification, regression, and density estimation [35].
Clustering - Identifies data that grouped together, respond to a similar logic and observations. In
online learning, an example of clustering would be grouping students based on their learning patterns
which allows one to further gain meaningful conclusions [36].
Relationship mining - Discovers relationships between numerous variables in a dataset and can
provide information on variables that are strongly associated with another variable. Additionally,
relationship mining can discover the strongest relationships between some variables. Four main
categories of relationship mining are: (i) association rule mining, (ii) correlation mining, (iii) sequential
pattern mining, and (iv) causal data mining [30].
Distillation of data for human judgment - Develops methods for appropriate presentation and
visualization of data for easier human judgment [37]. Presenting the data in different ways can help in
discovering new knowledge in order to achieve classification and/or identification. Data distillation for
classification can be used as a preparation stage for further prediction, while identification aims to
display data such that it is easily identifiable via well-known patterns [38].
Discovery with models - entails using previously defined models based on clustering, prediction, or
knowledge engineering using human reasoning rather than automated methods [34].
ML uses techniques that allow machines to learn and make accurate predictions from past
observations. In recent years coupling of ML with EDM has received high attention in research. Various
techniques and algorithms such as Clustering, Classification, Regression, Neural Networks, Association
Rules, Genetic Algorithms, Decision tree, etc. are used for knowledge discovery from databases [31].
3. Methodology
Methodology used to build a student performance prediction model is presented in Figure 1.
Methodology consists of three stages: (i) Data collection and integration, (ii) Data preprocessing, and
(iii) Model building and evaluation. In the Data collection and integration stage data is collected during
the student learning process. Data preprocessing stage includes tasks such as: (i) handling missing
values, (ii) solving inconsistency, (iii) removing redundancy, (iv) feature selection, and (v)
normalization. An output from this stage is a transformed dataset which is converted into a normalized
dataset. In the Model building and evaluation stage normalized data is divided into two sets: training
dataset (consists of 80% of received normalized data) and testing dataset (20%).
Figure 1. Student performance prediction methodology
In this study, the data were taken from Belgrade Metropolitan University’s EMS and LMS, where
all student records are stored. This dataset includes data collection from academic year 2017/18 to
2021/22. Data were narrowed down to four courses: (i) Introduction to object oriented programming
(ii) Objects and data abstraction, (iii) Introduction to information technologies, and (iv) Information
technology systems. The dataset included records for 1696 students.
In the first stage, the raw dataset was collected and after that dataset was preprocessed by removing
outliers, missing and noise values. Null, empty or negative values were removed from the dataset.
Students’ data used in this work includes: (i) homework assignments grades, (ii) online test grades, (iii)
project assignments grades, (iv) class participation grade, (v) number of failed attempts to pass the final
exam, (vi) class attendance, and (vii) high school average grade. In the selected courses, students had
assigned with weekly homework assignments, online assessments every three weeks, and one project
assignment per course. Student class attendance was taken each week. Besides the assigned grades,
EMS and LMS collected additional data about student learning such as time spent on the LMS, forum
participation, time when students submitted their assignments, etc.
The syllabus of each course defines a different number of assignments and their portion of the final
grade. Homework assignments, tests, projects and class participation grades represent 70% of the final
grade for the course. Final exam represents 30% of the grade.
The collected dataset was normalized using min-max normalization which performs a linear
transformation on the original data and scales the data in the range (0, 1). The numeric values of the
final exam score are classified into the categorical variables fail/pass (946 were classified with fail and
750 with pass). The fail class includes students who earned less than 50% of the exam score, while the
pass class includes those who successfully passed the exam and achieved 50% or more on the exam
score.
Exploratory data analysis was conducted in order to select suitable features. Correctness of feature
selection was ensured with Pearson’s correlation finding and correlation between the variables in data
set was explored. Six machine learning algorithms were applied for model validation. Selected ML
techniques were utilized for this purpose: (i) Logistic Regression (LR), (ii) Linear Discriminant
Analysis (LDA), (iii) K-Nearest Neighbor (KNN), (iv) Decision Trees (DT), (v) Naive Bayes (NB) and
(vi) Support Vector Machine (SVM). Evaluation of the built model was done on testing dataset with
SVM classifier. To undertake the classification of ML techniques Python programming language and
Google Colab Environment were used. Obtained results are presented using accuracy and confusion
matrix as metrics.
4. Results and Discussion

In order to perform feature selection and analyze correlation between variables, correlation matrix
with Pearson’s coefficient was calculated. Figure 2 illustrates the correlation heatmap graph of the input
dataset provided by using the Python Pandas library. Correlation degree was classified as follows: low
(below 0.29), moderate (from 0.3 to 0.49) and high (from 0.5 to 1). Based on the obtained results, we
can see the presence of moderate and high correlation degrees among all variables.
*** - p < 0.001, ** - p < 0.01, * - p < 0.05
Figure 2. Correlation HeatMap
It was of interest to analyze whether the correlation exists between all of the parameters. For
instance, it is interesting to see that homework is in high correlation with tests (0.62), projects (0.73),
class participation grade (0.70), number of final exam attempts (0.65), and class attendance (0.53). One
of the possible explanations is that attending class regularly helps with completing homework
successfully, and in return being ready for the project. Similarly, the project has a high correlation with
class participation grade (0.72), number of final exam attempts (0.76), and final exam scores (0.67).
However, the focus was placed on the correlation between the final exam grade and other parameters.
Based on the correlation matrix, it can be seen that the highest correlation is between the project and
number of attempts the final exam was taken (0.76). Additionally, a high level of correlation is shown
between the final exam score and projects’ grade (0.67) and homework (0.62).
Courses that were taken into consideration for our dataset, are similar not only in the structure of the
final grade, but also in the type of assessments. For instance, all of the courses are part of the computing
curricula, and have a high degree of practical assignments, and even assessments are based on problem
solving that is mainly relating to programming or some sort of technology (hands on) assignment.
Hence, this is the reason why it is expected to see correlation between the final exam scores and
homework and projects, as the final exam questions are similar to homework and project assignments.
It should also be noted that students' class participation grade have a high correlation with the final
exam score (0.61) because that grade in itself carries information about how actively and regularly the
student studied during the semester, which indicates student’s preparation for the exam. Poor
correlation is found between high school average grades and final score exams (0.3). That shows that
differences in high schools from which the students come do not have a great influence on the passing
of the course. This value reflects differences in the type of high schools the students attended or the
level of knowledge they acquired there. Moderate correlation is shown for the correlation with class
attendance and tests. Being present in the class does not necessarily mean that the student is active and
participating. This is the reason why there is a high correlation between the final exam scores and class
participation grade (0.61), but only moderate correlation between the final exam scores and class
attendance (0.48). Similarly, there is a moderate correlation between the final exam scores and grades
received on tests (0.48). This is interesting, as most of the tests are multiple choice questions, including
the covered theoretical work, where the class final exam questions contain both theoretical and practical
problems. Also, we can see moderate and high correlation degree levels between features and the target
variables (final exam score), so all features are kept in the dataset.
In order to choose a classifier for predicting the final exam outcome (whether the course was passed
or failed), a 10-fold cross-validation approach was conducted. Cross validation approach splits the
training dataset into 10 groups of approximately equal size, trains the model on nine groups and tests
the model on the tenth group in ten iterations. The outcomes of the experiments are summarized using
classifier accuracy that was calculated as the average accuracy after ten cross validation iterations. Six
ML algorithms were applied for validation of the model. Classifier accuracy for all ten iterations is
presented with boxplots in Figure 3.
Figure 3. Boxplots with accuracy after 10 iterations of cross validation for six different classifiers
SVM shows the best results with choosing the classifier in all ten iterations with the average
accuracy of 88.5%. Other examined algorithms show the average accuracy of LR (87.6%), LDA
(84.6%), KNN (86.5%), DT (84.4%), and NB (85.7%). Obtained results are in accordance with
conclusions in [20] that show that SVM performs well with small dataset size. As a supervised learning
algorithm for classification problems, SVM has the best performances when the class boundaries are
nonlinear because it is focused only on the class boundaries, while points that are anyway easily
classified are skipped [39].
Once the model was trained, it was tested on the collected dataset. The proposed model for predicting
students' academic performance based on SVM shows 90.3% accuracy after model evaluation on testing
dataset. Chosen model performance was additionally assessed using a confusion matrix. The confusion
matrix summarizes the selected model's overall performance as shown in Figure 4.
Figure 4. Confusion matrix of SVM proposed model

Based on Figure 4, we can conclude that the model predicts that 175 students from our dataset will
fail the exam and that 132 students will pass the exam. This is compared to 189 students who actually
failed the exam, and 151 students that passed. This means that model does not work perfectly and there
are present type I and type II errors. Model classifies 14 students that passed the course in the failing
group, and 19 students that failed the course are classified in the passing group. Also, the model predicts
that 132 students will pass the exam, out of 151 students who actually passed the exam. The model does
not work perfectly and there are present type I and type II errors. Model classifies 14 students that
passed the course in the failing group, and 19 students that failed the course are classified in the passing
group.
Evaluated SVM model performance is presented through precision and recall as shown in Table 1.
Table 1. Models performance
Precision (%) Recall (%)

FAIL 93% 93%
PASS 90% 87%
Based on the precision, we can see that from all “fail” predictions, 93% really failed the exam, while
from all “pass” predictions, 90% passed the exam. On the other hand, based on recall we can see that
from an overall number of students that actually failed the exam, the model predicted 93% successfully,
and from an overall number of students that actually passed the exam the model predicted 87%
successfully. These metrics confirmed that the chosen model gives very satisfactory results in predicting
the final exam outcome. The results show that the selected model can be used to predict the final exam
outcome (whether the course was passed or failed) with sufficiently high accuracy. This is important
for early identification of at-risk students, which can help in addressing their problems and challenges
early on.
5. Conclusion
Student academic performance is one of the important quality indicators for every university. Being
able to anticipate identification of at risk students at an early stage of student academic life, provides
an opportunity to improve the learning process and also reduce the dropout rates. In this work we have
examined the accuracy of six different machine learning algorithms in order to predict students’ passing
or failing the final exam. The six ML algorithms that were investigated were NB, LDA, LR, DT, KNN,
and SVM. For the analysis of the proposed model, dataset for four different courses was used.
Algorithms were evaluated based on characteristics such as accuracy and precision rate. SVM showed
as the most accurate in classifying a data set of student academic performance, and in predicting
students’ final exam outcome. Future work will analyze a larger number of ML algorithms and try to
include additional features in order to gain more accurate model for the prediction of student academic
performance and support the entire process of learning.
Acknowledgment
The work presented here was supported by the Ministry of Education, Science and Technological Development
of the Republic of Serbia ref. no. 451-03-68/2022-14/200169.
6. References
[1] R. Ghorbani and R. Ghousi, “Comparing different resampling methods in predicting students’
performance using machine learning techniques,” IEEE Access, vol. 8, pp. 67899–67911, 2020.
[2] Q. Liu et al., “EKT: Exercise-aware knowledge tracing for student performance prediction,” IEEE
Trans. Knowl. Data Eng., vol. 33, no. 1, pp. 100–115, Jan. 2021.
[3] A. Hellas et al., “Predicting academic performance: a systematic literature review,” in Proceedings
Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer
Science Education, Larnaca Cyprus, 2018.
[4] T. T. York, C. Gibson, and S. Rankin, “Defining and Measuring Academic Success,” Research &
Evaluation, vol. 20, 2015.
[5] B. Daniel, “Big Data and analytics in higher education: Opportunities and challenges,” Br. J. Educ.
Technol., vol. 46, no. 5, pp. 904–920, Sep. 2015.
[6] M. Nachouki and M. Abou Naaj, “Predicting student performance to improve academic advising
using the random forest algorithm,” Int. J. Distance Educ. Technol., vol. 20, no. 1, pp. 1–17, Jan.
2022.
[7] A. M. Shahiri, W. Husain, and N. A. Rashid, “A review on predicting student’s performance using
data mining techniques,” Procedia Comput. Sci., vol. 72, pp. 414–422, 2015.
[8] A. K. Hamoud, A. S. Hashim, and W. A. Awadh, “Predicting student performance in higher
education institutions using decision tree analysis,” Int. j. interact. multimed. artif. intell., vol.
inPress, no. inPress, p. 1, 2018.
[9] H. Almarabeh and King king Saud Bin Abdulaziz University for Health Sciences College of
Science and Health Professions Riyadh, Kingdom of Saudi Arabia, “Analysis of students’
performance by using different data mining classifiers,” Int. j. mod. educ. comput. sci., vol. 9, no.
8, pp. 9–15, Aug. 2017.
[10] S. T. Jishan, R. I. Rashu, N. Haque, and R. M. Rahman, “Improving accuracy of students’ final
grade prediction model using optimal equal width binning and synthetic minority over-sampling
technique,” Decis. Anal., vol. 2, no. 1, Dec. 2015.
[11] O. S. Oshodi, R. O. Aluko, E. I. Daniel, C. O. Aigbavboa, and A. O. Abisuga, “Towards reliable
prediction of academic performance of architecture students using data mining techniques,”
Journal of Engineering, Design and Technology, vol. 16, no. 3, pp. 385–397, 2018.
[12] F. Ahmad, N. H. Ismail, and A. A. Aziz, “The prediction of students’ academic performance using
classification data mining techniques,” Appl. Math. Sci., vol. 9, pp. 6415–6426, 2015.
[13] M. H. Mohamed and H. M. Waguih, “Early prediction of student success using a data mining
classification technique,” International Journal of Science and Research, vol. 6, no. 10, pp. 126–
139, 2017.
[14] V. Ramesh, P. Parkavi, and K. Ramar, “Predicting student performance: A statistical and data
mining approach,” Int. J. Comput. Appl., vol. 63, no. 8, pp. 35–39, Feb. 2013.
[15] G. Elakia and N. J. Aarthi, “Application of data mining in educational database for predicting
behavioural patterns of the students, Elakia et al,/(IJCSIT),” International Journal of Computer
Science and Information Technologies, vol. 5, no. 3, pp. 4649–4652, 2014.
[16] N. Putpuek, N. Rojanaprasert, K. Atchariyachanvanich, and T. Thamrongthanyawong,
“Comparative Study of Prediction Models for Final GPA Score: A Case Study of Rajabhat
Rajanagarindra University,” in 2018 IEEE/ACIS 17th International Conference on Computer and
Information Science, 2018, pp. 92–97.
[17] R. Steinmayr and B. Spinath, “Predicting school achievement from motivation and personality,”
Z. pädagog. Psychol., vol. 21, no. 3/4, pp. 207–216, Jan. 2007.
[18] A. Mueen, King Abdulaziz University, Saudi Arabia, Jeddah, B. Zafar, and U. Manzoor,
“Modeling and predicting students’ academic performance using data mining techniques,” Int. j.
mod. educ. comput. sci., vol. 8, no. 11, pp. 36–42, Nov. 2016.
[19] G. Bordea, A. M. Shahiri, W. Husain, and N. A. Rashid, “A review on predicting students
performance using data mining techniques,” Procedia Computer Science, vol. 72, pp. 414–422,
2015.
[20] M. Bucos and B. Drăgulescu, “Predicting student success using data generated in traditional
educational environments,” TEM Journal, vol. 7, no. 3, 2018.
[21] L. M. Abu Zohair, “Prediction of Student’s performance by modelling small dataset size,” Int. J.
Educ. Technol. High. Educ., vol. 16, no. 1, Dec. 2019.
[22] H. Al-Shehri et al., “Student performance prediction using Support Vector Machine and K-Nearest
Neighbor,” in 2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering
(CCECE), Windsor, ON, 2017.
[23] M. Maalouf, “Logistic regression in data analysis: an overview,” Int. j. data anal. tech. strateg.,
vol. 3, no. 3, p. 281, 2011.
[24] A. Tharwat, T. Gaber, A. Ibrahim, and A. E. Hassanien, “Linear discriminant analysis: A detailed
tutorial,” AI Commun., vol. 30, no. 2, pp. 169–190, May 2017.
[25] K. Taunk, S. De, S. Verma, and A. Swetapadma, “A brief review of nearest neighbor algorithm
for learning and classification,” in 2019 International Conference on Intelligent Computing and
Control Systems (ICCS), Madurai, India, 2019.
[26] Y.-Y. Song and Y. Lu, “Decision tree methods: applications for classification and prediction,”
Shanghai Arch. Psychiatry, vol. 27, no. 2, pp. 130–135, Apr. 2015.
[27] G. I. Webb, “Naïve Bayes,” in Encyclopedia of Machine Learning and Data Mining, Boston, MA:
Springer US, 2017, pp. 895–896.
[28] S. Suthaharan, “Support Vector Machine,” in Machine Learning Models and Algorithms for Big
Data Classification, Boston, MA: Springer US, 2016, pp. 207–235.
[29] S. K. Mohamad and Z. Tasir, “Educational data mining: A review,” Procedia Soc. Behav. Sci.,
vol. 97, pp. 320–324, Nov. 2013.
[30] Jovic J, Raspopović Milić M, Domazet D. Chandra K, “Educational data mining and learning
analytics tools for online learning,” in 12th International Conference on eLearning (eLearning-
2021), Belgrade, pp. 22–27.
[31] A. Qazdar, B. Er-Raha, C. Cherkaoui, and D. Mammass, “A machine learning algorithm
framework for predicting students performance: A case study of baccalaureate students in
Morocco,” Educ. Inf. Technol., vol. 24, no. 6, pp. 3577–3589, Nov. 2019.
[32] H. A. A. Hamza and P. Kommers, “‘A review of educational data mining tools & techniques,’’,”
Int. J. Educ. Technol. Learn, vol. 3, no. 1, pp. 17–23, 2018.
[33] P. Kaur, M. Singh, and G. S. Josan, “Classification and prediction based data mining algorithms
to predict slow learners in education sector,” Procedia Comput. Sci., vol. 57, pp. 500–508, 2015.
[34] R. S. J. De Baker, T. Barnes, and J. E. Beck, “Educational data mining 2008,” in The 1st
International Conference on Educational Data Mining Montréal, Québec, Canada, 2008.
[35] R. S. Baker and G. Siemens, “Learning analytics and educational data mining,” in The Cambridge
Handbook of the Learning Sciences, Cambridge University Press, 2022, pp. 259–278.
[36] A. Dutt, “Clustering algorithms applied in educational data mining,” Int. j. inf. electron. eng., 2015.
[37] A. Algarni, “Data Mining in Education,” Int. J. Adv. Comput. Sci. Appl., vol. 7, no. 6, 2016.
[38] R. S. J. d. Baker, A. T. Corbett, and V. Aleven, “More accurate student modeling through
contextual estimation of slip and guess probabilities in Bayesian knowledge tracing,” in Intelligent
Tutoring Systems, Berlin, Heidelberg: Springer Berlin Heidelberg, 2008, pp. 406–415.
[39] G. Pratiyush and S. Manu, “Classifying educational data using support vector machines: A
supervised data mining technique,” Indian J. Sci. Technol., vol. 9, no. 34, Sep. 2016.

Article 4

Uploaded by

Copyright:

Available Formats

Article 4

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Article 4

Uploaded by

Copyright:

Available Formats

Prediction of student academic performance using machine

4. Results and Discussion

Figure 4. Confusion matrix of SVM proposed model

Table 1. Models performance

Precision (%) Recall (%)

You might also like