Early Predicting of Students Performance in Higher
Early Predicting of Students Performance in Higher
This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3250702
Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.DO
ABSTRACT Students learning performance is one of the core components for assessing any
educational systems. Students performance is very crucial in tackling issues of learning process and
one of the important matters to measure learning outcomes. The ability to use data knowledge
to improve education systems has led to the development of the field of research known as
educational data mining (EDM). EDM is the creation of techniques to investigate data gathered
from educational settings, allowing for a more thorough and accurate understanding of students and
the improvement of educational outcomes for them. The use of machine learning (ML) technology
has increased significantly in recent years. Researchers and teachers can use the measurements of
success, failure, dropout, and more provided by the discipline of data mining in education to predict
and simulate education processes. Therefore, this work presents an analysis of students performance
using data mining methods. The paper presents both clustering and classification techniques to
identify the impact of students performance at early stage with on the GPA. For the clustering
technique, the paper uses dimensionality reduction mechanism by T-SNE algorithm with various
factors at early stage such as admission scores and first level courses, academic achievement tests
(AAT) and general aptitude tests (GAT) in order to explore the relationship between these factors
and GPA’s. For the classification technique, the paper presents experiments on different machine
learning models on predicting student performance at early stages using different features including
courses’ grades and admission tests’ scores. We use different assessment metrics to evaluate the
quality of the models. The results suggest that educational systems can mitigate the risks of
students failures at the early stages.
INDEX TERMS Graph mining; Students’ performance prediction; Student academic performance;
Early prediction; Data mining
VOLUME 4, 2016 1
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3250702
niques to identify the impact of students performance the literature related to students performance prediction
at early stage with on the GPA. techniques used in the field of Education. Section III
The learning process includes a lot of student perfor- provides details of the used dataset includes data carac-
mance. Identifying students who are more likely to have terization and correlations. Section IV provides the re-
poor academic success in the future requires making search methodology followed by the paper. We evaluate
predictions about student performance. If the data has and analysis our proposed method and report findings in
been transformed into knowledge, it may be useful and Section V. Finally, we conclude our work in Section VII.
used in predictions. As a result, the information might
improve the quality of education and learning and help II. RELATED WORK
students in achieving their academic objectives. Data Helal et al. [5] proposed the concept of using hetero-
mining techniques are used in the study area known as geneity to create better prediction models. Four popular
educational data mining (EDM) to analyze information machine learning algorithmsâJRip, sequential minimum
derived from educational backgrounds [4]. EDM imple- optimization, C4.5, and Naive-Bayesâwere used to cre-
mentation also aids in the planning of strategies for ate these models. The results of the experiment showed
raising student performance. As a result, it will improve that employing student subpopulations to predict aca-
teaching and learning and the students’ experience in demic performance is both successful and promising.
the educational institution. Additionally, it demonstrated that both rule-based and
Academic success is important because it is strongly tree-based algorithms offered more clear explanations,
linked to the positive outcomes we value. One of the making them more effective in an educational environ-
academic success factors is the academic students per- ment [5]. This study did not consider the combined
formance in the college or university. The cumulative features corresponding to a particular module.
academic achievement for each student still indicates the Xu et al. [6] proposed a technique utilizing three
success of every college or university. Also, the other fac- popular machine learning algorithms to predict student
tors we can use in analysis and predict the academic stu- performance using Internet usage data. From the real
dents performance are aptitude test, GPA of secondary Internet usage data of 4000 students, they collected,
school and the name of the school which the student computed, and normalized parameters like online time,
graduated from. We believe that the performance of the Internet traffic volume, and connection frequency. Their
students in the first year in college can be used as a findings indicated that it is possible to predict students’
factor to predict the performance of student in the rest academic success using data on Internet usage.
of years of his/her studies. These factors lead to early Dien et al. [7] proposed a method to predict stu-
remedy for students and take actions to improve student dent performance using several deep learning methods.
performance Artificial intelligence techniques have been Twenty one features from their model are used as convo-
applied on educational data to reveal the significant lutional neural network input. The dataset was obtained
reasons behind student performance. The contributions from the information system of a multidisciplinary uni-
of the paper are as follows: versity in Vietnam, and experimental findings on the
dataset revealed successful prediction. This work used
• We propose a framework for predicting students traditional and simple features to build a studentâs
performance using student’s academic performance performance prediction model for predicting student
and his/her social relationships features. performance.
• We use admission scores, his/her first level courses Giannakas et al. [1] proposed a binary classification
scores and academic achievement test (AAT) and framework based on deep neural networks (DNN), and
general aptitude test (GAT). the most crucial features that affected the result were ex-
• We explore a new way of using admission, his/her tracted. Different activation functions (Sigmoid, ReLu,
first level courses scores, and AAT and GAT by and Tanh) and optimizers were used to evaluate the
tSNE dimensionality reduction. To the best of our framework (Adagrad and Adadelta). The experimental
knowledge this attempt is a first of its kind to use findings indicate that, when the Adadelta and Adagrad
features from both admission scores and his/her optimizers were utilized, the prediction accuracy of the
first level courses scores to early predict student’s framework was 76.73% and 82.39%, respectively. The
performance using machine learning. learning performance was 80.76% and 86.57% overall.
• We also explore a new way of using increasing the Ha et al. [8] used various machine learning algorithms
threshold of relocating which is to compute the to determine the final grade average of students using
absolute difference between a grade and following a variety of criteria, including personal characteristics,
grade after or before. university entrance scores, a gap year, and their first-
• We use a state-of-the-art classification models to and second-year academic performance. The dataset
evaluate the effectiveness of our proposed idea. that was used was obtained from the university’s student
We organize the paper as follows: Section II provides management information system and a survey of grad-
2 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3250702
uates from three different years. The findings demon- nology used for regression, classification, and ranking
strated a connection between the factors and students’ tasks, and is part of the Boosting method family.
academic achievement. Liu and Niu [16] proposed a new approach which
Rastrollo-Guerrero et al. [9] studied and analyzed is called the Multi-Agent System (MAS). It is used
almost 70 papers to show different modern techniques to propose an Agent-based Modeling Feature Selec-
applied for predicting students’ performance. They have tion(ABMFS) model, and the selected feature subset
studied current research on predicting student behavior effectively removes the features that are irrelevant to the
in a class environment. They came to the conclusion prediction results. Then, they applied the Deep Learning
that there is a substantial tendency to predict university techniques to construct a Convolutional Neural Network
student performance from the analysis of these papers. (CNN) based structure to predict student performance.
Due to its ability to deliver accurate and trustworthy Evangelista and Descargar [17] provided a method for
outcomes, supervised learning has become a popular improving the performance prediction of several single
strategy for predicting students’ behavior. On the other classification algorithms by employing them as base
hand, due to the low accuracy of predicting students’ classifiers of heterogeneous ensembles and homogeneous
conduct in the circumstances analyzed, unsupervised ensembles (bagging and boosting) (voting and stacking).
learning is a technique that is unattractive to re- Their model needs to perform optimization techniques
searchers. to find out the algorithm parameters and configuration.
Mustafa Agaoglu [10] explains how classifier models Quy et al. [18] assess seven well-liked group fair-
are created using four different classification techniques: ness metrics for issues predicting student performance.
decision tree algorithms, support vector machines, ar- On five educational datasets, they run trials with four
tificial neural networks, and discriminant analysis. Us- traditional machine learning models and two fairness-
ing performance criteria for accuracy, precision, recall, aware machine learning techniques. Their studay is only
and specificity, their results are compared over a given limited to public schools not student academics in uni-
dataset composed of student replies to an actual course versities
evaluation questionnaire. To predict undergraduate students’ final test grades,
In order to determine the most crucial factors for Yağcı [19] proposed a prediction model. The final exam
ensuring the academic performance of engineering stu- grades were divided into four groups using a comparison
dents, Gonzalez-Nucamendi et.al [11] describe the de- of a variety of machine learning methods, including RF,
termination of student profiles based on the constructs SVM, LR, NB, ANN, and k-nearest neighbor (KNN).
of multiple intelligences and on learning and affective These groups were "32.5," "32.5-55," "55-77.5," and "
techniques. 77.5." The grades of 1854 students who took Turkish
Alshanqiti and Namoun [12] proposed hybrid regres- Language-I were taken into account in the proposed
sion model that optimizes the prediction accuracy of model using data that was gathered from a Turkish state
student academic performance, an optimized multi-label university. The department, faculty, and midterm exam
classifier that predicts the qualitative values for the scores from the student’s tree characteristics have been
influence of various factors associated with the ob- used to forecast the final exam grades. In comparison to
tained student performance and applied combining three other classification models, RF and ANN performed the
dynamically weighted techniques, namely collaborative best, classifying final exam grades with an area under
filtering, fuzzy set rules, and Lasso linear regression. the curve (AUC) of 86% and 74% accuracy, respectively.
They need to claim their approach generality to confirm
the reliability of their model. Also, they need using real
III. DATASET DESCRIPTION
academic datasets.
In order to predict a student’s placement in the IT In this study, we use a students records’ dataset for the
business based on their academic achievement in class five consecutive years. The dataset provides more than
ten, class twelve, graduation, and backlog to date in 275 thousands records for almost five thousands stu-
graduation, Maurya et al. [13] have presented a few dents including their identification, instructors, gender,
supervised machine learning classifiers. sections, program, admission criteria and time, gradu-
Alsalm et al. [14] proposed an approach to predict the ation time and GPA, courses’ details and their grades.
studentâs grades using the mathematics and Portuguese This dataset is being used in this paper to understand
language course grades data set and applied Deep learn- the factors that influence the academic performance of
ing model. The model of this work is only validated using the students.
two datasets and need to be validated on other large size
balanced datasets. A. DATA CHARACTERIZATION
Ahmed et al. [15] proposed an approach to predict From this dataset, we explored various attributes that
university students’ performance in final exams using support our task in quantifying and prediction the
the algorithm (GBDT) which is a machine learning tech- students performance in early stage.
VOLUME 4, 2016 3
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3250702
Figure 1: Density distribution functions of the admission criteria including the scores of the academic achievement
test, the general aptitude test, and the final grade of high school.
By analyzing admission criteria, Figure 1 shows the male scores are more stable than female scores in all the
density distribution function of the admission criteria. admission criteria.
The admission criteria include the academic achieve- In addition to the admission criteria, figures 3 and 4
ment test, the general aptitude test, and the final describe the most courses are the obstacles for the
grade of high school. These scores are calculated in a students. It is clearly noticed that Math courses are the
weighted ratio to satisfy the admission requirements. most challenging courses. The English courses are con-
The academic achievement test scores are normally sidered the second challenging courses for the students.
distributed showing the mean and median of 66 and 65,
consecutively. Similarly, the general aptitude test scores IV. PROPOSED METHOD
show mean and median of 68 and 69, consecutively. In
A. PREPARING THE DATA
contrast, the distribution of the final grade of high school
scores are left-skewed with mean and median 92 and 94, In this phase, we use a sample of students’ academic
consecutively. records at our CS college to explore the significant
factors in students performance. Our target is to present
When comparing the admission criteria with respect the data of students records and features in early stage
to gender during the five years, Figure 2 shows the slight to predict their performance in the final stage (final
variations between them. It is clearly noticed that the accumulative GPA).
4 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3250702
Figure 2: Gender comparison in admission criteria over admission years and terms including the scores of the academic
achievement test, the general aptitude test, and the final grade of high school.
Figure 3: Total failed students in courses (left:during the five years, right: in 2021)
Figure 4: Correlation networks between the failed courses (edge direction denotes next level course) calculated by
the Jaccard similarity index where similar students who failed in courses are the set.
B. FEATURES USED
We use admission scores of the students, gender and all
first-level required courses scores as features. We also
VOLUME 4, 2016 5
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3250702
mentioned all features we used in Table 1. We find these score features results beat the model of admission scores
features influence the students’ academic success. We and gender features combination.
attempt to predict and improve student performance by
utilization of using these mentioned features. C. EXPERIMENTING WITH ADMISSION SCORES
AND GENDER FEATURES
C. T-SNE ALGORITHM In this experiment, we combined two different types of
t-SNE is a non-linear dimensionality reduction used feature which are admission scores and gender features
for exploring high-dimensional data. It stands for t- to observe if the gender features impact and produce
distributed Stochastic Neighbor Embedding [20]. We better accuracy performance using only admission scores
use t-SNE algorithm with various factors such as GAT as features. Unfortunately, we noticed that the accuracy
and AAT for analyzing and exploring the relationship performance is reduced in all used classifiers as shown
between these factors and GPA of students. in Figure 7 and Table 3.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3250702
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3250702
Table 2: Results of models performance on admission scores features on a sample of 226 students with grade support:
A’s=26, B’s=53, C’s=88, D’s=59)
Normal Rounding 0.10± Rounding 0.30± Rounding 0.50±
Model GPA precision recall f1 score precision recall f1 score precision recall f1 score precision recall f1 score
A 80.00% 30.77% 44.44% 81.82% 34.62% 48.65% 90.91% 38.46% 54.05% 93.75% 57.69% 71.43%
B 54.05% 37.74% 44.44% 55.56% 37.74% 44.94% 64.29% 50.94% 56.84% 76.32% 54.72% 63.74%
RFR C 50.00% 80.68% 61.74% 50.35% 81.82% 62.34% 55.88% 86.36% 67.86% 63.57% 93.18% 75.58%
D 59.46% 37.29% 45.83% 61.11% 37.29% 46.32% 72.97% 45.76% 56.25% 88.37% 64.41% 74.51%
Accuracy 53.54% 54.42% 61.95% 72.57%
A 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00% 11.54% 20.69% 100.00% 34.62% 51.43%
B 44.44% 22.64% 30.00% 48.28% 26.42% 34.15% 62.07% 33.96% 43.90% 86.21% 47.17% 60.98%
SVM C 38.46% 51.14% 43.90% 39.13% 51.14% 44.33% 41.07% 52.27% 46.00% 47.62% 56.82% 51.81%
D 42.68% 59.32% 49.65% 42.68% 59.32% 49.65% 42.68% 59.32% 49.65% 46.99% 66.10% 54.93%
Accuracy 40.71% 41.59% 45.13% 54.42%
A 47.06% 30.77% 37.21% 47.06% 30.77% 37.21% 56.25% 34.62% 42.86% 81.25% 50.00% 61.90%
B 35.42% 32.08% 33.66% 35.42% 32.08% 33.66% 42.55% 37.74% 40.00% 58.00% 54.72% 56.31%
GNB C 46.07% 46.59% 46.33% 46.07% 46.59% 46.33% 52.58% 57.95% 55.14% 62.50% 68.18% 65.22%
D 33.33% 40.68% 36.64% 33.33% 40.68% 36.64% 37.88% 42.37% 40.00% 46.88% 50.85% 48.78%
Accuracy 39.82% 39.82% 46.46% 58.41%
Table 3: Results of models performance on admission scores and gender features on a sample of 226 students with
grade support: A’s=26, B’s=53, C’s=88, D’s=59)
Normal Rounding 0.10± Rounding 0.30± Rounding 0.50±
Model GPA precision recall f1 score precision recall f1 score precision recall f1 score precision recall f1 score
A 73.33% 42.31% 53.66% 75.00% 46.15% 57.14% 82.35% 53.85% 65.12% 94.74% 69.23% 80.00%
B 50.00% 52.83% 51.38% 51.85% 52.83% 52.34% 58.93% 62.26% 60.55% 72.22% 73.58% 72.90%
RFR C 49.62% 73.86% 59.36% 50.38% 76.14% 60.63% 56.80% 80.68% 66.67% 65.00% 88.64% 75.00%
D 62.50% 25.42% 36.14% 65.22% 25.42% 36.59% 82.14% 38.98% 52.87% 93.94% 52.54% 67.39%
Accuracy 52.65% 53.98% 62.39% 73.45%
A 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
B 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
SVM C 38.94% 100.00% 56.05% 39.11% 100.00% 56.23% 44.67% 100.00% 61.75% 48.89% 100.00% 65.67%
D 0.00% 0.00% 0.00% 100.00% 1.69% 3.33% 100.00% 49.15% 65.91% 100.00% 77.97% 87.62%
Accuracy 38.94% 39.38% 51.77% 59.29%
A 36.67% 42.31% 39.29% 36.67% 42.31% 39.29% 37.93% 42.31% 40.00% 48.28% 53.85% 50.91%
B 31.25% 37.74% 34.19% 31.75% 37.74% 34.48% 37.29% 41.51% 39.29% 46.55% 50.94% 48.65%
GNB C 39.62% 23.86% 29.79% 41.82% 26.14% 32.17% 53.45% 35.23% 42.47% 67.19% 48.86% 56.58%
D 32.91% 44.07% 37.68% 33.33% 44.07% 37.96% 37.50% 50.85% 43.17% 45.33% 57.63% 50.75%
Accuracy 34.51% 35.40% 41.59% 52.21%
Table 4: Results of models performance on admission scores and first level English courses scores features on a
sample of 226 students with grade support: A’s=26, B’s=53, C’s=88, D’s=59)
Normal Rounding 0.10± Rounding 0.30± Rounding 0.50±
Model GPA precision recall f1 score precision recall f1 score precision recall f1 score precision recall f1 score
A 61.90% 50.00% 55.32% 61.90% 50.00% 55.32% 70.83% 65.38% 68.00% 88.46% 88.46% 88.46%
B 54.00% 50.94% 52.43% 57.14% 52.83% 54.90% 65.91% 54.72% 59.79% 88.64% 73.58% 80.41%
RFR C 53.85% 71.59% 61.46% 55.56% 73.86% 63.41% 60.00% 81.82% 69.23% 76.64% 93.18% 84.10%
D 60.53% 38.98% 47.42% 61.54% 40.68% 48.98% 73.68% 47.46% 57.73% 91.84% 76.27% 83.33%
Accuracy 55.75% 57.52% 64.6% 83.63%
A 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00% 3.85% 7.41% 100.00% 23.08% 37.50%
B 21.43% 5.66% 8.96% 21.43% 5.66% 8.96% 23.08% 5.66% 9.09% 61.54% 15.09% 24.24%
SVM C 42.11% 100.00% 59.26% 43.78% 100.00% 60.90% 49.16% 100.00% 65.92% 57.14% 100.00% 72.73%
D 100.00% 5.08% 9.68% 100.00% 18.64% 31.43% 100.00% 55.93% 71.74% 100.00% 89.83% 94.64%
Accuracy 41.59% 45.13% 55.31% 68.58%
A 66.67% 53.85% 59.57% 70.00% 53.85% 60.87% 77.78% 53.85% 63.64% 84.21% 61.54% 71.11%
B 41.94% 49.06% 45.22% 42.86% 50.94% 46.55% 46.15% 56.60% 50.85% 54.55% 67.92% 60.50%
GNB C 46.39% 51.14% 48.65% 47.37% 51.14% 49.18% 52.13% 55.68% 53.85% 63.64% 63.64% 63.64%
D 52.17% 40.68% 45.71% 54.17% 44.07% 48.60% 61.22% 50.85% 55.56% 71.70% 64.41% 67.86%
Accuracy 48.23% 49.56% 54.42% 64.6%
Table 5: Results of models performance on admission scores and first level Math courses scores features on a sample
of 226 students with grade support: A’s=26, B’s=53, C’s=88, D’s=59)
Normal Rounding 0.10± Rounding 0.30± Rounding 0.50±
Model GPA precsion recall f1_score precsion recall f1_score precsion recall f1_score precsion recall f1_score
A 66.67% 46.15% 54.55% 66.67% 46.15% 54.55% 71.43% 57.69% 63.83% 77.27% 65.38% 70.83%
B 58.97% 43.40% 50.00% 60.00% 45.28% 51.61% 67.50% 50.94% 58.06% 79.07% 64.15% 70.83%
RFR C 52.99% 80.68% 63.96% 53.38% 80.68% 64.25% 60.63% 87.50% 71.63% 70.43% 92.05% 79.80%
D 62.86% 37.29% 46.81% 62.86% 37.29% 46.81% 81.58% 52.54% 63.92% 89.13% 69.49% 78.10%
Accuracy 56.64% 57.08% 66.37% 76.55%
A 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00% 7.69% 14.29%
B 35.00% 13.21% 19.18% 35.00% 13.21% 19.18% 50.00% 24.53% 32.91% 70.27% 49.06% 57.78%
SVM C 41.04% 62.50% 49.55% 41.04% 62.50% 49.55% 42.97% 62.50% 50.93% 49.57% 64.77% 56.16%
D 51.39% 62.71% 56.49% 51.39% 62.71% 56.49% 51.39% 62.71% 56.49% 54.17% 66.10% 59.54%
Accuracy 43.81% 43.81% 46.46% 54.87%
A 53.57% 57.69% 55.56% 53.57% 57.69% 55.56% 53.57% 57.69% 55.56% 61.54% 61.54% 61.54%
B 35.90% 26.42% 30.43% 36.84% 26.42% 30.77% 43.90% 33.96% 38.30% 53.49% 43.40% 47.92%
GNB C 47.47% 53.41% 50.27% 48.51% 55.68% 51.85% 53.47% 61.36% 57.14% 63.16% 68.18% 65.57%
D 45.00% 45.76% 45.38% 45.76% 45.76% 45.76% 50.00% 47.46% 48.70% 61.29% 64.41% 62.81%
Accuracy 45.58% 46.46% 50.88% 60.62%
8 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3250702
Table 6: Results of models performance on admission scores and first level CS courses scores features on a sample
of 226 students with grade support: A’s=26, B’s=53, C’s=88, D’s=59)
Normal Rounding 0.10± Rounding 0.30± Rounding 0.50±
Model GPA precsion recall f1_score precsion recall f1_score precsion recall f1_score precsion recall f1_score
A 70.00% 28.00% 40.00% 72.73% 32.00% 44.44% 75.00% 36.00% 48.65% 94.44% 68.00% 79.07%
B 50.85% 55.56% 53.10% 52.63% 55.56% 54.05% 59.32% 64.81% 61.95% 75.86% 81.48% 78.57%
RFR C 55.46% 75.00% 63.77% 55.83% 76.14% 64.42% 66.97% 82.95% 74.11% 78.79% 88.64% 83.42%
D 65.79% 42.37% 51.55% 65.79% 42.37% 51.55% 80.43% 62.71% 70.48% 88.24% 76.27% 81.82%
Accuracy 56.64% 57.52% 68.14% 81.42%
A 59.09% 52.00% 55.32% 59.09% 52.00% 55.32% 65.38% 68.00% 66.67% 71.43% 80.00% 75.47%
B 53.06% 48.15% 50.49% 53.06% 48.15% 50.49% 57.78% 48.15% 52.53% 63.64% 51.85% 57.14%
SVM C 52.17% 81.82% 63.72% 52.55% 81.82% 64.00% 59.02% 81.82% 68.57% 68.57% 81.82% 74.61%
D 70.59% 20.34% 31.58% 72.22% 22.03% 33.77% 84.85% 47.46% 60.87% 89.80% 74.58% 81.48%
Accuracy 54.42% 54.87% 63.27% 72.57%
A 61.54% 32.00% 42.11% 61.54% 32.00% 42.11% 61.54% 32.00% 42.11% 78.57% 44.00% 56.41%
B 42.86% 44.44% 43.64% 42.86% 44.44% 43.64% 49.06% 48.15% 48.60% 55.77% 53.70% 54.72%
GNB C 52.08% 56.82% 54.35% 52.63% 56.82% 54.64% 59.14% 62.50% 60.77% 67.03% 69.32% 68.16%
D 45.90% 47.46% 46.67% 46.77% 49.15% 47.93% 50.75% 57.63% 53.97% 59.42% 69.49% 64.06%
Accuracy 48.67% 49.12% 54.42% 62.83%
Table 7: Results of models performance on admission scores and first level courses (Eng, Math, CS) scores features
on a sample of 226 students with grade support: A’s=26, B’s=53, C’s=88, D’s=59)
Normal Rounding 0.10± Rounding 0.30± Rounding 0.50±
Model GPA precsion recall f1_score precsion recall f1_score precsion recall f1_score precsion recall f1_score
A 78.26% 69.23% 73.47% 79.17% 73.08% 76.00% 80.00% 76.92% 78.43% 92.31% 92.31% 92.31%
B 67.39% 58.49% 62.63% 68.89% 58.49% 63.27% 78.57% 62.26% 69.47% 91.11% 77.36% 83.67%
RFR C 61.26% 77.27% 68.34% 61.26% 77.27% 68.34% 69.64% 88.64% 78.00% 81.82% 92.05% 86.63%
D 71.74% 55.93% 62.86% 71.74% 55.93% 62.86% 85.11% 67.80% 75.47% 91.07% 86.44% 88.70%
Accuracy 66.37% 66.81% 75.66% 87.17%
A 76.00% 73.08% 74.51% 76.92% 76.92% 76.92% 77.78% 80.77% 79.25% 77.78% 80.77% 79.25%
B 77.27% 31.48% 44.74% 80.95% 31.48% 45.33% 85.00% 31.48% 45.95% 86.96% 37.04% 51.95%
SVM C 51.32% 88.64% 65.00% 52.00% 88.64% 65.55% 55.17% 90.91% 68.67% 59.42% 93.18% 72.57%
D 70.37% 32.76% 44.71% 72.41% 36.21% 48.28% 82.35% 48.28% 60.87% 89.47% 58.62% 70.83%
Accuracy 58.85% 60.18% 64.6% 69.47%
A 66.67% 53.85% 59.57% 66.67% 53.85% 59.57% 75.00% 57.69% 65.22% 87.50% 80.77% 84.00%
B 48.78% 37.74% 42.55% 48.78% 37.74% 42.55% 61.90% 49.06% 54.74% 80.00% 60.38% 68.82%
GNB C 53.77% 64.77% 58.76% 54.29% 64.77% 59.07% 63.64% 71.59% 67.38% 74.73% 77.27% 75.98%
D 50.00% 49.15% 49.57% 50.85% 50.85% 50.85% 58.46% 64.41% 61.29% 66.20% 79.66% 72.31%
Accuracy 53.1% 53.54% 62.83% 74.34%
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3250702
J. EVALUATION METRICS
Figure 14: Admission scores and English using dimen- The evaluation of machine learning classifiers is critical
sionality reduction by tSNE when studying the learning models and their perfor-
mance. To evaluate the performance of the classifier
models, we have used similar evaluation measures that
are adopted in most of the previous research experi-
ments. It covers the prediction accuracy and F1-score
under varying conditions of input parameters. Most of
the time, we use classification accuracy to measure the
performance of machine learning models, and we have
also used confusion matrices to compare the prediction
accuracy and failures.
TP + TN
Accuracy = (1)
TP + FP + TN + FN
Figure 15: Admission scores and All first level courses
grades using dimensionality reduction by tSNE TP
P recision(P ) = (2)
TP + FP
I. PERFORMANCE MODELS AFTER ROUNDING
GRADES TP
Recall(R) = (3)
We perform six different kinds of experiments with TP + FN
varying combination of features (admission score, ad-
mission scores+gender, admission scores+first level Eng 2 ∗ P recision ∗ Recall
courses scores, admission scores+first level Math courses F 1 − score = (4)
P recision + Recall
scores, admission scores+first level CS courses scores,
and admission scores+first level courses (Eng,Math, CS In the above equations, TP, TN, FN, FN, and FP are
scores) features as shown in tables 2, 3, 4, 5, 6, and 7. true positives, true negatives, false positives, and false
The intention behind such experiments is to show the negatives. We use F1-score as the primary performance
significance of our features after rounding grades of stu- indicator to evaluate all the classifier models used in our
dents with different scales in achieving better accuracy. experiments. F1 score is a single metric that combines
We specifically round the GPA to the closest decimal both precision and recall. The precision or True Positive
point on different ranges including 0.10, 0.30, 0.50 before Rate (TPR) is a way to look at the accuracy of positive
measuring the accuracy of the model based on the GPA predictions performed by a classifier and can be defined
label. This method simplifies the assessment of different as follows: precision = T P/(T P + F P ) where the True
results of the model performance and avoids difficulties Positives (TP) is the number of true positives, i.e.,
when applying errors-based metrics that might be hard correct prediction of a positive sample, and the False
to interpret. Positives (FP), i.e., the wrong prediction. But precision
We compare the performance of our method with is used with another parameter called recall. Recall is
different common machine learning classifiers. Tables 2 defined by T P/(T P + F N ).
and 3 show comparison of our results with different We have also used a confusion matrix table to study
classifiers based on recall, precision and F1-score on the the performance of classifiers. The confusion matrix is a
a sample of 226 students with grade support: A’s=26, table with rows and columns that report false positives,
B’s=53, C’s=88 and D’s=59. It is evident that our false negatives, true positives, and true negatives. This
approach performs better with increasing the threshold allows a more detailed analysis than the mere proportion
of Rounding which is to compute the absolute dif- of correct classifications, i.e., prediction accuracy.
10 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3250702
VOLUME 4, 2016 11
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3250702
12 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4