Paper 65-Fraud Detection in Credit Cards

(IJACSA) International Journal of Advanced Computer Science and Applications,

Vol. 11, No. 12, 2020

Fraud Detection in Credit Cards using Logistic

Hala Z Alenzi1, Nojood O Aljehane2
Department of Computer Science
Tabuk University, Tabuk City
Kingdom Saudi Arabia

Abstract—Due to the increasing number of customers as well In Malaysia, the number of transactions performed through
as the increasing number of companies that use credit cards for credit cards in 2011 was approximately 317 million, and this
ending financial transactions, the number of fraud cases has number increased to 447 million in 2018 [4]. In 2015, global
increased dramatically. Dealing with noisy and imbalanced data, credit card fraud reached a record of $21.84 billion, as
as well as with outliers, has accentuated this problem. In this reported by [2]. The number of fraud cases has been rising
work, fraud detection using artificial intelligence is proposed. with the increased use of credit cards. While various
The proposed system uses logistic regression to build the verification methods have been implemented, the number of
classifier to prevent frauds in credit card transactions. To handle fraud cases involving credit cards has not been significantly
dirty data and to ensure a high degree of detection accuracy, a
decreased [6]. The potential for substantial monetary gains,
pre-processing step is used. The pre-processing step uses two
novel main methods to clean the data: the mean-based method
combined with the ever-changing nature of financial services,
and the clustering-based method. Compared to two well-known
creates a wide range of opportunities for fraudsters [7]. Funds
classifiers, the support vector machine classifier and voting from payment card fraud are often used in criminal activities
classifier, the proposed classifier shows better results in terms of that are hard to prevent, e.g., to support terrorist acts [8]. The
accuracy, sensitivity, and error rate. internet is where fraudsters prefer to be because they are able
to conceal their location and identity. The recent increase in
Keywords—Classifier; logistic regression; accuracy; credit card fraud has directly hit the financial sector hard.
smoothing; artificial intelligence; cross validation Losses due to credit card fraud mainly impact merchants
because they bear all expenses, including the fees from their
I. INTRODUCTION card issuer, administrative fees and other charges [9]. All the
According to the definition of fraud [1], the aim of fraud is losses are borne by the merchants, leading to increases in the
to achieve personal or financial gain through deception. Based prices of goods and decreases in discounts. Hence, reducing
on this, fraud detection and prevention are the two significant this loss is highly important. An effective fraud detection
methods for avoiding loss due to fraud. Fraud prevention is system is required to minimize the number of cases of fraud.
the proactive technique for avoiding the occurrence of A. Motivation
fraudulent acts, and fraud detection is the technique for the
detection of fraudulent transactions by fraudsters [2]. A The use of credit cards to perform financial transactions at
variety of payment cards, including credit, charge, debit, and banks or other institutions is a common action in light of the
prepaid cards, are currently widely available. They are the currently available technology. Online payments (or any other
most popular means of payment in some countries [3]. Indeed, online transactions) bring benefits to companies and
advances in digital technologies have paved the way for individuals in terms of the convenience, velocity, and
changes in how we handle money, especially for payment flexibility of performing daily duties [10,11]. The work in [12]
methods that have changed from being a physical activity to a presented a statistical analysis related to the usage of credit
digital activity using electronics means [4]. This has cards over five years (from 2006 to 2010). This reflected the
revolutionized the landscape of monetary policy, including the huge dependency on credit cards by both people and
business strategies and operations of both large and small organizations. To take advantage of advanced technologies,
companies. Credit card fraud is the fraudulent use of credit companies try to use advanced techniques to provide high-
card details to buy a product or service. These transactions can quality services to customers. Automation can be seen as the
be physically or digitally performed [5]. In physical best solution for attracting more customers and consequently
transactions, the credit card is physically present. On the other collecting more financial gain [13]. The process of converting
hand, digital transactions take place over the internet or a manual system to a fully automatic on, as found in smart
telephone. A cardholder normally provides their card number, cities, is not without risk.
card verification number, and expiration date through a B. Problem Statement
website or telephone call. With the rapid rise in e-commerce
over the past few years, credit card use has increased According to [14], it is estimated that 10,000 transactions
tremendously [1,3]. take place via credit cards every second worldwide. Owing to
such a high transaction frequency, credit cards have become
the primary targets of fraud. Indeed, since the Diners Club

(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 11, No. 12, 2020

released its first credit card in 1950, credit card companies

have been fighting against fraud [15]. Every year, billions of
dollars are lost directly because of credit card fraud. Fraud
cases occur under different conditions, e.g., transactions at
points of sale (POSs) or transactions made online or over the
telephone, i.e., card-not-present (CNP) cases or transactions
with lost and stolen cards. In this way, the credit card fraud in
2015 alone amounted to $21.84 billion, with issuers bearing
$15.72 billion of the cost [16]. Based on information from the
European Central Bank, in 2012, the majority (60%) of fraud
stemmed from CNP transactions, and another 23% stemmed
from POS terminals. The value of fraud is high globally and
locally in Malaysia. The volumes of credit, debit, and charge
cards were 383.8 million, 107.6 million, and 4.1 million, Fig. 1. General Scenario of Online Fraud.
respectively, in 2016 and increased to 447.1 million, 245.7
million, and 5.2 million, respectively, in 2018 [9]. The overall C. Research Questions
percentage of fraudulent payments (i.e., with credit, debit, and On the basis of the empirical evidence, the following
charge cards) was 0.0186% in 2016 and increased by 37.6% to research questions are developed to guide this study and meet
0.0256% in 2018 [17]. The potential for huge monetary gains its objectives.
combined with the ever-changing nature of financial services
provides opportunities for fraudsters. In Malaysia, 1,000 card  How can a fraud detection system be built using AI that
transactions occur every minute. Fraud directly impacts can deal with imbalanced data effectively?
merchants and financial institutions because they incur all the  How can we smooth (or clean) the data before using it
costs. An increase in fraud affects customers’ confidence in
for training the machine to ensure high detection
using electronic payments [18]. accuracy?
Many surveys have shown that the increase in the
 How can the system detect fraud by adapting to the
dependence on credit cards to perform financial transactions is
behaviour of the user?
accompanied by an increasing rate of fraud, as seen in [1,3].
The increasing capabilities of the attackers or the hackers have D. Contributions
accentuated the problem since these people can exploit The contributions of this work can be summarized as
security gaps to obtain sensitive information about users or follows:
their credit information to perform malicious activities, such
as fraud [4,5]. To define this problem accurately, Fig. 1 shows  An AI-based system for fraud detection is proposed.
the general scenario of performing credit card fraud. The system uses logistic regression to build a classifier
called the LogR classifier. The LogR classifier has the
As shown in Fig. 1, the attacker can perform malicious ability to deal with imbalanced data and adapt to the
activities on many sides of the online process. To solve this
behaviour of the user by employing the cross-validation
problem, a fraud detection system is needed. Artificial
intelligence (AI) is defined as the research field that aims at
performing machine learning to obtain an intelligent machine  To ensure high accuracy detection, two main methods
that can perform tasks on behalf of the user. This can be done are used to clean the data. The mean-based method
through two main steps: training and testing. AI is employed deals with missing values, and the clustering-based
to build systems for fraud detection, such as classification- method deals with outliers.
based systems [19,6,7,8], clustering-based systems [17,20,21],
neural network-based systems [18,22,23], and support vector  Extensive experiments are conducted to train and test
machine-based systems [9]. Although AI-based systems can the proposed classifier using a standard database.
perform well, they suffer from some critical issues. First, the E. Structure of the Paper
term “imbalanced data” refers to unbalanced data used for
training, where one class of the data is dominated by the other The rest of this work is organized as follows. Section II
(i.e., the majority of data belong to one class and the rest reviews the related work. Section III describes the proposed
belong to the other). This negatively affects the accuracy of artificial intelligence system in detail. In Section IV, the
detection [24,25]. Second, the term “noisy data” refers to the metrics used are presented for evaluation purposes. Section V
existence of outliers within the data employed for training. presents the experiments and discusses the results in light of a
Outliers can be seen outside of the normal context of the data. comparison with similar approaches. Finally, the paper is
This issue also leads to poor detection accuracy [26,16]. Third, concluded in Section VI.
the concept of drift means that the behaviour of the client
changes, resulting in changes in the data stream when dealing
with online data detection in real time [15,14].

(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 11, No. 12, 2020

II. RELATED WORK The techniques employed to construct credit card fraud
detection systems using AI can be categorized into four main
This section first provides a brief background about the groups. This idea is shown in Fig. 3.
research domain. Then, the related work is presented in detail.
1) Classification-based systems: The authors in [19] tried
A. Background
to achieve two main objectives in their work: (1) enhancing
The background refers to the credit card research field in the accuracy of the classifications output by credit card fraud
terms of the intersection of multiple research sectors. This
detection systems and (2) lowering the response times of these
field can be viewed as the intersection of four main domains,
as illustrated in Fig. 2. systems. To achieve the first goal, the authors proposed a
hybrid model that fuses two classifiers to generate a new (or
The definitions of the domains and terms that are applied enhanced) one. The first classifier used is the K-means
in this study are listed below. classifier, which deals with overlapping data because such
Artificial Intelligence (AI): It can be defined as the data cause poor accuracy. The second classifier is the artificial
science that addresses the methods used for training machines bee colony algorithm (ABC), which is used to enhance the
to mimic the brains of humans. In other words, machines can performance of the system. The first classifier forms the first
be used to make decisions on behalf of human users. In this level, and the second classifier forms the second level of the
context, data mining tasks, such as classification, clustering, classification process proposed in the same model. The
applying association rules, and using neural networks, are database used in this work was generated by using the C#
employed [2]. programming language, where the number of instances was
Financial Systems: These can be defined as the systems 100,000. In addition, 12 features were selected to include in
that are used to convert manual transactions into digital the training phase. The selected features were based on a rule
transactions. In this context, the term “transaction” denotes engine.
any financial activity that may be performed by a user based
on a specific system [27]. Moreover, previous systems suffered from problems in
real-time environments [6]. These are problems in the context
Chip Industry: This term refers to the manufacturing of of credit card fraud detection. Such problems include
chips to store critical information on the card of the user. The imbalanced data, noisy data, and the concept of drift. The
information acts as a key to trigger any transaction. However, authors applied the bag creation technique to solve the data
the chip is programmed to match some passwords to allow problems; this technique involves performing the sampling
access to financial interfaces [28]. process on the collected data in real time. To clean the data,
Internet of Things (IoT): It can be defined as a collection they applied naïve Bayes networks for the effective
of devices connected via a network. The devices vary from manipulation of noisy data. An incremental learning-based
small devices with low processing power (such as watches) to method was presented to address the concept of drift. The data
large devices high processing power, such as mobile devices. set used in this work is summarized in Table I.
Using IoT devices to perform financial transactions is vital, The strength of this study is the enhancement in
especially in light of the goal of shifting toward smart cities performance achieved by using Spark to implement the system
[29]. in parallel. In addition, the reduction in cost is considered an
B. Groups of AI-based Techniques important feature of this system, and this was achieved by
employing naive Bayes networks in the process of
Artificial intelligence (AI) is defined as enabling machines classification. The weakness of the proposed system is that it
to make decisions on behalf of human users. In this context, does not manipulate cyclic recurrences that may be included
data mining tasks, such as classification, clustering, applying in the concept of drift. Cyclic recurrences refer to cyclic
association rules, and using neural networks, are employed repetitions in the distributions of data.
[2]. In addition, AI is employed to build systems for fraud
detection, such as classification-based systems [19,6,7,8],
clustering-based systems [17,20,21], neural network-based
systems [18,22,23] and support vector machine-based
systems [9].

Fig. 3. Categories of AI-based Techniques for Fraud Detection.


Start day End day Instances

July 2004 September 2004 0.3 million 3.74%

Fig. 2. The Intersection of Credit Card Research and other Research Fields.

(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 11, No. 12, 2020

The authors in [7] evaluated the current fraud detection skilled. The methods proposed to address such problems
system with regard to credit card transactions. The problem is suffer from low accuracy and effectiveness. In addition, the
that there are two stages for automatic classification: real-time methods used for detecting fraud may make some mistakes in
(RT) and near-real-time (NRT). They focused on the NRT identifying fraudulent transactions. The reason behind such
stage by using a rule-based classification technique that shortcomings is that the proposed approaches focus on order
considers the final evaluation of the human element of fraud. analysis rather than anything else. Motivated by these facts,
The authors did not improve the design of the system, the authors proposed a method that focuses on the hackers
discover any new rules, or improve the arithmetic efficiency themselves. The key idea is to extract some recognized
of individual rules. Instead, they manipulated the rules to form features, such as the address of delivery, customer name, and
a decision-making system to improve both the accuracy and methods of payment, and then, based on these features, the
the performance. The key idea is to calculate the contribution similarity among the attackers is calculated. Based on these
of each rule involved in the system. Calculating the similarities, the attackers are grouped in some clusters for
contribution of a rule depends on the difference between two detection. A main feature of their proposed method is that two
values, which are (1) the performance of the system when the current methods, agglomerative clustering and sampling, are
rule is used and (2) the performance of the system without selectively used in a reasonable amount of time for recursively
using the rule. The degree of performance improvement is grouping orders into small clusters. The dataset used for the
high if the rule is not redundant and is low if it is redundant training process was inspired by the Zalando website. This
with other rules or rule groups. For the measurement of website periodically receives approximately 29 million orders
performance, the precision, recall and F-score metrics were (some of them are normal and others are fraudulent).
employed. A real database, which consists of 359,862 records
The authors in [21] tried to evaluate the detection problem
provided by some industrial partners, was used for the training
phase. by extracting the general pattern of the dataset to represent the
fraud. In other words, the enhancement of the clustering
The authors in [8] addressed credit card fraud detection. In methods relies only on the clusters used; this technique is
this study, the authors relied on the fact that "the features of called general enhancement. The authors proposed an
the financial transactions in institutions change over time". approach that enables the application of local enhancement as
This shows that the problem of credit card fraud detection well as general enhancement for fraud detection in financial
should be considered in real time. Therefore, they converted transactions. They proposed the “Hierarchical Clusters-based
this problem into real working transactions. In terms of Deep Neural Networks (HC-DNN)” method that uses the
artificial intelligence, the class should not be provided to the anomalous features of hierarchical clusters that are pretrained
classifier immediately during the training stage. The key idea based on an autoencoder as the initial weights for neural
of the proposed approach is to follow a strict strategy that has networks. In detail, the data are grouped based on abnormal
three main steps: (1) analysing the real conditions under which features that refer to fraud. These features are then used as the
the real transactions are performed; (2) employing these initial weights for the input layers of neural networks, as
conditions to train the classifier using two main data sets; and shown in Fig. 4.
(3) testing the classifier after the training stage is completed
and supporting it by using the feedback of the users (their TABLE II. DETAILS OF THE DATASET USED [8]
interactions) to improve the accuracy of the classifier. Table II
summarizes the dataset used. Id End day Instances Features
2) Clustering-based systems: To address the problem of 2013 2014-01-18 21'830'330 51 0.19%
detecting credit card fraud through transactions, the authors in 2014-2015 2015-05-31 54'764'384 51 0.24%
[17] dealt with the problem of online shopping fraud and the
concept of drift. They proposed a strategy consisting of four
stages: (1) based on both the previous transaction data and the
information of the cardholders, they used the clustering
method to divide the cardholders into different groups for the
purpose comparing their behaviours; (2) they proposed a
sliding window strategy to group the transactions in each
group to extract the behavioural patterns for each cardholder;
(3) they trained a set of classifications for each group to
measure behavioural patterns; and (4) they used a group of
classifiers by training them on cardholder behaviours and
output the highest behaviour pattern. A feedback mechanism
was used to solve the concept of drift problem. Four dataset
simulators were generated to manually create the data sets.
The authors in [20] proposed a clustering-based method. In
this study, the fraud detection problem in ecommerce is
manipulated and may be exploited by hackers who are highly Fig. 4. Key Idea of the HC-DNN Method [21].

(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 11, No. 12, 2020

The authors used a dataset containing 19,505 records, features are used for the clustering process. In other words, the
including fraudulent and non-fraudulent records. The dataset data are cleaned initially. Then, the features of transactions are
is skewed and consists of 19,313 non-fraudulent and 192 extracted. Third, the features are measured to calculate the
fraudulent cases. Some preprocessing steps were performed on similarity among them. To isolate the features as much as
the data to mitigate the negative impact of the imbalanced data
possible, the SVM is used. Fourth, the K-means clustering
before using them for actual training.
algorithm is used to cluster the data based on the isolated (i.e.,
3) Neural network-based systems: The authors in [18] as far as possible) features. The classifier is then trained on the
discussed issues related to increasing fraud detection in online clusters. The classifier that deals with fraudulent transactions
shopping transactions and payments, especially those related is used to detect fraud. The database used for training contains
to credit cards. To detect credit card fraud, they proposed a 5310 records in total. Among them, 490 records are fraudulent
neural network-based system. It uses back prorogation to data and 4820 are non-fraudulent data, and 1174 characteristic
enhance the output of the neural network so that the error (the variables are included.
difference between the actual or desirable value and the output
of the neural network) is distributed back by adjusting the
weights of the inputs. The strategy followed in this work can
be summarized through the following steps:
a) A new Neuroph Project was created in Neuroph
Studio using the Java programming language.
b) The actual perceptron network was constructed.
c) The training data set was prepared.
d) The training process was started by considering the
desired value (the accuracy of fraud detection) set by an expert
in the field.
e) The trained network was tested. Fig. 5. General Scenario of the Fraud Detection System Proposed in the
Work in [22].
The data used for training were collected from a data
mining blog. It includes 20000 active credit card holders with
transactions spanning more than six months. The authors in
[22] proposed a “Convectional Neural Network CNN” in their
work. Similar to previous works, the problem studied was how
to detect a pattern that represents fraudulent transactions. In
their method, the CNN forms a classifier that takes features of
the transactions as inputs. The features are extracted from each
transaction and stored in a feature matrix. The classifier has
the ability to deal with imbalanced data based on the sampling
technique. The key idea behind the sampling technique is to
use higher than normal costs to generate fraudulent
transactions. Fig. 5 illustrates the general scenario of the CNN
The data used includes more than 260 million credit card
transactions in one year. Approximately four thousand
transfers are listed as fraudulent, and the remainder are legal.
A hybrid fraud detection system was proposed in [23]. The
key idea is to use neural networks as classifiers. Since the
network needs to update the weights of the input layer, a
swarm optimization method was employed for this purpose.
Finally, the model was tested and evaluated. Fig. 6 illustrates
the general structure of the proposed system, which is called
the “Particle Swarm Optimization Auto-associative Neural
Network (PSOAANN)”.
4) Support vector machine-based systems: The authors in
[9] used a support vector machine (SVM) to improve the
accuracy of the classifier in the process of detecting fraud in
credit card transactions. The key idea behind using an SVM is Fig. 6. Structure of the PSOAANN-based System [23].
to split the features that represent transactions, where these

(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 11, No. 12, 2020


This section describes the proposed approach in detail.
Fig. 7 illustrates the steps of the proposed approach.
As shown in Fig. 7 above, there are nine steps, starting
with the selection of the database and ending with the use of
the classifier in real-life situations. The reason behind
selecting logistic regression to build the classifier is related to
its efficiency of detecting frauds based on its ability to isolate Fig. 9. Interface for Selecting (or Loading) the Data Set.
the data that belong to different binary classes.
A. Selecting the Database
This work uses a standard dataset that is available on the
internet [30]. The dataset contains transactions made using
credit cards in September 2013 by European cardholders. This
dataset presents transactions that occurred over two days,
where we have 492 fraudulent cases out of 284,807
transactions. The dataset is highly unbalanced, and the
positive class (fraudulent cases) accounts for 0.172% of all
transactions. Fig. 8 shows the selection step in the
implemented programme represented by “Load DB”.
As shown in Fig. 8, the loading of the data is competed,
and the size of the dataset can be seen. Fig. 10. Data Exploration based on the Observation Number.

To explore the data contained in this data set, Fig. 9 shows

the data exploration options that can be chosen.
As shown in Fig. 9, there are 6 views of the used data set.
This enables us to clearly explore the database. In terms of
exploring the database, Fig. 10 and 11 show two examples of
data exploration.

Fig. 11. Data Exploration based on the Two Main Classes of the Data.

B. Data Cleaning
The goal of this step is to clean the data and prepare it for
the training phase of the classifier. In general, data in reality
Fig. 7. Flow Chart of the Proposed Approach. are noisy. Therefore, a cleaning step is necessary. In the
context of the data cleaning process, the procedure is as
1) Fill in the missing values. A missing value means that a
cell of a given record is empty due to an mistake during entry.
2) Solve any inconsistencies. This means that if there is a
collision in the data, this collision must be resolved.
3) Remove any outliers. Outliers refer to abnormal values
(i.e., very high values or very low values).
Fig. 8. Loading the used Dataset.

545 | P a g e
Vol. 11, No. 12, 2020

Fortunately, most of the data used in the data set are

cleaned except for some missing values and outliers. The
mechanism that is used for handling the missing values
depends on the mean (mathematical operation) since the data
are numbers. Fig. 12 illustrates to the process of filling in a
missing value.
For the handling of outliers, a clustering-based method is
employed in this work. The key idea is to create three clusters
(one for the normal data, a second one for high values, and a
third for low values). After grouping the data into the clusters,
the last two clusters (i.e., those that contain outliers) are
deleted. Fig. 13 illustrates the mechanism of outlier removal.
C. Database Division Fig. 14. Division of the Database based on Cross Validation.
In this step, the database is divided into training and
testing databases. The goal of the training database is to
construct the classifier (model), while the goal of the testing
database is to test (evaluate) the built classifier. In this work,
the cross-validation method is used to divide the database,
which is divided into 10 parts, as shown in Fig. 14.
As shown in Fig. 14, the database is divided into 10 parts
(i.e., the value of 𝑘 = 10 in the cross-validation method). In
the first iteration (𝑘 = 1), the first nine parts are considered a
training set, while the last part of the database is considered a
testing set. In the second iteration (𝑘 = 2), both the first eight
parts and the tenth part are considered as a training set, while
the ninth part of the database is considered a testing set. This
process continues until the last iteration (𝑘 = 10), where the
first part is the testing set and the last nine parts are the
training set.
Fig. 15 illustrates a sample of the code execution process
based on the cross-validation method when clicking on the
“Split DB” button.

Fig. 12. Mean-based Mechanism for Handling Missing Values.

Fig. 15. Results of the Division Process.

D. Building the Classifier

Fig. 13. Mechanism of Outlier Removal. In the context of building the classifier, logistic regression
is employed. Logistic regression is more advanced than linear
regression. The reason for this is that linear regression cannot
classify data that are widely distributed in a given space, as
shown in Fig. 16.

546 | P a g e
Vol. 11, No. 12, 2020

Fig. 18. The Concept of Logistic Regression Classification [33].

Fig. 16. The Limitation of Linear Regression.
In other words, the fraud class takes the value “1”, while
As shown in Fig. 16, on the left side, the linear regression the non-fraud class takes the value “0”. A threshold of 0.5 is
has the ability to classify the data, where the line can divide used to differentiate between the two classes, as shown in
the given data into two main categories (or classes). The right Fig. 18.
side of Fig. 16 illustrates the limitation of linear regression. E. Testing the Classifier
When the data overlap, the line cannot divide the data into two
clear classes. This limitation is overcome by logistic Since the cross-validation method divides the database into
regression. Fig. 17 provides a visual comparison between the 10 parts, there are 10 testing data sets. Each testing data set is
linear regression and the logistic regression methods for the used to test one classifier (there are 10 classifiers). This in turn
purpose of highlighting this limitation. gives the model an advantage by allowing it to use the whole
database for testing as well as for training. The testing process
Logistic regression has the following advantages [32]: is tightly coupled with the accuracy of the model. Calculating
the final accuracy involves calculating the accuracy of each
1) Logistic regression is easier to implement than linear
classifier. Formally, let 𝐴𝑐𝑐𝑘𝐶 denote the accuracy of a given
regression and is very efficient to train. trained classifier, as shown in Fig. 19.
2) It makes no assumptions about the distributions of
classes in the feature space. Then, the final accuracy of the final classifier (𝐴𝐶𝐶𝐹𝐶 ) is
3) It can easily be extended to multiple classes obtained based on the “average” mathematical operation.
(multinomial regression). ∑10 𝐶
𝑘=1 𝐴𝑐𝑐𝑘
4) It is very efficient for classifying unknown records. 𝐴𝐶𝐶𝐹𝐶 = (4)

The logistic regression equation can be obtained from the F. Evaluating the Classifier
linear regression equation. The mathematical steps to obtain In general, a confusion matrix is an effective benchmark
logistic regression equations are given below: for analysing how well a classifier can recognize records of
The equation of the straight line can be written as: different classes [34]. The confusion matrix is formed based
on the following terms:
𝑦 = 𝑎0 + 𝑎1 × 𝑥1 + 𝑎2 × 𝑥2 + ⋯ 𝑎𝑘 × 𝑥𝑘 (1)
1) True positives (TP): positive records that are correctly
In logistic regression, y can be between 0 and 1 only, so labelled by the classifier.
we divide the above equation by (1 − 𝑦): 2) True negatives (TN): negative records that are correctly
𝑦 labelled by the classifier.
|0 𝑓𝑜𝑟 𝑦 = 0 𝑎𝑛𝑑 ∞ 𝑓𝑜𝑟 𝑦 = 1 (2)
1−𝑦 3) False positives (FP): negative records that are
As a result, the logistic regression equation is defined as: incorrectly labelled positive.
𝑦 4) False negatives (FN): positive records that are
log [1−𝑦] = 𝑎0 + 𝑎1 × 𝑥1 + 𝑎2 × 𝑥2 + ⋯ 𝑎𝑘 × 𝑥𝑘 (3) mislabelled negative.
Table III shows the confusion matrix in terms of the TP,
FN, FP, and TN values.
Relying on the confusion matrix, the accuracy, sensitivity,
and error rate metrics are derived. For a given classifier, the
accuracy can be calculated by considering the recognition rate,
which is the percentage of records in the test set that are
correctly classified (fraudulent or non-fraudulent). The
accuracy is defined as:
Fig. 17. A Visual Comparison between Linear and Logistic Regression [31].
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (5)
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑎𝑙𝑙 𝑟𝑒𝑐𝑜𝑟𝑑𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑡𝑒𝑠𝑡𝑖𝑛𝑔 𝑠𝑒𝑡

547 | P a g e
Vol. 11, No. 12, 2020

Security and privacy issues are highly stressed according

to many studies [35-43] when using data in the artificial
intelligence research field. This is because the data reflect the
policies and sensitive issues of the institution in question
(these are banks in our work when applying the proposed
classifier in reality). Therefore, the privacy and security of
data are not considered in this work, but they will be
considered in future work.
Since the domain of this work is artificial intelligence, two
types of metrics are used. They are AI-based metrics and
performance-based metrics.
A. AI-based Metrics
In this context, the confusion matrix dominates the
situation. In other words, the metrics that are derived from the
confusion matrix are employed to measure the prediction
Fig. 19. Classifiers with Corresponding Accuracies. accuracy of the classifier.
Mechanisms for accuracy-based evaluation. In this B. Performance-based Metrics
context, a higher accuracy corresponds to a better classifier In this context, time dominates the situation. In other
output. The maximum value of the accuracy metric is 1 (or words, the total time (𝑇𝑜𝑇𝑖) required to build, train, and test
100%), which is achieved when the classifier classifies the the classifier is used as a benchmark. The 𝑇𝑜𝑇𝑖 is given by:
records correctly without any errors in the classification
process. 𝑇𝑜𝑇𝑖 = 𝑇𝑝𝑟𝑒 + 𝑇𝑑𝑏𝑠 + 𝑇𝑡𝑟 + 𝑇𝑡𝑠 (8)
Sensitivity refers to the true positive recognition rate. It is where 𝑇𝑝𝑟𝑒 refers to the preprocessing time, 𝑇𝑑𝑏𝑠 refers to
given by: the database splitting time, 𝑇𝑡𝑟 refers to the training time, and
𝑇𝑃 𝑇𝑡𝑠 refers to the testing time. It is well known that the lower
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = 𝑃
(6) the total time is, the higher the degree of performance.
Mechanisms for sensitivity-based evaluation. In this V. RESULTS AND DISCUSSIONS
context, a higher sensitivity corresponds to a better classifier
output. The maximum value of the sensitivity metric is 1 (or This section is structured so that the specifications of the
100%), which is achieved when the proportion of true positive machine used to implement the proposed classifier are
cases equals the number of actual positive cases. introduced. Then, the classifiers that are compared with the
proposed classifier are described. Finally, the results are
The error rate is defined as the ratio of mistakes made by provided along with two discussions.
the classifier during the prediction process. It is defined as:
A. Setup
𝑒𝑟𝑜𝑟 𝑟𝑎𝑡𝑒 = 1 − 𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 (7) The system is performed on a machine that has the
Mechanisms for error rate-based evaluation. In this specifications summarized in Fig. 20.
context, a higher accuracy corresponds to a worse classifier The programming language used for the implementation of
output. The maximum value of the accuracy metric is 1 (or the classifier is Python.
100%), which is achieved when the classifier classifies all the
records incorrectly (i.e., the accuracy is zero).
G. Examining the Value of the Accuracy
In this step, the final calculated accuracy is examined. If it
is accepted, then the classifier can be used in real-life
situations. Otherwise, the process of building the classifier has
a problem, and then retraining the classifier is required.


Actual class Confusion matrix

(Predicted class) C1 ¬ C1 Total
Fig. 20. Specifications of the Machine used to Implement the Classifiers.
True positives False negatives
C1 TP + FN = P
(TP) (FN)
False positives True negatives
¬ C1 FP + TN = N
(FP) (TN)

548 | P a g e
Vol. 11, No. 12, 2020

B. Selected Classifiers clusters. This in turn reflects efficient processing in the

Two classifiers are selected for a comparison with the prediction process compared to poor processing in the VC
classifier proposed in this work. They are the K-nearest classifier (i.e., only calculating the majority).
neighbours (KNN) classifier and the voting classifier (VC). For performance comparison purposes, the bare chart
Below, a brief description of each selected classifier is shown in Fig. 23 illustrates the values of the response time for
presented. all classifiers involved in the comparison.
Fig. 21 shows the fundamental steps required to build the
voting classifier.
As shown in Fig. 21, there are many classifiers, and a
voting step is required to produce the final output class. The
voting step means that the final output of the classifier
depends on the majority of the classes (predictions) that are
generated by the classifiers. For example, there are three Fig. 21. Basic Concept of the Voting Classifier.
classifiers in Fig. 22. The final prediction is either Fraud (F) or
Non-Fraud (NF). The voting process works as follows:
1) Obtain the outputs of the classifiers.
2) Calculate the number of classifiers that generate the F
class (let us say 2 classifiers).
3) Calculate the number of classifiers that generate the NF
class (let us say 1 classifier).
4) The majority is 2. Therefore, the final prediction is the
F class.
Fig. 22 shows the fundamentals steps for building the
KNN classifier.
As shown in Fig. 22, there are two clusters (one for
fraudulent transactions and one for non-fraudulent
transactions). Each cluster has a centre, which is represented Fig. 22. Basic Concept of the KNN Classifier.
numerally by (-1) for nonfraudulent transactions and (+1) for
fraudulent transactions. For a given transaction, the KNN TABLE IV. EVALUATING THE PROPOSED CLASSIFIER
classifier processes the transaction and generates a
K-value Accuracy Sensitivity Error rate
corresponding number. Then, the distance between the
generated value and the centre of each cluster is calculated. 1 96% 97% 4%
Finally, the transaction is assigned to the correct cluster (in the 2 98% 96% 2%
example, it is assigned to the non-fraud cluster).
3 98% 97% 2%
C. Results 4 96% 96% 4%
Since the cross-validation method is used to divide the 5 97% 98% 3%
database, we obtain ten sub-classifiers as mentioned
previously. The process of calculating the final values of the 6 96% 98% 4%
AI-based metrics depends on the “average” mathematical 7 97% 96% 3%
operation. Table IV summarizes the obtained results. 8 98% 98% 2%
Table V summarizes the comparison of the logistic 9 98% 98% 2%
regression (LogR)-based classifier with both the KNN-based
10 98% 96% 2%
classifier and the VC-based classifier.
Average 97.2% 97% 2.8%
Discussion. From Table V, it is obvious that the LogR
classifier achieves the best values in terms of accuracy,
sensitivity, and error rate. The reason behind this is related to
the efficient preprocessing technique used to remove outliers Metrics
and manipulate the missing values. In addition, cross Classifier
Accuracy Sensitivity Error rate
validation ensures that the entire database is employed as both
the training and testing data sets, and this in turn enhances the LogR classifier 97.2% 97% 2.8%
three metrics. The KNN classifier comes in second, and the KNN classifier 93% 94% 7%
VC classifier comes in third. This is because the KNN
VC classifier 90% 88% 10%
classifier includes a step related to calculating the distances
between the value of the new transaction and the centres of

549 | P a g e
Vol. 11, No. 12, 2020

Future work. In future work, we intend to enhance the

performance and take the security and privacy of the data in
real time into consideration.
(IJACSA) International Journal of Advanced Computer Science and Applications,
