Credit Card Fraud Detection
Credit Card Fraud Detection
Credit Card Fraud Detection
School
NO.1 MS PARK
PROJECT FILE
PROJECT FILE:
BANK MANAGEMENT
SUBMITTED TO SYSTEM BY
SUBMITTED
MS.YASHIKA DEEPANSHU
SUBMITTED TO: SUBMITTED BY:
CLASS – XII B
Ms. YASHIKA TANMAY KAUSHAL
EXAM ROLL NO:-
XIIth B
12228
ACKNOWLEDGEMENT
I would like to express my special thanks of
gratitude to our principal Mr. S.P. Singh as
well as our Teacher Ms. Yashika who gave
me the golden opportunity to do this
wonderful project on the topic Scholar
Registratiom , which also helped in lot of
research and I came to about so many new
things. I am really thankful to them. Secondly
I would also like to thank my parents and
friends who helped me a lot in finalizing this
project within the limited time frame.
CERTIFICATE
This is to certify that the project entitled, “Scholar Registration"
submitted by "Deepanshu" in partial fulfilment of the requirements
for the award of "Computer Science " in "PYTHON" at the
"Government Boys Senior Secondary School NO.1 Mansaorvar
Park" is an authentic work carried out by him under my supervision
and guidance.
Principal Signature
(Ms.Yashika)
External Examiner
CREDIT CARD FRAUD DETECTION
How can a credit card Fraud happen?
Some of the most common ways it may happen are:
Firstly and most ostensibly when your card details are overseen by some other
person.
When your card is lost or stolen and the person possessing it knows how to get
things done.
Fake phone call convincing you to share the details.
And lastly and most improbably, a high-level hacking of the bank account details.
Main challenges involved in credit card fraud detection are:
Enormous Data is processed every day and the model build must be fast enough to
respond to the scam in time.
Imbalanced Data i.e most of the transactions(99.8%) are not fraudulent which
makes it really hard for detecting the fraudulent ones
Data availability as the data is mostly private.
Misclassified Data can be another major issue, as not every fraudulent transaction
is caught and reported.
And last but not the least, Adaptive techniques used against the model by the
scammers.
How to tackle these challenges?
The model used must be simple and fast enough to detect the anomaly and
classify it as a fraudulent transaction as quickly as possible.
Imbalance can be dealt with by properly using some methods which we will talk
about in the next paragraph
For protecting the privacy of the user the dimensionality of the data can be
reduced.
A more trustworthy source must be taken which double-check the data, at least
for training the model.
We can make the model simple and interpretable so that when the scammer
adapts to it with just some tweaks we can have a new model up and running to
deploy.
Dealing with Imbalance
We will see in the later parts of the article that the data we received is highly
imbalanced i.e only 0.17% of the total Credit Card transaction is fraudulent. Well, a
class imbalance is a very common problem in real life and needs to be handled
before applying any algorithm to it.
There are three common ways to deal with the imbalance of Data
For those of you who are wondering if the fraudulent transaction is so rare why
even bother, well here is another fact. The amount of money involved in the
fraudulent transaction reaches Billions of USD and by increasing the specificity to
0.1% we can save Millions of USD. Whereas higher Sensitivity means fewer people
harassed.
4
CREDIT CARD FRAUD DETECTION – AN
INSIGHT INTO MACHINE LEARNING AND DATA
SCIENCE
There are three common ways to deal with the imbalance of Data
THE CODE
Hello coders, in case you jumped directly to this part, here is what
you need to know. Credit Card fraud is bad and we have to find a
way to identify fraud using some of the features given to us in the
data on which you can completely rely on for now. So without
further adieu, let’s get started.
Here is the GitHub link to the repository of the Notebook. You can
fork it and even push to suggest some changes in the repository.
Feel free to try it out.
Importing dependencies
You have to first download the data from the Kaggle website.
Click the download button next to the new Notebook button
in the middle of the screen.
Now you can use this code to load the dataset to the ipython
notebook you are working on.
Time
Amount
Transaction amount
Class
Let’s separate the Fraudulent cases from the authentic ones and
compare their occurrences in the dataset.
# Determine number of fraud cases in datasetFraud = data[data[‘Class’] == 1]
Valid = data[data[‘Class’] == 0]outlier_fraction =
len(Fraud)/float(len(Valid))
print(outlier_fraction)print(‘Fraud Cases: {}’.format(len(data[data[‘Class’]
== 1])))
print(‘Valid Transactions: {}’.format(len(data[data[‘Class’] == 0])))
With that out of the way let’s proceed with dividing the data values
into Features and Target.
#dividing the X and the Y from the dataset
X=data.drop([‘Class’], axis=1)
Y=data[“Class”]
print(X.shape)
print(Y.shape)
#getting just the values for the sake of processing (its a numpy array with
no columns)
X_data=X.values
Y_data=Y.values
Using Skicit learn to split the data into Training and Testing.
# Using Skicit-learn to split data into training and testing sets
from sklearn.model_selection import train_test_split
# Split the data into training and testing sets
X_train, X_test, Y_train, Y_test = train_test_split(X_data, Y_data, test_size
= 0.2, random_state = 42)