CPP Report
CPP Report
CPP Report
Submitted By :
Kartik Shinde [22]
Coordinator
Mrs.P.V.Javkar
In Partial Fulfilment Of
DIPLOMA IN COMPUTER TECHNOLOGY 2022-
2023
SINHAGAD TECHNICAL EDUCATION SOCIETY’S
PUNE – 411041
SINHAGAD TECHNICAL EDUCATION SOCIETY’S
CERTIFICATE
THIS IS TO CERTIFY THAT
DATE :
PLACE :
"The real spirit of achieving a goal is through the way of excellence and austere
discipline." The satisfaction and euphoria that accompany the successful
completion of any task would be incomplete without mentioning of the people who
made it possible and support had been a constant source of encouragement which
crowned our efforts with success.
We are deeply indebted and we would like to express sincere thanks to our principal
Dr. (Mrs.) Mrunalini S. Jadhav.
Finally, we express our honest and sincere feelings towards all other staff member
of computer department and our colleagues who directly or indirectly encourage
us, helped us, and criticized us in accomplishment of our present work.
Certificate
Acknowledgement 1
Abstract ------------------------------------------ 2
1.1 Introduction 4
1.2 Background 5
3. ProposedMethodology
4. ReferencesandBibliography
1.1 Introduction
Email or electronic mail spam refers to the “using of email to send unsolicited emails or
advertising emails to a group of recipients. Unsolicited emails mean the recipient has not granted
permission for receiving those emails. “The popularity of using spam emails is increasing since last
decade. Spam has become a big misfortune on the internet. Spam is a waste of storage, time and
message speed. Automatic email filtering may be the most effective method of detecting spam but
nowadays spammers can easily bypass all these spam filtering applications easily. Several years ago,
most of the spam can be blocked manually coming from certain email addresses. Machine learning
approach will be used for spam detection. Major approaches adopted closer to junk mail filtering
encompass “text analysis, white and blacklists of domain names, and community-primarily based
techniques”. Text assessment of contents of mails is an extensively used method to the spams. Many
answers deployable on server and purchaser aspects are available. Naive Bayes is one of the utmost
well-known algorithms applied in these procedures. However, rejecting sends essentially dependent on
content examination can be a difficult issue in the event of bogus positives. Regularly clients and
organizations would not need any legitimate messages to be lost. The boycott approach has been
probably the soonest technique pursued for the separating of spams. The technique is to acknowledge
all the sends other than those from the area/electronic mail ids. Expressly boycotted. With more up to
date areas coming into the classification of spamming space names this technique keeps an eye on no
longer work so well.
• The Average Office Worker Receives Roughly 121 Emails Per Day, Half Of Which Are
Estimated To Be Spam. But Even At 60 Emails A Day, It Is Easy To Lose Important
Communications To The Sheer Number That Are Coming In. This Is One Of The
Secret Benefits Of Spam Filtering That People Do Not Know About: It Simply
Streamlines Your Inbox. With Less Garbage Coming Into Your Inbox, You Can
Actually Go Through Your Emails More Effectively And Stay In Touch With Those
Who Matter.
• Protect Against Malware, Viruses, And Other Forms Of Malicious Attacks Are
Heading To People’S EMAIL INBOXES EVERY DAY
• Every Day, Someone Falls Prey To A Phishing Scam, A Particular Kind Of Spam-
Based Scheme Where Someone Thinks They Are Getting A Legitimate Email And
Ends Up Divulging Credit Card Information.
1.1 Background
Email has been the most important medium of communication nowadays, through
internet connectivity any message can be delivered to all aver the world. More than
270 billion emails are exchanged daily, about 57% of these are just spam emails.
Spam emails, also known as non-self
Nowadays, which affects or hacks personal information like bank ,related to money or
anything that causes destruction to single individual or a corporation or a group of
people. Besides advertising, these may contain links to phishing or malware hosting
websites set up to steal confidential information. Spam is a serious issue that is not
just annoying to the end-users but also financially damaging and a security risk.
Hence this system is designed in such a way that it detects unsolicited and unwanted
emails and prevents them hence helping in reducing the spam message which would
be of great benefit to individuals as well as to the company .In the future this system
can be implemented by using different algorithms and also more features can be
added to the existing system.
Email Spam Detection was primarily developed The reason to do this is simple: by
detecting unsolicited and unwanted emails, we can prevent spam messages from creeping
into the user’s inbox, thereby improving user experience. Emails are sent through a spam
detector.
2.3 Specification
In, the e-mail detection method was proposed for the detection of spam. In the
system, four predictive machine learning and python classifiers were used
with various data partitions for training and testing of the models. Additionally
different hyper parameters values were used in the models. The system
obtained good results.
05
the project for further use. Shreyash Jangam
Pream Dongare
06 Cost Estimations. Kartik Shinde
Shreyash Jangam
Pream Dongare
07 Prepare the project. Kartik Shinde
Shreyash Jangam
Pream Dongare
4. References and Bibliography
4. Papers
1
When we receive message in the inbox ,that message will be exported to dataset. This
1. message will be detected as spam or not using Naïve Bayes Classifier.Before detecting
whether received message is spam or notthe model has to be trained which is explained in
the below section.This concept includes Information System.
2. When we receive message in the inbox ,that message will be exported to dataset as shown
below. This message will be detected as spam or not.
.
3. In this system, to solve the problem of spam, the spam classification system is created to
identify spam and non- spam. Since spammers may send spam messages many times, it is
difficult to identify it every time manually .So we will be using some of the strategies in our
proposed system to detect the spam. The proposed solution not only identifies the spam
word but also identifies the IP address of the system through which the spam message is
sent so that next time when the spam message is sent from the same system our proposed
system directly identifies it as blacklisted based on the IP address. An information system
. offers a litany of benefits that help to make the process of managing
4. The exported message will be detected as spam or not using Bayes theorem and Naive
Bayes Classifier following all the steps discussed above along with finding probability of
words in spam and ham messages to detect it as spam or not. The below figures shows
message which got detected as spam and ham.
6. If Urgent! Please call 09062703810 is an exported message from the inbox to the
dataset then based on trained dataset and using Bayes theorem and Naive Bayes
Classifier, the above message is detected as Spam as shown below
4. Books
2
1. IGERT Independent Publishing Platform (May 20, 2020), Email Spam Detection: A
Complete Guide" (Author) by Thashina Sultana, K A Sapnaz,
4. Websites
3
1. Email based Spam Detection – IJERT
2. E-mail spam detection - Machine Learning: End-to-End guide for Java developers [Book]
(oreilly.com)