Part 3 discription
Part 3 discription
Part 3 discription
INTRODUCTION
Phishing is a simple yet complex mechanism that escalates threats to the security of
the Internet community. With little information about the victim, the attacker can
produce a believable and personalized email or webpage. It is also hard to catch the
attacker, as most of them tend to hide their location and work in almost complete
anonymity. Even with high technology and excellent security software, users can
become victims of this scheme. This is due to the huge of number of methods that
can be used by the attackers to attract users into their phishing scheme. A report by
Forbes has highlighted that approximately $500 million losses related to phishing
attacks occur every year in the US businesses.
A study by Hassan et al. raises concern on the methods used to detect and filter
phishing webpages or emails successfully. Phishing can be considered as a semantic
attack that easily tricks the users by crafting deceptive semantic techniques. The
phrases in the phishing vector, especially through emails, are Lure, Hook, and
Catch. Two mechanisms are suggested to defend against this phishing vector:
developing awareness programmes and deploying the detection and filtering
systems. Awareness programmes are designed to educate users by implementing
phishing defensive training such as that found. Whereas for the deployment of
technical defences against phishing, one can apply the two-factor authentication in a
robust secure email, use disguised executable file detection, analyse and detect
executable files transferred via emails, and add another layer of security by warning
1
a user when abnormal data in the header source code are detected, such as in the
spoofed email.
2
CHAPTER 2
LITERATURE SURVEY
Phishing is a major threat to all Internet users and is difficult to trace or defend
against since it does not present itself as obviously malicious in nature. In today's
society, everything is put online and the safety of personal credentials is at risk.
Phishing can be seen as one of the oldest and easiest ways of stealing information
from people and it is used for obtaining a wide range of personal details. It also has
a fairly simple approach – send an email, email sends victim to a site, site steals
information.
Many anti-phishing schemes have recently been proposed in literature. Despite all
those efforts, the threat of phishing attacks is not mitigated. One of the main reasons
is that phishing attackers have the adaptability to change their tactics with little cost.
In this paper, we propose a novel approach, which is independent of any specific
phishing implementation. Our idea is to examine the anomalies in Web pages, in
particular, the discrepancy between a Web site's identity and its structural features
and HTTP transactions. It demands neither user expertise nor prior knowledge of
the Web site. The evasion of our phishing detection entails high cost to the
adversary. As shown by the experiments, our phishing detector functions with low
miss rate and low false-positive rate.
Phishing is a major problem on the Web. Despite the significant attention it has
received over the years, there has been no definitive solution. While the state-of-the-
art solutions have reasonably good performance, they suffer from several drawbacks
3
including potential to compromise user privacy, difficulty of detecting phishing
websites whose content change dynamically, and reliance on features that are too
dependent on the training data. To address these limitations we present a new
approach for detecting phishing webpages in real-time as they are visited by a
browser. It relies on modeling inherent phisher limitations stemming from the
constraints they face while building a webpage. Consequently, the implementation
of our approach, Off-the-Hook, exhibits several notable properties including high
accuracy, brand-independence and good language-independence, speed of decision,
resilience to dynamic phish and resilience to evolution in phishing techniques. Off-
the-Hook is implemented as a fully-client-side browser add-on, which preserves
user privacy. In addition, Off-the-Hook identifies the target website that a phishing
webpage is attempting to mimic and includes this target in its warning. We
evaluated Off-the-Hook in two different user studies. Our results show that users
prefer Off-the-Hook warnings to Firefox warnings.
4
Many classifications techniques have been used and devised to combat phishing
threats, but none of them is able to efficiently identify web phishing attacks due to
the continuous change and the short life cycle of phishing websites. In this paper,
we introduce a Case-Based Reasoning (CBR) Phishing Detection System (CBR-
PDS). It mainly depends on CBR methodology as a core part. The proposed system
is highly adaptive and dynamic as it can easily adapt to detect new phishing attacks
with a relatively small data set in contrast to other classifiers that need to be heavily
trained in advance. We test our system using different scenarios on a balanced 572
phishing and legitimate URLs. Experiments show that the CBR-PDS system
accuracy exceeds 95.62%, yet it significantly enhances the classification accuracy
with a small set of features and limited data sets.
5
percentage of responses among the management and professional group can also be
classified as being in an alarming rate. This simulation is the first practice in UKM
and it helps to increase awareness and provide education about cyber security.
In this paper we study the impact that security awareness training has on the people
who click on malicious links contained in phishing emails. Phishing is a criminal
activity in which social engineering techniques and technology are used to obtain
personal information without one's consent. Currently, anti-phishing techniques
have little academic backing, usually only statistics and testimonials from security
organizations. This paper aims to provide an educational standard by which the
usefulness of internet security awareness and anti-phishing techniques can be
compared in the future.
Identity theft is an emerging threat in our networked world and more individuals
and companies fall victim to this type of fraud. User training is an important part of
ICT security awareness; however, IT management must know and identify where to
direct and focus these awareness training efforts. A phishing exercise was
conducted in an academic environment as part of an ongoing information security
awareness project where system data or evidence of users’ behavior was
accumulated. Information security culture is influenced by amongst other aspects
the behavior of users. This paper presents the findings of this phishing experiment
where alarming results on the staff behavior are shown. Educational and awareness
activities pertaining to email environments are of utmost importance to manage the
increased risks of identity theft.
6
Phishing is a kind of social engineering attack in which experienced persons or
entities fool novice users to share their sensitive information such as usernames,
passwords, credit card numbers, etc. through spoofed emails, spams, and Trojan
hosts. The proposed scheme based on designing a secure two factor authentication
web application that prevents phishing attacks instead of relying on the phishing
detection methods and user experience. The proposed method guarantees that
authenticating users to services, such as online banking or e-commerce websites, is
done in a very secure manner. The proposed system involves using a mobile phone
as a software token that plays the role of a second factor in the user authentication
process, the web application generates a session based onetime password and
delivers it securely to the mobile application after notifying him through Google
Cloud Messaging (GCM) service, then the user mobile software will complete the
authentication process – after user confirmation- by encrypting the received onetime
password with its own private key and sends it back to the server in a secure and
transparent to the user mechanism. Once the server decrypts the received onetime
password and mutually authenticates the client, it automatically authenticates the
user’s web session. We implemented a prototype system of our authentication
protocol that consists of an Android application, a Java-based web server and a
GCM connectivity for both of them. Our evaluation results indicate the viability of
the authentication protocol to secure the web applications authentication against
various types of threats.
Advanced Persistent Threat (APT) is one of the most serious types of cyber attacks,
which is a new and more complex version of multi-step attack. Within the APT life
cycle, the most common technique used to get the point of entry is spear-phishing
emails which may contain disguised executable files. This paper presents the
disguised executable file detection (DeFD) module, which aims at detecting
disguised exe files transferred over the connections. The detection is based on a
comparison between the MIME type of the transferred file and the file name
extension. This module was experimentally evaluated and the results show
successful detection of disguised executable files.
7
CHAPTER 3
SYSTEM ANALYSIS
3.1.1 Disadvantages
1.Less accuracy.
3.2.1 Advantages
8
1. Accuracy is more.
Umbrella
DOCUMENT CONTROL
Activity
Umbrella
Business Requirement Activity
Documentation
• Feasibility Study
• TEAM FORMATION
• Project Specification ANALYSIS &
Requirements PREPARATION
DESIGN CODE UNIT TEST ASSESSMENT
Gathering
INTEGRATION ACCEPTANCE
& SYSTEM TEST
DELIVERY/ INS
TESTING
TALLATION
Umbrella
TRAINING
Activity
Requirement Gathering
Analysis
Designing
Coding
Testing
Maintenance
9
3.4.1 Requirements Gathering stage
The requirements gathering process takes as its input the goals identified in the
high-level requirements section of the project plan. Each goal will be refined into a
set of one or more requirements. These requirements define the major functions of
the intended application, define operational data areas and reference data areas, and
define the initial data entities. Major functions include critical processes to be
managed, as well as mission critical inputs, outputs and reports. A user class
hierarchy is developed and associated with these major functions, data areas, and
data entities. Each of these definitions is termed a Requirement. Requirements are
identified by unique requirement identifiers and, at minimum, contain a requirement
title and textual description.
These requirements are fully described in the primary deliverables for this stage: the
Requirements Document and the Requirements Traceability Matrix (RTM). The
requirements document contains complete descriptions of each requirement,
including diagrams and references to external documents as necessary. Note that
detailed listings of database tables and fields are not included in the requirements
document.
The title of each requirement is also placed into the first version of the RTM,
along with the title of each goal from the project plan. The purpose of the RTM is to
show that the product components developed during each stage of the software
10
development lifecycle are formally connected to the components developed in prior
stages.
The outputs of the requirements definition stage include the requirements document,
the RTM, and an updated project plan.
The planning stage establishes a bird's eye view of the intended software product,
and uses this to establish the basic project structure, evaluate feasibility and risks
associated with the project, and describe appropriate management and technical
approaches.
The most critical section of the project plan is a listing of high-level product
requirements, also referred to as goals. All of the software product requirements to
be developed during the requirements definition stage flow from one or more of
these goals. The minimum information for each goal consists of a title and textual
description, although additional information and references to external documents
may be included. The outputs of the project planning stage are the configuration
management plan, the quality assurance plan, and the project plan and schedule,
11
with a detailed listing of scheduled activities for the upcoming Requirements stage,
and high level estimates of effort for the out stages.
Fig.3.3 Analysis
The design stage takes as its initial input the requirements identified in the approved
requirements document. For each requirement, a set of one or more design elements
will be produced as a result of interviews, workshops, and/or prototype efforts.
Design elements describe the desired software features in detail, and generally
include functional hierarchy diagrams, screen layout diagrams, tables of business
rules, business process diagrams, pseudo code, and a complete entity-relationship
diagram with a full data dictionary. These design elements are intended to describe
the software in sufficient detail that skilled programmers may develop the software
with minimal additional input.
When the design document is finalized and accepted, the RTM is updated to show
that each design element is formally associated with a specific requirement. The
outputs of the design stage are the design document, an updated RTM, and an
updated project plan.
12
Fig.3.4 Designing
The development stage takes as its primary input the design elements described in
the approved design document. For each design element, a set of one or more
software artifacts will be produced. Software artifacts include but are not limited to
menus, dialogs, and data management forms, data reporting formats, and specialized
procedures and functions. Appropriate test cases will be developed for each set of
functionally related software artifacts, and an online help system will be developed
to guide users in their interactions with the software.
The RTM will be updated to show that each developed artifact is linked to a
specific design element, and that each developed artifact has one or more
corresponding test case items. At this point, the RTM is in its final configuration.
The outputs of the development stage include a fully functional set of software that
satisfies the requirements and design elements previously documented, an online
help system that describes the operation of the software, an implementation map
that identifies the primary code entry points for all major system functions, a test
plan that describes the test cases to be used to validate the correctness and
completeness of the software, an updated RTM, and an updated project plan.
13
Fig.3.5 Coding
During the integration and test stage, the software artifacts, online help, and test data
are migrated from the development environment to a separate test environment. At
this point, all test cases are run to verify the correctness and completeness of the
software. Successful execution of the test suite confirms a robust and complete
migration capability. During this stage, reference data is finalized for production use
and production users are identified and linked to their appropriate roles. The final
reference data (or links to reference data source files) and production user list are
compiled into the Production Initiation Plan.
The outputs of the integration and test stage include an integrated set of software,
an online help system, an implementation map, a production initiation plan that
describes reference data and production users, an acceptance plan which contains
the final suite of test cases, and an updated project plan.
14
Fig.3.6 Testing
During the installation and acceptance stage, the software artifacts, online help, and
initial production data are loaded onto the production server. At this point, all test
cases are run to verify the correctness and completeness of the software. Successful
execution of the test suite is a prerequisite to acceptance of the software by the
customer.
After customer personnel have verified that the initial production data load is
correct and the test suite has been executed with satisfactory results, the customer
formally accepts the delivery of the software.
15
Fig.3.7 Maintenance
The primary outputs of the installation and acceptance stage include a production
application, a completed acceptance test suite, and a memorandum of customer
acceptance of the software. Finally, the PDR enters the last of the actual labor data
into the project schedule and locks the project as a permanent project record. At this
point the PDR "locks" the project by archiving all software items, the
implementation map, the source code, and the documentation for future reference.
3.4.7 Maintenance
16
CHAPTER 4
SOFTWARE REQUIREMENT SPECIFICATION
17
time. There are aspects in the feasibility study portion of the preliminary
investigation:
• ECONOMIC FEASIBILITY
A system can be developed technically and that will be used if installed must still be a
good investment for the organization. In the economical feasibility, the development
cost in creating the system is evaluated against the ultimate benefit derived from the
new systems. Financial benefits must equal or exceed the costs. The system is
economically feasible. It does not require any addition hardware or software. Since
the interface for this system is developed using the existing resources and
technologies available at NIC, There is nominal expenditure and economical
feasibility for certain.
• OPERATIONAL FEASIBILITY
Proposed projects are beneficial only if they can be turned out into information
system. That will meet the organization’s operating requirements. Operational
feasibility aspects of the project are to be taken as an important part of the project
implementation. This system is targeted to be in accordance with the above-
mentioned issues. Beforehand, the management issues and user requirements have
been taken into consideration. So there is no question of resistance from the users that
can undermine the possible application benefits. The well-planned design would
ensure the optimal utilization of the computer resources and would help in the
improvement of performance status.
• TECHNICAL FEASIBILITY
18
4.2. External Interface Requirements
User Interface
The user interface of this system is a user friendly python Graphical User Interface.
Hardware Interfaces
The interaction between the user and the console is achieved through python
capabilities.
Software Interfaces
Operating Environment
Windows XP.
HARDWARE REQUIREMENTS:
SOFTWARE REQUIREMENTS:
19
Operating System - Windows7/8
Programming Language - Python
CHAPTER 5
IMPLEMETATION
5.1 Python
Python is a fairly old language created by Guido Van Rossum. The design began in
the late 1980s and was first released in February 1991.
In late 1980s, Guido Van Rossum was working on the Amoeba distributed operating
system group. He wanted to use an interpreted language like ABC (ABC has simple
easy-to-understand syntax) that could access the Amoeba system calls. So, he
decided to create a language that was extensible. This led to design of a new
language which was later named Python.
No. It wasn't named after a dangerous snake. Rossum was fan of a comedy series
from late seventies. The name "Python" was adopted from the same series "Monty
Python's Flying Circus".
20
5.1.2 Features of Python:
Python has a very simple and elegant syntax. It's much easier to read and write
Python programs compared to other languages like: C++, Java, C#. Python makes
programming fun and allows you to focus on the solution rather than syntax.
If you are a newbie, it's a great choice to start your journey with Python.
You can freely use and distribute Python, even for commercial use. Not only can
you use and distribute software’s written in it, you can even make changes to the
Python's source code.
Portability
You can move Python programs from one platform to another, and run it without
any changes.
Suppose an application requires high performance. You can easily combine pieces
of C/C++ or other languages with Python code.
This will give your application high performance as well as scripting capabilities
which other languages may not provide out of the box.
21
Unlike C/C++, you don't have to worry about daunting tasks like memory
management, garbage collection and so on.
Likewise, when you run Python code, it automatically converts your code to the
language your computer understands. You don't need to worry about any lower-
level operations.
Python has a number of standard libraries which makes life of a programmer much
easier since you don't have to write all the code yourself. For example: Need to
connect MySQL database on a Web server? You can use MySQLdb library using
import MySQLdb .
Standard libraries in Python are well tested and used by hundreds of people. So you
can be sure that it won't break your application.
Object-oriented
With OOP, you are able to divide these complex problems into smaller sets by
creating objects.
Programming in Python is fun. It's easier to understand and write Python code.
Why? The syntax feels natural. Take this source code for an example:
a=2
b=3
22
sum = a + b
print(sum)
You don't need to define the type of a variable in Python. Also, it's not necessary to
add semicolon at the end of the statement.
Python enforces you to follow good practices (like proper indentation). These small
things can make learning much easier for beginners.
Python allows you to write programs having greater functionality with fewer lines
of code. Here's a link to the source code of Tic-tac-toe game with a graphical
interface and a smart computer opponent in less than 500 lines of code. This is just
an example. You will be amazed how much you can do with Python once you learn
the basics.
Python has a large supporting community. There are numerous active forums online
which can be handy if you are stuck.
23
CHAPTER 6
EXPECTED RESULTS
24
CHAPTER 7
REFERENCES
[2] Phishing Activity Trends Report – 1st Quarter 2018. Available online:
https://docs.apwg.org/reports/apwg_trends_report_q1_2018.pdf (accessed on: 1
February 2019).
[3] Y. Pan, and X. Ding, “Anomaly-based web phishing page detection”, In Proc.
Of the 22nd ACSAC, IEEE, Miami, FL, USA, pp. 381-392, 2006.
25
[7] N. A. Bakar, M. Mohd, and R. Sulaiman, “Information leakage preventive
training,” In Proc. Of 6th ICEEI, IEEE, Langkawi, Malaysia, 2018.
[9] T. Steyn, H. Kruger, and L. Drevin, “Identity theft - empirical evidence from a
phishing exercise” In New Approaches for Security, Privacy and Trust in Complex
Environments; Venter, H.; Eloff, M.; Labuschagne, L.; Eloff, J.; von Sohns, R.
Springer: Boston, MA, USA, vol. 232, pp. 193-203, 2007.
26
[15] K. Firdous, B. Al-Otaibi, A. Al-Qadi, and N. Al-Dossari, “Hybrid client side
phishing websites detection approach,” International Journal of Advanced
Computer Science and Applications,vol. 5, pp. 132-140, 2014.
27