Fulldoc - Dsec Mca - Crime Prediction
Fulldoc - Dsec Mca - Crime Prediction
Fulldoc - Dsec Mca - Crime Prediction
ABSTRACT
An incredible increase in urban population over the past three decades has created a
desire for a safe, friendly, and sustainable society. The management of urbanization continues
to be a significant concern for administrative authorities due to the city's ever-expanding
population, which is devouring suburbs and rural areas. Cities are becoming overcrowded;
forcing governments to launch smart city programs that would assist improve infrastructure
management and address the key issues of development, sustainability, and security. Despite
the enormous momentum that smart city efforts have gathered and the promises they make
about improving quality of life, they do have some difficult elements as well. Public safety is
one of the biggest obstacles to living in a smart city. To create a healthy society, crime rates
must be identified and reduced. Big Data approaches are used to gather and analyze data in
order to identify the necessary characteristics and key traits that lead to the creation of crime
prediction. Traditional crime detection and machine learning-based algorithms frequently
struggle to accurately forecast crime trends because they are unable to produce important
prime qualities from the crime dataset. In order to improve the subject machine learning
algorithm's accuracy, this system aims to extract the key features such as time zones, crime
likelihood, and crime types. As an alternative to current modelling methodologies, we may
use the Support Vector Machine algorithm in this project to construct a framework for
identifying and forecast crime. SVM belongs to the new generation of machine learning
methods used to determine the best class separation inside datasets. Using geographical
datasets gathered from Kaggle Data sources, experiments demonstrate that SVMs produce
accurate results when used with a Python tool.
1. INTRODUCTION
In the real world, we are surrounded by humans who can learn everything from their
experiences with their learning capability, and we have computers or machines which work
on our instructions. Machine Learning is said as a subset of Artificial Intelligence that is
mainly concerned with the development of algorithms which allow a computer to learn from
the data and past experiences on their own. The term machine learning was first introduced
by Arthur Samuel in 1959. Machine Learning is a growing technology which enable
computers the capability to learn without being explicitly programmed.
ML is one of the most exciting technologies that one would have ever come across.
But can a machine also learn from experiences or past data like a human does? So here comes
the role of Machine Learning. As it is evident from the name, it gives the computer that
makes it more similar to humans: The ability to learn. Machine learning is actively being
used today, perhaps in many more places than one would expect. Machine learning uses
various algorithms for building mathematical models and making predictions using historical
data or information. Currently, it is being used for various tasks such as image
recognition, speech recognition, email filtering, Facebook auto-tagging, recommender
system, and many more.
A Machine Learning system learns from historical data, builds the prediction
models, and whenever it receives new data, predicts the output for it. The accuracy of
predicted output depends upon the amount of data, as the huge amount of data helps to build
a better model which predicts the output more accurately. Suppose we have a complex
problem, where we need to perform some predictions, so instead of writing a code for it, we
just need to feed the data to generic algorithms, and with the help of these algorithms,
machine builds the logic as per the data and predict the output. Machine learning has changed
our way of thinking about the problem. The below block diagram explains the working of
Machine Learning algorithm:
NEED FOR MACHINE LEARNING
The need for machine learning is increasing day by day and the reason behind is that
it is capable of doing tasks that are too complex for a person to implement directly. As a
human, we have some limitations as we cannot access the huge amount of data manually, so
for this, we need some computer systems and here comes the machine learning to make
things easy for us. We can train machine learning algorithms by providing them the huge
amount of data and let them explore the data, construct the models, and predict the required
output automatically.
The performance of the machine learning algorithm depends on the amount of data,
and it can be determined by the cost function. With the help of machine learning, we can save
both time and money. The importance of machine learning can be easily understood by its
uses cases, currently, machine learning is used in self-driving cars, cyber fraud detection, face
recognition, and friend suggestion by Facebook, etc. Various top companies such as Netflix
and Amazon have build machine learning models that are using a vast amount of data to
analyze the user interest and recommend product accordingly.
Supervised Learning
Unsupervised Learning
Reinforcement Learning (Semi-structured Learning)
SUPERVISED LEARNING
Supervised learning is a type of machine learning method .we provide sample labelled
data to the machine learning system in order to train it, and on that basis, it predicts the
output. The system creates a model using labelled data to understand the datasets and learn
about each data, once the training and processing are done then we test the model by
providing a sample data to check whether it is predicting the exact output or not. The goal of
supervised learning is to map input data with the output data. The supervised learning is
based on supervision, and it is the same as when a student learns things in the supervision of
the teacher. The example of supervised learning is spam filtering. Supervised learning can be
grouped further in two categories of algorithms:
Classification
Regression
CLASSIFICATION
Classifier – It is an algorithm that is used to map the input data to a specific category.
Classification Model – The model predicts or draws a conclusion to the input data
given for training, it will predict the class or category for the data.
Feature – A feature is an individual measurable property of the phenomenon being
observed.
Binary Classification – It is a type of classification with two outcomes, for example:
Either true or false.
Multi-Class Classification – The classification with more than two classes, in multi-
class classification each sample is assigned to one and only one label or target.
REGRESSION
The main goal of regression is the construction of an efficient model to predict the
dependent attributes from a bunch of attribute variables. A regression problem is when the
output variable is either real or a continuous value i.e. salary, weight, area, etc. We can also
define regression as a statistical means that is used in applications like housing, investing, etc.
It is used to predict the relationship between a dependent variable and a bunch of independent
variables. Let us take a look at various types of regression techniques.
REGRESSION TYPES
Simple Linear Regression - One of the most interesting and common regression technique is
simple linear regression. In this, we predict the outcome of a dependent variable based on the
independent variables, the relationship between the variables is linear.
Polynomial Regression - In this regression technique, we transform the original features into
polynomial features of a given degree and then perform regression on it.
Support Vector Regression - For support vector machine regression or SVR, we identify a
hyper plane with maximum margin such that the maximum number of data points are within
those margins. It is quite similar to the support vector machine classification algorithm.
Decision Tree Regression - A decision tree can be used for both regression
and classification. use the ID3 algorithm(Iterative Dichotomise 3) to identify the splitting
node by reducing the standard deviation.
UNSUPERVISED LEARNING
Clustering
Association
CLUSTERING
Clustering is the task of dividing the population or data points into a number of
groups such that data points in the same groups are more similar to other data points in the
same group and dissimilar to the data points in other groups. It is basically a collection of
objects on the basis of similarity and dissimilarity between them. Clustering is very much
important as it determines the intrinsic grouping among the unlabelled data present. There
are no criteria for good clustering. It depends on the user, what is the criteria they may use
which satisfy their need.
CLUSTERING METHOD:
Density-Based Methods: These methods consider the clusters as the dense region
having some similarities and differences from the lower dense region of the space.
These methods have good accuracy and the ability to merge two clusters.
Hierarchical Based Methods: The clusters formed in this method form a tree-type
structure based on the hierarchy. New clusters are formed using the previously
formed one. It is divided into two category.
Partitioning Methods: These methods partition the objects into k clusters and each
partition forms one cluster. This method is used to optimize an objective criterion
similarity function such as when the distance is a major parameter.
Grid-based Methods: In this method, the data space is formulated into a finite
number of cells that form a grid-like structure. All the clustering operations done on
these grids are fast and independent of the number of data objects
Association rule learning is a type of unsupervised learning technique that checks for
the dependency of one data item on another data item and maps accordingly so that it can be
more profitable. It tries to find some interesting relations or associations among the variables
of dataset. It is based on different rules to discover the interesting relations between variables
in the database.
The association rule learning is one of the very important concepts of machine
learning, and it is employed in Market Basket analysis, Web usage mining, continuous
production, etc. Here market basket analysis is a technique used by the various big retailers
to discover the associations between items. We can understand it by taking an example of a
supermarket, as in a supermarket, all products that are purchased together is put together. For
example, if a customer buys bread, he most likely can also buy butter, eggs, or milk, so these
products are stored within a shelf or mostly nearby
It has various applications in machine learning and data mining. Below are some
popular applications of association rule learning:
Market Basket Analysis - It is one of the popular examples and applications of
association rule mining. This technique is commonly used by big retailers to
determine the association between items.
Medical Diagnosis - With the help of association rules, patients can be cured
easily, as it helps in identifying the probability of illness for a particular disease.
Protein Sequence - The association rules help in determining the synthesis of
artificial Proteins.
It is also used for the Catalog Design and Loss-leader Analysis and many more
other applications.
REINFORCEMENT LEARNING
Machine learning is a buzzword for today's technology, and it is growing very rapidly
day by day. We are using machine learning in our daily life even without knowing it such as
Google Maps, Google assistant, Alexa, etc.
Content Filter
Header filter
General blacklists filter
Rules-based filters
Permission filters
AUTOMATIC LANGUAGE TRANSLATION - Nowadays, if we visit a new place and we
are not aware of the language then it is not a problem at all, as for this also machine learning
helps us by converting the text into our known languages. Google's GNMT (Google Neural
Machine Translation) provide this feature, which is a Neural Machine Learning that translates
the text into our familiar language, and it called as automatic translation. The technology
behind the automatic translation is a sequence to sequence learning algorithm, which is used
with image recognition and translates the text from one language to another language.
Machine learning is making our online transaction safe and secure by detecting fraud
transaction. Whenever we perform some online transaction, there may be various ways that a
fraudulent transaction can take place such as fake accounts, fake ids, and steal money in the
middle of a transaction. So to detect this, Feed Forward Neural network helps us by checking
whether it is a genuine transaction or a fraud transaction. For each genuine transaction, the
output is converted into some hash values, and these values become the input for the next
round. For each genuine transaction, there is a specific pattern which gets change for the
fraud transaction hence, it detects it and makes our online transactions more secure.
TRAFFIC PREDICTION - If we want to visit a new place, we take help of Google Maps,
which shows us the correct path with the shortest route and predicts the traffic conditions.
Everyone who is using Google Map is helping this app to make it better. It takes information
from the user and sends back to its database to improve the performance. It predicts the
traffic conditions such as whether traffic is cleared, slow-moving, or heavily congested with
the help of two ways:
2. LITERATURE SURVEY
This paper concluded by discussing the implications of these findings for research on
technology and in-equality in criminal justice. Whereas the current wave of critical
scholarship on algorithmic bias often leans upon technological deterministic narratives in
order to make social justice claims, here we focus on the social and institutional contexts
within which such predictive systems are deployed and negotiated. In the process, we show
that these tools acquire political nuance and meaning through practice, which can lead to
unanticipated or undesirable outcomes: forms of workplace surveillance and the displacement
of discretion to less accountable places. We argue that this sheds new light on the
transformations of police and judicial discretion – with important consequences for social and
racial inequality – in the age of big data. Given the rationalizing impetus that guides the
adoption of algorithmic technologies in the criminal justice context, these profound changes
lead us to raise the question of the reception of predictive algorithms in the context of law
enforcement and criminal courts. Although there is strong theoretical work in surveillance
studies that focuses on the possibilities, good and bad, of new forms of algorithmic decision-
making, there is a dearth of empirical work on the social contexts of their reception in
policing and courts.
2.4 TITLE: Machine learning for risk assessment in gender-based crime.
AUTHOR: González-Prieto,
This paper propose a hybrid model that combines the statistical prediction methods with the
ML method, permitting authorities to implement a smooth transition from the preexisting
model to the ML-based model. This hybrid nature enables a decision-making process to
optimally balance between the efficiency of the police system and aggressiveness of the
protection measures taken. Despite the apparent regular occurrence of crime, as it was
already recognized in the 19th century, it has defied the predictability provided by the
scientific method in the Natural Sciences. Surprisingly, it is easier to accurately predict where
a rocket will be after its launch in its way to a distant planet than to foresee the next victim of
an offense. The unpredictable nature of crime arises the question of whether the classical
scientific method can be a solving tool instead of only a descriptive framework. Given a large
amount of structured data about IPVAW cases, we will apply ML techniques to develop
novel models of risk assessment of recidivism of a victim, understood as the probability that a
female victim, who has been offended and has reported her case, is aggressed again. In our
case, the data will be provided by the Spanish VioGen system, a governmental program for
tracking and controlling gender violence, but the approach and applied methods are general
and can be straightforwardly translated to other data sources.
2.5 TITLE: Economic crime detection using support vector machine classification
The existing system for crime rate prediction using machine learning algorithms aims
to analyze historical crime data and build predictive models that can forecast crime rates in
specific. This system utilizes various machine learning algorithms and techniques to analyze
the data and make accurate predictions. The process begins with data collection, where
historical crime data from different sources, such as police records, crime databases, and
public reports, is gathered. The existing system for crime rate prediction using machine
learning algorithms has shown promising results in forecasting crime rates By utilizing
historical crime data and leveraging the power of machine learning, this system provides
valuable insights for law enforcement agencies and policymakers. The accuracy and
reliability of crime rate predictions heavily rely on the quality and representativeness of the
historical crime data. If the data used for training the models is biased or incomplete, it can
lead to biased predictions. Crime patterns can change over time due to various factors such as
socio-economic changes, policy interventions, or community dynamics. Machine learning
models trained on historical data may struggle to adapt to these changing patterns and may
not accurately predict future crime rates. Machine learning models trained on data may not
generalize well to other regions or future time periods. The models may fail to capture unique
local factors or fail to adapt to changes in crime patterns over time
3.1.1 DISADVANTAGES
3.2.1 ADVANTAGES
5.1 MODULES
Datasets Acquisition
Preprocessing
Features Extraction
Classification
DATASETS ACQUISITION
A data set (or dataset, although this spelling is not present in many contemporary
dictionaries like Merriam-Webster) is a collection of data. Most commonly a data set
corresponds to the contents of a single database table, or a single statistical data matrix,
where every column of the table represents a particular variable, and each row corresponds to
a given member of the data set in question. The data set lists values for each of the variables,
such as height and weight of an object, for each member of the data set. Each value is known
as a datum. The data set may comprise data for one or more members, corresponding to the
number of rows. The term data set may also be used more loosely, to refer to the data in a
collection of closely related tables, corresponding to a particular experiment or event. In this
module, we can upload the datasets which year, month, day, hour, minutes, latitude and
longitude values
PREPROCESSING
FEATURES SELECTION
Feature selection refers to the process of reducing the inputs for processing and
analysis, or of finding the most meaningful inputs. A related term, feature engineering
(or feature extraction), refers to the process of extracting useful information or features from
existing data. Filter feature selection methods apply a statistical measure to assign a scoring
to each feature. The features are ranked by the score and either selected to be kept or removed
from the dataset. The methods are often uni-variate and consider the feature independently, or
with regard to the dependent variable. It can be used to construct the multiple crimes. In this
module, select the multiple features from uploaded datasets. And train the datasets with
multiple crime type’s murder, violence, abuse, vehicle thefts and so on.
CLASSIFICATION
In this module implement classification algorithm to predict the crime types and using
machine learning algorithm such as Support Vector Machine algorithm to predict the crimes.
A Support Vector Machine (SVM) is a feed forward machine learning model that maps sets
of input data onto a set of appropriate outputs. It (SVM) consists of multiple layers of nodes
in a directed graph, and each layer is fully connected to the next one. Each node is a neuron
with a nonlinear activation function except for the input nodes. SVM utilizes a supervised
learning technique called back propagation for training the network. SVM is a modified form
of the standard linear Perceptron and can distinguish data that are not linearly separable.
6. SYSTEM DESIGN
A system architecture or systems architecture is the conceptual model that defines the
structure, behavior, and more views of a system. An architecture description is a formal
description and representation of a system, organized in a way that supports reasoning about
the structures and behaviors of the system. System architecture can comprise system
components, the externally visible properties of those components, the relationships (e.g. the
behavior) between them. It can provide a plan from which products can be procured, and
systems developed, that will work together to implement the overall system. There have been
efforts to formalize languages to describe system architecture; collectively these are called
architecture description languages (ADLs).
An allocated arrangement of physical elements which provides the design solution for
a consumer product or life-cycle process intended to satisfy the requirements of the
functional architecture and the requirements baseline.
Architecture comprises the most important, pervasive, top-level, strategic inventions,
decisions, and their associated rationales about the overall structure (i.e., essential
elements and their relationships) and associated characteristics and behavior.
If documented, it may include information such as a detailed inventory of current
hardware, software and networking capabilities; a description of long-range plans and
priorities for future purchases, and a plan for upgrading and/or replacing dated
equipment and software
The composite of the design architectures for products and their life-cycle processes
6.2 DATA FLOW DIAGRAM
A data flow diagram shows the way information flows through a process or system. It
includes data inputs and outputs, data stores, and the various sub processes the data moves
through. DFDs are built using standardized symbols and notation to describe various entities
and their relationships. Data flow diagrams visually represent systems and processes that
would be hard to describe in a chunk of text. You can use these diagrams to map out an
existing system and make it better or to plan out a new system for implementation.
Visualizing each element makes it easy to identify inefficiencies and produce the best
possible system.
LEVEL 0
The Level 0 DFD shows how the system is divided into 'sub-systems' (processes),
each of which deals with one or more of the data flows to or from an external agent, and
which together provide all of the functionality of the system as a whole. It also identifies
internal data stores that must be present in order for the system to do its job, and shows the
flow of data between the various parts of the system.
LEVEL-1
The next stage is to create the Level 1 Data Flow Diagram. This highlights the main
functions carried out by the system. As a rule, to describe the system was using between two
and seven functions - two being a simple system and seven being a complicated system. This
enables us to keep the model manageable on screen or paper
LEVEL-2
A Data Flow Diagram (DFD) tracks processes and their data paths within the business
or system boundary under investigation. A DFD defines each domain boundary and
illustrates the logical movement and transformation of data within the defined boundary. The
diagram shows 'what' input data enters the domain, 'what' logical processes the domain
applies to that data, and 'what' output data leaves the domain. Essentially, a DFD is a tool for
process modeling and one of the oldest.
LEVEL-3
A data flow diagram (DFD) is a graphical representation of the flow of data through
an information system. A DFD shows the flow of data from data sources and data stores to
processes and from processes to data stores and data sinks. DFDs are used for modelling and
analyzing the flow of data in data processing systems, and are usually accompanied by a data
dictionary, an entity-relationship model, and a number of process descriptions
6.3 UML DIAGRM
6.3.1 CLASS DIAGRAM
A Class diagram is the main building block of object-oriented modeling. It is used for
general conceptual modeling of the structure of the application and for detailed modelling
translating the models into programming code. Class diagram can also be used for data
modelling. The classes in a class diagram represent both the main elements, interactions in
the application and the classes to be programmed.
System
Datasets Acquisition
Preprocessing
Features Selection
User
Rules construction
Classification
A Sequence diagram is an interaction diagram that shows how processes operate with
one another and in what order. It is a construct of a Message Sequence Chart. A sequence
diagram shows object interactions arranged in time sequence. Sequence diagram is
sometimes called event trace diagrams, event scenarios and timing diagrams. A sequence
diagram shows, as parallel vertical lines, different processes that live simultaneously and
horizontal arrows. The messages exchanged between them. Sequence diagram has three
objects. The connection between the objects is mentioned using stimulus and self-stimulus.
4 : Features selection()
5 : MLP algorithm()
6 : Crime prediction()
6.3.3 COLLOBRATION DIAGRAM
Dataset upload
Preprocessing
Rules construction
Classification
Python's developers strive to avoid premature optimization, and reject patches to non-
critical parts of CPython that would offer marginal increases in speed at the cost of clarity.
[ When speed is important, a Python programmer can move time-critical functions to
extension modules written in languages such as C, or use PyPy, a just-in-time compiler.
CPython is also available, which translates a Python script into C and makes direct C-level
API calls into the Python interpreter. An important goal of Python's developers is keeping it
fun to use. This is reflected in the language's name a tribute to the British comedy group
Monty Python and in occasionally playful approaches to tutorials and reference materials,
such as examples that refer to spam and eggs (from a famous Monty Python sketch) instead
of the standard for and bar.
A common neologism in the Python community is pythonic, which can have a wide
range of meanings related to program style. To say that code is pythonic is to say that it uses
Python idioms well, that it is natural or shows fluency in the language, that it conforms with
Python's minimalist philosophy and emphasis on readability. In contrast, code that is difficult
to understand or reads like a rough transcription from another programming language is
called unpythonic. Users and admirers of Python, especially those considered knowledgeable
or experienced, are often referred to as Pythonists, Pythonistas, and Pythoneers. Python is an
interpreted, object-oriented, high-level programming language with dynamic semantics. Its
high-level built in data structures, combined with dynamic typing and dynamic binding, make
it very attractive for Rapid Application Development, as well as for use as a scripting or glue
language to connect existing components together. Python's simple, easy to learn syntax
emphasizes readability and therefore reduces the cost of program maintenance. Python
supports modules and packages, which encourages program modularity and code reuse. The
Python interpreter and the extensive standard library are available in source or binary form
without charge for all major platforms, and can be freely distributed. Often, programmers fall
in love with Python because of the increased productivity it provides. Since there is no
compilation step, the edit-test-debug cycle is incredibly fast. Debugging Python programs is
easy: a bug or bad input will never cause a segmentation fault. Instead, when the interpreter
discovers an error, it raises an exception. When the program doesn't catch the exception, the
interpreter prints a stack trace. A source level debugger allows inspection of local and global
variables, evaluation of arbitrary expressions, setting breakpoints, stepping through the code a
line at a time, and so on. The debugger is written in Python itself, testifying to Python's
introspective power. On the other hand, often the quickest way to debug a program is to add a
few print statements to the source: the fast edit-test-debug cycle makes this simple approach
very effective.
Python’s initial development was spearheaded by Guido van Rossum in the late
1980s. Today, it is developed by the Python Software Foundation. Because Python is a
multiparadigm language, Python programmers can accomplish their tasks using different
styles of programming: object oriented, imperative, functional or reflective. Python can be
used in Web development, numeric programming, game development, serial port access and
more.
There are two attributes that make development time in Python faster than in other
programming languages:
1. Python is an interpreted language, which precludes the need to compile code before
executing a program because Python does the compilation in the background. Because
Python is a high-level programming language, it abstracts many sophisticated details
from the programming code. Python focuses so much on this abstraction that its code
can be understood by most novice programmers.
2. Python code tends to be shorter than comparable codes. Although Python offers fast
development times, it lags slightly in terms of execution time. Compared to fully
compiling languages like C and C++, Python programs execute slower. Of course,
with the processing speeds of computers these days, the speed differences are usually
only observed in benchmarking tests, not in real-world operations. In most cases,
Python is already included in Linux distributions and Mac OS X machines.
7.2 BACK END
MySQL is the world's most used open source relational database management
system (RDBMS) as of 2008 that run as a server providing multi-user access to a number of
databases. The MySQL development project has made its source code available under the
terms of the GNU General Public License, as well as under a variety of proprietary
agreements. MySQL was owned and sponsored by a single for-profit firm, the Swedish
company MySQL AB, now owned by Oracle Corporation.
MySQL is a popular choice of database for use in web applications, and is a central
component of the widely used LAMP open source web application software stack—LAMP is
an acronym for "Linux, Apache, MySQL, Perl/PHP/Python." Free-software-open source
projects that require a full-featured database management system often use MySQL.For
commercial use, several paid editions are available, and offer additional functionality.
Applications which use MySQL databases include: TYPO3, Joomla, Word Press, phpBB,
MyBB, Drupal and other software built on the LAMP software stack. MySQL is also used in
many high-profile, large-scale World Wide Web products, including Wikipedia,
Google(though not for searches), ImagebookTwitter, Flickr, Nokia.com, and YouTube.
Inter images
MySQL is primarily an RDBMS and ships with no GUI tools to administer MySQL
databases or manage data contained within the databases. Users may use the included
command line tools, or use MySQL "front-ends", desktop software and web applications that
create and manage MySQL databases, build database structures, back up data, inspect status,
and work with data records. The official set of MySQL front-end tools, MySQL Workbench
is actively developed by Oracle, and is freely available for use.
Graphical
Testing is a set activity that can be planned and conducted systematically. Testing
begins at the module level and work towards the integration of entire computers based
system. Nothing is complete without testing, as it is vital success of the system.
Testing Objectives:
There are several rules that can serve as testing objectives, they are
1. For Correctness
Tests used for implementation efficiency attempt to find ways to make a correct
program faster or use less storage. It is a code-refining process, which reexamines the
implementation phase of algorithm development. Tests for computational complexity amount
to an experimental analysis of the complexity of an algorithm or an experimental comparison
of two or more algorithms, which solve the same problem. The data is entered in all forms
separately and whenever an error occurred, it is corrected immediately. A quality team
deputed by the management verified all the necessary documents and tested the Software
while entering the data at all levels.
Unit Testing
The first test in the development process is the unit test. The source code is normally
divided into modules, which in turn are divided into smaller units called units. These units
have specific behavior. The test done on these units of code is called unit test. Unit test
depends upon the language on which the project is developed. Unit tests ensure that each
unique path of the project performs accurately to the documented specifications and contains
clearly defined inputs and expected results.
Integration Testing
In integration testing modules are combined and tested as a group. Modules are
typically code modules, individual applications, source and destination applications on a
network, etc. Integration Testing follows unit testing and precedes system testing. Testing
after the product is code complete. Betas are often widely distributed or even distributed to
the public at large in hopes that they will buy the final product when it is released.
System Testing
Validation Testing
The process of evaluating software during the development process or at the end of
the development process to determine whether it satisfies specified business requirements.
Validation Testing ensures that the product actually meets the client's needs. It can also be
defined as to demonstrate that the product fulfills its intended use when deployed on
appropriate environment
9.1 CONCLUSION
importdatetime
import sys
import pickle
importmysql.connector
importnumpy as np
app = Flask(__name__)
app.config['DEBUG']
app.config['SECRET_KEY'] = '7d441f27d441f27567d441f2b6176a'
@app.route("/")
def homepage():
returnrender_template('index.html')
@app.route("/AdminLogin")
defAdminLogin():
returnrender_template('AdminLogin.html')
@app.route("/NewQueryReg")
defNewQueryReg():
returnrender_template('NewQueryReg.html')
@app.route("/UploadDataset")
defUploadDataset():
returnrender_template('ViewExcel.html')
@app.route("/AdminHome")
defAdminHome():
returnrender_template('AdminHome.html')
@app.route("/WeatherInfo")
defWeatherInfo():
cursor = conn.cursor()
cur = conn.cursor()
data = cur.fetchall()
returnrender_template('WeatherInfo.html',data=data)
defadminlogin():
error = None
ifrequest.method == 'POST':
returnrender_template('AdminHome.html' )
else:
returnrender_template('index.html', error=error)
defnewquery():
ifrequest.method == 'POST':
t1 = request.form['t1']
t2 = request.form['t2']
t3 = request.form['t3']
t4 = request.form['t4']
t5 = request.form['t5']
t6 = request.form['t6']
t7 = request.form['t7']
filename2 ="Model/Crime-prediction-rfc-model.pkl"
my_prediction = classifier2.predict(data)
print(my_prediction[0])
if (my_prediction) == 0:
Predict = 'Murder'
elif(my_prediction == 1):
Predict = 'violence'
elif (my_prediction == 2):
Predict = 'ChildAbusing'
Predict = 'Mischief'
Predict = 'TheftVehicle'
Predict = 'Accident'
returnrender_template('NewQueryReg.html', Predict=Predict)
defuploadassign():
ifrequest.method == 'POST':
file = request.files['fileupload']
file_extension = file.filename.split('.')[1]
print(file_extension)
#file.save("static/upload/" + secure_filename(file.filename))
import pandas as pd
importmatplotlib.pyplot as plt
df = ''
iffile_extension == 'xlsx':
df = pd.read_excel(file.read(), engine='openpyxl')
eliffile_extension == 'xls':
df = pd.read_excel(file.read())
eliffile_extension == 'csv':
df = pd.read_csv(file)
importseaborn as sns
sns.countplot(df['TYPE'], label="Count")
plt.savefig('static/images/out.jpg')
iimg = 'static/images/out.jpg'
print(df)
# import pandas as pd
importmatplotlib.pyplot as plt
# read-in data
importseaborn as sns
sns.countplot(df['TYPE'], label="Count")
plt.show()
df.TYPE = df.TYPE.map({'Murder': 0,
'violence': 1,
'ChildAbusing': 2,
'Offence Against a Person': 3,
'Mischief': 4,
'TheftVehicle': 5,
'Accident': 6
})
defclean_dataset(df):
df.dropna(inplace=True)
returndf[indices_to_keep].astype(np.float64)
df = clean_dataset(df)
df_copy = df.copy(deep=True)
# Model Building
# df.drop(df.columns[np.isnan(df).any()], axis=1)
X = df.drop(columns='TYPE')
y = df['TYPE']
classifier = MLPClassifier(random_state=3)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
print(classification_report(y_test, y_pred))
filename = 'Model/Crime-prediction-rfc-model.pkl'
df= df.head(300)
if __name__ == '__main__':
app.run(debug=True, use_reloader=
APPENDIX 2 SCREENSHOTS
TYPE YEAR MONTH ... MINUTE X Y
BOOK REFERENCES
WEBSITE REFERENCES
1. https://medium.com/javarevisited/10-free-python-tutorials-and-courses-from-google-
microsoft-and-coursera-for-beginners-96b9ad20b4e6
2. https://www.bestcolleges.com/bootcamps/guides/learn-python-free/
3. https://www.programiz.com/python-programming
4. https://realpython.com/
5. https://www.codecademy.com/learn/learn-python