0% found this document useful (0 votes)

99 views

PAASBAAN - Crime Prediction and Classification in Indore City

This document summarizes a minor project report submitted to Rajiv Gandhi Proudyogiki Vishwavidyalaya, Bhopal by Kunal Diwan, Sahitya Nigam, Sourabh Tiwari, and Vikramaditya Singh Bhati. The project aims to predict and classify crimes in Indore City using machine learning algorithms. It analyzes crime data scraped from the Indore Police website. The document includes an introduction to machine learning processes, use case, activity and sequence diagrams depicting the project, proposed system architecture, different machine learning algorithms used for classification including KNN, decision trees and random forests, results of classification for different crime types by hour, and avenues for

Uploaded by

python developer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

99 views

PAASBAAN - Crime Prediction and Classification in Indore City

Uploaded by

python developer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

“PAASBAAN – Crime Prediction and

Classification in Indore City”

A Minor Project Report submitted to

Rajiv Gandhi Proudyogiki Vishwavidyalaya, Bhopal

in partial fulfillment of the requirements for the award of

Degree of

Bachelor of Engineering
in
Information Technology
by
Kunal Diwan (0832IT151017)
Sahitya Nigam (0832IT151038)
Sourabh Tiwari (0832IT151054)
Vikramaditya Singh Bhati (0832IT151060)

Under the guidance of

Ms. Grishma Pandey
(Assistant Professor)

Session: 2017-18
Department of Information Technology
Chameli Devi Group of Institutions, Indore
452 020 (Madhya Pradesh)
DECLARATION

We certify that the work contained in this report is original and has been done by us under the
guidance of my supervisor(s).
a. The work has not been submitted to any other Institute for any degree or diploma.
b. We have followed the guidelines provided by the Institute in preparing the report.
c. We have conformed to the norms and guidelines given in the Ethical Code of Conduct
of the Institute.
d. Whenever we have used materials (data, theoretical analysis, figures, and text) from
other sources, we have given due credit to them by citing them in the text of the report
and giving their details in the references.

Name and Signature of Project Team Members:

Sr. Enrollment No. Name of students Signature of

No. students
1. 0832IT151017 Kunal Diwan

2. 0832IT151038 Sahitya Nigam

3. 0832IT151054 Sourabh Tiwari

4. 0832IT151060 Vikramaditya Singh Bhati

II
CHAMELI DEVI GROUP OF INSTITUTIONS, INDORE

CERTIFICATE

Certified that the project report entitled, “Paasbaan – Crime prediction and classification in
Indore city” is a bonafide work done under my guidance by Kunal Diwan,Sahitya
Nigam,Sourabh Tiwari & Vikramaditya Singh Bhati in partial fulfillment of the requirements
for the award of degree of Bachelor of Engineering in Information Technology Engineering.

Date: _________________________

(Ms.Grishma Pandey)

Guide

__________________________ _________________________

(Prof. Jasvant Mandloi) (Dr. K.S. Jairaj)

Head of the Department Dean, CDGI

__________________________ _________________________

( Internal Examiner ) (External Examiner)

III
CHAMELI DEVI GROUP OF INSTITUTIONS

INDORE

ACKNOWLEDGEMENT

We have immense pleasure in expressing our sincerest and deepest sense of gratitude towards
our guide Ms. Grishma Pandey for the assistance, valuable guidance and co- operation in
carrying out this Project successfully. We have developed this project with the help of
Faculty members of our institute and we are extremely grateful to all of them. We also take
this opportunity to thank Head of the Department Prof. Jasvant Mandloi, and Dean of
Chameli Devi Group of Institutions, Dr. K.S. Jairaj, for providing the required facilities in
completing this project. We are greatly thankful to our parents, friends and faculty members
for their motivation, guidance and help whenever needed.

Name and signature of team Members:

1. Kunal Diwan

2. Sahitya Nigam

3. Sourabh Tiwari

4. Vikramaditya Singh Bhati

IV
Abstract

To be better prepared to respond to criminal activity, it is important to understand patterns in

crime. In our project, we analyze crime data from the city of Indore, scraped from publicly
available website of Indore Police.

At the outset, the task is to predict which category of crime is most likely to occur given a
time and place in Indore.
The use of AI and machine learning to detect crime via sound or cameras currently exists, is
proven to work, and expected to continue to expand.

The use of AI/ML in predicting crimes or an individual’s likelihood for committing a crime
has promise but is still more of an unknown. The biggest challenge will probably be “proving”
to politicians that it works. When a system is designed to stop something from happening, it is
difficult to prove the negative. Companies that are directly involved in providing governments
with AI tools to monitor areas or predict crime will likely benefit from a positive feedback
loop. Improvements in crime prevention technology will likely spur increased total spending
on this technology.

We also attempt to make our classification task more meaningful by merging multiple classes
into larger classes. Finally, we report and reflect on our results with different classifiers, and
dwell on avenues for future work.

V
List of Figures
S.No. Figure number Figure Name Page number
Machine learning
1 Fig 1.1 1.3
Process
Use case diagram of
2 Fig 3.1 3.1
Paasbaan
Activity diagram of
3 Fig 3.2 3.2
Paasbaan
Sequence diagram of
4 Fig 3.3 3.3
Paasbaan
System architecture
5 Fig 3.4 3.4
of Paasbaan
Principle diagram of
6 Fig 4.1.1 4.2
KNN
Shows graphical
Fig 4.1.2
7 representation of 4.2
KNN
8 Fig 4.1.3 Distance functions 4.3
9 Fig 4.2.1 Decision tree 4.3
Decision Tree
10 Fig 4.2.2 4.4
example
Random Forest
11 Fig 4.3.1 4.5
Example
Decision Tree of
12 Fig 4.3.2 4.6
Paasbaan
Act13(Gambling vs
13 Fig 4.4.1 4.6
Hour)
Act323(Violence vs
14 Fig 4.4.2 4.7
Hour)
Act363(Kidnapping
15 Fig 4.4.3 4.7
vs Hour)
Act379(Robbery vs
16 Fig 4.4.4 4.8
Hour)
Act302(Murder vs
17 Fig 4.4.5 4.8
Hour)
Act279(Accident vs
18 Fig 4.4.6 4.9
Hour)
13 Fig 5.1 Predicting Surges 5.2
14 Fig A.1 Snapshot 1 7.1
15 Fig A.2 Snapshot 2 7.1
16 Fig B.1 Snapshot 3 8.1
17 Fig B.2 Snapshot 4 8.1
18 Fig C.1 Snapshot 5 9.1
19 Fig C.2 Snapshot 6 9.1

VI
List of Tables

S.No. Table Number Table Name Page Number

1 1.1 Police Dataset 1.4
2 1.4 Dataset after preprocessing 1.5

3 1.5 Role and Responsibilities 1.6

4 4.5.3 Tests 4.11

VII
TABLE OF CONTENTS

Page
CONTENTS
No.
Title Page I
Declaration II
Certificate by the Supervisor III
Acknowledgement IV
List of Figures V
List of Tables VI
Abstract VII
Chapter 1: Introduction ........................................................................................................1.1
.

1.1 Rationale.........................................................................................................................1.2
.

1.2 Goal ............................................................................................................................1.2

1.3 Objective ....................................................................................................................1.2

1.4 Methodology ..............................................................................................................1.3

1.5 Roles and Responsibilities .........................................................................................1.7

1.6 Contribution of Project ............................................................................................... .

1.8
1.6.1 Market Potential .................................................................................................. .
1.8
1.6.2 Innovativeness ..................................................................................................... .
1.8
1.6.3 Usefulness ........................................................................................................... .
1.8
1.7 Report Organization ...................................................................................................1.9
.

Chapter 2: Requirement Engineering ...................................................................................2.1

2.1 Functional Requirement .............................................................................................2.1

2.1.1 Interface Requirement .........................................................................................2.1

2.2 Non Functional Requirements ....................................................................................2.1

Chapter 3: Analsis & Design ................................................................................................3.1

3.1 Use-case Diagrams .....................................................................................................3.1

3.2 Activity Diagrams ......................................................................................................3.2

VIII
3.3 Sequence Diagrams ....................................................................................................3.3
.

3.4 System Architecture ...................................................................................................3.4

Chapter 4: Construction .......................................................................................................4.1

4.1 Implementation...........................................................................................................4.1
.

4.2 Implementation Details .............................................................................................. .

4.1
4.2.1 KNN(K-Nearest Neighbors) ...............................................................................4.2
.

4.2.2 Decision Tree ......................................................................................................4.3

4.2.3 Random Forest .....................................................................................................

4.4
4.2.4 Data Visualization ................................................................................................4.6

4.3 Software Details ..........................................................................................................4.9

4.4 Hardware Details .........................................................................................................4.10

4.5 Testing ........................................................................................................................ .

4.10
4.5.1 White Box Testing .............................................................................................. .
4.10
4.5.2 Black Box Testing ............................................................................................... .
4.10
4.5.3 Test Table ............................................................................................................ .
4.11

Chapter 5: Conclusion and Future scope...........................................................................5.1

5.1 Conclusion ...............................................................................................................5.1

5.2 Future Scope ........................................................................................................... 5.1

5.3 Expected Outcome ..................................................................................................5.3

References………………………………………..…………………………………..… 6.1
Appendix-A………….………………………..…………………………………..…… 7.1
Appendix-B…………………………….……..…………………………………..…… 8.1
Appendix-C…………………………….……..…………………………………..…… 9.1

IX
Introduction

Chapter-1
Introduction

Paasbaan which is an Urdu word meaning protector, many important questions in public
safety and protection relate to crime, and a better understanding of crime is beneficial in
multiple ways: it can lead to targeted and sensitive practices by law enforcement authorities
to mitigate crime, and more concerted efforts by citizens and authorities to create healthy
neighborhood environments.
With the advent of the Big Data era and the availability of fast, efficient
algorithms for data analysis, understanding patterns in crime from data is an active and
growing field of research.

The inputs to our algorithms are time (hour, day, month, and year), place (latitude and
longitude), and class of crime:

 Act 379 - Robbery

 Act 13 - Gambling
 Act 279 - Accident
 Act 323 - Violence
 Act 302 - Murder
 Act 363 - Kidnapping

The output is the class of crime that is likely to have occurred. We try out multiple
classification algorithms, such as KNN (K-Nearest Neighbors), Decision Trees, and Random
Forests.
We also perform multiple classification tasks – we first try to predict which of 6
classes of crimes are likely to have occurred, and later try to differentiate between violent and
non-violent crimes.

PAASBAAN
1.1
Introduction

1.1 Rationale

Madhya Pradesh's commercial capital Indore has topped the crime record in the country in
2008 followed by Bhopal and Jaipur. Crime rate of Indore was 941.4, which is the highest in
the country, according to National Crime Record Bureau's (NCRB) report - "Crime in India
2008".

With the rapid urbanization and development of big cities and towns, the graph of crimes is
also on the increase. This phenomenal rise in offences and crime in cities is a matter of great
concern and alarm to all of us.

There are robberies, murders, rapes and what not. The frequent and repeated thefts,
burglaries, robberies, murders, killings, rapes, shoplifting, pick pocketing, drug- abuse, illegal
trafficking, smuggling, theft of vehicles etc., have made the common citizens to have
sleepless nights and restless days.

They feel very insecure and vulnerable in the presence of anti-social and evil elements. The
criminals have been operating in an organized way and sometimes even have nationwide and
international connections and links.

1.2 Goal
Much of the current work is focused in two major directions:
 Predicting surges and hotspots of crime, and
 Understanding patterns of criminal behavior that could help in solving criminal
investigations.

1.3 Objective
The objective of our work is to:
 Predicting crime before it takes place.
 Predicting hotspots of crime.
 Understanding crime pattern.
 Classify crime based on location.
 Analysis of crime in Indore.

PAASBAAN
1.2
Introduction

1.4 Methodology
1.4.1 Machine learning
The term machine learning refers to the automated detection of meaningful patterns in data.
In the past couple of decades it has become a common tool in almost any task that requires
information extraction from large data sets.We are surrounded by a machine learning based
technology: search engines learn how to bring us the best results (while placing pro_table
ads), anti-spam software learns to filter our email messages, and credit card transactions are
secured by a software that learns how to detect frauds. Digital cameras learn to detect faces
and intelligent personal assistance applications on smart-phones learn to recognize voice
commands. Cars are equipped with accident prevention systems that are built using machine
learning algorithms.
Machine learning is also widely used in scientific applications such as
bioinformatics, medicine, and astronomy. One common feature of all of these applications is
that, in contrast to more traditional uses of computers, in these cases, due to the complexity of
the patterns that need to be detected, a human programmer cannot provide an explicit,
finedetailed specification of how such tasks should be executed. Taking example from
intelligent beings, many of our skills are acquired or re_ned through learning from our
experience (rather than following explicit instructions given to us). Machine learning tools
are concerned with endowing programs with the ability to learn and adapt

Fig 1.1-Machine learning process

PAASBAAN
1.3
Introduction

The inputs to our algorithms are time (hour, day, month, year), place (latitude and longitude),
class of crime
 Act 379-Robbery
 Act 13-Gambling
 Act 279-Accident
 Act 323-Violence
 Act 302-Murder
 Act 363-Kidnapping
The output is the class of crime that is likely to have occurred. We try out multiple
classification algorithms, such as KNN (K-Nearest Neighbors), Decision Trees, and Random
Forests.

1.4.2 Our Dataset

Dataset which we are using is scraped daily from website of Indore police which is publically
available.
But the dataset is Hindi and in order to perform machine learning this data cannot be used as
it is.
Hence the data needs to be processed
Features of this dataset
 थाना : Police Station

 थाना अपराध/मगग क्रमाां क : Police Station identification number

 धारा : I.P.C. act number
 फररयादी का नाम एवां पता : Complainant name & address
 आरोपी का नाम एवां पता : Accused name & address
 घटना स्थल : Incident place
 घटना ददनाां क व समय : Incident date & time
 कायमी ददनाां क व समय : Reporting date & time
 दवलांब से कायमी का कारण : Reason of Time delay in reporting to police
 घटना के कारण सदित दववरण : Incident information in brief

PAASBAAN
1.4
Introduction

थाना थाना धारा फररयादी का आरोपी का घटना स्थल घटना कायमी दवलांब से घटना के
अपराध/मगग नाम एवां पता नाम एवां ददनाां क व ददनाां क व कायमी का कारण
क्रमाां क पता समय समय कारण सदित
दववरण

थाना जू नी 89/18 379, सु नील अज्ञात -, ४६ टाईप २ 08-02-18 2/10/2018 फररयादी कोई अज्ञात
इां दौर अन्जाने उम्र बीएसएनएल 11:0 के 12:45:00 के थाना व्यक्ति
12:0 बीच PM
२८ वर्ग क्वाटग र आने पर फररयादी
दपता/पदत खातीवालाटैं क की दबना
सु रेश अन्जाने इन्दौर नम्बर की
दनवासी ५६ मोटर
टाईप २ साकयकल
पीएनटी को रखे
कालोनी स्थान से
खातीवालाटैं क चोरी कर ले
इन्दौर गया

थाना 64/18 13 जुआ शासन तफे ददनेश - बाडी मोिल्ला 10-02-18 2/10/2018 घटना
राऊ एक्ट पु दलस सउदन केशरदसि , राऊ 19:10 के 8:10:00 ददनाक को
20:0 बीच PM
मिे श सुरेश -रमेश आरोपीयो
श्रीवास्तव यादव को तास
दपता/पदत मुकेश - पत्तो सें
भगवान दास सत्यनारायण िारजीत का
दनवासी पु दलस पांवार , दाव लगाते
थाना राऊ हुवे पकडा

Table 1.1: Police Dataset

1.4.3 Preprocessing
Before implementing machine learning algorithms on our data, we went through a series of
preprocessing steps with our classification task in mind. These included:
 Dropping features such police station, station number, Complainant name & address
,Accused name & address
 Dropping features such as Resolution, Description and Address: The resolution and
description of a crime are only known once the crime has occurred, and have limited
significance in a practical, real-world scenario where one is trying to predict what
kind of crime has occurred, and so, these were omitted. The address was dropped
because we had information about the latitude and longitude, and, in that context, the
address did not add much marginal value.

PAASBAAN
1.5
Introduction

 The timestamp contained the year, date and time of occurrence of each crime. This
was decomposed into five features: Year (2018), Month (1-12), Date (1-31), Hour (0-
23) and Minute (0-59).

Following these preprocessing steps, we ran some out-of-the box learning algorithms as a
part of our initial exploratory steps. Our new feature set consisted of 9 features, all of which
were now numeric in nature.

timestamp act379 act13 act279 act323 act363 act302 latitude longitude

28-02-2018 1 0 0 0 0 0 22.73726 75.87599

21:00

28-02-2018 1 0 0 0 0 0 22.72099 75.87608

21:15

28-02-2018 0 0 1 0 0 0 22.73668 75.88317

10:15

28-02-2018 0 0 1 0 0 0 22.74653 75.88714

10:15

Table 1.2: Dataset after Preprocessing

1.4.4 Methodology
After the preprocessing described in the previous sections, we had three different
classifications problems to solve, which we proceeded to attack with an assortment of
classification algorithms. The following are the algorithms which we are using:
 KNN( K- Nearest neighbors)
 Decision Tree
 Random Forests

PAASBAAN
1.6
Introduction

1.5 Role and Responsibilities

Role Name Responsibilities
 Data Entry
Learner Kunal Diwan  Data Preprocessing
 Testing
 Data Entry
Learner Sahitya Nigam  Data Preprocessing
 Testing
 Data Entry
 Data Preprocessing
 GUI(Flask)
Data
 Documentation
Scientist
Sourabh Tiwari  Machine Learing
& GUI
 Data Analysis
Developer
 Data Mining
 Kernel Designing
 Data Visualization
 Data Entry
 Data Preprocessing
 GUI(Flask)
Data
 Documentation
Scientist Vikramaditya
 Machine Learing
& GUI Singh Bhati
 Data Analysis
Developer
 Data Mining
 Kernel Designing
 Data Visualization

PAASBAAN
1.7
Introduction

1.6 Contribution of Project

1.6.1 Market potential
The use of AI and machine learning to detect crime via sound or cameras currently exists, is
proven to work, and expected to continue to expand.

The use of AI/ML in predicting crimes or an individual’s likelihood for committing a crime
has promise but is still more of an unknown.The biggest challenge will probably be “proving”
to politicians that it works. When a system is designed to stop something from happening, it is
difficult to prove the negative.Companies that are directly involved in providing governments
with AI tools to monitor areas or predict crime will likely benefit from a positive feedback
loop. Improvements in crime prevention technology will likely spur increased total spending
on this technology.

For Example: it could be the case that two or more classes of crimes surge and sink together,
which would be an interesting relationship to uncover. Other areas to work on include
implementing a more accurate multi-class classifier, and exploring better ways to visualize
our results.

1.6.2 Innovativeness
The idea behind this project is that crimes are relatively predictable; it just requires being
able to sort through a massive volume of data to find patterns that are useful to law
enforcement.This kind of data analysis was technologically impossible a few decades ago,
but the hope is that recent developments in machine learning are up to the task.

1.6.3 Usefulness
Public safety and protection relate to crime, and a better understanding of crime is beneficial
in multiple ways: it can lead to targeted and sensitive practices by law enforcement
authorities to mitigate crime, and more concerted efforts by citizens and authorities to create
healthy neighborhood environments. With the advent of the Big Data era and the availability

PAASBAAN
1.8
Introduction

of fast, efficient algorithms for data analysis, understanding patterns in crime from data is an
active and growing field of research.

1.7 Report Organization

The remaining section of the report is structured as follows:
 Chapter 2 provides detailed business and technical requirements
 Chapter 3 provides analysis and design of this project
 Chapter 4 provides Construction, implementation details of this project
 Chapter 5 provides Conclusion and future scope as well as future application of this
project

PAASBAAN
1.9
Requirement Engineering

Chapter-2
Requirement Engineering

2.1 Functional Requirement

The functional requirements describe the core functionality of the application

2.1.1 Interface Requirement:

 Screen 1 to accept user inputs.

 Field 1 accepts numeric data for latitude and longitude.
 Field 2 accepts date & time.
 Button 1 overall analysis.
 Submit button to send data of Field 1 & 2 to Kernel.
 Screen 2 displays predicted values.
 Screen 3 displays analysis.

2.2 Non Functional Requirement

Non function requirement are those requirement of the system which are not directly
concerned with specific functional delivered by the system. They may be related to emergent
properties such as reliability, extendibility, usability,etc.

 To provide prediction of crime.

 To provide maximum accuracy.
 Provide visualized analysis.
 Ease of use.
 Availability
 Reliability
 Maintainability

PAASBAAN
2.1
Analysis and Design

Chapter-3
Analysis and Design

3.1 Use case diagram

Use case diagram represent the overall scenario of the system. A scenario is nothing but a
sequence of steps describing an interaction between a user and a system.

Thus use case is a set of scenario tied together by some goal. The use case diagram are drawn
for exposing the functionalities of the system.

Fig 3.1-Use case diagram of Paasbaan

PAASBAAN
3.1
Analysis and Design

3.2 Activity diagram

The activity diagram is a graphical representation for representing the flow of interaction
within specific scenatios. It is similar to a flowchart in which various activities that can be
performed in the system are represented.

Fig 3.2-Activity diagram of Paasbaan

PAASBAAN
3.2
Analysis and Design

3.3 Sequence diagram

In the sequence diagram how the object interacts with the other object is shown. There are
sequence of events that are represented by a sequence diagram.

It is a time oriented view of the interation between objects to accomplish a behavioural goal
of the system.

Fig 3.3-Sequence diagram of Paasbaan

PAASBAAN
3.3
Analysis and Design

3.4 System architecture

The system architectural design is the design process for identifying the subsystems making
up the system and framework for subsystem control and communication. The goal of the
architectural design is to establish the overall structure of software system.

Fig 3.4-System architecture of Paasbaan

PAASBAAN
3.4
Construction

Chapter-4
Construction

4.1 Implementation

The implementation of the project is done with the help of python language. To be particular,
for the purpose of machine learning Anaconda is being used.

Anaconda is one of several Python distributions. Anaconda is a new distribution of the

Python. It was formerly known as Continuum Analytics. Anaconda has more than 100 new
packages. Anaconda is used for scientific computing, data science, statistical analysis, and
machine learning.

On Python technology, we found out Anaconda to be easier. Since it helps with the following
problems:

 Installing Python on multiple platforms.

 Separating out different environments.
 Dealing with not having correct privileges.
 Getting up and running with specific packages and libraries.

This data was scraped from the publically available data from Indore police website which
had been made by people in police station of different areas. Implementation of the idea
started from the Indore city itself so as to limit an area for the prediction and making it less
complex. The data was sorted and converted into a new format of timestamp, longitude,
latitude, which was the input that machine would be taking so as to predict the crime rate in
particular location or city.

The entries was done just to make the machine learn what all it has to do with the data and
what actually the output is being demanded. As soon as the machine learnt the algorithms and
the process, accuracy of different algorithms were measured & the algorithm with the most
accuracy is used for the prediction kernel i.e. Random forest.

4.2 Implementation Details

For the purpose of proper implementation and functioning several Algorithms and techniques
were used. Following are the algorithms used:

PAASBAAN
4.1
Construction

4.2.1 KNN (K-Nearest neighbors)

A powerful classification algorithm used in pattern recognition K nearest neighbors stores all
available cases and classifies new cases based on a similarity measure (e.g. distance
function).One of the top data mining algorithms used today. A non-parametric lazy learning
algorithm (An Instance based Learning method).

KNN: Classification Approach

 An object (a new instance) is classified by a majority votes for its neighbor classes.
 The object is assigned to the most common class amongst its K nearest
neighbors.(measured by distance function)

Fig 4.1.1 Principle diagram of KNN

Fig 4.1.2 Shows graphical representation of KNN

PAASBAAN
4.2
Construction

Fig 4.1.3 Distance functions

4.2.2 Decision Tree

As the name says all about it, it is a tree which helps us by assisting us in decision-making.
Used for both classification and regression, it is a very basic and important predictive learning
algorithm.

 It is different from others because it works intuitively i.e., taking decisions one-by-one.

 Non-parametric: Fast and efficient.

It consists of nodes which have parent-child relationships:

Fig 4.2.1 Decision tree

PAASBAAN
4.3
Construction

Fig 4.2.2 Decision Tree example

Decision tree considers the most important variable using some fancy criterion and splits
dataset based on it. It is done to reach a stage where we have homogenous subsets that are
giving predictions with utmost surety.

4.2.3 Random forest

Random Forests is a very popular ensemble learning method which builds a number of
classiﬁers on the training data and combines all their outputs to make the best predictions on
the test data.

Thus, the Random Forests algorithm is a variance minimizing algorithm that uses
randomness when making split decision to help avoid overﬁtting on the training data.

A random forests classifier is an ensemble classifier, which aggregates a family of classifiers

h(x|θ1),h(x|θ2),..h(x|θk). Each member of the family, h(x|θ), is a classiﬁcation tree and k is
the number of trees chosen from a model random vector.

Also, each θk is a randomly chosen parameter vector. If D(x,y) denotes the training dataset,
each classiﬁcation tree in the ensemble is built using a different subset Dθk(x,y) ⊂ D(x,y) of
the training dataset.

PAASBAAN
4.4
Construction

Thus, h(x|θk) is the kth classification tree which uses a subset of features xθk ⊂ x to build a
classification model. Each tree then works like regular decision trees: it partitions the data
based on the value of a particular feature (which is selected randomly from the subset), until
the data is fully partitioned, or the maximum allowed depth is reached. The final output y is
obtained by aggregating the results thus:

where I denotes the indicator function.

Fig 4.3.1 Random Forest Example

PAASBAAN
4.5
Construction

Fig 4.3.2 Decision Tree of Paasbaan

4.2.4 Data Visualization

Fig 4.4.1 Act13(Gambling vs Hour)

PAASBAAN
4.6
Construction

Fig 4.4.2 Act323(Violence vs Hour)

Fig 4.4.3 Act363(Kidnapping vs Hour)

PAASBAAN
4.7
Construction

Fig 4.4.4 Act379(Robbery vs Hour)

Fig 4.4.5 Act302(Murder vs Hour)

PAASBAAN
4.8
Construction

Fig4.4.6 Act279(Accident vs Hour)

4.3 Software Details

 Anaconda Distribution (v5.1)

 Python (3.6.5)
 Packages Used:
o Flask (0.12.2)
o Pandas (0.22.1)
o Numpy (1.14.2)
o Sklearn (0.19.1)
o Geopy (1.13.0)
 HTML 5
 CSS 3
 Bootstrap 4
 Java Script 1.8

PAASBAAN
4.9
Construction

4.4 Hardware Details

 Operating system: Windows 7 or newer, 64-bit macOS 10.9+, or Linux.

 System architecture: 64-bit x86, 32-bit x86 with Windows or Linux.

 CPU: Intel Core 2 Quad CPU Q6600 @ 2.40GHz or greater.

 RAM: 4 GB or greater.

4.5 Testing

The development of software involves a series of production activities were opportunities for
injection of human fallibilities are enormous.

Error may begin to occur at very inspection of the process where the objective may be
enormously or imperfectly specified as well as in lateral design and development stage.
Because of human inability to perform and communicate with perfection, software
development quality assurance activities.

Software testing is a crucial element of software quality assurances and represents ultimate
review of specification, design and coding.

4.5.1 White box testing

It focuses on the program control structure. Here all statement in the project have been
executed at least once during testing and all logical condition have been exercised.

4.5.2 Black box testing

This is designed to uncover the error in functional requirements without regard to the internal
working of the project. This testing focuses on the information domain of the project ,
deriving test case by partitioning the input and output domain of programming – A manner
that provides through test coverage.

PAASBAAN
4.10
Construction

Test Test Name Test Steps Executed Actual Test case

Case ID Description result result statement

01 Check for The entered 1. Enter If format is As Pass

correct values are in details in correct expected.
entered correct fields. details are
numeric format. Click sent to
2.
values and kernel
submit.
date and successfully.
time.

02 Check for The entered 1. Enter If format is As Pass

correct values are in details in correct expected.
entered correct fields. details are
time. format. Click sent to
2.
kernel
submit.
successfully

03 Check for The entered 1. Enter If format is As Pass

correct values are details in correct expected
entered correct. fields. details are
location Click sent to
2.
kernel
submit.
successfully

04 Predicted Output is displayed If kernel As Pass

Result predicts expected
successfully
output is
then showed
to the screen

05 Analysis Data 1.Click Shows the As Pass

Button visualization Analysis overall expected
is displayed. analysis on
screen 3

Table 4.5.3 Tests

PAASBAAN
4.11
Conclusion and Future Scope

Chapter-5
Conclusion and future scope

5.1 Conclusion
The initial problem of classifying 6 different crime categories was a challenging multi-class
classification problem, and there was not enough predictability in our initial data-set to
obtain very high accuracy on it. We found that a more meaningful approach was to collapse
the crime categories into fewer, larger groups, in order to find structure in the data. We got
high accuracy and precision on Prediction. However, the Violent/Non-violent crime
classification did not yield remarkable results with the same classifiers – this was a
significantly harder classification problem. Thus, collapsing crime categories is not an
obvious task and requires careful choice and consideration.
Possible avenues through which to extend this work include time-series modeling of the
data to understand temporal correlations in it, which can then be used to predict surges in
different categories of crime. It would also be interesting to explore relationships between
surges in different categories of crimes – for example, it could be the case that two or more
classes of crimes surge and sink together, which would be an interesting relationship to
uncover. Other areas to work on include implementing a more accurate multi-class
classifier, and exploring better ways to visualize our results.

5.2 Future Scope

The goal of any society shouldn’t be to just catch criminals but to prevent crimes from
happening in the first place

 Predicting Future Crime Spots: By using historical data and observing where recent
crimes took place we can predict where future crimes will likely happen. For example
a rash of burglaries in one area could correlate with more burglaries in surrounding
areas in the near future. System highlights possible hotspots on a map the police should
consider patrolling more heavily

PAASBAAN
5.1
Conclusion and Future Scope

Fig 5.1 Predicting Surges

 Predicting Who Will Commit a Crime: Using Face Recognition to predict if a

individual will commit a crime before it happens. The system will detect if there are
any suspicious changes in their behavior or unusual movements. For example if an
individual seems to be walking back and forth in a certain area over and over
indicating they might be a pickpocket or casing the area for a future crime. It will also
track individual over time.

 Pretrial Release and Parole: After being charged with a crime, most individuals are
released until they actually stand trial. In the past deciding who should be released
pretrial or what an individual’s bail should be set at is mainly now done by judges
using their best judgment. In just a few minutes, judges had to attempt to determine if
someone is a flight risk, a serious danger to society, or at risk to harm a witness if
released. It is an imperfect system open to bias. The media organization’s analysis
indicated the system might indirectly contain a strong racial bias. They found, “That
black defendants who did not recidivate over a two-year period were nearly twice as
likely to be misclassified as higher risk compared to their white counterparts (45
percent vs. 23 percent).” The report raises the question of whether better AI/ML can
eventually produce more accurate predictions or if it would reinforce existing
problems. Any system will be based off of real world data, but if the real world data is
generated by biased police officers, it can make the AI/ML biased.

PAASBAAN
5.2
Conclusion and Future Scope

5.3 Expected Outcome

The idea behind this project is that crimes are relatively predictable; it just requires being
able to sort through a massive volume of data to find patterns that are useful to law
enforcement. This kind of data analysis was technologically impossible a few decades ago,
but the hope is that recent developments in machine learning are up to the task.

The use of AI and machine learning to detect crime via sound or cameras currently exists, is
proven to work, and expected to continue to expand. The use of AI/ML in predicting crimes or
an individual’s likelihood for committing a crime has promise but is still more of an unknown.
The biggest challenge will probably be “proving” to politicians that it works. When a system
is designed to stop something from happening, it is difficult to prove the negative.

Companies that are directly involved in providing governments with AI tools to monitor areas
or predict crime will likely benefit from a positive feedback loop. Improvements in crime
prevention technology will likely spur increased total spending on this technology.

Possible avenues through which to extend this work include time-series modeling of the data
to understand temporal correlations in it, which can then be used to predict surges in different
categories of crime. It would also be interesting to explore relationships between surges in
different categories of crimes – for example, it could be the case that two or more classes of
crimes surge and sink together, which would be an interesting relationship to uncover. Other
areas to work on include implementing a more accurate multi-class classifier, and exploring
better ways to visualize our results.

PAASBAAN
5.3
REFERENCES

References

[1] Bogomolov, Andrey and Lepri, Bruno and Staiano, Jacopo and Oliver, Nuria and Pianesi,
Fabio and Pentland, Alex.2014. Once upon a crime: Towards crime prediction from
demographics and mobile data, Proceedings of the 16th International Conference on
Multimodal Interaction.
[2] Yu, Chung-Hsien and Ward, Max W and Morabito, Melissa and Ding, Wei.2011. Crime
forecasting using data mining techniques, pages 779-786, IEEE 11th International
Conference on Data Mining Workshops (ICDMW)
[3] Kianmehr, Keivan and Alhajj, Reda. 2008. Effectiveness of support vector machine for
crime hot-spots prediction, pages 433-458, Applied Artificial Intelligence, volume 22,
number 5.
[4] Toole, Jameson L and Eagle, Nathan and Plotkin, Joshua B. 2011 (TIST), volume 2,
number 4, pages 38, ACM Transactions on Intelligent Systems and Technology
[5] Wang, Tong and Rudin, Cynthia and Wagner, Daniel and Sevieri, Rich. 2013. pages 515-
530, Machine Learning and Knowledge Discovery in Databases
[6] Friedman, Jerome H. ”Stochastic gradient boosting.” Computational Statistics and Data
Analysis 38.4 (2002): 367-378.sts
[7]Leo Breiman, Random Forests, Machine Learning, 2001,Volume 45, Number 1, Page 5

PAASBAAN
6.1
Appendix-A

Appendix-A

Fig A.1-Snapshot 1

Fig A.2-Snapshot 2

PAASBAAN
7.1
Appendix-B

Appendix-B

Fig B.1-Snapshot 3

Fig B.2-Snapshot 4

PAASBAAN
8.1
Appendix-C

Appendix-C

Fig C.1-Snapshot 5

Fig C.2-Snapshot 6

PAASBAAN
9.1

Crime Rate Prediction
No ratings yet
Crime Rate Prediction
26 pages
Fbi Crime Analysis and Prediction Using Machine Learning
No ratings yet
Fbi Crime Analysis and Prediction Using Machine Learning
8 pages
Research Methodology and Quantitative Methods
From Everand
Research Methodology and Quantitative Methods
G. NAGESWARA RAO
1/5 (1)
PASSBAAN
No ratings yet
PASSBAAN
77 pages
1822 B.E Cse Batchno 242
No ratings yet
1822 B.E Cse Batchno 242
59 pages
AnandReport_merged (1)
No ratings yet
AnandReport_merged (1)
80 pages
Criminal Investigation Using Suspect Prediction
No ratings yet
Criminal Investigation Using Suspect Prediction
39 pages
Project Synopsis
No ratings yet
Project Synopsis
14 pages
IRJET-V11I4287
No ratings yet
IRJET-V11I4287
6 pages
Crime Rate Prediction
No ratings yet
Crime Rate Prediction
14 pages
project report _33
No ratings yet
project report _33
21 pages
Crime Rate Prediction: Ch. Mahendra1, G. Nani Babu2, G. Balu Nitin Chandra, A. Avinash 4, Y. Aditya5
No ratings yet
Crime Rate Prediction: Ch. Mahendra1, G. Nani Babu2, G. Balu Nitin Chandra, A. Avinash 4, Y. Aditya5
6 pages
Crime Rate Prediction Based On Clustering: Bachelor of Technology Computer Science and Engineering
100% (1)
Crime Rate Prediction Based On Clustering: Bachelor of Technology Computer Science and Engineering
50 pages
Crime Data Analysis Using ML
No ratings yet
Crime Data Analysis Using ML
22 pages
Project Title: A Major Project Report Submitted in Partial Fulfillment of The Requirements For The Degree of
No ratings yet
Project Title: A Major Project Report Submitted in Partial Fulfillment of The Requirements For The Degree of
25 pages
Crime Type and Occurrence Prediction Using Machine Learning
No ratings yet
Crime Type and Occurrence Prediction Using Machine Learning
28 pages
Crime Analysis System
No ratings yet
Crime Analysis System
74 pages
Krithika Heheee
No ratings yet
Krithika Heheee
17 pages
Final Project Report Format of PBL-1
No ratings yet
Final Project Report Format of PBL-1
14 pages
Crime Prediction and Analysis: 1 Pratibha 2 Akanksha Gahalot
No ratings yet
Crime Prediction and Analysis: 1 Pratibha 2 Akanksha Gahalot
6 pages
"Crime Report Management System": Pavan Kumar S (1Nh19Is404)
No ratings yet
"Crime Report Management System": Pavan Kumar S (1Nh19Is404)
46 pages
1.Sasi final termpaper
No ratings yet
1.Sasi final termpaper
37 pages
Paper (Imran)
No ratings yet
Paper (Imran)
13 pages
majorprojectppt-240330115817-ea90e720
No ratings yet
majorprojectppt-240330115817-ea90e720
10 pages
abcde
No ratings yet
abcde
5 pages
Crime Type and Occurrence Predection
No ratings yet
Crime Type and Occurrence Predection
18 pages
AI Unit4
No ratings yet
AI Unit4
29 pages
Phase 1 Starting
No ratings yet
Phase 1 Starting
9 pages
Crime Prediction Using Machine Learning Project[1] [Read-Only]
No ratings yet
Crime Prediction Using Machine Learning Project[1] [Read-Only]
14 pages
MCA FRONT PAGE Muthu2 - 053019
No ratings yet
MCA FRONT PAGE Muthu2 - 053019
9 pages
AbhayRautela_MiniProject_5th Semester
No ratings yet
AbhayRautela_MiniProject_5th Semester
15 pages
Crime Examination Study 2021
No ratings yet
Crime Examination Study 2021
9 pages
Criminalistics Proposal
No ratings yet
Criminalistics Proposal
13 pages
Devil Crime Rate Prediction Using K-Means
No ratings yet
Devil Crime Rate Prediction Using K-Means
14 pages
Crime Analysis Through Machine Learning: November 2018
No ratings yet
Crime Analysis Through Machine Learning: November 2018
7 pages
Synopsis Crime
No ratings yet
Synopsis Crime
7 pages
Fyp Caps
No ratings yet
Fyp Caps
24 pages
Crime Detection Using Data Mining IJERTV5IS010610
No ratings yet
Crime Detection Using Data Mining IJERTV5IS010610
6 pages
Report 1 Crim
No ratings yet
Report 1 Crim
73 pages
Group 6-Micro Project Documentation
No ratings yet
Group 6-Micro Project Documentation
50 pages
Artificial Intelligence & Crime Prediction
No ratings yet
Artificial Intelligence & Crime Prediction
23 pages
95 Submission-2
No ratings yet
95 Submission-2
12 pages
Criminova Crime Forecast
No ratings yet
Criminova Crime Forecast
36 pages
Crime Prediction
No ratings yet
Crime Prediction
11 pages
Crime Prediction Using Machine Learning and Deep L
No ratings yet
Crime Prediction Using Machine Learning and Deep L
8 pages
Sat - 91.Pdf - Cyber Patrolling Using Machine Learning
No ratings yet
Sat - 91.Pdf - Cyber Patrolling Using Machine Learning
11 pages
Research Paper
No ratings yet
Research Paper
11 pages
Crime Prediction: Shivansh Pandey, Sahil Pratap Subodh Bansala, Shivam Goel
No ratings yet
Crime Prediction: Shivansh Pandey, Sahil Pratap Subodh Bansala, Shivam Goel
17 pages
Crime Rate Prediction Using KNN: Ms. Vrushali Pednekar Ms. Trupti Mahale Ms. Pratiksha Gadhave Prof. Arti Gore
No ratings yet
Crime Rate Prediction Using KNN: Ms. Vrushali Pednekar Ms. Trupti Mahale Ms. Pratiksha Gadhave Prof. Arti Gore
4 pages
272crime Rate Prediction Using Machine Learning
No ratings yet
272crime Rate Prediction Using Machine Learning
5 pages
IJCRT22A6562
No ratings yet
IJCRT22A6562
8 pages
Crime Prediction in Nigeria's Higer Institutions
No ratings yet
Crime Prediction in Nigeria's Higer Institutions
13 pages
Final_report
No ratings yet
Final_report
39 pages
Synopsis_house_price_prediction[1]
No ratings yet
Synopsis_house_price_prediction[1]
2 pages
TITLE: An Improved Crime Analytics and Prediction System For Expert Recommendation Using Data Mining Techniques
No ratings yet
TITLE: An Improved Crime Analytics and Prediction System For Expert Recommendation Using Data Mining Techniques
5 pages
Crimeai
No ratings yet
Crimeai
8 pages
Crime Type and Occurrence Prediction Using Machine Learning Algorithm
No ratings yet
Crime Type and Occurrence Prediction Using Machine Learning Algorithm
8 pages
Crime Prediction Using Machine Learning and Deep L
No ratings yet
Crime Prediction Using Machine Learning and Deep L
21 pages
Strategic Balancing Using Factual Data
From Everand
Strategic Balancing Using Factual Data
Abhinav Aggarwal
No ratings yet
Real-Time Critical Systems
From Everand
Real-Time Critical Systems
Jordan Lee Mauro-Buhagiar
3/5 (1)
Fake Profile Detection
100% (1)
Fake Profile Detection
69 pages
Online Pet Shop Web Application Abstract
No ratings yet
Online Pet Shop Web Application Abstract
30 pages
Project Report Submitted in The Partial Fulfillment of The Requirements For The Award of The Degree of
No ratings yet
Project Report Submitted in The Partial Fulfillment of The Requirements For The Award of The Degree of
34 pages
An Intelligent Web-Based Voice Chat Bot
No ratings yet
An Intelligent Web-Based Voice Chat Bot
33 pages
E-Comm Document
No ratings yet
E-Comm Document
39 pages
An Approach For Crime Analysis Using Clustering Algorithm: Submitted by
No ratings yet
An Approach For Crime Analysis Using Clustering Algorithm: Submitted by
52 pages
Lecture 15 - Java MultiThreading
No ratings yet
Lecture 15 - Java MultiThreading
44 pages
Caterpillar Game Report
No ratings yet
Caterpillar Game Report
15 pages
Pixma MP496: Product Information
No ratings yet
Pixma MP496: Product Information
7 pages
Amstrad CPC Serial Interface II User Guide
No ratings yet
Amstrad CPC Serial Interface II User Guide
27 pages
CS408 Quiz 3 File by Tanveer Online Academy
No ratings yet
CS408 Quiz 3 File by Tanveer Online Academy
12 pages
Workbench - Mechanical Introduction
100% (1)
Workbench - Mechanical Introduction
34 pages
Software Architecture Fundamentals Case Study
No ratings yet
Software Architecture Fundamentals Case Study
3 pages
Lab6c Using Wireshark To Study ARP and ICMP PDF
No ratings yet
Lab6c Using Wireshark To Study ARP and ICMP PDF
6 pages
Comptia A+ 220-801 Official Study Guide: Course Overview
No ratings yet
Comptia A+ 220-801 Official Study Guide: Course Overview
7 pages
ELIZA
No ratings yet
ELIZA
9 pages
B.sc. (Computer Science) Timetables March April 2024
No ratings yet
B.sc. (Computer Science) Timetables March April 2024
8 pages
2023-QUESTION BANK-Computer-Sc-I
No ratings yet
2023-QUESTION BANK-Computer-Sc-I
2 pages
Operating System (Chap 1)
No ratings yet
Operating System (Chap 1)
33 pages
Practical-9: Designing Test Cases For Your Software Definition. Microsoft Office Word
No ratings yet
Practical-9: Designing Test Cases For Your Software Definition. Microsoft Office Word
5 pages
Yozolog
No ratings yet
Yozolog
7 pages
Syllabus: AIT 673 - Cyber Incident Handling/Response: Term: Spring 2018
No ratings yet
Syllabus: AIT 673 - Cyber Incident Handling/Response: Term: Spring 2018
12 pages
DGS-1100 Series Datasheet
No ratings yet
DGS-1100 Series Datasheet
6 pages
PCMuu 4220272 T 2 Yu 2 Beusuagw 7 WF 1 T 1 T 2 T 5252 T 2 T
No ratings yet
PCMuu 4220272 T 2 Yu 2 Beusuagw 7 WF 1 T 1 T 2 T 5252 T 2 T
34 pages
tds10309 en
No ratings yet
tds10309 en
4 pages
大中华区半导体：接口 – 投资者反馈的四个方面 --- Greater China Semiconductors_ Interface – Four Areas of Investor Feedback
No ratings yet
大中华区半导体：接口 – 投资者反馈的四个方面 --- Greater China Semiconductors_ Interface – Four Areas of Investor Feedback
9 pages
Python GUI
No ratings yet
Python GUI
26 pages
Unit 3-Dead Lock
No ratings yet
Unit 3-Dead Lock
12 pages
Install Apache Dan PHP7 Pada Windows x64
No ratings yet
Install Apache Dan PHP7 Pada Windows x64
8 pages
Introduction (BT4222) YL
No ratings yet
Introduction (BT4222) YL
48 pages
AN138 CMT2219A Configuration Guideline
No ratings yet
AN138 CMT2219A Configuration Guideline
42 pages
NE8000 Command
No ratings yet
NE8000 Command
5 pages
Amon Chowdhury CV
No ratings yet
Amon Chowdhury CV
3 pages
Archer C24 (EU) 1.0 - Datasheet
No ratings yet
Archer C24 (EU) 1.0 - Datasheet
6 pages
Kingst Virtual Instruments User Guide (v3.5)
No ratings yet
Kingst Virtual Instruments User Guide (v3.5)
17 pages
Cloud Computing Unit-5
No ratings yet
Cloud Computing Unit-5
11 pages