0% found this document useful (0 votes)

114 views46 pages

4road Damage Detection

The document describes a graduate project report submitted to AKTU university that aims to develop a system for road damage detection. It outlines the objectives of the project, which are to build an inexpensive and efficient solution for detecting various types of road damages using computer vision and deep learning techniques. The system would analyze images and video footage collected from smartphones to identify issues like potholes and cracks and help streamline the process of road maintenance and management. It discusses some of the technical challenges in developing such a system and the approaches that will be explored to address them.

Uploaded by

Ajit Raj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

114 views46 pages

4road Damage Detection

Uploaded by

Ajit Raj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 46

Road Damage Detection

A graduate project report submitted to AKTU in partial fulfillment

of the requirement for the award of the degree of

Bachelor of Technology
In

Computer Science and Engineering

SUBMITED BY UNDER GUIDANCE OF

Ajit Raj Shekhar (1764110005) Mr. Gaurav Ojha

Chandan Yadav (1764110018) (Assistant Professor)
Rahul Kumar Yadav Kausal (1764110037)
Shreya Pandey (1864110910) Computer Science & Engineering

Department of Computer Science & Engineering

Ashoka Institute of Technology and Management

(A constitute of Dr. A.P.J. Abdul Kalam Technical University)
Varanasi -221007

August, 2021
CERTIFICATE

This is to Certify that Ajit Raj Shekhar, Chandan Yadav, Rahul Kumar Yadav Kausal and
Shreya Pandey has carried out the Project work presented in the Project Report entitled “Road
Damage Detection” for the award of Bachelor of Technology (Computer Science &
Engineering) from Ashoka Institute of Technology & Management, Varanasi in partial
fulfillment of the requirement for the award of degree of B. Tech. in Computer Science &
Engineering, is a record of the candidate’s own work carried out by them under my supervision. The
matter embodied in this report is original and has not been submitted for the award of any other
degree.

Forwarded By:
(Mr. Gaurav Ojha)
Mr. Arvind Kumar
Assistant Professor
Assistant Prof. & Head of Department
(Department of CSE)
(Department of CSE)

Date- ………………….

2
DECLARATION

We hereby declare that the project entitled “Road Damage Detection” submitted by us
in the partial fulfillment of the requirement for the award of the degree of Bachelor of
Technology (Computer Science & Engineering) of Dr. A.P.J. Abdul Kalam
Technical University, Lucknow is record of our own work carried under the
supervision and guidance of Mr. Gaurav Ojha.
To the best of our knowledge this project have not been submitted to any other
University or Institute for the award of degree.

Signature : Signature :
Name : Ajit Raj Shekhar Name : Chandan Yadav
Roll No. : 1764110005 Roll No. : 1764110018
Date : Date :

Signature : Signature :
Name : Rahul Kumar Yadav Kausal Name : Shreya Pandey
Roll No. : 1764110037 Roll No. : 1864110910
Date : Date :

3
ACKNOWLEDGEMENT

In performing our project, we had to take the help and guideline of some respected
persons, who deserves our gratitude. The completion of this project gives us much
pleasure.
I take this opportunity to express my deep gratitude and deep regard to my guide Mr.
Gaurav Ojha, Designation, Department of Computer Science & Engineering, Ashoka
Institute of Technology & Management, Varanasi, for his/her exemplary guidance,
monitoring and encouragement throughout the course of this project. And he is also our
Internal Guide for helping us to complete out this project work successfully.

We also take the opportunity to acknowledge the contribution of Mr. Arvind Kumar,
Assistant Professor & Head of Department, Computer Science & Engineering, of
College Ashoka Institute of Technology & Management, Varanasi, for his full support
and assistance during the development of the project.
A sincere thanks to all my project team they performed very well and their constant
effort is really appreciable. Their dedication encouraged me to perform well, I really
admire their company, and it was a great experience to work with them.
We are thankful to all our faculty member for their cooperation, invaluably constructive
criticism and friendly advice during the project. We are also thankful to our colleague
and classmate who helped us in compilation of this project.
Finally, yet importantly, we would like to express my heartfelt thanks to my beloved
parents for their blessings. I perceive as this opportunity as a big milestone in my career
development. I will strive to use gained skill and knowledge in the best possible way,
and will work on their improvement, in order to continue cooperation with all of you in
the future.

4
ABSTRACT

The various defects that occur on asphalt pavement are a direct cause car accidents,
and countermeasures are required because they cause significantly dangerous
situations. In this project, we propose convolutional neural networks (CNN)-based
road surface damage detection with deep learning. First, the training database is
collected through the camera installed in the vehicle while driving on the road.
Moreover, the CNN model is trained in the form of a semantic segmentation using the
deep convolutional autoencoder. Here, we augmented the training dataset depending
on brightness, and finally generated lots of training images. Furthermore, the CNN
model is updated by using the pseudo-labeled images from the semi-supervised
learning methods for improving the performance of road surface damage detection
technique. To demonstrate the effectiveness of the proposed method, various
evaluation datasets were created to verify the performance of the proposed road
surface damage detection, and four experts evaluated each image. As a result, it is
confirmed that the proposed method can properly segment the road surface damages.

5
LIST OF FIGURES

6
CONTENT

1.OBJECTIVE
2.INTRODUCTION
2.1 What is road damage detection ?
2.2 what is deep learning ?
2.3 CNNs(Convolutional neural network).

3.SYSTEM REQUIREMENTS
3.1 Hardware components.
3.2 Software required.

4.IMPLEMENTATION
-Object detection system
-Data collection.
-Deep ensemble learning.
-TABLE 1: Model comparision.
5.CODINGS.

7
1. OBJECTIVE

Roads are one of the most crucial parts of a social as well as economic development
of any developing as well as developed country. But as we are well aware that the
maintenance of the same by governmental organisations such as municipalities is a
big challenge, many researchers are already indulged in finding multiple ways for
developing an efficient and apt way for helping the municipalities. If regular
inspection of the road conditions are not maintained then the condition of roads will
worsen gradually due to many factors such as weather, traffic, aging, poor choice of
materials etc.

Some agencies deploy road survey vehicles which consist of multiple expensive
sensors and high resolution cameras. There are some experienced road managers
who supervise and perform visual inspection of roads. But these methods are of
course really time consuming and expensive. Even after the completion of
inspection, these agencies struggle to maintain accurate and updated databases of
recorded structural damages.

This poor management leads to unorganised and inappropriate resource allocation

for road maintenance.

So we need something which is inexpensive, fast and organised solution for such
road damage detections. Nowadays we are very fortunate that almost everyone
carries a camera based smartphone. So with the advent of Object Detection
techniques in AI, people have started launching challenges and research in this
domain and municipalities in Japan have already started using such smartphone
based AI techniques to perform road damage inspection. So this case study is an
attempt to use some state of the art techniques to build a model which will try to
detect multiple types of road damages such as potholes, alligator cracks, etc using
artificial intelligence tools.

2. INTRODUCTION

8
Throughout this project we mainly talk about the Road damage detection, Deep
Learning and CNNs(Convolutional Neural Networks) . So before moving further
we must have knowledge about these terms as mentioned above.

2.1 What is road damage detection

Road damage detection is critical for the maintenance of a road, which

traditionally has been performed using expensive high-performance sensors.
With the recent advances in technology, especially in computer vision, it is
now possible to detect and categorize different types of road damages, which
can facilitate efficient maintenance and resource management.

Can we automatically detect and classify the severity of road damage by

exploiting raw video footage taken from smartphones on car dashboards?

What are some of the technical challenges one would need to overcome in doing
so?

This method walks through our approach to solving the task, highlighting the
unexpected issues we encountered along the way.

From different road detection methods, we found most approaches could be

broken down into the following categories:

 3D Analysis: usage of stereo image analysis¹ or LIDAR point clouds² to

detect abnormalities in pavement.

 Vibration-based Analysis: capitalizing on on-board accelerometers or

gyroscopes.³

 Vision-based models: ranging from traditional techniques like edge-

detection & spectral segmentation⁴ to representation learning and
segmentation via Convolutional Neural Networks (CNNs).⁵

9
Figure: Example images from existing papers on road damage detection.

2.2 What is Deep Learning

Developers apply the end-to-end object detection method based on deep

learning to the road surface damage detection problem, and verify its
detection accuracy. In particular, we examine whether we can detect eight
classes of road damage by applying state-of-the-art object detection methods.

This case study is on a real world application of Deep Learning on Making

Classification & detection of different kinds of road damages

10
Figure: Road damage detection and classifications.

2.3 CNNs(Convolutional Neural Networks)

To detect the road surface damages, CNN-based techniques have also being
studied. They detect the road surface damages by using the object detection. The
object detection is to find the position of an object on the image, which is in the
form of bounding boxes, and to determine what class the object is.

In object detection, the parts of the road surface damages are not precisely
segmented. In this, we focus on finding the road surface damages in the form of
semantic segmentation.Although convolutional neural networks show high
performance on image processing in various applications, most approaches are
limited to supervised learning. Traditional machine learning methods can be
divided into supervised and unsupervised learning categories, where supervised
learning refers to the use of datasets that pair input data with labeled data to train
models, and CNNs often use image information as input data. Labeled data can
vary in segmentation, classification, and regression depending on the structure

11
of the neural network, and much time and effort are required to acquire such
labeled data. On the other hand, collecting unlabeled input data is a relatively
easy and simple alternative to acquiring labeled data. Unsupervised learning
refers to a type of machine learning algorithm used to generate new input data,
or determine hidden structures from datasets consisting of input data without
labeled responses.

In the case of performing supervised learning for the road surface damage
detection technique based on the semantic segmentation, the labeled image that
only segments road surface damages can be used as training data. In this paper,
5000 pieces of image datasets were collected, and these datasets must be labeled
one by one to train the model, which requires a great deal of time and effort
compared to collecting simple unlabeled input data.

Figure. Examples of: (a) fully connected neural network (FCN) and (b) 1D and
(c) 2D convolutional neural networks (CNNs): all neurons are connected in (a),
while only adjacent neurons are connected in (b,c).

12
3. SYSTEM REQUIREMENTS

The project needs both hardware and software components. The hardware
components includes, the Image Capturing Device (Webcam, Optical device,
smartphones camera), Monitoring system, Connection cables, Storage system,
Etc.

Software components are Jupyter Notebook, Pycharm, etc. They are described in
detail below:

3.2 Hardware Components

i) Image Capturing Device

Webcams : A webcam is a video camera that feeds or streams an image or

video in real time to or through a computer network, such as the Internet.
Webcams are typically small cameras that sit on a desk, attach to a user’s
monitor, or are built into the hardware. Webcams can be used during a video
chat session involving two or more people, with conversations that include live
audio and video.

Webcam software enables users to record a video or stream the video on the
Internet. As video streaming over the Internet requires much bandwidth, such
streams usually use compressed formats. The maximum resolution of a webcam
is also lower than most handheld video cameras, as higher resolutions would be
reduced during transmission. The lower resolution enables webcams to be
relatively inexpensive compared to most video cameras, but the effect is
adequate for video chat sessions.

The webcam features are mainly dependent on the computer processor as well as
an operating system of the computer. They can provide advanced features such
as image archiving, motion sensing, custom coding, or even automation.
Furthermore, webcams are used for social video recording, video broadcasting,

13
and computer vision and mainly used for security surveillance and in
videoconferencing.

Features of webcam

The webcams can differ in terms of size, shape, specification, and price. There
are several features of webcam that help you choose the best webcam for your
individual needs:

1. Megapixels

2. Frame rate

3. Lens quality

4. Autofocus

5. Low light quality

6. Resolution

14
Smartphones camera: Today’s smartphones come equipped with a very
comprehensive set of camera related specifications. Our smartphone, for many
of us, has become our primary camera due to it being the one we always have
with us.

In its purest form, smartphone photography is all about collecting photons (light)
and converting them into electrons (image). The capabilities of the supporting
hardware and software are paramount to producing high-quality images of your
chosen subject

Features On Good Camera Phones :

Bright aperture

Decent amount of pixels

Large sceen

optical image stabilization

Lenses and zoom

HDR etc

15
ii) Monitoring System : By putting thresholds on the damage score one can
automatically find areas of significant road damage. Most of the time the
classification was correct. In some cases there was noise in the data which
caused the score to be high the damage score is meant to distinguish cracks and
potholes from a smooth pavement. However, there are many more objects and
deformations that can be found in roads, three examples are shown in Figure

Two of them are part of the infrastructure, a manhole and a grate. Obviously
they Are not considered road damage. The third example is an area where the
Pavement is buckled. These structures can be clearly seen in the 3D maps. We
are Currently working on algorithms to classify such objects and Deformation
the areas of damage can be shown on a map and used by maintenance. A display
Concept is shown in Figure where the user can click on the selected locations
And view detailed information about it.

16
3.3 SOFTWARE REQUIRED

i) Jupyter Notebook

The Jupyter Notebook is an open-source web application that allows you to

create and share documents that contain live code, equations, visualizations, and
narrative text. Its uses include data cleaning and transformation, numerical
simulation, statistical modeling, data visualization, machine learning, and much
more. Jupyter Notebook (formerly IPython Notebooks) is a web-based
interactive computational environment for creating Jupyter notebook documents.
The “notebook” term can colloquially make reference to many different entities,
mainly the Jupyter web application, Jupyter Python web server, or Jupyter
document format depending on context.

According to the official website of Jupyter, Project Jupyter exists to develop

open-source software, open-standards, and services for interactive computing
across dozens of programming languages. Jupyter Book is an open-source
project for building books and documents from computational material. It allows
the user to construct the content in a mixture of Markdown, an extended version
of Markdown called MyST, Maths & Equations using MathJax, Jupyter
Notebooks, reStructuredText, the output of running Jupyter Notebooks at build
time. Multiple output formats can be produced (currently single files, multipage
HTML web pages and PDF files).

17
ii) Pycharm

PyCharm is a dedicated Python Integrated Development Environment (IDE)

providing a wide range of essential tools for Python developers, tightly
integrated to create a convenient environment for productive Python, web, and
data science development.

PyCharm is available in three editions:

Community (free and open-sourced): for smart and intelligent Python

development, including code assistance, refactorings, visual debugging, and
version control integration.

Professional(paid) : for professional Python, web, and data science development,

including code assistance, refactorings, visual debugging, version control
integration, remote configurations, deployment, support for popular web
frameworks, such as Django and Flask, database support, scientific tools
(including Jupyter notebook support), big data tools.

Edu (free and open-sourced): for learning programming languages and related
technologies with integrated educational tools.

PyCharm supports the following versions of Python:

Python 2: version 2.7

Python 3: from the version 3.6 up to the version 3.10

18
4. IMPLEMENTATION

Object Detection System

In general, for object detection, methods that apply an image classifier to an
object detection task have become mainstream; these methods entail varying the
size and position of the object in the test image, and then using the classifier to
identify the object. In the past few years, an approach involving the extraction of
multiple candidate regions of objects using region proposals as typified by R-
CNN then making a classification decision with candidate regions using
classifiers has also been reported. However, the R-CNN approach can be time
consuming because it requires more crops, leading to significant duplicate
computation from overlapping crops. This calculation redundancy was solved
using a Fast R-CNN , which inputs the entire image once through a feature
extractor so that crops share the computation load of feature extraction. As
described above, image processing methods have historically developed at a
considerable pace. In our study, we primarily focus on four recent object
detection systems: the Faster R-CNN , the You Look Only Once (YOLO)
system, the Region-based Fully Convolutional Networks (R-FCN) system and
the Single Shot Multibox Detector (SSD) system.

i) YOLO

YOLO is an object detection framework that can achieve high mean average
precision (mAP) and speed. In addition, YOLO can predict the region and class
of objects with a single CNN. An advantageous feature of YOLO is that its
processing speed is considerably fast because it solves the problem as a mere
regression, detecting objects by considering background information. The

19
YOLO algorithm outputs the coordinates of the bounding box of the object
candidate and the confidence of the inference after receiving an image as input.

ii) R-FCN

R-FCN is another object detection framework, which was proposed by Dai et al.
(Dai et al., 2016). Its architecture is that of a region-based, fully convolutional
network for accurate and efficient object detection. Although Faster R-CNN is
several times faster than Fast R-CNN, the region-specific component must be
applied several hundred times per image. Instead of cropping features from the
same layer where the region proposals are predicted like in the case of the Faster
R-CNN method, in the R-FCN method, crops are taken from the last layer of the
features prior to prediction. This approach of pushing cropping to the last layer
minimizes the amount of per-region computation that must be performed. Dai et
al. (Dai et al., 2016) showed that the R-FCN model (using Resnet 101) could
achieve aaccuracy comparable to Faster R-CNN often at faster running speeds.

iii) SSD

SSD is an object detection framework that uses a single feed-forward

convolutional network to directly predict classes and anchor offsets without
requiring a second stage per-proposal classification operation. The key feature of
this framework is the use of multi-scale convolutional bounding box outputs
attached to multiple feature maps at the top of the network.

20
Data Collection
Data Collection thus far, in the study of damage detection on the road surface,
images are either captured from above the road surface or using on-board
cameras on vehicles. When models are trained with images captured from above,
the situations that can be applied in practice are limited, considering the
difficulty of capturing such images. In contrast, when a model is constructed
with images captured from an on-board vehicle camera, it is easy to apply these
images to train the model for practical situations. For example, using a readily
available camera like on smartphones and general passenger cars, any individual
can easily detect road damages by running the model on the smartphone or by
transferring the images to an external server and processing it on the server. We
installed a smartphone (LG Nexus 5X) on the dashboard of a car, as shown in
Figure, and photographed images of 600 × 600 pixels once per second. The
reason we select a photographing interval of 1sec is because it is possible to
photograph images while traveling on the road without leakage or duplication
when the average speed of the car is approximately 40 km/h (or approximately
10 m/s). For this purpose, we created a smartphone application that can capture
images of the roads and record the location information once per second.

21
22
Deep Ensemble Learning

An object detection algorithm deals with detecting semantic objects and visual
content belonging to a certain class from a digital image. With the advances in
deep neural networks, several Convolutional Neural Network (CNN) based
object detection algorithms have been proposed. The first one was the Region of
CNN features (R-CNN) method, which proposed to perform object detection via
two steps: object region proposal and classification. The first step generates
multiple regions by using a selective search, which are then input to a CNN
classifier. However, due to its inherent computational complexity, several
optimized versions of R-CNN were proposed such as the Fast R-CNN algorithm.
More recently, an algorithm known as ”You Only Look Once” (YOLO) was
proposed, which combined the two steps from R-CNN algorithm and
significantly reduced the computational complexity. YOLO uses a CNN which
inherently decides regions from the image and outputs probabilities for each of
them. Hence, it is able to achieve a significant speedup as compared to R-CNN
based algorithms and can be used for real-time processing as well. The goal of
this work is to improve upon the real-time detection capabilities for road damage
detection, hence we use YOLO as our base model. Ensemble methods, which
combine the predictions from various models, have been successfully employed
in various machine learning tasks to improve the accuracy. In this work, we use
an ensemble of YOLO models trained for different number of iterations and
different resolutions. More details about the model selection and implementation
can be found here. We present the model performance in Table I.

Results

Fig. 1 shows the detection results from a single YOLO model under varying
conditions.

In Fig. 2, we show some detection results for YOLO models trained on data
from Japan and India.

Whereas, in Fig. 3, training data from Japan, India, and Czech are used.

23
24
The models trained on data from all the countries seem to perform better than
the models trained on data from Japan and India only, which is in contrast to the
results presented in [1].

Furthermore, selection of the input image size considerably affects the detection
performance. Since YOLO requires the input image resolution to be a multiple
of 32, we focused on two specific sizes, 416 and 608. However, as opposed to
common perception, increasing the resolution of the image decreased the
performance of the base model, as shown in Table II. We evaluated the
performance of the proposed models by using the platform provided by the
organizers of IEEE BigData Cup Challenge 2020. As described in Section III-A,
the bounding boxes whose class label matched with the ground truth were
selected and then those with a greater than 50% IoU were picked. Finally the F1
Score for these boxes was calculated.

TABLE I: Model Comparison

Model Name Dataset: Test 1 Dataset: Test 2

Yolo-v4(416*416) 0.5193 0.5137

Yolo-v4(608*608) 0.5122 0.5012

Ensemble(5 models) 0.5321 0.5226

Ensemble(15 models) 0.6091 0.5983

Ensemble(25 models) 0.6102 0.6297

Ensemble(30 models) 0.6275 0.6358

25
5. CODING'S
In this section we talk about coding arena about the project which listed out
through section.

import os

from os import walk, getcwd

from PIL import Image

#classes = ["stopsign"]

def convert(size, box):

dw = 1./size[0]

dh = 1./size[1]

x = (box[0] + box[1])/2.0

y = (box[2] + box[3])/2.0

w = box[1] - box[0]

h = box[3] - box[2]

x = x*dw

w = w*dw

y = y*dh

26
h = h*dh

return (x,y,w,h)

2–

import glob

import pandas as pd

import xml.etree.ElementTree as ET

def class_text_to_int(row_label):

if row_label == 'D00':

return '0'

elif row_label == 'D01':

return '1'

elif row_label == 'D10':

return '2'

elif row_label == 'D11':

return '3'

elif row_label == 'D20':

return '4'

elif row_label == 'D40':

return '5'

27
elif row_label == 'D43':

return '6'

elif row_label == 'D44':

return '7'

else:

exit(0)

def xml_to_csv(path, outpath):

#c=0

xml_list = []

for xml_file in glob.glob(path + '/*.xml'):

tree = ET.parse(xml_file)

root = tree.getroot()

ct = 0

fn = root.find('filename').text

fn = fn.split('.')

print(fn[0])

txt_outpath = outpath + fn[0] + '.txt'

w = int(root.find('size')[0].text)

h = int(root.find('size')[1].text)

28
#print("Output:" + txt_outpath)

txt_outfile = open(txt_outpath, "w")

#print(txt_outfile)

for member in root.findall('object'):

# print(member.find('bndbox')[0].text)

cls = member[0].text

xmin = int(member.find('bndbox')[0].text)

xmax = int(member.find('bndbox')[1].text)

ymin = int(member.find('bndbox')[2].text)

ymax = int(member.find('bndbox')[3].text)

#print(w,h)

b = (float(xmin), float(xmax), float(ymin), float(ymax))

bb = convert((w,h), b)

#print(bb)

cls_id = class_text_to_int(cls)

txt_outfile.write(cls_id+ " " + " ".join([str(a) for a in bb]) + '\n')

txt_outfile.close()

#xml_list.append(value)

#list_file.close()

print("Files created")

29
def main():

image_path = "C:\Users\Ajit\Downloads\Road-damage-detection-
master\src"#os.path.join(os.getcwd(), 'annotations')

op = "F:/intern/RoadDamageDataset/Sumida/lb/"

xml_df = xml_to_csv(image_path,op)

import os

import glob

import pandas as pd

import xml.etree.ElementTree as ET

#def label2det(label):

# f = open('val.txt', 'a+')

#
f.write('/media/erress/Personal/Programming/BennettUniversity/bdd100k/ima
ges/100k/val/%s.jpg' % (label['name']))

# f.write('\n')

# f.close()

30
7-

def change_dir(path):

xml_list = []

f = open('testlist.txt', 'a+')

for xml_file in glob.glob(path + '/*.xml'):

tree = ET.parse(xml_file)

root = tree.getroot()

#for member in root.findall('object'):

# print(member.find('bndbox')[0].text)

value = root.find('filename').text

f.write('train_data/images/%s' % (value))

f.write('\n')

f.close()

print('file created')

def main():

image_path = "C:\Users\Ajit\Downloads\Road-damage-detection-
master\src"#os.path.join(os.getcwd(), 'annotations')

xml_df = change_dir(image_path)

31
9-

#cpu config

config = tf.ConfigProto( device_count = {'GPU': 1 , 'CPU': 56} ) #max: 1

gpu, 56 cpu

sess = tf.Session(config=config)

keras.backend.set_session(sess)

#os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"

#os.environ["CUDA_VISIBLE_DEVICES"] = ""

det_l = cfgconst.net.layers[len(cfgconst.net.layers)-1]

CLASSNUM = det_l.classes

f = open(cfgconst.labelnames)

voc_names =[]

for ln in f:

voc_names.append(ln.strip()) # = ["stopsign", "skis"]

# check class number

print(voc_names)

if CLASSNUM != len(voc_names):

print('cfg file class setting is not equal to '+cfgconst.labelnames)

exit()

32
# run_yolo

if len(sys.argv) < 2:

print ('usage: python %s [train/test/valid] [pretrained model (optional)]\n'

%(sys.argv[0]))

exit()

voc_labels= []

for i in range(CLASSNUM):

voc_labels.append("ui_data/labels/"+voc_names[i]+".PNG")

if not os.path.isfile(voc_labels[i]):

print ('can not load image:%s' %(voc_labels[i]))

exit()

import utils

thresh = utils.find_float_arg(sys.argv, "-thresh", .2)

#print 'thresh='+str(thresh)

#exit()

cam_index = utils.find_int_arg(sys.argv, "-c", 0)

#cfg_path = sys.argv[2]

model_weights_path = sys.argv[2] if len(sys.argv) > 2 else 'noweight'

33
filename = sys.argv[3] if len(sys.argv) > 3 else 'nofilename'

print sys.argv

print model_weights_path+','+filename

def train_yolo( weights_path):

# construct network

net = cfgconst.net #parse.parse_network_cfg(cfg_path)

train_images = cfgconst.train #"train_data/train.txt"

backup_directory = "backup/"

# load pretrained model

if os.path.isfile(model_weights_path):

print 'Loading '+model_weights_path

model=load_model(model_weights_path,
custom_objects={'yololoss': ddd.yololoss})

sgd = opt.SGD(lr=net.learning_rate, decay=net.decay,

momentum=net.momentum, nesterov=True)

model.compile(loss=ddd.yololoss, optimizer=sgd,
metrics=["accuracy"])

else:

# base is cfg name

34
#base = utils.basecfg(cfg_path)

print ('Learning Rate: %f, Momentum: %f, Decay: %f\n' %

(net.learning_rate, net.momentum, net.decay));

model = kerasmodel.makenetwork(net)

(X_train, Y_train) = yolodata.load_data(train_images,net.h,net.w,net.c,

net)

print ('max_batches : %d, X_train: %d, batch: %d\n' %(net.max_batches,

len(X_train), net.batch));

print str(net.max_batches/(len(X_train)/net.batch))

#datagen = ImageDataGenerator(

# featurewise_center=True,

# featurewise_std_normalization=True,

# rotation_range=0,

# width_shift_range=0.,

# height_shift_range=0.,

# horizontal_flip=True)

#datagen.fit(X_train)

#model.fit_generator(datagen.flow(X_train, Y_train,
batch_size=net.batch),

# samples_per_epoch=len(X_train),
nb_epoch=net.max_batches/(len(X_train)/net.batch))

35
#model.fit(X_train, Y_train, batch_size=net.batch,
nb_epoch=net.max_batches/(len(X_train)/net.batch))

early_stop = EarlyStopping(monitor='loss',

min_delta=0.001,

patience=3,

mode='min',

verbose=1)

checkpoint = ModelCheckpoint('yolo_weight.h5',

monitor='loss',

verbose=1,

save_best_only=True,

mode='min',

period=1)

batchesPerdataset = max(1,len(X_train)/net.batch)

model.fit(X_train, Y_train,
nb_epoch=net.max_batches/(batchesPerdataset), batch_size=net.batch,
verbose=1)

model.save_weights('yolo_weight_rd.h5')

model.save('yolo_kerasmodel_rd.h5')

def debug_yolo( cfg_path, model_weights_path='yolo_kerasmodel_rd.h5' ):

36
net = cfgconst.net ##parse.parse_network_cfg(cfg_path)

testmodel = load_model(model_weights_path,
custom_objects={'yololoss': ddd.yololoss})

(s,w,h,c) = testmodel.layers[0].input_shape

x_test,y_test = yolodata.load_data('train_data/test.txt', h, w, c, net)

testloss = testmodel.evaluate(x_test,y_test)

print y_test

print 'testloss= '+str(testloss)

def predict(X_test, testmodel, confid_thresh):

print 'predict, confid_thresh='+str(confid_thresh)

pred = testmodel.predict(X_test)

(s,w,h,c) = testmodel.layers[0].input_shape

# find confidence value > 0.5

confid_index_list =[]

confid_value_list =[]

x_value_list = []

y_value_list =[]

w_value_list =[]

h_value_list =[]

class_id_list =[]

37
classprob_list =[]

x0_list = []

x1_list = []

y0_list = []

y1_list = []

det_l = cfgconst.net.layers[len(cfgconst.net.layers)-1]

side = det_l.side

classes = det_l.classes

xtext_index =0

foundindex = False

max_confid =0

for p in pred:

#foundindex = False

for k in range(1): #5+classes):

#print 'L'+str(k)

for i in range(side):

for j in range(side):

if k==0:

38
max_confid =
max(max_confid,p[k*49+i*7+j])

#sys.stdout.write( str(p[k49+i7+j])+', ' )

if k==0 and p[k49+i7+j]>confid_thresh:

confid_index_list.append(i*7+j)

foundindex = True

#print '-'

print 'max_confid='+str(max_confid)

for confid_index in confid_index_list:

confid_value = max(0,p[0*49+confid_index])

x_value = max(0,p[1*49+confid_index])

y_value = max(0,p[2*49+confid_index])

w_value = max(0,p[3*49+confid_index])

h_value = max(0,p[4*49+confid_index])

maxclassprob = 0

maxclassprob_i =-1

for i in range(classes):

if p[(5+i)*49+confid_index] > maxclassprob and

foundindex:

39
maxclassprob = p[(5+i)*49+confid_index]

maxclassprob_i = i

classprob_list.append( maxclassprob)

class_id_list.append( maxclassprob_i)

print 'max_confid='+str(max_confid)+',c='+str(confid_value)
+',x='+str(x_value)+',y='+str(y_value)+',w='+str(w_value)+',h='+str(h_value)
+',cid='+str(maxclassprob_i)+',prob='+str(maxclassprob)

row = confid_index / side

col = confid_index % side

x = (w / side) * (col + x_value)

y = (w / side) * (row + y_value)

print 'confid_index='+str(confid_index)+',x='+str(x)
+',y='+str(y)+',row='+str(row)+',col='+str(col)

#draw = ImageDraw.Draw(nim)

#draw.rectangle([x-(w_value/2)*w,y-(h_value/2)*h,x+
(w_value/2)*w,y+(h_value/2)*h])

#del draw

#nim.save('predbox.png')

#sourceimage = X_test[xtext_index].copy()

x0_list.append( max(0, int(x-(w_value/2)*w)) )

40
y0_list.append( max(0, int(y-(h_value/2)*h)) )

x1_list.append( int(x+(w_value/2)*w) )

y1_list.append( int(y+(h_value/2)*h) )

break

#xtext_index = xtext_index + 1

#print pred

sourceimage = X_test[0].copy()

return sourceimage, x0_list, y0_list, x1_list, y1_list, classprob_list,

class_id_list

def test_yolo(imglist_path, model_weights_path='yolo_kerasmodel_rd.h5',

confid_thresh=0.3):

print 'test_yolo: '+imglist_path

# custom objective function

#print (s,w,h,c)

#exit()

if os.path.isfile(imglist_path):

testmodel = load_model(model_weights_path,
custom_objects={'yololoss': ddd.yololoss})

(s,w,h,c) = testmodel.layers[0].input_shape

f = open(imglist_path)

for img_path in f:

41
#X_test = []

if os.path.isfile(img_path.strip()):

frame = Image.open(img_path.strip())

#(orgw,orgh) = img.size

nim = scipy.misc.imresize(frame, (w, h, c))

if nim.shape != (w, h, c):

continu

#nim = img.resize( (w, h), Image.BILINEAR )

img, x0_list, y0_list, x1_list, y1_list, classprob_list,

class_id_list = predict(np.asarray([nim]), testmodel, thresh)

#X_test.append(np.asarray(nim))

#predict(np.asarray(X_test), testmodel, confid_thresh)

# found confid box

for x0,y0,x1,y1,classprob,class_id in zip(x0_list,

y0_list, x1_list, y1_list, classprob_list, class_id_list):

# draw bounding box

cv2.rectangle(img, (x0, y0), (x1, y1),

(255,255,255), 2)

# draw classimg

classimg = cv2.imread(voc_labels[class_id])

42
print
'box='+str(x0)+','+str(y0)+','+str(x1)+','+str(y1)

#print img.shape

#print classimg.shape

yst = max(0,y0-classimg.shape[0])

yend = max(y0,classimg.shape[0])

img[yst:yend, x0:x0+classimg.shape[1]] =
classimg

# draw text

font = cv2.FONT_HERSHEY_SIMPLEX

cv2.putText(img, str(classprob), (x0,y0-

classimg.shape[0]-1), font, 1,(255,255,255),2,cv2.LINE_AA)

cv2.imshow('frame',img)

if cv2.waitKey(1000) & 0xFF == ord('q'):

break

else:

print img_path+' predict fail'

cv2.destroyAllWindows()

else:

print imglist_path+' does not exist'

43
def demo_yolo(model_weights_path, filename, thresh=0.3):

print 'demo_yolo'

testmodel = load_model(model_weights_path,
custom_objects={'yololoss': ddd.yololoss})

(s,w,h,c) = testmodel.layers[0].input_shape

cap = cv2.VideoCapture(filename)

while (cap.isOpened()):

ret, frame = cap.read()

if not ret:

break

#print frame

nim = scipy.misc.imresize(frame, (w, h, c))

#nim = np.resize(frame, (w, h, c)) #, Image.BILINEAR )

img, x0_list, y0_list, x1_list, y1_list, classprob_list, class_id_list =

predict(np.asarray([nim]), testmodel, thresh)

# found confid box

for x0,y0,x1,y1,classprob,class_id in zip(x0_list, y0_list, x1_list,

y1_list, classprob_list, class_id_list):

# draw bounding box

cv2.rectangle(img, (x0, y0), (x1, y1), (255,255,255), 2)

# draw classimg

44
classimg = cv2.imread(voc_labels[class_id])

print 'box='+str(x0)+','+str(y0)+','+str(x1)+','+str(y1)

#print img.shape

#print classimg.shape

yst = max(0,y0-classimg.shape[0])

yend = max(y0,classimg.shape[0])

img[yst:yend, x0:x0+classimg.shape[1]] = classimg

# draw text

font = cv2.FONT_HERSHEY_SIMPLEX

cv2.putText(img, str(classprob), (x0,y0-classimg.shape[0]-1),

font, 1,(255,255,255),2,cv2.LINE_AA)

cv2.imshow('frame',img)

if cv2.waitKey(100) & 0xFF == ord('q'):

break

cap.release()

cv2.destroyAllWindows()

if sys.argv[1]=='train':

train_yolo(model_weights_path)

elif sys.argv[1]=='test':

if os.path.isfile(model_weights_path):

45
test_yolo(filename, model_weights_path, confid_thresh=thresh)

else:

test_yolo(filename, confid_thresh=thresh)

elif sys.argv[1]=='demo_video':

if os.path.isfile(model_weights_path):

print 'pretrain model:'+model_weights_path+', video:'+filename+',

thresh:'+str(thresh)

demo_yolo(model_weights_path, filename, thresh)

else:

print 'syntax error::need specify a pretrained model'

exit()

elif sys.argv[1]=='debug':

debug_yolo( cfg_path, model_weights_path )

Animal Detection 1
No ratings yet
Animal Detection 1
56 pages
Cyber Security Syllabus
No ratings yet
Cyber Security Syllabus
1 page
FINAL Road Damage Detection Report
No ratings yet
FINAL Road Damage Detection Report
51 pages
Blockchain-Based E-Commerce: A Review On Applications and Challenges
No ratings yet
Blockchain-Based E-Commerce: A Review On Applications and Challenges
17 pages
Obstacle Avoiding Smartcar Using Arduino PDF
No ratings yet
Obstacle Avoiding Smartcar Using Arduino PDF
21 pages
Seminar Deep Learning
No ratings yet
Seminar Deep Learning
17 pages
Intelligent Wireless Camera
57% (7)
Intelligent Wireless Camera
14 pages
19.PIR Sensor Based Intrusion Detection System
100% (3)
19.PIR Sensor Based Intrusion Detection System
84 pages
Automatic Unauthorized Parking Detector With SMS Notification To Owner
50% (2)
Automatic Unauthorized Parking Detector With SMS Notification To Owner
2 pages
Road Detection Paper
No ratings yet
Road Detection Paper
14 pages
Lab2 - Arithmetic and Logic Opertaions
No ratings yet
Lab2 - Arithmetic and Logic Opertaions
5 pages
Automatic Drunk and Drive Vehicle Locking Using Without Helmet
No ratings yet
Automatic Drunk and Drive Vehicle Locking Using Without Helmet
9 pages
Milf City
83% (6)
Milf City
14 pages
Vivid
100% (1)
Vivid
14 pages
Matlab Lab Manual
0% (1)
Matlab Lab Manual
22 pages
Number Plate Detection Using Python
No ratings yet
Number Plate Detection Using Python
16 pages
Department of Computer Science Engineering
No ratings yet
Department of Computer Science Engineering
16 pages
Real Time Object Detection Using Raspberry Pi
No ratings yet
Real Time Object Detection Using Raspberry Pi
7 pages
Group13 Ecea200-1l Finalmanuscript
No ratings yet
Group13 Ecea200-1l Finalmanuscript
57 pages
Face Detection
No ratings yet
Face Detection
91 pages
Image Processing in Road Traf C Analysis
No ratings yet
Image Processing in Road Traf C Analysis
11 pages
Vehicle Accident Detection Synopsis
100% (1)
Vehicle Accident Detection Synopsis
12 pages
"Parkit" - A Parking Space Finder App
No ratings yet
"Parkit" - A Parking Space Finder App
31 pages
Hawassa University: Institute of Technology Faculty of Manufacturing Engineering Department of Mechanical Engineering
No ratings yet
Hawassa University: Institute of Technology Faculty of Manufacturing Engineering Department of Mechanical Engineering
25 pages
Kulambayev - Real - Time - Road - Damage - Detection - System-Q3-WoS
No ratings yet
Kulambayev - Real - Time - Road - Damage - Detection - System-Q3-WoS
11 pages
Arduino Uno Ultrasonic Sensor HC sr04 Motion Detector With Display of Distance in The LCD IJERTV9IS050677 PDF
No ratings yet
Arduino Uno Ultrasonic Sensor HC sr04 Motion Detector With Display of Distance in The LCD IJERTV9IS050677 PDF
8 pages
Pothole Detection Using Machine Learning
No ratings yet
Pothole Detection Using Machine Learning
5 pages
Pedestrian Tracking Algorithm For Video Surveillance Based On Lightweight Convolutional Neural Network
No ratings yet
Pedestrian Tracking Algorithm For Video Surveillance Based On Lightweight Convolutional Neural Network
12 pages
Drowsiness Detection System Using Machine Learning
No ratings yet
Drowsiness Detection System Using Machine Learning
4 pages
2023 Scopus Enhanced Road Damage Detection
No ratings yet
2023 Scopus Enhanced Road Damage Detection
11 pages
Wa0061.
No ratings yet
Wa0061.
25 pages
ECLR 33 (Matlab) Manual
No ratings yet
ECLR 33 (Matlab) Manual
81 pages
Road Damage
No ratings yet
Road Damage
13 pages
UG Project Report (FFFF)
No ratings yet
UG Project Report (FFFF)
45 pages
Name of The Project: Seminar Report ON
No ratings yet
Name of The Project: Seminar Report ON
52 pages
Fingerprint
No ratings yet
Fingerprint
36 pages
Automated Ambulance Final
No ratings yet
Automated Ambulance Final
49 pages
Data Aquisition PDF
No ratings yet
Data Aquisition PDF
28 pages
SSRN Id4107251
No ratings yet
SSRN Id4107251
7 pages
Project Report Final
No ratings yet
Project Report Final
60 pages
Multi Object Tracking in Traffic Environments: A Systematic Literature
No ratings yet
Multi Object Tracking in Traffic Environments: A Systematic Literature
13 pages
Minor Project Alcohal Detection
No ratings yet
Minor Project Alcohal Detection
14 pages
Sem5 Training Report - ECIL-embedded System
No ratings yet
Sem5 Training Report - ECIL-embedded System
25 pages
Alcohol Detection System in Cars
67% (3)
Alcohol Detection System in Cars
12 pages
ES 22 Lab Manual
No ratings yet
ES 22 Lab Manual
33 pages
Pradeep Seminar Report
No ratings yet
Pradeep Seminar Report
28 pages
PHD Research Proposal of Waqar Baig PDF
No ratings yet
PHD Research Proposal of Waqar Baig PDF
11 pages
Digital Image Processing
No ratings yet
Digital Image Processing
40 pages
Traffic Signal Annunciator: Government College of Engineering, Jalgaon 425002
No ratings yet
Traffic Signal Annunciator: Government College of Engineering, Jalgaon 425002
32 pages
Conf - Accident Detection and Alerting Systems A Study
No ratings yet
Conf - Accident Detection and Alerting Systems A Study
6 pages
Unit 4 Architectural Approach For IoT
0% (1)
Unit 4 Architectural Approach For IoT
14 pages
Letter Cognizant GenC Campus Hiring 2025
No ratings yet
Letter Cognizant GenC Campus Hiring 2025
17 pages
Electric Field and Ultrasonic Sensor Based Security System PDF
No ratings yet
Electric Field and Ultrasonic Sensor Based Security System PDF
4 pages
Object Detector For Blind Person
No ratings yet
Object Detector For Blind Person
20 pages
Design and Implementation of Pothole Detector Usingmultisensor System PDF
No ratings yet
Design and Implementation of Pothole Detector Usingmultisensor System PDF
5 pages
Driver Drowsiness Synopsis
No ratings yet
Driver Drowsiness Synopsis
11 pages
Object Detection Using Faster R-CNN Deep Learning: Trainfasterrcnnobjectdetector
No ratings yet
Object Detection Using Faster R-CNN Deep Learning: Trainfasterrcnnobjectdetector
9 pages
Smart Crop Protection System From Animals PIC
No ratings yet
Smart Crop Protection System From Animals PIC
7 pages
Matlab Manual
No ratings yet
Matlab Manual
16 pages
Matlab Tuts
No ratings yet
Matlab Tuts
18 pages
Evaluation of An Arduino-Based IoT Person Counter
No ratings yet
Evaluation of An Arduino-Based IoT Person Counter
8 pages
Sampling A Signal in Matlab - GaussianWaves
No ratings yet
Sampling A Signal in Matlab - GaussianWaves
11 pages
Android Application For Structural Health Monitoring and Data Analytics of Roads PDF
No ratings yet
Android Application For Structural Health Monitoring and Data Analytics of Roads PDF
5 pages
Latihan Soal Bsi 2
75% (4)
Latihan Soal Bsi 2
3 pages
Sample - Final Year Report
No ratings yet
Sample - Final Year Report
52 pages
Seminar (RCS851) REPORT (1) 0000000000000
No ratings yet
Seminar (RCS851) REPORT (1) 0000000000000
22 pages
Essentials Streamlabs OBS
No ratings yet
Essentials Streamlabs OBS
28 pages
Stop Motion Pro: Quickstart Guide
No ratings yet
Stop Motion Pro: Quickstart Guide
20 pages
Chapter 4
0% (1)
Chapter 4
30 pages
Bcm3a12 - Bba3a12 Professional Business Skills PDF
No ratings yet
Bcm3a12 - Bba3a12 Professional Business Skills PDF
128 pages
F7D7602V2 8820ed01325uk RevC00 NetCam
No ratings yet
F7D7602V2 8820ed01325uk RevC00 NetCam
46 pages
bcc950 Quickstart Guide
No ratings yet
bcc950 Quickstart Guide
68 pages
Medical Mirror
No ratings yet
Medical Mirror
10 pages
CCTV Policy
No ratings yet
CCTV Policy
8 pages
User's Manual: Important Information
No ratings yet
User's Manual: Important Information
13 pages
Instructions For Using V Tuber Avatar
No ratings yet
Instructions For Using V Tuber Avatar
4 pages
LAE 2022 Manual For Examinees 1.2 PDF
No ratings yet
LAE 2022 Manual For Examinees 1.2 PDF
30 pages
English For It: Assignment Topic Unit 14: Video Conferencing
No ratings yet
English For It: Assignment Topic Unit 14: Video Conferencing
28 pages
Supported IP Camera List: June 9, 2011
No ratings yet
Supported IP Camera List: June 9, 2011
5 pages
Video Interview Handbook by CPL
No ratings yet
Video Interview Handbook by CPL
6 pages
Student Guide For ACET Home Based Online Examinations
No ratings yet
Student Guide For ACET Home Based Online Examinations
7 pages
(IJCST-V5I2P20) : R.Jayachandran, N.Rubika, M.Mathivanan, C.V.Vidya Prabha
No ratings yet
(IJCST-V5I2P20) : R.Jayachandran, N.Rubika, M.Mathivanan, C.V.Vidya Prabha
5 pages
Road Damage Detection2222222-1
No ratings yet
Road Damage Detection2222222-1
15 pages
Toshiba Satellite L650 10M
No ratings yet
Toshiba Satellite L650 10M
3 pages
A NOVEL OBJECT DETECTION MODEL (YOLOv5) FOR IMPROVED LUNG NODULE IDENTIFICATION IN MEDICAL IMAGES
No ratings yet
A NOVEL OBJECT DETECTION MODEL (YOLOv5) FOR IMPROVED LUNG NODULE IDENTIFICATION IN MEDICAL IMAGES
8 pages
CSX Candidate Guide 0218
No ratings yet
CSX Candidate Guide 0218
9 pages
OBS Setup Screen Monitoring Students
No ratings yet
OBS Setup Screen Monitoring Students
7 pages
Audiology Telemedicine: Education and Practice
No ratings yet
Audiology Telemedicine: Education and Practice
7 pages
Dating
No ratings yet
Dating
4 pages
OnVUE-Technical-Requirements 20
No ratings yet
OnVUE-Technical-Requirements 20
6 pages
Frequently Asked Questions: How Much Does The Exam Cost?
No ratings yet
Frequently Asked Questions: How Much Does The Exam Cost?
4 pages
Admit Card: Photograph of Candidate
No ratings yet
Admit Card: Photograph of Candidate
3 pages
2 - AMCAT Auto-Proctored Instructions
No ratings yet
2 - AMCAT Auto-Proctored Instructions
2 pages