Reference Paper

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

1st KEC Conference Proceedings| Volume I ISBN 978-9937-0-4872-9

KEC Conference September 27, 2018

TOMATO PLANT DISEASES DETECTION SYSTEM


USING IMAGE PROCESSING
Santosh Adhikari
Kathford International College of Bikesh Shrestha Bibek Baiju
Engineering and Management Kathford International College of Kathford International College of
Affiliated to Tribhuvan University Engineering and Management Engineering and Management
Kathmandu, Nepal Affiliated to Tribhuvan University Affiliated to Tribhuvan University
Santoshadhikari514@gmail.com Kathmandu, Nepal Kathmandu, Nepal
bikesh151515@gmail.com megadextron911@gmail.com
Er. Saban Kumar K.C.
Research and Development Unit
Kathford International College of
Engineering & Management, Tribhuvan
University
Balkumari, Lalitpur
er.saban@kathford.edu.np
ABSTRACT- In the agriculture sector, one of the major
problems in the plants is its diseases. The plant diseases can be
including complex tasks such as object recognition and
caused by various factors such as viruses, bacteria, fungus etc.
image classification also smart phone based applications for
Most of the farmers are unaware of such diseases. That’s why
shape and disease identification in plant leaves have been
the detection of various diseases of plants is very essential to
developed.
prevent the damages that it can make to the plants itself as well
In this scenario, this research is focused on collecting the
as to the farmers and the whole agriculture ecosystem.
data of diseases in tomato plants and train a model for
Regarding this practical issues, this research aimed to classify
diseases detection.
and detect the plant’s diseases automatically especially for the
tomato plant. As per the hardware requirement, Raspberry Pi
is the major computing unit. Image processing is the key process II. LITERATURE REVIEW
of the project which includes image acquisition, adjusting image
ROI, feature extraction and convolution neural network (CNN) Qin et al. proposed a feasible solution for lesion image
based classification. Here, Python programming language, segmentation and image recognition of alfalfa leaf disease.
OPENCV library is used to manipulate raw input image. To The Relief method was first used to extract a total of 129
train on CNN architecture and creating a machine learning features, and then an SVM model was trained with the most
model that can predict the type of diseases, image data is important features. The results indicated that image
collected from the authenticated online source. As the result , recognition of the four alfalfa leaf diseases can
few diseases that usually occurs in tomato plants such as Late be implemented and obtained an average accuracy of
blight (training 100, test 21), Gray spot (training 95, test 18) and 94.74% [2]. These approaches have been applied for
bacterial canker (training 90, test 21) are detected. classification of tomato powdery mildew against healthy
leaves using thermal and stereo images [3].
Rothe et al. presented a pattern recognition system for
identifying and classifying three cotton leaf diseases. Using
the captured dataset of natural images, an active contour
Keywords: Convolution Neural Network (CNN), Image model was used for image segmentation and Hu‟s moments
Processing, Raspberry-Pi, YOLO were extracted as features for the training of an adaptive
I. INTRODUCTION neuro-fuzzy inference system [4].
The pattern recognition system achieved an average
Agriculture is the mainstay of the Nepalese people living. accuracy of 85%. Islam et al. presented an approach that
Still, two-third of the population relies upon agriculture integrated image processing and machine learning to allow
directly or indirectly. Majority of Nepalese people depend the diagnosis of diseases from leaf images. This automated
on agriculture for their livelihoods and has contributed about method classifies diseases on potato plants from „Plant
32.6% of nation's GDP alone by the agricultural sector in Village‟, which is a publicly available plant image database.
the year 2015/16. In the year 2014/15, the average economic The segmentation approach and utilization of an SVM
growth was confined to 0.77%. Tomato is one of the major demonstrated disease classification in over 300 images,
cash crops cultivated in Nepal. Throughout the year the and obtained an average accuracy of 95% [5].
tomatos are cultivated in 25.49 hectors of area and Authors discussed convolutional neural networks models
harvesting rate is about 13419 KG/Hac [1]. In desease were developed to perform plant disease detection and
recognition in such plants , the current system relies on diagnosis using simple leaves images of healthy and
visual observation which is a time consuming process. diseased plants. Training of the models was performed with
Image processing in agriculture has been applied in the the use of an open database of 87,848 images containing 25
areas of sorting, grading of fresh products, detection of different plants in a set of 58 distinct classes of [plant,
defects such as dark spots, cracks and bruises on fresh fruits disease] combinations including healthy plants [6].
and seeds, etc. Recent advances in hardware technology Rathod et al., have discussed different machine learning
have allowed the evolution of deep Convolution Neural methods for disease detection of plant leaf anomalies. The
Networks (CNN) and their number of applications,
KECConference2018, Kantipur Engineering College, Dhapakhel, Lalitpur 81
KEC Conference
systems utilized plant image disease classification, boundary different rotation, scales). Images consist of different
division, feature extraction, which give quick and exact infected areas in the plant (e.g. stem, leaves, fruits ,etc).
discovery of plant leaf infection have been engaged [7]. Starting with the dataset of images, the areas of every image
Sladojevic et al. proposed a novel approach based on deep containing the disease with bounding box and class were
convolutional networks to detect plant disease. By annoteded manually. Some diseases might look similar
discriminating the plant leaves from their surroundings, 13 depending on the infection status . Therefore, the knowledge
common different types of plant diseases were recognized for identifying the type of disease has been provided by
by the proposed CNN-based model. The experimental experts in the area that has helped us to visibly identify the
results showed that the proposed CNN-based model can categories in the images and infected areas of the plant.
reach a good recognition performance, and obtained an This annotation process aims to label the class and location
average accuracy of 96.3% [8]. of the infected areas in the image. The output of this step is
Lu et al. proposed a novel identification approach for rice the coordinates of the bounding boxes of different sizes with
diseases based on deep convolutional neural networks. their corresponding class of disease which will be evaluated
Using a dataset of 500 natural images of diseased and as the Intersection Over Union (IOU) with the predicted
healthy rice leaves and stems, CNNs were trained to identify results in the network during testing.
10 common rice diseases. The experimental results showed
C. Designing Convolution Neural Network (CNN)
that the proposed model achieved an average accuracy of
95.48% [9]. Inspired by the classical Alexnet [13], YOLO [14] and their
Mohanty et al. developed a CNN-based model to detect 26 performance improvements, a deep convolution neural
diseases and 14 crop species. Using a public dataset of network is designed to identify tomato plant diseases. The
54,306 images of diseased and healthy plant leaves, the designed network has 24 convolution layers followed by 2
proposed model was trained and achieved an accuracy of fully connected layers.
99.35% [10]. First of all, a structure is designed, which is based on
Tan et al. presented an approach based on CNN to recognize standard YOLO model. For the perception of the
apple pathologic images, and employed a self-adaptive convolution kernel, A larger sized the convolution kernel
momentum rule to update CNN parameters. The results has a stronger ability to extract the macro information of the
demonstrated that the recognition accuracy of the proposal image, and vice versa. A Layer grid is smaller than the
was up to 96.08%, with a fairly quick convergence [11]. whole image. And other information of the image can be
The novel cucumber leaf disease detection system was understood as "noise" which needs to be filtered. As a
presented based on convolutional neural networks. Under consequence, the first convolution layer is designed to be 64
the fourfold cross-validation strategy, the proposed CNN- kernels of size 7*7*2.
based system achieved an average accuracy of 94.9% in Table 1: CNN Network Filter
classifying cucumbers into two typical disease classes and a
healthy class [12].
III. METHODOLOGY Type Filter Size/Stride Output
A. System Overview
A general overview of the system is presented in the block Convolutional 32 3*3 224*224
diagram of the system as follows:
Result
Image Dataset
Maxpool 2*2/2 112*112
Performance
Data Annotation Verification
Convolutional 64 3*3 112*112

Data Augmentation Diseases Training


Maxpool 2*2/2 56*56
Training
Annotated & Testing Parameter
Augmented Training Convolutional 128 3*3 56*56
Diseases Detection
Data Validation (CNN)
Figure 1: System overview Convolutional 64 1*1 56*56
B. Data Collection and Annotation
The dataset contains images with several diseases in tomato Convolutional 128 3*3 56*56
plants. Some of the images are extracted from the internet
and some are captured from the farm using a camera device.
The image was collected at different time and orientation (
Maxpool 2*2/2 28*28
e.g. illumination, different light intensity, placement,

KECConference2018, Kantipur Engineering College, Dhapakhel, Lalitpur 82


KEC Conference
gradient descent with a starting learning rate of 0.1,
polynomial rate decay with a power of 4, weight decay of
Convolutional 256 3*3 28*28 0.0005 and momentum of 0.9. In the initial training, input
image is 580* 580 resolution then it is scaled down to
448*448 and for 10 epochs at a 10−3 learning rate. After the
Convolutional 128 1*1 28*28 training, the classifier achieves a top-1 accuracy of 76.5%
and a top-5 accuracy of 93.3%.
After removing the fully connected layers, Classifier can take
Convolutional 256 3*3 28*28 images of different sizes. If the width and height are doubled,
we are just making 4x output grid cells and therefore 4x
predictions. Since the CNN network down samples the input
Maxpool 2*2/2 14*14 by 32, we just need to make sure the width and height is a
multiple of 32. During training, Classifier takes images of
size 320×320, 352×352, and 608×608 (with a step of 32).
Convolutional 512 3*3 14*14 For every 10 batches, Classifier randomly selects another
image size to train the model. This acts as data augmentation
and forces the network to predict well for different input
Convolutional 256 1*1 14*14 image dimension and scale. In additional, we can use lower
resolution images for object detection at the cost of accuracy.
This can be a good tradeoff for speed on low GPU power
Convolutional 512 3*3 14*14 devices. At 288 × 288 algorithm runs at more than 90 FPS
with mAP almost as good as Fast R-CNN. Batch
normalization leads to significant improvements in
Convolutional 256 1*1 14*14 convergence while eliminating the need for other forms of
regularization. By adding batch normalization on all of the
convolutional layers in detection we get more than 2%
improvement in mAP. Batch normalization also helps
Convolutional 512 3*3 14*14
regularize the model. With batch normalization 4%. Then
the fully connected layers and the last convolution layer is
Maxpool 2*2/2 7*7 removed for a detector. Detection algorithm adds three 3 × 3
convolutional layers with 1024 filters each followed by a
final 1 × 1 convolutional layer with 125 output channels. (5
Convolutional 1024 3*3 7*7 box predictions each with 25 parameters) Classifier also add
a passthrough layer. CNN trains the network for 160 epochs
with a starting learning rate of 10−3, dividing it by 10 at 60
and 90 epochs. Input image is annotated where annotated file
Convolutional 512 1*1 7*7
contain region of area of infected plant parts. Over 500
images per class are trained. Here class represents the type of
diseases Training was done on Nvidia 1080 Ti.
Convolutional 1024 3*3 7*7
E. Detection
Detection algorithm divides the input image into
Convolutional 512 1*1 7*7 an S×S grid. Each grid cell predicts only one object. Each
grid cell predicts a fixed number of boundary boxes. Each
boundary box contains 5 elements: (x, y, w, h) and a box
Convolutional 1024 3*3 7*7 confidence score. The confidence score reflects how likely
the box contains an object (objectness) and how accurate is
the boundary box. The bounding box width w and
Convolutional 1000 1*1 7*7 height h by the image width and height. x and y are offsets to
the corresponding cell is normalized. Hence, x, y, w and h are
all between 0 and 1. Each cell has n conditional class
Avgpool Global 1000 probabilities. The conditional class probability is the
probability that the detected object belongs to a particular
class (one probability per category for each cell).
D. Training Model The major concept of this algorithm is to build a CNN
network to predict a (7, 7, 30) tensor. It uses a CNN network
The input image is collected from internet and annotated to to reduce the spatial dimension to 7×7 with 1024 output
label region of the area. An object with a different channels at each location. Detection performs a linear
background, light intensity, orientation and different shapes regression using two fully connected layers to make 7×7×2
and sizes is collected to increase accuracy and minimize false boundary box predictions (the middle picture below). To
detection. This model has trained with the 4 class make a final prediction, we keep those with high box
classification dataset in 2000 iterations: using stochastic

KECConference2018, Kantipur Engineering College, Dhapakhel, Lalitpur 83


KEC Conference
confidence scores (greater than 0.25) as our final predictions A. Training Model on CNN
(the right picture). The class confidence score for each
prediction box is computed as:
It measures the confidence on both the classification and
the localization (where an object is located). In the real-life
domain, the boundary boxes are not arbitrary. Gray spot have
very similar shapes and leaf mold have an approximate
aspect ratio of 0.41.
Since we only need one guess to be right, the initial training
will be more stable if we start with diverse guesses that are
common for real-life objects. Instead of predicting 5 arbitrary
boundary boxes, we predict offsets to each of the anchor
boxes above. If we constrain the offset values, we can
maintain the diversity of the predictions and have each
prediction focuses on a specific shape. So the initial training
will be more stable.
IV. RESULTS
The CNN-based classifiers are tested on a subset of the
diseases dataset, including tomato plant leaf diseases. The Figure 2: Iteration vs. loss function
dataset consists of 3 leaf diseases of the tomato plant,
including Gray spot (113 samples), Late Blight (121 Above graph shows the relation between number of iteration
samples), Bacterial Canker (111 samples). Adding healthy and average loss of training. Y-axis represents average loss
tomato leaf images, the used dataset contains 520 images in and X-axis represents number of iterations. The model is
3 categories. The preliminary preparation and augmentation trained upto 8000 iterations with average loss of 0.0634 and
are applied to the dataset. The images of the dataset are Mean Average Precision (MAP) is found to be 0.76.
resized to fit into 412×412 dimensions which are chosen to
be relatively small and close to a fraction of the average size B. Detection of diseases
of all images. After excluding 10% of the images as test set,
the remaining images as training set are augmented, in
order to reduce over fitting, by adding horizontally flipped
copy of the images, then a portion of these images is further
separated as the validation set pre-trained on Image-Net and
fine-tuned on the dataset, and the proposed CNN
architecture with and without residual learning. Firstly, the
pre-trained YOLO models, are fine-tuned on the dataset to
be considered as a baseline for comparison. Then a
simplified CNN architecture is proposed and trained with
and without the residual learning framework (residual and
plain CNN) to compare the results. All the diseases which
may affect the growth of tomato plant has been analyzed.
Different diseases has different features and symptoms, by
classifying these visual symptoms of diseases data is trained
on convolution neural network (CNN).after training model
is created which can detect all the diseases. After testing
trained model on Pascal voc. Format, Mean Average
Precision (MAP) is found to be 0.76.
System can predict diseases on different scales and
resolution of images. Size , orientation, light intensity does
not affect the output result. However on high resolution
image detection accuracy will be high. System resize the
input image into 412*412 ( width * Height) and scales pixel
value at this ratio.

Figure 3: Detection of late blight (accuracy: 95%)

KECConference2018, Kantipur Engineering College, Dhapakhel, Lalitpur 84


KEC Conference
predict diseases in tomato plant. Python programming
language, OPENCV library is used to manipulate raw input
image. Model is trained on Nvidia 1080 gpu. This system is
implemented in Raspberry pi, desktop based Graphical User
Interface (GUI) is developed to capture image or video.
In addition, It is found that using technique based data
annotation and augmentation results in better performance.
As a limitation; this system is only capable of detecting
three classes of diseases and healthy plant. In order to detect
Figure 4: Detection of bacterial Canker (accuracy: 89%) other class of diseases data has to be trained on current
model. Algorithm will use transfer learning method to
classify other class of diseases.
The main challenge while developing object detection
model on machine learning was to collect large number of
train images with different shapes, sizes, with different
background, light intensity, orientation and aspect ratio.
As per the recommendation; the further study can be done to
detect all types of plant diseases, not only detection but also
suggesting remedies for diseases. Finally, this system can be
integrated with IOT server to implement system on rural and
remote area.
ACKNOWDGEMENT
It gives us great pleasure in presenting this paper titled:
“TOMATO PLANT DISEASE DETECTION SYSTEM
USING IMAGE PROCESSING”. On this momentous
Figure 5: Detection of Gray-Spot (accuracy: 92%) occasion, we wish to express our immense gratitude to the
range of people who provided invaluable support in the
completion of this project. Their guidance and
encouragement has helped in making this project a great
success.
We would like to deeply express our sincere gratitude to our
respected principal Madhu Sudan Kayastha, Ph. D. and the
management of Kathford International College of
Engineering and Management for providing such an ideal
atmosphere to build up this project with well-equipped store
and library with all the utmost necessary reference
materials. We are extremely thankful to all the staff and the
management of the college for providing us all the facilities
and resources required.

REFERENCES
[1] N. Ghimire, M. Kandel, M. Aryal and D. Bhattarai,
Figure 6: Detection of Healthy Plant "Assessment of tomato consumption and demand in
Nepal," The Journal of Agriculture and Environment,
Regarding accuracy of the system during training, it is 0.76 vol. 18, p. 83, Jun. 2017.
MAP. An overall accuracy of the system is found to be 89
% based on plant village dataset. System failure or [2 ] F. Qin, D. Liu, B. Sun, L. Ruan, Z. Ma and H. Wang,
predicting false positive on those images which have similar "Identification of alfalfa leaf diseases using image,"
pattern of diseases were caused due to the mud, insects 2016.
waste such as white moth , pest and eggs.
V. CONCLUSION [3] G. Prince, J.P. Clarkson, N.M. Rajpoot, “Automatic
detection of diseased tomato plants using thermal and
In this way by collecting data of various diseases of tomato
plants and process them to train on CNN architecturte to stereo visible light images,” PloS one, Vol. 10 No. 4,
create a machine learning model, Late blight (training 100, 2015, pp. e0123262.
test 21), Gray spot (training 95, test 18), bacterial canker
(training 90, test 21) are the detected diseases.
For detection purpose YOLO object detection algorithm
build in darknet framework is used to train a model and

KECConference2018, Kantipur Engineering College, Dhapakhel, Lalitpur 85


KEC Conference
[4] P. Rothe and R. Kshirsagar, "Cotton leaf disease [12] Y. Kawasaki, H. Uga, S. Kagiwada and H. Iyatomi,
identification using pattern recognition techniques," in "Basic study of automated diagnosis of viral plant
International Conference on Pervasive Computing, diseases using convolutional neural networks," in
Pune,India, 2015. 12th International Symposium on Visual Computing,
Las Vegas, 2015.
[5] Islam, A. Dinh, K. Wahid and P. Bhowmik, "Detection
[13] A. krizheysky and G. Hinton, "ImageNet
of potato diseases using image segmentation and
classification with deep convolution neural
multiclass support vector machine," Canadian networks," in 25th International Conference on
Conference on Electrical and Computer Engineering, Neural Information Processing Systems, Lake
Canada, 30 April–3 May 2017.
Tahoe, USA, 3 Dec. 2012.

[6] K. P. Ferentinos, "Deep Learning models for plant [14] J. Redmon, S. Divvala, R. Girshick and A. Farhadi,
disease detection and diagnosis," 2018. "You only look once: Unified, real-time object
detection," in IEEE conference on computer vision
[7] P. Rothe and R. Kshirsagar, "Cotton leaf disease and pattern recognition , 2016.
identification using pattern recognition techniques," in
International Conference on Pervasive Computing,
Pune,India, 2015.

[8] S. Sladojevic, M. Arsenovic, A. Anderla, D. Culibrk


and D. Stefanovic, “Deep neural networks based
recognition of plant diseases by leaf image
classification,” 2016.

[9] Y. Lu, S. Yi, N. Zeng, Y. Liu and Y. Zhang,


“Identification of rice diseases using deep
convolutional neural networks,” 2017.

[10] S. Mohanty, D. Hughes and S. Marcel, “Using deep


learning for image-based plant disease detection,”
2016.

[11] W. Tan, C. Zhao and H. Wu, “CNN intelligent early


warning for apple skin lesion image acquired by
infrared video sensors,” 2016.

KECConference2018, Kantipur Engineering College, Dhapakhel, Lalitpur 86

You might also like