Prostate Cancer Yolo

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Expert Systems With Applications 201 (2022) 117148

Contents lists available at ScienceDirect

Expert Systems With Applications


journal homepage: www.elsevier.com/locate/eswa

Automated prostate cancer grading and diagnosis system using deep


learning-based Yolo object detection algorithm
Mehmet Emin Salman a , Gözde Çakirsoy Çakar b , Jahongir Azimjonov a ,∗, Mustafa Kösem c ,
İsmail Hakkı Cedimoğlu d
a
Department of Computer Engineering, Sakarya University, Sakarya, 54187, Turkey
b Medical Doctor with the Sakarya University Research and Training Hospital, Sakarya, 54187, Turkey
c Department of Pathology, Faculty of Medicine, Sakarya University, Sakarya, 54187, Turkey
d Department of Information Systems Engineering, Sakarya University, Sakarya, 54187, Turkey

ARTICLE INFO ABSTRACT

Keywords: Purpose: Developing an artificial intelligence-based prostate cancer detection and diagnosis system that can
Deep learning automatically determine important regions and accurately classify the determined regions on an input biopsy
Gleason grading image.
Prostate cancer detection
Method: The Yolo general-purpose object detection algorithm was utilized to detect important regions (for the
Prostate tissue classification
localization task) and to grade the detected regions (for the classification task). The algorithm was re-trained
with our prostate cancer dataset. The dataset was created by annotating 500 real prostate tissue biopsy images.
The dataset was split into train/test parts as 450/50 real prostate tissue images, respectively, before the data
augmentation process. Next, the training set consisting of 450 labeled biopsy images was pre-processed with
the data augmentation method. This way, the number of biopsy images in the dataset was increased from 450
to 1776. Then, the algorithm was trained with the dataset and the automatic prostate cancer detection and
diagnosis tool was developed.
Results: The developed tool was tested with two test sets. The first test set contains 50 images that are similar
to the train set. Hence, 97% detection and classification accuracy has been achieved. The second test set
contains 137 completely different real prostate tissue biopsy images, thus, 89% detection accuracy has been
achieved.
Conclusion: In this study, an automatic prostate cancer detection and diagnosis tool was developed. The test
results show that high-accuracy (high-performance) prostate cancer diagnosis tools can be developed using AI
(computer vision) methods such as object detection algorithms. These systems can decrease the inter-observer
variability among pathologists, and help prevent the time delay in the diagnosis phase.

1. Introduction 2 are all considered as a benign class. Gleason pattern 3 consists


of well-formed, individual glands of various sizes. Gleason pattern 4
Pathological grading is crucial at prostate cancer (PCa) risk strati- includes poorly formed, fused, and cribriform glands. The presence of a
fication and treatment selection. PCa is diagnosed from systemic core cribriform pattern in radical prostatectomy specimens has been related
needle biopsies sampled throughout the prostate gland, traditionally.
to higher rates of extra-prostatic extension, positive surgical margins,
Clinical risk assessment is primarily determined using the pathological
biochemical recurrence, and cancer-specific mortality (Iczkowski et al.,
grades based on the Gleason grading system, which was initially pro-
posed in the 1960s and modified multiple times thereafter (Gleason, 2011; Kweldam et al., 2015). Gleason pattern 5 is the high-risk histo-
1966). The Gleason grades 1 and 2 are not often used for pathology logic pattern including tumor sheets, individual cells, and cord cells.
reporting. Because these grades are morphologically indistinguishable In general, tumors containing only Gleason grade 3 have a minimal
from normal prostate tissue. They are not aggressive and tend to grow risk of metastasis, while tumors with dominant Gleason grades 4 and
slowly. Patient follow-up with serum PSA (prostate-specific antigen) 5 are related to cancer-specific mortality. PCa shows a heterogeneous
levels may necessitate recurrent biopsy. Hence, benign, grades 1 and

∗ Corresponding author.
E-mail addresses: eminsalman@sakarya.edu.tr (M.E. Salman), gozdec123@gmail.com (G. Çakirsoy Çakar), jahongir.azimjonov1@ogr.sakarya.edu.tr
(J. Azimjonov), mustafakosem@sakarya.edu.tr (M. Kösem), cedim@sakarya.edu.tr (İ.H. Cedimoğlu).

https://doi.org/10.1016/j.eswa.2022.117148
Received 16 June 2021; Received in revised form 7 August 2021; Accepted 29 March 2022
Available online 12 April 2022
0957-4174/© 2022 Elsevier Ltd. All rights reserved.
M.E. Salman et al. Expert Systems With Applications 201 (2022) 117148

primary diagnostic practices as a consultant (Ribeiro et al., 2019). So,


there has been great interest in developing the AI-based PCa diagnos-
tic and treatment approaches, specifically, medical image processing
and machine learning-based methods that can help to discriminate
the benign versus malignant tissues by grading, as well as predicting
the outcome of the disease, and automatically analysis of histopatho-
logic images (Madabhushi & Lee, 2016). Computer-aided pathology
has provided solutions to assist pathologists in evaluating new tissue
samples in recent years (Wang et al., 2014). The deep learning methods,
which are an unsupervised learning technique that extracts hidden
features automatically from raw image data has become very popular
in diagnosing various types of cancer, for which classical probabilistic,
statistical, and conventional machine learning methods become short.
Deep learning (DL) has become a successful technique of modern ma-
chine learning algorithms in various fields such as pattern recognition,
object detection, and classification for the computer and information
sciences community (Lee & Chen, 2015). In recent years, many studies
have been conducted in the field of digital pathology with this new
technique, as in the field of the radiological image processing (Kallen,
Molin, Heyden, Lundström, & Åström, 2016; Litjens et al., 2016).
Deep learning approaches offer improved performance with different
Fig. 1. An illustration of the developed system. methods in the bio-medical field, including Digital Pathological Image
Analysis (DPGA). Convolutional neural network (CNN) approaches as
one of many different DL architectures, provide superior performance
distribution of grades within the same patient. The Gleason score is re- for classification, segmentation, and detection tasks in DPGA (Alom
ported based on the combination of the most dominant and second most et al., 2019). Therefore, CNN-based architectures have been utilized
dominant patterns observed, see Fig. 1. Because of its heterogeneity, commonly as a tool for faster and more accurate diagnosis by process-
the grading of prostate tissues is subjective with a high degree of inter- ing radiological and pathological images (Cinar, Engin, Engin, & Ziya
observer variability, and even for the same pathologist (Allsbrook et al., Atesci, 2009).
2001; Egevad et al., 2013). Despite this subjective nature, microscopic Although the Gleason grading system plays a vital role in the
evaluation is the gold standard for diagnosis and grading of PCa, diagnosis and treatment of the disease, the approach has substan-
as well as tissue diagnosis of tumors. Nevertheless, the microscopic tial inter-observer variations among pathologists, which restricts its
evaluation of a single tissue takes about 20 min, thus it is considered effectiveness for individual patients. This inconsistency may lead to
a time-consuming task. However, recent studies show artificial intel- unnecessary therapy or a missing diagnosis of PCa. Thus, different deep
ligence (AI) has the potential to aid in pathologic evaluation also in learning-based methods have been proposed to solve these problems.
the detection and classification of PCa on pathology imaging. More Nevertheless, these approaches become short since they do not contain
recent advances in deep learning applications have shown promise for an object or region localization functionality, which has a direct and
improved performance of PCa grading algorithms (Arvaniti et al., 2018; negative effect on detection (localization and classification) accuracy.
Li et al., 2019; Nagpal et al., 2019; Nir et al., 2018; Zhou, Fedorov, Additionally, the techniques do not scan the whole input image, con-
Fennessy, Kikinis, & Gao, 2017). The research by Nir et al. evaluat- trarily, they only focus on randomly selected and a determined number
ing multiple machine and deep learning methods demonstrates 92% of regions. These issues cause missing crucial regions or focus on
accuracy results in cancer detection and 79% accuracy in classification unimportant regions. Consequently, these negative cases lead to misde-
of low and high-grade PCa. In addition, the researchers (Ström et al., tection or misclassification, or complete failure of the whole detection
2020) have represented high accuracy results of discriminating benign system. However, the Yolo object detection algorithm performs the
versus malignant cores and grading of PCa. The deployment of such AI detection (localization and classification) task more accurately than
tools in pathology practice is important not only because of the inter- the previous detection algorithms since Yolo processes an input image
observer variability in the diagnosis of prostate adenocarcinoma, but by scanning the whole parts of the image. Hence, it is fully potent
also because of the significant increase in the workload of patholo- for object or region detection (localization and classification) task.
gists as cancer cases rise, and also due to the increasing complexity Due to these reasons, we have chosen the general-propose Yolo object
of histopathological assessment changes in the PCa recommendation detection algorithm for PCa detection and classification system in this
guidelines. So, there is a concomitant increase in the pathology work- study. Additionally, the study is different from other approaches in
load since the PCa cases grow per year, which potentially results in that the Yolo algorithm is a state-of-the-art method in object detection
delayed cancer diagnoses and diagnostic errors (Nezhad, Sadati, Yang, and classification, and there is no study conducted with Yolo in the
& Zhu, 2019). To reduce the workload and prevent misdiagnosis, the literature until now in the diagnosis of PCa disease. As is stated above,
AI systems have the potential to assist the pathologist by prescreening. the main reason we use the Yolo approach is that the algorithm
However, AI-based pathological diagnosis approaches face possible determines the location of the regions more accurately by analyzing the
challenges such as benign mimickers of cancer, slides with thick cuts, entire input image, and it has a higher localization and classification
and fragmented cores, and poor staining. Additionally, the AI systems rates in detecting regions when it is trained via the field-specific image
cannot achieve a higher sensitivity as reporting pathologists do since data, for example, with the accurately annotated prostate tissue images.
these methods are designed based on microscopic examination. How- In addition, the Yolo algorithm is superior to other algorithms with
ever, the sensitivity could be increased by enhancing AI systems with its ability to detect real-time object detection much faster. Another
the opinions of different pathologists. specificity of this study is that the new dataset has been created via
Furthermore, we are witnessing a transformation in pathology as a the nearly 500 real PCa tissue pattern images, which were annotated
result of new horizons in digital pathology (medical image processing by a pathologist, and reviewed by two other pathologists experts. The
and artificial intelligence.) Access to the use of artificial intelligence system were re-trained with this dataset. Re-training the fine-tuned
may provide novel facilities in education, innovative research, and Yolo algorithm significantly increased the accuracy of the system.

2
M.E. Salman et al. Expert Systems With Applications 201 (2022) 117148

Fig. 2. Data augmentation samples.

For example, the accuracy rate of PCa diagnosis approaches such as applied to the relevant region to find the best regions. After selecting
traditional machine learning, probabilistic, and meta-heuristic, which the best features, support vector machines (SVM) were applied to
have been done with other data sets containing few case examples in classify them according to benign and malignant cancers (Abraham &
the past, is between 50% and 80%. Our system achieved a 97% (Test Nair, 2019; Thangavel & Manavalan, 2014). Sadoughi et al. developed
1) and a 89% (Test 2) overall PCa detection and classification accuracy a system for PCa diagnosis by combining back propagation artificial
at PCa tissue analysis. Test 1 contains prostate tissue images similar neural network features with Particle Swarm Optimization(PSO). In
to the train set, thus, a higher accuracy has been obtained. But, the their study with 360 patients data sets, they achieved 98.33% suc-
system was also tested with Test 2 sets. The Test 2 set contains 137 cess during training. The authors say that they are unable to make
prostate tissue images that are completely different from the training a concrete diagnosis for prostate cancer diagnosis but provide useful
test, and the biopsy images (1960x4032 pixel units) in the Test 2 set information to physicians in evaluating whether a biopsy is necessary
are in different sizes from the biopsy images (3042 × 4032 pixel units) or not (Sadoughi & Ghaderzadeh, 2014).
in the training set. The developed system is illustrated in Fig. 1. Meta-heuristic algorithms have generally been used to determine
whether a biopsy is necessary or not. However, other computational
2. Literature review intelligence algorithms have been deemed more appropriate to examine
and evaluate biopsy tissue. For example, artificial neural networks
The goal of predictive models in medical image processing is to have been widely used to diagnose biopsy images, and artificial neural
develop computer-aided algorithms that can predict future events and networks-based approaches have been shown to better adapt to the PCa
developments related to healthcare systems (Shariat, Kattan, Vickers, detection process than other computational intelligence algorithms.
Karakiewicz, & Scardino, 2009). Therefore, many computer-aided tra- Because, the advantage of the artificial neural network is that it gen-
ditional machine learning or artificial intelligence models have been erally performs well in classifying data that is not included in the
developed to diagnose PCa disease. There are many computational learning process (Cosma, Brown, Archer, Khan, & Graham Pockley,
intelligence approaches such as meta-heuristic optimization algorithms, 2017). Saritas et al. designed an artificial neural network model in
artificial neural networks, support vector machines, fuzzy-based ap- the diagnosis of PCa disease. They used data from 121 patients with a
proaches and deep learning applied for PCa prediction modeling (de definite diagnosis. 70% of the data is divided into training sets and 30%
Oliveira et al., 2013). as test data. Their results achieved a success rate of 94.44% that could
For instance, meta-heuristic approaches such as genetic algorithm, help physicians to make fast and reliable diagnoses (Saritas, Ozkan, &
ant colony, and particle swarm optimization algorithms have been pro- Sert, 2010). Greenblatt et al. integrated neural network and support
posed to diagnose PCa by analyzing ultrasound trans-rectal images or to vector machine (SVM) into the computer-aided diagnostic tool used in
determine whether a biopsy is necessary. Underwood et al. developed a the prediction of Gleason grade PCa using histopathological images.
genetic algorithm-based simulation model. It has been compared with First, the quaternion neural network used for multi-classification was
the previously recommended screening policies (Underwood, Zhang, applied to predict Gleason grade PCa. Later, a dual support vector
Denton, Shah, & Inman, 2012). Shi et al. proposed a new approach machine (SVM) was used to make better classification. They achieved
to segment prostate ultrasound images using the genetic optimization a 98.87% success rate in the study they conducted to estimate the
algorithm. In the first step, the boundary curve representation was Gleason degree (Grade 3, 4, 5) in a data set consisting of 71 patient
obtained using principal component analysis (PCA). Next, the prostate images (Greenblatt, Mosquera-Lopez, & Agaian, 2013). Support vector
margin features were determined, and in the third stage, the genetic machines are a method used for classifying linear and nonlinear data.
algorithm was applied to optimize the implicit curve display param- As an example, Haq et al. have developed a dynamic contrast-enhanced
eters (Shi, Liu, & Wu, 2013). Thangavel et al. applied ant colony MRI (DCE-MRI) system for prostate cancer prediction (Jensen, Carl,
optimization to identify cancer from trans-rectal ultrasound (TRUS) Boesen, Langkilde, & Østergaard, 2019; Vente, Vos, Hosseinzadeh,
images in the PCa classification system. First, the relevant regions were Pluim, & Veta, 2021). Various approaches have been used for data-
identified from the TRUS images and ant colony optimization was driven and feature extraction. Using the extracted features, a prediction

3
M.E. Salman et al. Expert Systems With Applications 201 (2022) 117148

Fig. 3. Annotated images. The pathologist annotated vital regions and graded the regions using the Gleason grading system.

model based on SVM classification algorithm was created. 449 tissue and classification systems is under 60% and it is not sufficient and
regions were obtained from 16 patients to test the model. A 86% confident to use them for PCa diagnosis. The fact that these studies have
diagnosis accuracy was achieved in the study (Haq et al., 2015). different and low accuracy results is due to the difference of the model
In probabilistic approaches, methods based on artificial neural net- content, few amounts of images in the data set, and training process.
works, or SVM-based algorithms, the data belonging to the selected For instance, ImageNet, VGG-16, Inception-V3 architectures does
object region either belong 100% to a set or not at all. This is against the not contain object or region localization feature. It has direct and
way a pathologist examines and diagnoses prostate tissue (López-Úbed negative effect on the accuracy detection and classification accuracy.
et al., 2020). However, in fuzzy logic-based approaches, many studies Methods such as ResNet-50, DenseNet-121, MobileNet SSD, R-CNN, FR-
have shown that fuzzy logic-based diagnostics algorithms can examine CNN does not scanning whole input image, contrarily, they only focus
and analyze prostate tissue pattern more closely to human assessment, on randomly selected and determined number of regions. Consequently,
as data belong partially to one cluster and can be incorporated into these methods may miss crucial regions or might focus on unimportant
another cluster to a certain extent simultaneously. For example, in regions and these two conditions lead to misdetection or misclassifica-
a study conducted with fuzzy-based systems, they developed a fuzzy tion, or they completely fail the detection system. However, the Yolo
system combining clinical stage and biopsy Gleason score to predict object detection algorithm detects and classifies objects and detected
the pathological stage of PCa and showed that the developed system regions by scanning the whole parts of an input image. Due to these
performed better than Partin probability tables (Castanho, Hernandes, reasons, it is powerful for object and region detection, and classification
De Ré, Rautenberg, & Billis, 2013). In another study, function values tasks. To address aforementioned PCa detection challenges we have
generated from 190 patient data were determined using ROC curve test chosen general-purpose Yolo object detection algorithm. Additionally,
performance analysis to predict PCa diagnosis. It has been observed to the study is different from other studies is that the Yolo method is a
perform similarly to the presenters presented in probability tables (de new method and there is no study conducted with Yolo to date in the
Paula Castanho, de Barros, Yamakami, & Vendite, 2008). Benecchi diagnosis of prostate cancer disease. As it is stated above, the main
predicted PCa disease using the coactive neuro-fuzzy inference system reason we use the Yolo method in this study is that this algorithm
(CANFIS) in his study. They found that their study with 1030 patient determines the location of the objects more accurately, by analyzing
data and CANFIS was superior to a prostate-specific antigen blood the entire input image, and it has a higher classification and higher
test (Benecchi, 2006). Cosma et al. developed a neuro-fuzzy model for confidence level in detecting objects. In addition, the Yolo algorithm is
the classification and prediction of organ-confined disease (OCD) and superior to other algorithms with its ability to detect real-time object
extra prostatic disease (ED) using the PCa data set obtained from the
detection much faster (Redmon & Farhadi, 2018). Another specificity
cancer genome atlas (TCGA). In the developed model, the age parame-
of this study is that there are nearly 2000 real PCa tissue pattern images
ter was taken as an additional entry in the test results. The estimation
in the data set. The fact that the data set contains sufficient amount of
result obtained from the model revealed a better performance than
tissue samples has significantly increased the accuracy of the developed
other computational intelligence systems (Cosma et al., 2016).
system. For example, the accuracy rate of PCa diagnosis approaches
Traditional machine learning, statistical and probabilistic or fuzzy
such as traditional machine learning, probabilistic, and meta-heuristic,
logic-based algorithms encounter problems such as inability to de-
which have been done with data sets containing few case examples in
tect rapid color or shape change, brightness or lighting problems,
the past, is between 50% and 60%. Our data set used in this study
and illusion or occlusion in object detection and classification pro-
contains patterns created from more than 1780 real human prostate
cesses (Covic et al., 2005). However, the CNN architecture, which
tissues and has an object detection rate of over 90%.
consists of three main layers, namely the input layer, convolution
layers and the fully connected classification layer, and powerful ac-
tivation and error functions in these layers, is robust against to the 3. Methodology
problems that the traditional machine learning, statistical and prob-
abilistic algorithms faced in the past. In addition, CNN architectures 3.1. Feasibility of vision-based PCa diagnosis using Gleason grading system
have been found to perform better in PCa diagnosis from pathologi-
cal images (Arvaniti et al., 2018). Apart from that, there have been Generally, the diagnosis and treatment of PCa relies on the
many CNN-based object detection and classification algorithms in- histopathological analysis of prostate tissue samples and grading them
cluding ImageNet (Russakovsky et al., 2015), VGG-16 (Simonyan & with a Gleason score based on architectural pattern of the tumor. The
Zisserman, 2015), Inception-V3 (Szegedy, Vanhoucke, Ioffe, Shlens, & Gleason grade, Gleason score and ISUP grade grouping systems are
Wojna, 2015), ResNet-50 (He, Zhang, Ren, & Sun, 2015), DenseNet- illustrated via Fig. 1, Bulten et al. (2020). Benign, grade 3, grade
121 (Huang, Liu, Van Der Maaten, & Weinberger, 2017), MobileNet 4, and grade 5 tissue patterns have distinctive pixel, shape features
SSD (Howard et al., 2017), R-CNN, FR-CNN (Girshick, 2015) and that are crucial for the training of deep learning-based PCa detection
Yolo (Redmon, Divvala, Girshick, & Farhadi, 2016). And they are and classification algorithms. These distinctive features enable auto-
actively being used for object detection and classification tasks. There matic and real-time PCa detection and diagnosis systems. Especially, a
are many studies conducted using CNN architecture for the diagnosis of deep learning-based general-purpose object detection algorithm Yolo is
PCa disease. The average accuracy of the previous CNN-based detection fitting to these tasks since it has strong region localization functionality.

4
M.E. Salman et al. Expert Systems With Applications 201 (2022) 117148

Fig. 4. The network architecture of Yolo.

3.2. Data set preparation Yolo does not immediately recognize benign or malignant prostate tis-
sue images due to the current hyperparameter settings and pre-learned
This step is critical for the system since the proposed algorithm is features acquired from the COCO dataset. Therefore, a Yolo-based
developed using a fully supervised learning approach of deep learning PCa detection system requires re-training and fine-tuning the general-
(Convolutional Neural Networks). The more accurately the training purpose model of Yolo to detect the important regions and to classify
data set is prepared by annotating images, the higher the accuracy the detected regions of prostate biopsy image as benign, grade 3, 4,
of the whole cancer detection and classification system. Hence, we and 5. Firstly, the Yolo model was fine-tuned. Next, the fine-tuned
have collected more than 500 prostate tissues biopsy images from model was re-trained with the PCa dataset. The Yolo model comes with
different patients. The biopsy images were acquired from the Pathology the neural network configuration file, called yolov3.cfg in which the
Department of the Research and Training Hospital of Sakarya Univer- structure of the Yolo neural net is located. The file consists of totally
sity. The RGB images of prostate tissues were obtained by scanning 106 layers (input, hidden, pooling, and fully connected output layers).
the slides using a digital camera (Samsung, Node9, Wide-angle Super The linear ReLU, Leaky ReLU activation functions have been used in the
Speed Dual Pixel 12MP AF sensor, Rear camera with 12-megapixel, general-purpose model. But, the features of prostate biopsy images in
1.4-micron) attached to a microscope (Nicon eclipse i80). These images the PCa dataset are not linearly distributed. Hence, the linear activation
were scanned at 20× optical magnification. Whole slide tissue images of functions were changed to non-linear activation functions (Sigmoid and
500 biopsy cores, which containing different grades of prostatic acinar Tanh). The Sigmoid and Tanh activation functions are illustrated in
adenocarcinoma and benign areas were stained with H&E compounds. Eqs. (1) and (2), respectively.
Region of interest (ROIs) of grade 3, grade 4, and grade 5 were accu- 𝑒𝑥
rately annotated on whole slides images. The original size of images in 𝑓 (𝑥) = 𝑆𝑖𝑔𝑚𝑜𝑖𝑑(𝑥) = (1)
𝑒𝑥 + 1
training set and test set 1 are 4032 × 3024 (Width 𝑥 Height) pixels.
𝑒𝑥 − 𝑒−𝑥
In the test set 2, the size of images are 4032 × 1960 (Width × Height) 𝑓 (𝑥) = 𝑇 𝑎𝑛ℎ(𝑥) = (2)
𝑒𝑥 + 𝑒−𝑥
pixels. Then, one pathology expert prepared the dataset by annotating
each image into four categories, namely Benign, Grade 3, Grade 4, and The next step to fine-tune the model was to replace the default class
Grade 5. The ‘‘labelImg’’ image annotation tool (by MIT, Massachusetts numbers from 80 to four in which the four corresponds to the Gleason
Institute of Technology) have been utilized to annotate the biopsy grades including benign, grade 3, 4, and 5, and the default number
tissue images in the dataset. The annotated samples by pathologists of 80 corresponds to the previous classes from the Microsoft COCO
are illustrated via Fig. 3. The prepared ground truth data set images dataset. With the newly defined class size, each convolutional layer’s
were prepared by a pathologist, and reviewed and approved by at least filter must also shift from the default 255 to 27. The hyperparameter
two other independent pathologists. The data set has been replicated by ‘filters’ is calculated through this equation, filters=(classes+5)x3 based
passing through a two-step pre-processing stage including fine-tuning on the formula in the original paper of the Yolo algorithm. So, the num-
and data augmentation (replication). The augmentation stage includes ber of filters, in our case is 27, filters=(4+5)x3. Other hyperparameters
horizontal and vertical flips, rotations of (90, 180, 270 degrees), crop were initialized to their optimal values based on the recommendations
(by the annotated bounding-box ratio, for example, 30%, 50% of an in the original paper of the Yolo algorithm as follows:
original image), scale up and down, and merged images of cropped • batch_size=24, subdivisions=8
samples (according to cancer grades) from different images. Image • width=608, height=608, channels=3
examples from the data augmentation step are shown in Fig. 2. At the
• momentum=0.9, decay=0.0005, angle=0
end of this process, we had the data set contains about 1800 images,
• saturation = 1.5, exposure = 1.5, hue=0.1
and it is ready to be forwarded to the main module (object detection
• learning_rate=0.001, burn_in=1000
and classification). The dataset was split into training set and test set
• max_batches = 500200, policy=steps
1 after data augmentation, hence, the training set and test set 1 might
• steps=400000,450000, scales=0.1,0.1
contain similar or the same images. Therefore, we have prepared, test
set 2, which consists of completely distinct biopsy images from different The fully connected and output layers are linked to each other
patients. And the size of the images in test set 2 is also different from with SoftMax (Logistic Regression) loss functions (see, Eq. (3)). Loss
the images in the training set (Tables 2–4). Function contain two stages, the localization loss to predict four points
of bounding boxes and to estimate the conditional labels and compute
3.3. The proposed cancer detection and classification system via re-training the confidence level of the detected provisional object (region)s with
the fine-tuned Yolo algorithm probabilities. These two parts are calculated using sum of squared error
functions. Two scale parameters are used to control how much increase
The general-purpose weight model of Yolo has initially been trained is required for the loss from bounding box coordinate predictions
via the general-purpose architecture of Yolo (Fig. 4) on the Microsoft (𝜆𝑐𝑜𝑜𝑟𝑑 ) and how much decrease is required for the loss of confidence
COCO dataset to detect 80 everyday object classes from cars, phones, or score predictions for boxes without objects (𝜆𝑛𝑜𝑜𝑏𝑗 ). Down-weighting
planes to pens, books, or handbags. However, any pre-trained model of the loss contributed by background boxes is important as most of the

5
M.E. Salman et al. Expert Systems With Applications 201 (2022) 117148

Fig. 5. Computing intersection of union.

bounding boxes involve no instance. For the study, the model sets
𝜆𝑐𝑜𝑜𝑟𝑑 =5 and 𝜆𝑛𝑜𝑜𝑏𝑗 =0.5. The loss function only penalizes classification Fig. 6. Training Loss Function by 500 Epoches.
error if an object is present in that grid cell, 1𝑜𝑏𝑗𝑖 = 1. It also only
penalizes bounding box coordinate error if that predictor is responsible Table 1
for the ground truth box, 1𝑜𝑏𝑗
𝑖𝑗 = 1. As a one-stage object detector, Yolo
Accuracy results of the system for the test set 1 that contains 174 images, which are
similar to the images in the train set.
is super fast, it produces sufficient results at recognizing regular and
Gleason grades System Three pathologists Number of detected cases
irregularly shaped objects or a group of small objects despite it has a accuracy decision in test image set
limited number of bounding box candidates (Weng, 2018). Benign 96.00 100 74
Grade 3 98.00 100 132
𝑆2 ∑
∑ 𝐵
𝑜𝑏𝑗 Grade 4 95.00 100 224
𝐿𝑙𝑜𝑐 = 𝜆𝑐𝑜𝑜𝑟𝑑 𝑙𝑖𝑗 ̂𝑖 )2 + (𝑦𝑖 − 𝑦̂𝑖 )2 +
[(𝑥𝑖 − 𝑥 Grade 5 100.0 100 162
𝑖=0 𝑗=0
√ √ √ Total/Average 97.00 100 592
√ Accuracy
+( 𝑤𝑖 − 𝑤̂𝑖 )2 + ( ℎ𝑖 − ℎ𝑖 )2 ]
̂
2

𝑆 ∑
𝐵
( 𝑜𝑏𝑗 ) (3)
𝐿𝑐𝑙𝑠 = 1𝑖𝑗 + 𝜆𝑛𝑜𝑜𝑏𝑗 (1 − 1𝑜𝑏𝑗 ̂ 2
𝑖𝑗 ) (𝐶𝑖𝑗 − 𝐶𝑖𝑗 ) +
Table 2
𝑖=0 𝑗=0 Accuracy results of the system for the test set 2 that contains 137 completely different
2
biopsy images.

𝑆 ∑
Gleason grades System Three pathologists Number of detected cases
+ 1𝑜𝑏𝑗 ̂𝑖 (𝑐))2
𝑖 (𝑝𝑖 (𝑐) − 𝑝 accuracy decision in test image set
𝑖=0 𝑐∈𝐶
Benign 71 100 59
𝐿 = 𝐿𝑙𝑜𝑐 + 𝐿𝑐𝑙𝑠
Grade 3 93 100 464
Grade 4 89 100 222
where, 1𝑜𝑏𝑗 𝑖 : An indicator function of whether the cell i contains an Grade 5 84 100 137
object. 1𝑜𝑏𝑗
𝑖 : It indicates whether the 𝑗th bounding box of the cell i is Total/Average 89 100 882
‘‘responsible’’ for the object prediction. 𝐶𝑖𝑗 : The confidence score of cell Accuracy
i, 𝑃 𝑟(𝑐𝑜𝑛𝑡𝑎𝑖𝑛𝑖𝑛𝑔𝑎𝑛𝑜𝑏𝑗𝑒𝑐𝑡) ∗ 𝐼𝑜𝑈 (𝑝𝑟𝑒𝑑, 𝑡𝑟𝑢𝑡ℎ), IoU > 0.5 for this study, see
Fig. 5. 𝐶 ̂𝑖𝑗 : The predicted confidence score. 𝐶: The set of all classes.
𝑝𝑖 (𝑐): The conditional probability of whether cell i contains an object systems, it is shown in Fig. 1. There are nine IF-THEN rules, Eq. (5).
of class 𝑐 ∈ 𝐶. 𝑝̂𝑖 (𝑐): The predicted conditional class probability.
⎧1, if 𝐺1 = 3 𝑎𝑛𝑑 𝐺2 = 3

𝑇𝑃 + 𝑇𝑁 ⎪2, if 𝐺1 = 3 𝑎𝑛𝑑 𝐺2 = 4
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁 ⎪3, if 𝐺1 = 4 𝑎𝑛𝑑 𝐺2 = 3
𝑇𝑃 ⎪
𝑅𝑒𝑐𝑎𝑙𝑙 = ⎪4, if 𝐺1 = 4 𝑎𝑛𝑑 𝐺2 = 4
𝑇𝑃 + 𝐹𝑁 (4) ⎪
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑇𝑃
, ⎪4, if 𝐺1 = 3 𝑎𝑛𝑑 𝐺2 = 5
𝑇𝑃 + 𝐹𝑃 𝐼𝑆𝑈 𝑃 (𝐺1, 𝐺2) = ⎨ (5)
2 𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 𝑅𝑒𝑐𝑎𝑙𝑙 ⎪4, if 𝐺1 = 5 𝑎𝑛𝑑 𝐺2 = 3
𝐹𝑠𝑐𝑜𝑟𝑒 = . ⎪5, if 𝐺1 = 4 𝑎𝑛𝑑 𝐺2 = 5
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙 ⎪
⎪5, if 𝐺1 = 5 𝑎𝑛𝑑 𝐺2 = 4

⎪5, if 𝐺1 = 5 𝑎𝑛𝑑 𝐺2 = 5

3.4. Data fusion at decision level ⎩𝐵𝑒𝑛𝑖𝑔𝑛, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.

4. Results and discussion


Yolo-based PCa detection module evaluates an input image by
dividing it into the determined SxS grids (cells) and returns one or The proposed method has been developed using the supervised
several detected regions via the label, confidence, and bounding-box learning algorithm Yolo (CNN architecture of deep learning). There
information as an outcome. However, when pathologists diagnose a were 500 real prostate tissue pattern images annotated by three pathol-
biopsy pattern, they evaluate the entire image and give the result for ogists. The dataset was split into train/test sets, 450/50, respectively.
Then, the training set was increased to 1776 using the data augmen-
the full picture. Due to this reason, data fusion is important to merge
tation technique. In training phase, the training epoches was 110,000
the detected regions and to make the final decision for each input
and training time was 24 h. We stopped training when the value of
biopsy image. For this propose, we have developed an algorithm on the loss function dropped under 5%. In Fig. 6, first 500 epoches (X
data fusion at the decision level using rule-based with the Gleason grad- axis) with the training loss (Y axis) is illustrated. To test detection
ing and International Society of Urological Pathology (ISUP) grouping accuracy of the system we have used intersection of union (see, Fig. 5

6
M.E. Salman et al. Expert Systems With Applications 201 (2022) 117148

Table 3
Confusion matrix table for the test set 1. The test set 1 contains the
biopsy images, which are similar to the biopsy images in the training
set.
Precision Recall F1-score Support
Benign 0.96 0.99 0.97 74
Grade 3 0.98 0.98 0.98 132
Grade 4 0.95 0.99 0.97 224
Grade 5 1.00 0.93 0.96 162
Accuracy 0.97 592

Table 4
Confusion matrix table for the test set 2. The test set 2 contains
completely different 137 biopsy images than the biopsy images in the
training set.
Precision Recall F1-score Support
Benign 0.71 0.88 0.79 59
Grade 3 0.93 0.99 0.96 464
Grade 4 0.89 0.73 0.80 222
Grade 5 0.84 0.80 0.82 137
Accuracy 0.89 882

and a conventional accuracy calculation metric, which is known as


the confusion matrix as in Eq. (4). Overall, the detection algorithm
performed above %90 for each Gleason grade group with 97% average
accuracy on the test 1. However, the algorithm detects Grade 5 patterns Fig. 7. Best detection examples of the system.
via the highest accuracy rate (100%) comparing others three groups
due the reason that the patterns that belong to the Grade 5 group
have distinguishing pixel color and shape features than the patterns
that belong to the other three groups, see Table 1 and Fig. 7. So, the
distinguishing pixel color and shape features helps the algorithm to
detect and classify an input pattern with high accuracy rate. Besides,
in the data set, Grade 3 and Grade 4 pattern images are similar by
their color and shape content to each other, in reality, these two groups
are also so close according to pathologists. And the detection accuracy
results also shows this fact with almost the same detection outcomes
for Grade 3 (98%) and Grade 4 (95%). Furthermore, the findings of
the system have conditionally been divided into three parts namely,
the best detected, worst detected, and mixed detection examples after
the deep and detailed reviewing of all the results, see Figs. 7, 8 and 9,
respectively. Following, the detailed explanations for every single part
are given by the viewpoint of computer experts and pathologists.
For the best detection cases, see Fig. 7, Computer experts: The main
reason for high accuracy detection and classification in this category,
the pattern images including Benign, Grades 3, 4, 5 are all well struc-
tured and have clear distinguishing features as pixel color and shape
which are vital for the proposed detection algorithm. Because, the
algorithm takes the input images, splits it into SxS grids (cells), extracts
the hidden features, and compares the extracted hidden features to
its trained weight model, then it makes a decision based on these
hidden features. So, the system detects and labels input images with
minimum error rate. Pathologists: In Fig. 7(a), the AI system detected
and classified nearly the same regions as well as three pathologists Fig. 8. Worst detection examples of the system.
choose which containing benign glandular structures. Normally, we
underestimate benign stromal areas without glands while evaluating
a tissue, so the system may have also focused on benign areas with
image in Fig. 7(d), the AI system has also identified Grade 5 tumor
glandular structures. In Fig. 7(b) image, we can see an artifactual blur-
areas successfully, but it has strikingly assigned Grade 3 to a Grade 5
ring region in the inferior and superior detected regions of the image.
However, the artificial intelligence system has detected and classified region in fact. In this area, we can see vascular structures containing
Grade 3 regions accurately despite the presence of blur regions in the luminal spaces as Grade 3 tumor glands. In this case, the system might
image. For the image sample in Fig. 7(c), we can conclude that the have classified the vascular structures as Grade 3 pattern. Overall,
AI system has successfully detected and graded malignant glands in the developed AI system has successfully detected and classified these
this area. One area which was assigned as Grade 4 is controversial. cases, and other images similar to these cases.
In this area, because of tissue fragmentation, the tissue entirety is not For the worst detection cases, see Fig. 8, Computer experts and
preserved well and there are empty spaces more than expected. Due pathologists: Four images out of 174 test images contain the misde-
to this reason the system might not have evaluated this area. For the tected and misclassified cases. The main reason for the worst detection

7
M.E. Salman et al. Expert Systems With Applications 201 (2022) 117148

5. Conclusion

In this study, an intelligent PCa detection and classification system


was developed. The system can help pathologists for diagnosing of
PCa by processing prostate biopsy images with the deep learning-
based image processing technique of artificial intelligence. It detects
carcinogenic regions in an input tissue image, and it classifies its
grade based on the Gleason grading method. In this system, the CNN-
based general-purpose Yolo object detection model of deep learning
were utilized. The Yolo object detection algorithm was re-trained with
around 1800 images of prostate tissue patterns that were obtained from
the 450 real prostate biopsy images. The dataset was annotated by a
pathologist and approved by two independent pathologists. The system
was re-trained by fine-tuning the default hyperparameters, during a 24-
hour period. The training process was stopped when the value of the
loss function fell below 5% and was tested with two test sets. The test
set 1 contains 50 whole-slide biopsy images that were similar to the
images in the training set. The test set 2 includes completely different
biopsy images than the images in the training set. Each detected region
and its predicted Gleason grade of the evaluated input images has been
reviewed by three pathologists thoroughly. As a result, for test set 1,
an average accuracy of 97%, and for test set 2, an average accuracy
of 89% were obtained. The results of the system show that an AI
system based on Yolo object detection algorithm can achieve successful
detection and classification results between benign biopsy cores versus
Fig. 9. Mixed label detection examples of the system. cores containing malignity. Moreover, it can be concluded that an AI
system can grade prostate biopsies with high performance. As future
work, we plan to prepare a data set that will contain about 10,000
biopsy images to improve the reliability of the system. Besides, we are
and classification is the data augmentation (cropped, rotated, and
going to build a specific model based on the CNN architecture for PCa
resized images). The system extracts hidden and crucial features from
detection and classification tasks.
the augmented data, learns, and creates a weight model while training.
The augmented image data helps to improve the overall accuracy of Compliance with ethical standards
the system, but they also cause some misdetection and misclassification
cases as in this condition. However, if there is a sufficient amount of Ethical approval: This article does not contain any studies with
real image data in the data set, we will not do any data augmentation human participants or animals performed by any of the authors.
task. And these problems will disappear by itself. In addition to this, Funding: The author(s) received no specific funding for this work.
some tissue images contain different Benign and Grade 3, 4, 5 glands.
In this condition, the data fusion method is used to make final decisions CRediT authorship contribution statement
using the ISUP approach. Additionally, in these cases prominent failures
Mehmet Emin Salman: Conceptualization, Methodology, Software,
are related with outermost detections. Indeed, accuracy is better for
Formal analysis, Data curation, Writing – original draft, Visualiza-
inner detections. For the picture in Fig. 8(a), detection and grading
tion. Gözde Çakirsoy Çakar: Conceptualization, Dataset preparation,
results do not match with the pathologists decision. One of these areas Formal analysis, Data curation, Writing – original draft. Jahongir Az-
graded by the AI system as 4 and 5 at the same time. In this area imjonov: Conceptualization, Methodology, Software, Formal analysis,
containing Grade 5 and a few small focus may be assigned to Grade Data curation, Writing – original draft, Visualization. Mustafa Kösem:
4. However, this determination should not be considered as incorrect. Investigation, Resources, Validation, Data curation, Writing – review
For the mixed detection cases, see Fig. 9, Computer experts and & editing, Supervision, Project administration, Funding acquisition. İs-
Pathologists: There are various Gleason grade detection results in a mail Hakkı Cedimoğlu: Investigation, Resources, Validation, Data cu-
single input pattern in some images. Further, the AI system detected ration, Writing – review & editing, Supervision, Project administration,
malignant glands successfully. However, grading was equivocal for Funding acquisition.
Grade 4 and 5 for four detection regions in Fig. 9(c) and Fig. 9(d) .
Declaration of competing interest
Also in two benign areas, it can be seen that edges of a few Grade 3
glands involved in the detection area of Figs. 9(a) and 9(b). We have The authors declare that they have no known competing finan-
determined that it is not an abnormal case, because the input tissue cial interests or personal relationships that could have appeared to
pattern may include several malign and benign regions. Additionally, influence the work reported in this paper.
these patterns are in a morphological spectrum between Benign to
Grade 5. So, Grade 3 tumors have morphological similarities with References
benign glands. Also, for some patterns in spectrum, pathologists may
Abraham, B., & Nair, M. S. (2019). Automated grading of prostate cancer using
be indefinite for instance Grade 4 or 5. This is the source of proven
convolutional neural network and ordinal class classifier. Informatics in Medicine
inter-observer variability between pathologists. Due to these reasons, Unlocked, [ISSN: 2352-9148] 17, Article 100256. http://dx.doi.org/10.1016/j.imu.
the system may misdetect and misclassify some patterns. However, the 2019.100256.
Allsbrook, W. C., Mangold, K. A., Johnson, M. H., Lane, R. B., Lane, C. G., &
situation is not considered as a very serious issue since it can be solved
Epstein, J. I. (2001). Interobserver reproducibility of gleason grading of prostatic
and the accuracy of the system can be improved by just re-training via carcinoma: General pathologist. Human Pathology, [ISSN: 0046-8177] 32(1), 81–88.
more amount of annotated image data. http://dx.doi.org/10.1053/hupa.2001.21135.

8
M.E. Salman et al. Expert Systems With Applications 201 (2022) 117148

Alom, M. Z., Aspiras, T., Taha, T. M., Asari, V. K., Bowen, T., Billiter, D., et Kweldam, C. F., Wildhagen, M. F., Steyerberg, E. W., Bangma, C. H., van der Kwast, T.
al. (2019). Advanced deep convolutional neural network approaches for digital H., & van Leenders, G. J. (2015). Cribriform growth is highly predictive for
pathology image analysis: a comprehensive evaluation with different use cases. postoperative metastasis and disease-specific death in gleason score 7 prostate
arXiv:1904.09075. cancer. Modern Pathology, [ISSN: 1530-0285] 28(3), 457–464. http://dx.doi.org/
Arvaniti, E., Fricker, K. S., Moret, M., Rupp, N., Hermanns, T., Fankhauser, C., et al. 10.1038/modpathol.2014.116.
(2018). Automated gleason grading of prostate cancer tissue microarrays via deep Lee, H., & Chen, Y.-P. P. (2015). Image based computer aided diagnosis system
learning. Scientific Reports, [ISSN: 2045-2322] 8(1), 12054. http://dx.doi.org/10. for cancer detection. Expert Systems with Applications, [ISSN: 0957-4174] 42(12),
1038/s41598-018-30535-1. 5356–5365. http://dx.doi.org/10.1016/j.eswa.2015.02.005.
Benecchi, L. (2006). Neuro-fuzzy system for prostate cancer diagnosis. Urology, 68(2), Li, W., Li, J., Sarma, K. V., Ho, K. C., Shen, S., Knudsen, B. S., et al. (2019). Path
357–361. R-CNN for prostate cancer diagnosis and gleason grading of histological images.
Bulten, W., Pinckaers, H., van Boven, H., Vink, R., de Bel, T., van Ginneken, B., et IEEE Transactions on Medical Imaging, 38(4), 945–954. http://dx.doi.org/10.1109/
al. (2020). Automated deep-learning system for Gleason grading of prostate cancer TMI.2018.2875868.
using biopsies: a diagnostic study. The Lancet Oncology, 21(2), 233–241. Litjens, G., Sánchez, C. I., Timofeeva, N., Hermsen, M., Nagtegaal, I., Kovacs, I.,
Castanho, M., Hernandes, F., De Ré, A., Rautenberg, S., & Billis, A. (2013). Fuzzy et al. (2016). Deep learning as a tool for increased accuracy and efficiency
expert system for predicting pathological stage of prostate cancer. Expert Systems of histopathological diagnosis. Scientific Reports, [ISSN: 2045-2322] 6(1), 26286.
with Applications, [ISSN: 0957-4174] 40(2), 466–470. http://dx.doi.org/10.1016/j. http://dx.doi.org/10.1038/srep26286.
eswa.2012.07.046. López-Úbed, P., Díaz-Galiano, M. C., Martín-Noguerol, T., na López, A. U., Martín-
Cinar, M., Engin, M., Engin, E. Z., & Ziya Atesci, Y. (2009). Early prostate cancer Valdivia, M.-T., & Luna, A. (2020). Detection of unexpected findings in radiology
diagnosis by using artificial neural networks and support vector machines. Expert reports: A comparative study of machine learning approaches. Expert Systems with
Systems with Applications, [ISSN: 0957-4174] 36(3, Part 2), 6357–6361. http://dx. Applications, [ISSN: 0957-4174] 160, Article 113647. http://dx.doi.org/10.1016/j.
doi.org/10.1016/j.eswa.2008.08.010. eswa.2020.113647.
Cosma, G., Acampora, G., Brown, D., Rees, R. C., Khan, M., & Pockley, A. G. (2016). Madabhushi, A., & Lee, G. (2016). Image analysis and machine learning in digital
Prediction of pathological stage in patients with prostate cancer: A neuro-fuzzy pathology: Challenges and opportunities. Medical Image Analysis, [ISSN: 1361-8415]
model. PLoS One, [ISSN: 1932-6203] 11(6), Article e0155856–e0155856. http: 33, 170–175. http://dx.doi.org/10.1016/j.media.2016.06.037, 20th anniversary of
//dx.doi.org/10.1371/journal.pone.0155856, 27258119[pmid]. the Medical Image Analysis journal (MedIA).
Cosma, G., Brown, D., Archer, M., Khan, M., & Graham Pockley, A. (2017). A survey on Nagpal, K., Foote, D., Liu, Y., Chen, P.-H. C., Wulczyn, E., Tan, F., et al. (2019).
computational intelligence approaches for predictive modeling in prostate cancer. Development and validation of a deep learning algorithm for improving Gleason
Expert Systems with Applications, [ISSN: 0957-4174] 70, 1–19. http://dx.doi.org/10. scoring of prostate cancer. Npj Digital Medicine, [ISSN: 2398-6352] 2(1), 48. http:
1016/j.eswa.2016.11.006. //dx.doi.org/10.1038/s41746-019-0112-2.
Covic, A., Schiller, A., Volovat, C., Gluhovschi, G., Gusbeth-Tatomir, P., Petrica, L., Nezhad, M. Z., Sadati, N., Yang, K., & Zhu, D. (2019). A Deep Active Survival Analysis
et al. (2005). Epidemiology of renal disease in Romania: a 10 year review of approach for precision treatment recommendations: Application of prostate cancer.
two regional renal biopsy databases. Nephrology Dialysis Transplantation, [ISSN: Expert Systems with Applications, [ISSN: 0957-4174] 115, 16–26. http://dx.doi.org/
0931-0509] 21(2), 419–424. http://dx.doi.org/10.1093/ndt/gfi207. 10.1016/j.eswa.2018.07.070.
de Oliveira, D. L. L., do Nascimento, M. Z., Neves, L. A., de Godoy, M. F., de Nir, G., Hor, S., Karimi, D., Fazli, L., Skinnider, B. F., Tavassoli, P., et al. (2018).
Arruda, P. F. F., & de Santi Neto, D. (2013). Unsupervised segmentation method Automatic grading of prostate cancer in digitized histopathology images: Learning
for cuboidal cell nuclei in histological prostate images based on minimum cross from multiple experts. Medical Image Analysis, [ISSN: 1361-8415] 50, 167–180.
entropy. Expert Systems with Applications, [ISSN: 0957-4174] 40(18), 7331–7340. http://dx.doi.org/10.1016/j.media.2018.09.005.
http://dx.doi.org/10.1016/j.eswa.2013.06.079.
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified,
de Paula Castanho, M. J., de Barros, L. C., Yamakami, A., & Vendite, L. L. (2008). Fuzzy real-time object detection. In 2016 IEEE conference on computer vision and pattern
expert system: An example in prostate cancer. Applied Mathematics and Computation, recognition (CVPR) (pp. 779–788). http://dx.doi.org/10.1109/CVPR.2016.91.
[ISSN: 0096-3003] 202(1), 78–85. http://dx.doi.org/10.1016/j.amc.2007.11.055.
Redmon, J., & Farhadi, A. (2018). YOLOV3: An incremental improvement. arXiv:
Egevad, L., Ahmad, A., Algaba, F., Berney, D., Boccon-Gibod, L., Compérat, E., et
1804.02767.
al. (2013). Standardization of gleason grading among 337 European pathologists.
Ribeiro, M. G., Neves, L. A., do Nascimento, M. Z., Roberto, G. F., Martins, A. S.,
Histopathology, 62, 247–256. http://dx.doi.org/10.1111/his.12008.
& Azevedo Tosta, T. A. (2019). Classification of colorectal cancer based on the
Girshick, R. (2015). Fast R-CNN. In 2015 IEEE international conference on computer vision
association of multidimensional and multiresolution features. Expert Systems with
(ICCV) (pp. 1440–1448). http://dx.doi.org/10.1109/ICCV.2015.169.
Applications, [ISSN: 0957-4174] 120, 262–278. http://dx.doi.org/10.1016/j.eswa.
Gleason, D. F. (1966). Automatic grading of prostate cancer in digitized histopathology
2018.11.034.
images: Learning from multiple experts. Medical Image Analysis, 50(3), 125–128.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015).
Greenblatt, A., Mosquera-Lopez, C., & Agaian, S. (2013). Quaternion neural networks
ImageNet Large scale visual recognition challenge. International Journal of Computer
applied to prostate cancer gleason grading. In 2013 IEEE international conference
Vision, [ISSN: 1573-1405] 115(3), 211–252. http://dx.doi.org/10.1007/s11263-
on systems, man, and cybernetics (pp. 1144–1149). http://dx.doi.org/10.1109/SMC.
015-0816-y.
2013.199.
Sadoughi, F., & Ghaderzadeh, M. (2014). A hybrid particle swarm and neural network
Haq, N. F., Kozlowski, P., Jones, E. C., Chang, S. D., Goldenberg, S. L., & Moradi, M.
approach for detection of prostate cancer from benign hyperplasia of prostate.
(2015). A data-driven approach to prostate cancer detection from dynamic contrast
Studies in Health Technology and Informatics, 205, 481.
enhanced MRI. Computerized Medical Imaging and Graphics, [ISSN: 0895-6111] 41,
Saritas, I., Ozkan, I. A., & Sert, I. U. (2010). Prognosis of prostate cancer by
37–45. http://dx.doi.org/10.1016/j.compmedimag.2014.06.017, Machine Learning
artificial neural networks. Expert Systems with Applications, [ISSN: 0957-4174] 37(9),
in Medical Imaging.
6646–6650. http://dx.doi.org/10.1016/j.eswa.2010.03.056.
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep residual learning for image
recognition. arXiv:1512.03385. Shariat, S. F., Kattan, M. W., Vickers, A. J., Karakiewicz, P. I., & Scardino, P. T.
(2009). Critical review of prostate cancer predictive tools. Future Oncology (London,
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., et
England), [ISSN: 1744-8301] 5(10), 1555–1584. http://dx.doi.org/10.2217/fon.09.
al. (2017). MobileNets: Efficient convolutional neural networks for mobile vision
121, 20001796[pmid].
applications. arXiv:1704.04861.
Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected Shi, Y., Liu, Y., & Wu, P. (2013). Level set priors based approach to the segmentation
convolutional networks. In 2017 IEEE conference on computer vision and pattern of prostate ultrasound image using genetic algorithm. Intelligent Automation & Soft
recognition (CVPR) (pp. 2261–2269). http://dx.doi.org/10.1109/CVPR.2017.243. Computing, 19(4), 537–544. http://dx.doi.org/10.1080/10798587.2013.869111.
Iczkowski, K. A., Torkko, K. C., Kotnis, G. R., Storey Wilson, R., Huang, W., Wheeler, T. Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale
M., et al. (2011). Digital quantification of five high-grade prostate cancer patterns, image recognition. arXiv:1409.1556.
including the cribriform pattern, and their association with adverse outcome. Ström, P., Kartasalo, K., Olsson, H., Solorzano, L., Delahunt, B., Berney, D. M., et
American Journal of Clinical Pathology, [ISSN: 0002-9173] 136(1), 98–107. http: al. (2020). Artificial intelligence for diagnosis and grading of prostate cancer in
//dx.doi.org/10.1309/AJCPZ7WBU9YXSJPE. biopsies: a population-based, diagnostic study. The Lancet Oncology, 21(2), 222–232.
Jensen, C., Carl, J., Boesen, L., Langkilde, N., & Østergaard, L. (2019). Assessment Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2015). Rethinking the
of prostate cancer prognostic Gleason grade group using zonal-specific features inception architecture for computer vision. arXiv:1512.00567.
extracted from biparametric MRI using a KNN classifier. Journal of Applied Clinical Thangavel, K., & Manavalan, R. (2014). Soft computing models based feature selection
Medical Physics, [ISSN: 1526-9914] 20(2), 146–153. http://dx.doi.org/10.1002/ for TRUS prostate cancer image classification. Soft Computing, [ISSN: 1433-7479]
acm2.12542. 18(6), 1165–1176. http://dx.doi.org/10.1007/s00500-013-1135-2.
Kallen, H., Molin, J., Heyden, A., Lundström, C., & Åström, K. (2016). Towards grading Underwood, D. J., Zhang, J., Denton, B. T., Shah, N. D., & Inman, B. A. (2012).
gleason score using generically trained deep convolutional neural networks. In 2016 Simulation optimization of PSA-threshold based prostate cancer screening policies.
IEEE 13th international symposium on biomedical imaging (ISBI) (pp. 1163–1167). Health Care Management Science, [ISSN: 1572-9389] 15(4), 293–309. http://dx.doi.
http://dx.doi.org/10.1109/ISBI.2016.7493473. org/10.1007/s10729-012-9195-x.

9
M.E. Salman et al. Expert Systems With Applications 201 (2022) 117148

Vente, C. d., Vos, P., Hosseinzadeh, M., Pluim, J., & Veta, M. (2021). Deep learning Weng, L. (2018). Object detection part 4: Fast detection models. https:
regression for prostate cancer detection and grading in bi-parametric MRI. IEEE //lilianweng.github.io/lil-log/2018/12/27/object-detection-part-4.html. (Accessed:
Transactions on Biomedical Engineering, 68(2), 374–383. http://dx.doi.org/10.1109/ 2020-10-30).
TBME.2020.2993528. Zhou, N., Fedorov, A., Fennessy, F., Kikinis, R., & Gao, Y. (2017). Large scale digital
Wang, H., Roa, A. C., Basavanhally, A. N., Gilmore, H. L., Shih, N., Feldman, M., prostate pathology image analysis combining feature extraction and deep neural
et al. (2014). Mitosis detection in breast cancer pathology images by combining network. arXiv:1705.02678.
handcrafted and convolutional neural network features. Journal of Medical Imaging,
1(3), 1–8. http://dx.doi.org/10.1117/1.JMI.1.3.034003.

10

You might also like