ConCORDe-Net: Cell Count Regularized Convolutional Neural Network for Cell Detection in Multiplex Immunohistochemistry Images

Hagos, Yeman Brhane; Narayanan, Priya Lakshmi; Akarca, Ayse U.; Marafioti, Teresa; Yuan, Yinyin

doi:10.1007/978-3-030-32239-7_74

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11764))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

17k Accesses

Abstract

In digital pathology, cell detection and classification are often prerequisites to quantify cell abundance and explore tissue spatial heterogeneity. However, these tasks are particularly challenging for multiplex immunohistochemistry (mIHC) images due to high levels of variability in staining, expression intensity, and inherent noise as a result of preprocessing artefacts. We proposed a deep learning method to detect and classify cells in mIHC whole-tumor slide images of breast cancer. Inspired by inception-v3, we developed Cell COunt RegularizeD Convolutional neural Network (ConCORDe-Net) which integrates conventional dice overlap and a new cell count loss function for optimizing cell detection, followed by a multi-stage convolutional neural network for cell classification. In total, 20447 cells, belonging to five cell classes were annotated by experts from 175 patches extracted from 6 whole-tumor mIHC images. These patches were randomly split into training, validation and testing sets. Using ConCORDe-Net, we obtained a cell detection F1 score of 0.873, which is the best score compared to three state of the art methods. In particular, ConCORDe-Net excels at detecting closely located and weakly stained cells compared to other methods. Incorporating cell count loss in the objective function regularizes the network to learn weak gradient boundaries and separate weakly stained cells from background artefacts. Moreover, cell classification accuracy of $96.5\%$ was achieved. These results support that incorporating problem specific knowledge such as cell count into deep learning based cell detection architectures improves robustness of the algorithm.

You have full access to this open access chapter, Download conference paper PDF

Deep learning-inferred multiplex immunofluorescence for immunohistochemical image quantification

Article 07 April 2022

A deep learning model for molecular label transfer that enables cancer cell identification from histopathology images

Article Open access 02 March 2022

Simultaneous Detection and Classification of Partially and Weakly Supervised Cells

Keywords

1 Introduction

Cell detection and classification are often the first key steps in a wide range of histology image analysis tasks, such as investigating the interplay of the tumor and immune cells [1]. Multiplex Immunohistochemistry (mIHC) is a multi-parametric protocol that allows simultaneous examination of expression of multiple markers in a single section [2, 3]. Combined with robust cell detection and classification techniques, mIHC has the potential to allow detailed investigation of cells spatial interaction and signalling for the study of tumor heterogeneity [2].

The field of digital pathology has recently witnessed a surge of interest in the application of deep learning for cell classification [4], cell detection [5, 6], and cell counting [7,8,9,10]. However, automated cell detection and classification remain challenging due to variation in slide preparation and cell morphological diversity in shape and size. For example, closely located cells with weak boundaries are often difficult to discern [5,6,7,8]. Moreover, often a parameter such as a kernel size needed to be fixed [5], which cannot cater for cells with a range of size and shape. Furthermore, the need to differentiate cells with a subtle difference in marker expression intensity, as exemplified in Fig. 1a, adds another layer of complexity in mIHC image analysis.

In this paper, to address the above stated challenges, we developed a new cell detection method followed by multi-stage CNN to analyse mIHC images of breast cancer. Our work has the following main contributions: (1) We developed Cell Count RegularizeD Convolutional neural Network (ConCORDe-Net) inspired by inception-v3 which incorporates cell counter and designed for cell detection without the need of pre-specifying parameters such as cell size. (2) The parameters of ConCORDe-Net were optimized using an objective function that combines conventional Dice overlap and a new cell count loss function which regularizes the network parameters to detect closely located cells. (3) Our quantitative experiments support that ConCORDe-Net outperformed the state of the art methods at detecting closely located as well as weakly stained cells.

2 Materials

The dataset used in this paper were mIHC whole-tumor slide images from patients with breast cancer, and the images were scanned at 40X resolution. A total of 175 regions/patches were annotated from different parts of 6 whole tumor images by experts. The patches were extracted from different regions of the slides to incorporate the variation in the data. The patches were then randomly split into training (120), validation (28), and testing (27). Inside these patches 20477 cells were annotated and these belonged to five different types of cells as depicted in Table 1. Illustrative example of patches are shown in Fig. 1a. The distribution of the data for each cell is presented in Table 1.

Table 1. Distribution of dataset

Full size table

3 Methodology

3.1 Dot Annotation to Cell Pseudo-segmentation

The reference ground truth obtained was a dot annotation at the center of a cell instead of cell spatial extent segmentation which is generally tedious task. However, to train the proposed cell detection pipeline, cells mask (G) and the number of cells ($C_t$) were needed as a target. $C_t$ is simply the number of annotated cells in the input patch. Cell pseudo-segmentation was generated from dot annotation using Eq. (1).

$$\begin{aligned} G(i, j) = {\left\{ \begin{array}{ll} 1 &{} \text {if } d < r \\ 0 &{} \text {} otherwise \end{array}\right. } \end{aligned}$$

(1)

where G(i, j) is pixel intensity value at (i, j) of pseudo-segmentation image (G), $\textit{d}$ is an Euclidean distance between pixel location (i, j) and any of cell dot annotations, and $\textit{r}$ is threshold distance. $\textit{r}$ was empirically set to 4 pixels to guarantee pseudo-segmentation of cells do not touch each other.

3.2 Cell Counter

Our proposed cell counter network is shown in Fig. 1b. It is a mapping function, $f:\mathbb {R}^{nxn} \rightarrow \mathbb {R}^{1}$, where n is the size of the input patch, which is 224 in our case. It consists of feature extraction and regression parts. The feature extraction part is composed of four consecutive convolutional layers of $3\,\times \,3$ filter size, and “same” padding. The number of neurons in these layers are $\{16, 32, 64, 128\}$ respectively. Every convolutional layer was followed by max-pooling layer of size ($2\,\times \,2$) with stride 2 to reduce the dimensionality of features in the previous layer. The regressor part has a series of two dense layers of $\{200,\ 1\}$ neurons. The output dense layer has one neuron which computes estimated number of cells in the input tensor or image. The activation of all convolutional and dense layers was set to rectified linear unit (ReLU).

Parameters of all layers were randomly initialized using uniform glorot initialization [11]. Optimization of the parameters was done using Adam [12], learning rate of $10^{-4}$. Initially, we have experimented with Euclidean loss [10] and exponential loss functions. However, these suffer from loss explosion during the initial epochs and we came up with a new cell count loss ($C_l$) function in Eq. (2).

$$\begin{aligned} C_{l} = (1 - \frac{1}{1 + \frac{1}{B}\sum _{j=1}^{B}|C_{pj} - C_{tj}| }) \end{aligned}$$

(2)

where the summation is over B mini-batch images, $C_{pj}$ and $C_{tj}$ are predicted and true number of cells in the $j^{th}$ image, respectively. Figure 2a shows profile of $C_l$ as a function of cell count difference ($C_{p} - C_{t}$) and it is bounded between 0 and 1.

Before integrating the cell counter model to cell detection pipeline, it was trained and evaluated using pseudo-segmentation and number of cells as an input and output, respectively. To increase the amount of data, horizontal and vertical flipping were applied to all input training patches. The pseudo-segmentation is a binary image, however, when it is integrated with the cell detection model, a tensor of floating value will be fed. Thus, morphological and intensity deformation was applied as follows; Morphological erosion using rectangular structuring element of width $w=2$ was performed to every patch with a probability $p=0.4$, where p and w were empirically chosen. Then, the images were multiplied by a random matrix of the same size as the image with an empirically chosen probability $p=0.4$. All elements in the random matrix were in range [0.7, 1] to set pixel values between 0.7 and 1.

3.3 Cell Detection

Figure. 1b shows the proposed ConCORDe-Net cell detection convolutional neural network. The input is $224\,\times \,224 \,\times \,3$ size patch. The network has three parts; encoder, decoder and cell counter. The encoder-decoder section is extended version U-Net [13]. The standard U-Net architecture [13] uses VGG-style in its encoder and decoder section. We have proposed to use inception-v3 module shown in Fig. 1c instead of VGG block. The parallel and varying size filters in inception block enables the network to extract multi-scale features in a given layer. The encoder contains three inception modules and the first two modules were followed by 2D max-pooling layers. The decoder is composed of transposed convolution, concatenation, and inception modules. The $1\,\times \,1$ filter size convolutional layer at the end of the decoder is used to reduce the dimension of the tensor from $224\,\times \,224\,\times \,32$ to $224 \,\times \,224\,\times \,1$. The output of the decoder was taken as cell location prediction map (P) and connected to the pretrained cell counter model (explained in Sect. 3.2), which generates predicted number of cells ($C_p$). Activation of all layers was set to ReLU, but sigmoid for the last layer in the decoder section. Therefore, the cell detection architecture has two outputs, cell location prediction map and predicted number of cells.

The parameters of cell counter model were transfer learned from cell pseudo-segmentation as explained in Sect. 3.2. Parameters of the other layers were randomly initialized using uniform glorot initialization [11], and optimized using Adam [12], learning rate = $10^{-4}$ and an objective function shown in Eq. (3). Cell detection loss ($D_l$) in Eq. (3) has two parts. The first part is Dice overlap loss, and the second part is cell count loss.

$$\begin{aligned} D_{l}= (1- 2\frac{\sum _{j=1}^{B}\sum _{i=1}^{N}p_{ij}g_{ij}}{1 + \sum _{j=1}^{B}\sum _{i=1}^{N}p_{ij} + \sum _{j=1}^{B}\sum _{i=1}^{N}g_{ij}}) + K(1 - \frac{1}{1 + \frac{1}{B}\sum _{j=1}^{B}|C_{pj} - C_{tj}| }) \end{aligned}$$

(3)

where summations in the first part is over batch size (B) images, and N pixels of the ground truth image, $g_i\ \epsilon \ G$ and prediction map, $p_i \ \epsilon \ P$. The second part is same as Eq. (2), but weighted by empirically optimized constant $K=0.3$.

Horizontal and vertical flipping was applied to training patches to increase the amount and diversity of our data.

3.4 Cell Classification

In our dataset, there were five types of cells: CD8, GAL8+ pSTAT−, GAL8+ pSTAT+ strong, GAL8+ pSTAT+ moderate, and GAL8+ pSTAT+ weak. GAL8+ pSTAT+ cells were divided based on the expression level of pSTAT into strong, moderate, and weak. However, discriminating among GAL8+ pSTAT+ cells is challenging, even for experts. Inspired by the principle of divide and conquer algorithm, we convert the problem into multi-stage classification. The first classifier (classifier1) differentiates between CD8, Gal8+ pSTAT−, and all GAL8+ pSTAT+ cells. Then, a second classifier (classifier2) was trained to further divide GAL8+ pSTAT+ cells in to GAL8+ pSTAT+ strong, GAL8+ pSTAT+ moderate, and GAL8+ pSTAT+ weak.

Both classifiers were trained using $28\,\times \,28\,\times \,3$ patches which can cover the whole cell area for the majority of the cells. Similar network architecture was used for both classifiers. The classifier has feature extraction and classification sections. The feature extraction part is a modified version of VGG architecture [14] consisting of four convolutional layers of {$32,\ 64\ 128\ 128$} neurons with filters size $3\,\times \,3$, stride 1 and “same” padding. Each convolutional layers were followed by $2\,\times \,2$ max-pooling. The classification layer consisted of two dense layers of {$200,\ 3$} neurons with dropout layer, $\text {rate}=0.3$ in between. Softmax activation was applied to the last dense layer and ReLU for the other layers. Categorical cross-entropy objective function was applied. Uniform glorot [12] was applied to initialize parameters of the layers and optimized using Adam [12], learning $\text {rate}=10^{-4}$. To handle class imbalance, in each mini-batch, an equal number of patches from all cell types were fed to the network and the number of iterations were determined by the number of patches in the most underestimated class. Moreover, runtime augmentation of flipping, and zooming with scale $s= [0.85\ 1.15]$ was applied with a probability of $p=0.4$, where s and p were empirically optimized.

4 Results and Discussion

The proposed deep learning based unified cell detection and classification pipeline was evaluated on mIHC whole-tumor slide images. Implementation of the proposed approach was done in Python, and we used Keras API [15] for development of the deep learning pipeline.

To investigate if convolutional neural networks (CNN) can regress the number of cells from an input image, the proposed cell counter model was trained and then, evaluated on a test patches pseudo-segmentation image before integrating to ConCORDe-Net. Pearson correlation $r=0.999$ was obtained between the true and predicted number of cells. The high correlation supports that the proposed cell counter network can be used as a cell count approximation function.

Quantitatively, we evaluated ConCORDe-Net using standard metrics: precision, recall and F1-score. A detection was considered true positive if it lies with in an Euclidean distance of 8 pixels (2r, where r is in Eq. (1)) to a ground truth annotation.

Moreover, we compared ConCORDe-Net with state of the art methods, MapDe [5] and U-Net [13] as shown in Table 2. The same data augmentation as explained in Sect. 3.3 was applied to all models depicted in the Table. U-Net [13] was trained to regress pseudo-segmentation explained in Sect. 3.1. The output of CNN models in Table 2 is probability map that approximates pseudo-segmentation. The center of cells was regressed as follows from the probability map. Firstly, a global threshold maximizing F1-score was applied for each model to generate binary image. Secondly, hole filling morphological operation was applied to remove holes created after thresholding. Finally, the center of every connected component was computed which corresponds to center of a cell. ConCORDe-Net achieved the highest recall and F1-score compared to state of the art methods, MapDe [5] and U-Net [13]. Moreover, in both ConCORDe-Net and U-Net [13], integrating cell counter CNN has improved cell detection F1-score. For MapDe [5], we used the parameters that were specified in the paper and tuning the dimensions of “mapping filter” might improve the result.

Table 2. Cell detection performance comparison. Model1 is a model after cell counter is removed from ConCORDe-Net. U-Net [13] + Cell Counter is a CNN after integrating cell counter CNN to the original U-Net [13] architecture.

Full size table

Precision of ConCORDe-Net was lower than the three other methods due to the following reasons: (1) ConCORDe-Net identifies weakly stained cells that were missed by other methods, which could be missed by expert too. (2) Over-detection of large cells when there are more than one intensity peaks within the cell. We believe that these limitations could be improved by training and validating on a large cohort.

Performance of the proposed classifier models was qualitatively evaluated using receiver characteristic curve (ROC), area under the curve (AUC), accuracy, precision, recall, and F1-score on test data shown in Table 1. ROC and AUC of classifier1 are presented in Fig. 2b. AUC value of greater than 0.99 was achieved for all cell types. Overall accuracy computed on the original distribution of data was found around $98\%$. Moreover, precision, recall and F1-score were all 0.98. Figure 2c shows ROC and AUC of this classifier2. For all cell types, AUC value was higher than 0.97 and overall accuracy of around $93\%$ was obtained. After cascading the two classifiers, overall accuracy of $96.5\%$ was achieved.

Figure 3 shows a visual output of ConCORDe-Net followed by cell classification and comparison with MapDe [5] and U-Net [13] which uses Dice overlap loss as an objective function. ConCORDe-Net is better in discerning touching cells with weak boundary gradient and weakly stained GAL8+ pSTAT- cells compared to MapDe [5] and U-Net [13]. By regularizing the objective function with cell count, the network was able to learns patterns that can separate closely located cells and identify weakly stained cells.

5 Conclusions

In this paper, we proposed a deep learning based unified cell detection and classification method in mIHC whole-tumor slide images of breast cancer. Cell count regularized CNN was employed for cell detection followed by multi-stage CNN to classify cells. The parameters in the cell detection architecture were learnt using a new objective function which optimizes dice overlap and cell count. F1 score of 0.873 was achieved on test data which outperformed state of the art methods MapDe [5] and U-Net [13]. Our proposed approach is better in detecting closely located and weakly stained cells compared to MapDe [5] and U-Net [13]. Moreover, $96.5\%$ classification accuracy was achieved. Our experiment shows that incorporating problem specific knowledge such as cell count improves robustness of the cell detection algorithm.

References

Yuan, Y.: Spatial heterogeneity in the tumor microenvironment. Cold Spring Harb. Perspect. Med. 6(8), a026583 (2016)
Article Google Scholar
Blom, S., et al.: Systems pathology by multiplexed immunohistochemistry and whole-slide digital image analysis. Sci. Rep. 7(1), 15580 (2017)
Article Google Scholar
Kalra, J., Baker, J.: Multiplex immunohistochemistry for mapping the tumor microenvironment, pp. 237–251 (2017)
Google Scholar
Sirinukunwattana, K., Raza, S.E.A., Tsang, Y.-W., Snead, D.R.J., Cree, I.A., Rajpoot, N.M.: Locality sensitive deep learning for detection and classification of nuclei in routine colon cancer histology images. IEEE Trans. Med. Imaging 35(5), 1196–1206 (2016)
Article Google Scholar
S.E.A., Raza: Deconvolving convolution neural network for cell detection, June 2018
Google Scholar
Yang, G., Sau, C., Lai, W., Cichon, J., Li, W.: Efficient and robust cell detection: a structured regression approach. 344(6188), 1173–1178 (2018)
Google Scholar
Weidi, X., Noble, J.A., Zisserman, A.: Microscopy cell counting with fully convolutional regression networks. In: MICCAI 1st Workshop on Deep Learning in Medical Image Analysis (2015)
Google Scholar
Rad, R.M., Saeedi, P., Au, J., Havelock, J.: Blastomere cell counting and centroid localization in microscopic images of human embryo. In: 2018 IEEE 20th International Workshop on Multimedia Signal Processing, MMSP 2018, pp. 1–6 (2018)
Google Scholar
Paul Cohen, J., Boucher, G., Glastonbury, C.A., Lo, H.Z., Bengio, Y.: Count-ception: counting by fully convolutional redundant counting. In: Proceedings - 2017 IEEE International Conference on Computer Vision Workshops, ICCVW 2017, 2018, pp. 18–26, January 2018
Google Scholar
Xue, Y., Ray, N., Hugh, J., Bigras, G.: Cell counting by regression using convolutional neural network. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9913, pp. 274–290. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46604-0_20
Chapter Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. Technical report
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization, December 2014
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition, September 2014
Google Scholar
Chollet, F., et al.: Keras (2015)
Google Scholar

Download references

Acknowledgement

This project was funded by the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 766030.

Author information

Authors and Affiliations

Division of Molecular Pathology, The Institute of Cancer Research, London, UK
Yeman Brhane Hagos, Priya Lakshmi Narayanan & Yinyin Yuan
Department of Cellular Pathology, University College London, London, UK
Ayse U. Akarca & Teresa Marafioti

Authors

Yeman Brhane Hagos
View author publications
You can also search for this author in PubMed Google Scholar
Priya Lakshmi Narayanan
View author publications
You can also search for this author in PubMed Google Scholar
Ayse U. Akarca
View author publications
You can also search for this author in PubMed Google Scholar
Teresa Marafioti
View author publications
You can also search for this author in PubMed Google Scholar
Yinyin Yuan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yinyin Yuan .

Editor information

Editors and Affiliations

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Dinggang Shen
University of Georgia, Athens, GA, USA
Tianming Liu
Western University, London, ON, Canada
Terry M. Peters
Yale University, New Haven, CT, USA
Lawrence H. Staib
University of Strasbourg, Illkirch, France
Caroline Essert
United Imaging Intelligence, Shanghai, China
Sean Zhou
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Pew-Thian Yap
Western University, London, ON, Canada
Ali Khan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hagos, Y.B., Narayanan, P.L., Akarca, A.U., Marafioti, T., Yuan, Y. (2019). ConCORDe-Net: Cell Count Regularized Convolutional Neural Network for Cell Detection in Multiplex Immunohistochemistry Images. In: Shen, D., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2019. MICCAI 2019. Lecture Notes in Computer Science(), vol 11764. Springer, Cham. https://doi.org/10.1007/978-3-030-32239-7_74

Download citation

DOI: https://doi.org/10.1007/978-3-030-32239-7_74
Published: 10 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32238-0
Online ISBN: 978-3-030-32239-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)