1-s2.0-S2211568422001899-main

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Diagnostic and Interventional Imaging 104 (2023) 24−36

Review/Gastrointestinal imaging

Artificial intelligence: A review of current applications in hepatocellular


carcinoma imaging
Anna Pellata,b,*, Maxime Baratb,c, Romain Coriata,b, Philippe Soyerb,c, Anthony Dohanb,c
a
Department of Gastroenterology and Digestive Oncology, Ho ^pital Cochin, AP-HP, 75014 Paris, France
b
Universite Paris Cite, 75006 Paris, France
c
Department of Radiology, Ho ^pital Cochin, AP-HP, 75014 Paris, France

A R T I C L E I N F O A B S T R A C T

Keywords: Hepatocellular carcinoma (HCC) is the most common type of primary liver cancer and currently the third-
Hepatocellular carcinoma leading cause of cancer-related death worldwide. Recently, artificial intelligence (AI) has emerged as an
Artificial intelligence important tool to improve clinical management of HCC, including for diagnosis, prognostication and evalua-
Machine learning tion of treatment response. Different AI approaches, such as machine learning and deep learning, are both
Deep learning
based on the concept of developing prediction algorithms from large amounts of data, or big data. The era of
Diagnosis
digital medicine has led to a rapidly expanding amount of routinely collected health data which can be lever-
Treatment
aged for the development of AI models. Various studies have constructed AI models by using features
extracted from ultrasound imaging, computed tomography imaging and magnetic resonance imaging. Most
of these models have used convolutional neural networks. These tools have shown promising results for HCC
detection, characterization of liver lesions and liver/tumor segmentation. Regarding treatment, studies have
outlined a role for AI in evaluation of treatment response and improvement of pre-treatment planning. Sev-
eral challenges remain to fully integrate AI models in clinical practice. Future research is still needed to
robustly evaluate AI algorithms in prospective trials, and improve interpretability, generalizability and trans-
parency. If such challenges can be overcome, AI has the potential to profoundly change the management of
patients with HCC. The purpose of this review was to sum up current evidence on AI approaches using imag-
ing for the clinical management of HCC.
© 2022 Société française de radiologie. Published by Elsevier Masson SAS. All rights reserved.

1. Introduction records or other types of sources. By leveraging this large-scale data,


we can see how AI can be a powerful ally to human-driven medicine.
Artificial intelligence (AI) refers to intelligence demonstrated by Various ML techniques have widely been used in medical studies
machines, as opposed to the natural intelligence of human beings. [1,2]. Advances in the field of AI, primarily due to hardware advance-
Indeed, the definition of intelligence itself is controversial, especially ments, have led to the development of DL, one of the most complex sub-
in medical imaging where AI refers to machine learning (ML) algo- set of ML, with multi-layered neural network algorithms, using
rithms developed to improve the medical images analysis and to dis- techniques such as convolutional neural network (CNN) which has
cover new biomarkers. ML includes multiple different techniques proven usefulness in the analysis of endoscopic images [3], radiological
such as deep learning (DL) based on the concept of developing pre- images [4], ophtalmological images [5] or histological samples [6,7].
diction algorithms from large amounts of data, or big data (Fig. 1). Although AI has many applications in medicine, it also shows limitations.
The era of digital medicine has led to a rapidly expanding amount of Hepatocellular carcinoma (HCC), which usually occurs in the set-
routinely collected health data (RCD) extracted from electronic health ting of liver cirrhosis, accounts for at least 80% of all primary liver
cancers and is the third leading cause of cancer-related mortality
Abbreviations: AI, Artificial intelligence; ANN, Artificial neural networks; AUC, Area worldwide [8]. Its incidence has been rising in the past years with
under the curve; BCLC, Barcelona clinic liver cancer; CAD, Computer-aided diagnosis; highest incidences currently observed in Northern Africa, South-East-
CEUS, Contrast-enhanced ultrasound; CNN, Convolutional neural network; CT, Com- ern Asia and Eastern Asia [9]. With that in mind, populations at risk
puted tomography; DL, Deep learning; HCC, Hepatocellular carcinoma; ICC, Intrahe-
need life-time surveillance for early detection of HCC in order to
patic cholangiocarcinoma; LI-RADS, Liver Imaging Reporting and Data System; LR,
Logistic regression; ML, Machine learning; MRI, Magnetic resonance imaging; MVI, access curative-intended treatments, which can be quite particularly
Microvascular invasion; RCD, Routinely collected health data; RF, Random forest; SVM, challenging on a background of chronic liver disease. Despite some
Support vector machine; TACE, Transcatheter arterial chemoembolization improvements in HCC treatments, overall survival is poor, especially
* Corresponding author: anna.pellat@aphp.fr
in the metastatic setting.
E-mail address: anna.pellat@aphp.fr (A. Pellat).

https://doi.org/10.1016/j.diii.2022.10.001
2211-5684/© 2022 Société française de radiologie. Published by Elsevier Masson SAS. All rights reserved.
A. Pellat, M. Barat, R. Coriat et al. Diagnostic and Interventional Imaging 104 (2023) 24−36

neurons. Over the years, deep neural nets have proven to be effective
technology for processing and understanding patterns in very large
datasets of images, texts and even audio. This is due to these models’
ability to automatically discover patterns in the raw data, where
other ML techniques would need to be provided with an explicit set
of features [10]. CNN, which are commonly used in radiological AI
approaches and on image data in general, are based on convolutional
layers of neural nets.
On the basis of access to a rapidly expanding quantity of radiology
images and progress in development of AI models, a new analysis
approach has emerged in the field of imaging: radiomics. Radiomics
allows for the extraction of a great amount of quantitative data from
medical images to identify specific features quantifying tumor phe-
notypic characteristics. Radiomics is not in extensive use in clinical
practice because of the lack of externally validated robust models
and the need for manual segmentation of tumors that limits its devel-
opment. However, progress in the field of AI could contribute in
increasing the clinical applicability and generalizability of radiomics
in the future [12].

3. Artificial intelligence approaches for the diagnosis of


hepatocellular carcinoma

3.1. Background

Current international guidelines recommend surveillance for


early HCC detection in high-risk populations [13−15]. In these
Fig. 1. Diagram illustrates the artificial intelligence process when using routinely col- patients, ultrasonography is the most appropriate test for surveil-
lected health data. lance [13]. Ultrasound is subject to high degree of dependence on
operator experience, equipment quality and patient morphology, all
HCC is a particular tumor to study with AI approaches because it of which can impact the quality of the exam. Hence, it is not always
shows specific imaging features that can sometimes be used to make easy to detect focal liver lesions, especially in cirrhotic patients, or to
a diagnosis, thus obviating the need for histopathological analysis. characterize the encountered lesion. According to meta-analysis
Therefore, various AI approaches have been developed over the years results, the sensitivity for detection of HCC of conventional ultra-
to try to help solve the aforementioned issues. sound is only 59% to 78% [16,17]. Different ultrasound modalities
The purpose of this review was to sum up the recent evidence of have emerged in order to improve sensibility and specificity, such as
AI approaches based on radiological modalities in the management contrast-enhanced ultrasound (CEUS) which has improved the sensi-
of HCC, focusing on diagnosis and treatment. bility for HCC detection [16,17].
Since 2001, non-invasive imaging diagnosis of HCC in the setting
of cirrhosis has been accepted [13−15]. Multiphasic computed
2. Artificial intelligence approaches in medicine and imaging: tomography (CT) and magnetic resonance imaging (MRI) are the only
what to keep in mind for a better understanding two validated contrast-enhanced image techniques for the diagnosis
of HCC [13−15]. In practice, when a new liver lesion is found by ultra-
The massive trove of RCD offers a great opportunity to develop AI- sound, patients will be referred for further imaging with CT or MRI.
based models (Fig. 1). The optimal AI approach usually depends on Nevertheless, the radiological diagnosis of HCC is not always simple
the type and size of the medical data at hand as no model works best as there is a high prevalence of atypical radiological features or the
for all situations. ML has been increasingly applied in the field of radi- existence of other malignant tumors, such as intrahepatic cholangio-
ology, where it is being studied in image diagnosis, organ segmenta- carcinoma (ICC), that can mimic HCC. Therefore, patients with liver
tion, image quality improvement, medical report writing and to lesions presenting atypical features will either need histological con-
discover new biomarkers [10,11]. firmation or close follow-up. In this context, various AI models have
ML is based upon training of an algorithm on data (or training been constructed to help increase the non-invasive diagnosis accu-
data). Unlike traditional techniques, ML does not rely on explicit racy of HCC. We will describe studies focused on diagnosis with AI
rules, and does not consider an a priori type of relationship between approaches using radiological parameters. Mentioned studies are
potential predictors. It learns the implicit features, or patterns, of the summarized in Tables 1, 2 and 3.
data during training. This allows for the model to generalize to
unseen data and to be leveraged as a predictive tool. Supervised ML
relies on labeled data, whereas unsupervised techniques leverage 3.2. Artificial intelligence based on ultrasound data
unlabeled data. Different ML techniques, such as logistic regression
(LR), decision tree, random forest (RF), support vector machine 3.2.1. Improving detection of focal liver lesions and/or distinction from
(SVM), and k-nearest neighbor have all been used in various medical cirrhotic liver
studies [1,2]. ML models will be very well adapted to tabular health To assess and differentiate different stages of liver disease, Bharti
data, such as a patient’s clinicopathological or laboratory parameters. et al. proposed a model combining three AI classifiers using data
DL is a subset of ML, based on artificial neural networks (ANN), from ultrasound images: the classification accuracy of the ensemble
processing large amounts of data through multiple layers of artificial model was 96.6% [18]. More recently, Brehar et al. showed that a
neurons. The architecture of neural networks is inspired by the com- CNN model, based on ultrasound images from two different ultra-
plex architecture of the human brain and its biological networks of sound machines (two datasets), outperformed conventional ML
25
A. Pellat, M. Barat, R. Coriat et al.
Table 1
Summary of studies evaluating artificial intelligence approaches based on ultrasound imaging for the diagnosis of hepatocellular carcinoma.

Author, year [Ref.] Study outcome Type and size of the training and/or Artificial intelligence approach Validation method(s) as Performance value(s) (best Sensitivity (%)/Specificity (%)
test and/or validation set (s) (n) (es) described in each study values unless stated otherwise) (best values unless stated
otherwise)

Ultrasound imaging
Virmani J, 2014 [20] Characterization of focal liver 108 images NNE Internal validation and testing Accuracy: 95% N.R.
lesions (including HCC)
Bharti P, 2018 [18] Distinction of different liver dis- 189 images from 94 patients: Ensemble model with 3 classi- Internal validation Accuracy: 96.6% 95.5−96.9/98.0−99.8
eases (from normal liver to 48 normal liver, 50 chronic liver, fiers:
HCC) 50 cirrhosis, 41 HCC k-NN
SVM
Rotation forest
Schmauch B, 2019 Detection and characterization Training set: CNN Internal validation and testing Training set: Mean AUC: 0.935 N.R.
[21] of focal liver lesions as benign 367 images from 367 patients with for detection;
(cyst, FNH and angioma) vs. radiological reports (109 with focal Mean AUC: 0.916 for charac-
malignant (HCC, metastasis) liver lesions) terization
Test set: 177 patients Test set:
Mean AUC: 0.891 for detection
Brehar R, 2020 [19] Distinction between HCC and 2 datasets: CNN Internal validation and testing Accuracy 84.8−91%, 86.79−94.37/82.95−88.38
cirrhosis 200 patients with 823 images AUC 0.91−0.95
68 patients with 508 images
(80% for training and 20% for
testing)
Yang Q, 2020 [22] Characterization of focal liver Available pathological results for CNN: Internal and external validations Training: AUC: 0.765−0.925 N.R.
lesions (as benign vs. liver lesions. 3 models with and without Internal validation: AUC:
malignant). Training set: 16500 images/1446 background and clinical 0.859−0.966
patients parameters External validation: AUC:
26

Internal validation set: 0.750−0.924


4125 images/369 patients
External
validation set: 3718 images/328
patients
Mao B, 2021 [23] Characterization of primary and 114 patients (pathologically con- LR Internal validation and testing Mean accuracy: 72.9−84.3%, 77.5−86.8/66.7−88.0 (average
secondary malignant liver firmed liver cancer) k-NN Mean AUC: 0.737−0.816 values for all models)
lesions Training set: 91 patients MLP (average values for all models)
Test set: 23 patients RF
SVM

(continued on next page)

Diagnostic and Interventional Imaging 104 (2023) 24−36


A. Pellat, M. Barat, R. Coriat et al.
Table 1 (Continued)

Author, year [Ref.] Study outcome Type and size of the training and/or Artificial intelligence approach Validation method(s) as Performance value(s) (best Sensitivity (%)/Specificity (%)
test and/or validation set (s) (n) (es) described in each study values unless stated otherwise) (best values unless stated
otherwise)

Ren S, 2021 [24] Prediction of pathological grad- 193 patients (pathologically con- SVM Internal validation, testing and Training: AUC: 0.788−0.977, Training: 64.71−90.20/80.52
ing of HCC firmed HCC) 3 models: ultrasomics fea- external validation Accuracy: 74.22−92.19% −93.51
Training set: 128 tures, clinical factors alone and Test: AUC: 0.720−0.874, Accu- Test: 57.14−85.71/72.00
Test set: 32 combined racy: 68.75−84.38% −84.00
External validation set: 33 External validation: AUC: External validation: 75.00/
0.770−0.849, Accuracy: 66.67 61.90−85.71
−81.82%
Ren S, 2021 [25] Differentiation between HCC and 226 patients (pathologically con- SVM Internal and external validation Training: AUC: 0.840−0.975, Training: 77.42−96.77/68.64
ICC firmed HCC or ICC) 3 models: ultrasomics fea- Accuracy: 70.47−89.26 −87.29
Training set: 149 tures, clinical factors and Test: AUC: 0.711−0.936, Accu- Test: 70.00−90.00/71.43
Test set: 38 combined racy: 71.05−86.84 −85.71
External validation set: 39 External validation: AUC: External validation: 66.67
0.730−0.874, Accuracy 69.23 −88.87/66.67−86.67
−87.18
Contrast-enhanced ultrasound imaging
Ta CN, 2018 [27] Characterization of focal liver 106 lesions SVM Internal validation AUC: 0.829−0.883 N.R.
lesions as benign vs. malignant ANN Accuracy: 80.0−81.1%
Guo LH, 2018 [29] Differentiating benign from 93 lesions: MKL Internal validation Accuracy: 90.41% 93.56/86.89
malignant liver lesions 47 malignant and 46 benign
Huang Q, 2020 [28] Differentiation between atypical 257 images SVM Internal validation AUC: 94.40% 94.76/93.62
HCC and FNH
Wang W, 2021 [30] Preoperative histological grading 235 HCC lesions: SVM Internal and external validation AUC: 0.665−0.785 N.R.
65 high grade and 170 low grade 3 models: ultrasomics fea-
lesions tures, clinical factors alone and
27

combined
ANN: Artificial neural network; AUC: Area under the curve; CNN: Convolutional neural network; FNH: Focal nodular hyperplasia; HCC: Hepatocellular carcinoma; ICC: Intrahepatic cholangiocarcinoma; k-NN: k-nearest neighbor; LR: Logis-
tic regression; MKL: Multiple kernel learning; MLP: Multilayer perceptron; NNE: Neural network ensemble; N.R.: Not reported; RF: Random forest; SVM: Support vector machine

Diagnostic and Interventional Imaging 104 (2023) 24−36


A. Pellat, M. Barat, R. Coriat et al. Diagnostic and Interventional Imaging 104 (2023) 24−36

Table 2
Summary of studies evaluating artificial intelligence approaches based on computed tomography for the diagnosis of hepatocellular carcinoma.

Author, year [Ref.] Study outcome Type and size of the Artificial Validation method(s) as Performance values Sensitivity
training and/or test intelligence described by each study (best values unless (%)/Specificity (%)
and/or validation set approach(es) stated otherwise) (best values unless
(s) (n) stated otherwise)

Characterization of liver lesions


Yasaka K, 2018 [35] Characterization of liver Training set: 460 CNN Internal validation and 1/ Classification: 1/ Classification:
lesions: patients testing Training: Median Test: Se: 11−100
1/classification in five cate- Test set: 100 accuracy 95−97%
gories patients Test: Median accu-
2/malignant (HCC and non- CT images from racy 48−84%
HCC liver cancers) vs inde- non-contrast- 2/ Malignant vs the
terminate and benign enhanced, arte- rest:
lesions (hemangiomas and rial, and delayed Test: Median AUC
cysts) phases 0.61−0.92
Khan A A, 2019 [32] Characterization of liver 179 patients: SVM Internal validation and Accuracy: 96.6 Sp for HCC: 94.23
lesions: malignant versus 98 benign and 81 k-NN testing −98.3% −97.03
benign (hemangioma) malignant lesions Ensemble classifier
Todoroki Y, 2019 [39] Characterization of focal 89 patients CNN Internal validation and N.R. Se: 79−100
liver lesions (five testing
categories)
Li J, 2019 [37] Classification of nodular, dif- 165 CT images: 46 ANN Internal validation and Average AUC: 0.957 N.R.
fuse and massive HCC diffuse tumors, 43 SVM testing −0.990,
nodular tumors CNN Average accuracy:
And 76 massive 92.6−98.4%
tumors (average values for
all three models)
Mokrane FZ, 2020 [31] Characterization of indeter- Training set: 106 Radiomics Internal and external Training: Training: 81/72
minate liver lesions in cir- patients k-NN validation AUC : 0.79−0.81 External validation:
rhotic patients (HCC and Calibration set: 36 RF Calibration: AUC 70/54
non-HCC) patients SVM 0.67−0.69
External validation External validation:
set: 36 patients AUC: 0.66
(77% of biopsy-con-
firmed HCC)
Shi W, 2020 [36] Differentiation of HCC from 342 patients with MP-CDN (3 models) Internal validation and Accuracy 81.1 74.4−92.3/72.5
other focal liver lesions 449 lesions (194 testing −85.6%, −94.1
HCC) AUC 0.862−0.925
Training set: 359
lesions
Test set: 90 lesions
Zhou J, 2020 [38] Characterization of liver 616 focal liver CNN Internal validation and 1/Detection: 2/Classification:
lesions: 1/malignant (HCC, lesions testing Average precision of Binary classification:
ICC and metastasis) versus 82.8% 76.6−88.4/76.6
benign lesions (cyst, hem- 2/Classification: −88.4
angioma, and FNH) Binary classification: Six-class classifica-
2/classification in six catego- Accuracy: 82.5%, tion:
ries of liver lesions AUC: 0.921 46.6−93.1/91.9
Six-class classifica- −98.6
tion:
Accuracy: 73.4%,
AUC: 0.766−0.983
Mao B, 2020 [34] Grading of HCC Training set: 237 Radiomics Internal validation and Training: AUC: Training: 60.67
patients with HCC eXtreme Gradient testing 0.6915−0.9964, −95.51/51.35
Test set: 60 patients Boosting Accuracy: 61.18 −80.41
with HCC −97.05% Test:
Test: AUC: 0.6128 43.48−65.22/37.84
−0.8014, −81.08
Accuracy: 48.3
−70.00%
Ponnoprat D, 2020 [40] Differentiation between HCC 187 HCC and 70 ICC CNN Internal validation Accuracy 88%, N.R.
and ICC lesions SVM TPR for HCC 95.18%,
TPR for ICC 69.44%
Krishan A, 2021 [33] 1/Detection and 1638 CT scans Several individual Internal validation Detection: Accuracy N.R.
2/classification of malignant classifiers and 98.39−100%,
liver lesions (HCC and sec- multi-level AUC 0.99−1.00
ondary liver lesions) ensemble Classification: Accu-
racy 76.38−
87.01%,
AUC 0.77−0.99
Liver and/or tumor segmentation
Li W, 2015 [41] Tumor segmentation in the 26 portal phase CNN Internal validation and Precision of 82.67 § N.R.
liver enhanced CT testing 1.43%
images
Vivanti R, 2017 [46] Detection and segmentation 246 tumors (97 new CNN Internal validation TR 72−86% for N.R.
of new liver tumors tumors) detection

(continued)

28
A. Pellat, M. Barat, R. Coriat et al. Diagnostic and Interventional Imaging 104 (2023) 24−36

Table 2 (Continued)

Author, year [Ref.] Study outcome Type and size of the Artificial Validation method(s) as Performance values Sensitivity
training and/or test intelligence described by each study (best values unless (%)/Specificity (%)
and/or validation set approach(es) stated otherwise) (best values unless
(s) (n) stated otherwise)

Sun C, 2017 [42] Tumor segmentation in the CT images taken FCN Internal validation and VOE 15.6−38.2 N.R.
liver from 2 databases testing VOE 8.1−19.1
Training set: 3809 for each data set
images (2785
+1024)
Chlebus G, 2018 [43] Tumor segmentation in the LiTS challenge data- FCN and RF Internal validation and Mean dice score N.R.
liver set: 130 testing 0.69
training CT, 70 test
CT
Chen L, 2019 [44] Tumor segmentation in the LiTS challenge data- FCN Internal validation and Mean dice score N.R.
liver set: 130 training testing 0.684
CT, 70 test CT
Chen WF, 2020 [45] Tumor segmentation in the LiTS challenge data- SED Internal validation and Accuracy: 0.992; N.R.
liver set: 130 training testing AUC: 0.95, Dice
CT, 70 test CT score: 0.75
ANN: Artificial neural networks; AUC: Area under the curve; CNN: Convolutional neural network; FCN: Fully convolutional network; FNH: Focal nodular hyperplasia; HCC: Hepato-
cellular carcinoma; ICC: Intrahepatic cholangiocarcinoma; k-NN: k-nearest neighbor; LiTS: Liver tumor segmentation; MP-CDN: Multiphase convolutional dense network; N.R.: Not
reported; RF: Random forest; SED: Successive Encoder-Decoder; Se: Sensitivity; Sp: Specificity; SVM: Support vector machine, TPR: True positive ratio; VOE: Volumetric overlap
error.

models (SVM, RF, multi-layer perceptron and AdaBoost) in differenti- HCC [29]. Finally, a combined ultrasomics-based model could dis-
ating HCC from cirrhotic parenchyma [19]. criminate HCC pathological grading with an AUC of 0.720 [30].
Table 1 summarizes the results of studies evaluating AI
approaches based on ultrasound data for the diagnosis of HCC.
3.2.2. Characterization of focal liver lesions
Virmani et al. developed a neural network ensemble-based com-
3.3. Artificial intelligence based on computed tomography data
puter-aided diagnosis (CAD) that could distinguish normal liver from
four different liver lesions with an accuracy of 95% [20]. Diagnosis for
included liver lesions was confirmed by experienced participating 3.3.1. Characterization of focal liver lesions
radiologists, clinical follow-up and other associated findings [20]. Mokrane et al. built a radiomics signature, based on 13 920 CT
Later, Schmauch et al. used a dataset provided by a French radiology imaging features, that achieved AUCs of 0.81 and 0.66 for distinguish-
public challenge to design a supervised DL model which could detect ing HCC from non-HCC lesions in cirrhotic patients in the training and
focal liver lesions and characterize them as benign or malignant with external validation sets, respectively [31]. Biopsy was positive for
a mean area under the curve (AUC) of 0.935 and 0.916, respectively HCC in 77% of patients evaluated in this study [31]. Khan et al. devel-
[21]. Considering the rather small number of images used for train- oped an SVM-based tool for classification of liver tumors as benign or
ing, this model would need validation [21]. More recently, another malignant with an accuracy of 98.3% [32]. One study used CT data to
CNN model was constructed by using a large multicentric database of develop several ML-based models that showed good performances
ultrasound images along with background and clinical parameters for the detection and characterization of malignant liver lesions [33].
[22]. The AUC for distinguishing benign from malignant lesions was In another study, clinical parameters and contrast enhanced CT-based
0.924 in the external validation cohort [22]. The combined model radiomics were used either separately or combined to develop gradi-
showed better accuracy than judgment by clinical radiologists or con- ent boosting-based models for pathological grading of HCC: the com-
trast-enhanced CT, but was slightly inferior to MRI [22]. In one study, bined model showed the best performance with an AUC of 0.8014 in
radiomic features were extracted from ultrasound images and used the test set [34].
for the development of various ML-based models to distinguish pri- Several CNN-based models using CT images were also developed.
mary from secondary liver cancer; in this study, the LR model outper- CT images over three phases were used to construct a CNN model for
formed other ML models in this work [23]. In another work, a distinguishing malignant liver lesions from indeterminate and benign
combined SVM-based ultrasomics model could predict pathological liver lesions: the median AUC was 0.92 in the test set [35]. Another
grading of HCC with an AUC of 0.874 in the test set [24]. Finally, team compared the performance of a tree-phase contrast-enhanced
another combined model could differentiate between HCC and ICC CT protocol combined with a DL model, to a four-phase CT protocol,
with good performances [25]. In these last four studies, liver lesions for distinguishing HCC from other focal liver lesions [36]. Combina-
were pathologically confirmed and used as standard reference. How- tion of the DL approach with a triphasic CT protocol without pre-con-
ever, the use of ultrasound for radiomics models raises some issues in trast yielded similar diagnostic accuracy (85.6%) to a four-phase CT
terms of repeatability [26]. protocol (83.3%; P = 0.765) [36]. These findings suggest that skipping
Ta et al. showed that a ML-based CAD using CEUS data could clas- the pre-contract phase might not compromise accuracy while reduc-
sify lesions as benign or malignant with a similar accuracy than ing a patient’s radiation dose. Another team built a CAD system with
expert radiologists and could even improve the physician’s accuracy three different classifiers to help diagnose different types of HCC
[27]. Huang et al. constructed a SVM-based CAD model which could (nodular, diffuse and massive): the CNN classifier was superior to
differentiate atypical HCC and focal nodular hyperplasia using CEUS both ANN and SVM classifiers for classification of nodular and mas-
data with an average accuracy of 94.4% compared with pathology sive tumors but not diffuse tumors [37]. In another recent work, a
reports and subsequent clinical follow-up [28]. Guo et al. demon- new CNN-based CAD was developed to automatically detect and clas-
strated that a multiple-kernel learning-based model could increase sify focal liver lesions as benign or malignant: the classification per-
the sensitivity, specificity, and overall accuracy of CEUS for detecting formance of the model was placed between a junior and senior
29
A. Pellat, M. Barat, R. Coriat et al. Diagnostic and Interventional Imaging 104 (2023) 24−36

Table 3
Summary of studies evaluating artificial intelligence approaches based on magnetic resonance imaging for the diagnosis of hepatocellular carcinoma.

Author, year [Ref.] Study outcome Type and size of the Artificial intelligence Validation method Performance values Sensitivity
training and/or test and/ approach(es) (s) as described by (best values unless stated (%)/Specificity (%)
or validation set (s) (n) each study otherwise) (best values unless
stated otherwise)

Jansen M, 2019 [49] Classification among five 95 patients: 125 benign Extremely randomized Internal validation Overall accuracy 77% 62−93/56−93
different focal liver lesions (adenoma, trees classifier
lesions cyst, hemangioma)
and 88 malignant
lesions (HCC,
metastases)
Hamm C, 2019 [50] Classification of six dif- 494 liver lesions: CNN Internal validation Training: Accuracy Training: 60−100/
ferent focal liver Training set: 434 and testing 98.7% 94−100
lesions (adenoma, Test set: 60 Test: Accuracy 91.9% Test:89−99/97−100
cyst, FNH, HCC, ICC,
metastases)
Oyama A, 2019 [56] Classification of focal 150 liver lesions (50 Radiomics Internal validation Accuracy: 52−85% 18−86/30−94
liver lesions HCC, 50 metastases, LR and XGboost and testing AUC: 0.45−0.89
50 hemangiomas)
Trivizakis E, 2019 [55] Classification of malig- 134 patients 2D and 3D fully CNN and Internal validation Accuracy: 65−93.9 67−95/39−92.3
nant liver lesions: pri- SVM and testing AUC: 53−93.7
mary vs secondary
Kim J, 2020 [47] HCC detection Training set: 455 CNN Internal validation, Training: AUC: 0.97 Training: 94/99
patients testing and exter- Validation: AUC: 0.90 Validation: 87/93
External validation set: nal validation
45 patients
Zhen SH, 2020 [51] Classification of focal Training set: 1210 CNN (various models: Internal validation, External validation: External validation:
liver lesions patients seven-way, binary testing and Exter- AUC: 0.841−0.989 40.5−100/67.3
External validation set: and three-way nal validation −100
201 patients classifiers)
Wu Y, 2020 [53] Distinction between LI- 89 liver lesions (59 CNN Internal validation Accuracy: 76.7−90.0% Se: 75.6−88.9
RADS 3 and LI-RADS patients) and testing AUC: 0.90−0.95
4/5 tumors MR images acquired at
three time points
(pre-contrast, arterial
and washout phase)
Liang W, 2020 [57] Classification of three 170 patients with CT Radiomics Internal validation CT model: −
focal liver lesions (Epi- 137 patients with MRI RF and testing Training: 0.861−0.996
thelioid Angiomyoli- ANN Test: 0.731−0.879
poma, HCC and FNH) RR MRI model
Training: 0.987−0.999
Test: 0.736−0.925
Bousabarah K, 2021 [48] 1/HCC detection and 174 patients with 231 CNN Internal validation 1/Detection 1/Detection
2/segmentation lesions RF and testing Validation: Dice score: Validation: Se 55
0.49 −73
Test: Dice score: 0.48 Test: Se 66−75
2/Segmentation
Validation: Dice score:
0.64
Test: Dice score: 0.68
Oestmann P M, 2021 Differentiation between 118 patients with 150 CNN Internal validation Training: Accuracy: Test:
[52] HCC and non-HCC lesions ((93 HCC and and testing 94.1% Classification of
lesions 57 non-HCC) Test: Accuracy: 87.3% HCC:
Training set: 140 92.7%/82.0%
Test set: 10 Classification of
non-HCC: 82.0%/
92.7%
ANN: Artificial neural networks; AUC: Area under the curve; CNN: Convolutional neural network, CT: Computed tomography; FNH: Focal nodular hyperplasia; HCC: Hepatocellular
carcinoma; ICC: Intrahepatic cholangiocarcinoma; LI-RADS: Liver Imaging Reporting and Data System. LR: Logistic regression; MRI: Magnetic resonance imaging; RF: Random for-
est; RR: Ridge regression; SVM: Support vector machine.

physician’s evaluation [38]. Similarly, multiphasic CT images treatment. However, manual segmentation is not always easy, espe-
extracted from a smaller group of patients were used to develop a cially in the context of diffuse or small lesions.
CNN-based model to detect and classify five different focal liver Li et al. proposed a CNN-based model using 26 CT images which
lesions [39]. Finally, another study constructed a two-step method could perform liver tumors segmentation with a precision of 82.67%
for distinguishing HCC and ICC with a classification accuracy of 88% [41]. Later, Sun et al. designed and validated a fully CNN which
[40]. achieved high accuracy for segmentation of liver tumors [42]. Since
2017, the Liver Tumor Segmentation Challenge calls upon investiga-
tors to develop AI-based algorithms for automatic liver tumors seg-
3.3.2. Liver and tumor segmentation mentation using a multinational dataset of 200 CT scans (130
Liver and tumor (including HCC and other liver malignancies) seg- training CT scans, 70 test CT scans) (https://competitions.codalab.org/
mentation are of great importance to assess the tumor burden, detect competitions/17094). Various works using this dataset have been
early recurrence, extract radiomic features and for planning the ideal published over the years [43−45]. Up to this day, 280 different teams
30
A. Pellat, M. Barat, R. Coriat et al. Diagnostic and Interventional Imaging 104 (2023) 24−36

have already participated to the challenge, and top-scoring methods systems could one day be used to improve standardized HCC report-
used fully CNN or U-nets that separately segmented the liver and ing systems and thereby clinical outcomes. Of note, a CNN-based
liver lesions. Currently the best-scoring algorithm achieved a dice model using diffusion-weighted MRI could distinguish between pri-
score of 0.825 for lesion segmentation. While these findings are mary liver cancer and liver metastasis with high accuracy [55].
promising, there was notable variability in both the imaging charac- Finally, radiomics approaches have also been applied to MRI for clas-
teristics of liver tumors and in their delineation, underscoring the sification of liver lesions [56,57].
need for universal and standardized methods for liver tumor seg-
mentation. Vivanti et al. described a segmentation-based CNN model 4. Artificial intelligence approaches to evaluate prognostication,
for automatic detection of recurrence during follow-up [46]. The treatment response and survival in hepatocellular carcinoma
method could achieve a true positive rate of 86% for lesions >5 mm
[46]. 4.1. Background
Table 2 summarizes the results of studies evaluating AI
approaches based on CT data in the field of HCC. In the context of great individual variability between patients
with HCC, the development of robust prognostic scoring systems is
3.4. Artificial inteligence based on magnetic resonance imaging data key to improve patient risk stratification in order to optimize treat-
ment strategies and assess their effects. Here, AI can play a key role.
Due to difficult access to manual MRI annotations as well as other Several DL algorithms have been developed to help predict response
technical limitations, AI approaches have been less frequently applied to a specific treatment by analyzing tumor characteristics, such as
to MRI imaging of HCC and most of published studies have been con- morphological features, which have a major impact on patient prog-
ducted in rather small populations. Table 3 summarizes the results of nosis. By doing so, the idea is to select the best treatment option for
studies evaluating AI approaches based on MRI data in the field of patients. Mentioned studies will be summarized in Table 4.
HCC.
4.2. Hepatic transplantation
3.4.1. Improving detection
In a multicenter retrospective study, a CNN-based model devel- Only one study has assessed the risk of HCC recurrence after liver
oped in 455 patients achieved 87% sensitivity and 93% specificity for transplantation using an AI model based on CT imaging [58]. The
the detection of HCC in the external validation dataset (45 patients) radiomics nomogram based on a combined model (radiomics signa-
[47]. Notably, the model surpassed less experienced radiologists’ per- ture and clinical risk factors) showed good predictive performance
formance in the diagnosis of small HCC lesions [47]. More recently, in for recurrence-free survival with a C-index of 0.785 in the training
a retrospective single-center study, a CNN-based model using multi- dataset and 0.789 in the validation dataset [58].
phasic contrast-enhanced MRI data could automatically detect and
delineate HCC with sensitivities of 73/75% (validation/test) [48]. 4.3. Surgical resection
Mean Dice scores between automatically detected lesions using the
model and corresponding manual segmentations were 0.64/0.68 Microvascular invasion (MVI), defined as the presence of micro-
(validation/test) [48]. metastatic HCC emboli within the vessels of the liver, is a critical
determinant of early recurrence and survival [59]. Three different
3.4.2. Characterization of focal liver lesions groups developed radiomics-based models using CT images which
Jansen et al. developed a model combining clinical data and MRI- could predict preoperative MVI with good performances, especially
based features which could distinguish HCC from other malignant or when combined with clinical factors [60−62]. Regarding MRI, three
benign lesions with a sensitivity of 73% despite a specificity of 56% teams developed CNN-based models which could also predict MVI
[49]. In addition, Hamm et al. also developed a CNN-based model with high accuracies, including one that used diffusion-weighted
that could successfully classify six different liver lesions with an over- MRI [63−65]. Finally, a radiomics model based on grayscale ultra-
all accuracy of 92% [50]. More recently, Zhen et al. developed and sound images also showed promising results for the prediction of
externally validated various CNN-based models incorporating both MVI [66].
enhanced and unenhanced MR images and clinical data from 1210 Ji et al. combined several clinical and biological features (available
patients with liver tumors [51]. This DL system demonstrated excel- before surgery) and radiomics signatures to assess the risk of HCC
lent performance for classifying liver tumors on par with that recurrence after surgical resection [67]. The preoperative model, as
observed for three experienced radiologists, including the combined well as the postoperative model which also included pathological
model with unenhanced MR images, suggesting that, with further results, both outperformed conventional outcome prediction scores
validation, these models may allow patients to avoid contrast agent such as the Barcelona clinic liver cancer (BCLC) stage [67]. The models
injection and its complications [51]. Another recent study developed were also externally validated [67].
a CNN-based model using multiphasic MR images of patients with
histologically proven liver lesions: the model was trained with a 4.4. Transarterial chemoembolisation
combination of images that met the Liver Imaging Reporting and
Data System (LI-RADS) criteria for definitive HCC (LIRADS 5/typical) Transcatheter arterial chemoembolization (TACE) is the treatment
and with images that did not (atypical) and aimed to distinguish of choice for intermediate stage B HCC in the BCLC classification [13
between HCC and non-HCC lesions [52]. Sensitivities/specificities for −15,68]. Several studies have investigated the ability of AI methods
HCC and non-HCC lesions were 92.7%/82.0% and 82.0%/92.7%, respec- to predict response to TACE in patients with advanced HCC. Morshid
tively [52]. Similarly, another CNN-based model also showed high et al used two CNN to extract CT image features that were then used
accuracy for distinguishing between LI-RADS 3 and LIRADS 4/5 liver as RF inputs after radiomics processing. They achieved a prediction
lesions [53]. In addition to developing a focal liver lesion classifier, accuracy rate of 74.2% using a combination of the BCLC stage plus
Wang et al. developed a proof-of-concept prototype for the automatic quantitative image features vs. 62.9% using the BCLC stage alone [69].
identification, mapping, and scoring of radiological features within a Using residual CNN, Peng et al. trained (562 patients) and externally
DL system, enabling radiologists to interpret elements of decision- validated (89 and 138 patients) an algorithm yielding an AUC of at
making behind classification decisions [50,54]. With the need for fur- least 0.94 for prediction of complete or partial response and stable or
ther refinement and validation, interpretable and transparent DL progressive disease after TACE [70]. ML and DL techniques were both
31
A. Pellat, M. Barat, R. Coriat et al. Diagnostic and Interventional Imaging 104 (2023) 24−36

Table 4
Summary of studies evaluating artificial intelligence approaches based on imaging for the prognostication and treatment of hepatocellular carcinoma.

Author, year [Ref.] Study outcome Type and size of the Artificial intelligence Validation method(s) as Performance value Sensitivity
training and/or test and/ approach(es) described by each study (s) (best values (%)/Specificity (%)
or validation set (s) (n) unless stated (best values unless
otherwise) stated otherwise)

Computed tomography
Guo D, 2019 [58] Recurrence-free survival 130 patients Radiomics Internal validation Training: N.R.
after liver Training set: 93 LASSO C-index of 0.785
transplantation Validation set: 40 § clinical data Validation: C-index
of 0.789
Results for combined
model with clinical
data
Ma X, 2019 [61] Microvascular invasion 157 patients with HCC Nine different models Internal validation Training: Training:
prediction (Training set, 110/Vali- with radiomics § clin- Accuracy: 67.3 51.4−97.3/57.5
dation set: 47) ical data −83.6, AUC: 0.703 −93.2
−0.876 Validation:
Validation: 38.9−94.4/37.9
Accuracy: 46.8 −94.4
−80.9, AUC: 0.490
−0.801
Xu X, 2019 [62] Microvascular invasion 495 patients with oper- Radiomics with clinical Internal validation and Training: Training:
prediction and ated HCC (Training data testing AUC: 0.909, Accu- 88.0/76.8
survival set, 350/Test set, 145) LR racy: 80% Test:
Test: 89.8/79.2
AUC: 0.889, Accu-
racy: 82.8%
Ji GW, 2019 [67] Prediction of HCC recur- Training set: 210 RSF Internal and External Training: C-statistic: N.R.
rence after resection patients MRMR validation 0.748
(results of the preoper- Internal validation: Internal validation:
ative model are shown) 107 patients C-statistic: 0.781
External validation: 153 External validation:
patients C-statistic: 0.733
Morshid A, 2019 [69] Prediction of response to 105 patients CNN Internal validation Accuracy: 74.2%, N.R.
TACE RF AUC: 0.73
(model combined with
clinical data)
Shan Q-Y, 2019 [79] Prediction of early 156 patients LASSO LR models Internal and external Validation: N.R.
recurrence of HCC (Training set, 109/ validation AUC: 0.61−0.79
after curative resec- External validation set,
tion or ablation 47)
Peng J, 2020 [70] Prediction of response to Training set: 562 Residual CNN Internal and external External Validation N.R.
TACE patients validation sets
Two external validation AUCs: >0.94, Accu-
sets: 89+138 patients racy: 82.8−85.1%
Jiang Y-Q, 2020 [60] Microvascular invasion 405 patients Radiomics § clinical Internal validation Training: Validation:
prediction Training set: 324 parameters AUC: 0.900−0.980 65.9−93.2/75.7
Validation set:81 Gradient boosting Validation: −97.3
CNN AUC: 0.875−0.906,
(4 models) Accuracy: 80.2
−85.2%
Liu Q-P, 2020 [71] 1/ Microvascular inva- 494 patients operated RF Internal validation and 1/ Microvascular N.R.
sion prediction and 243 receiving TACE SVM testing invasion predic-
2/Prediction of prognos- tion
tic risk factors in HCC Training: 0.84
treated with TACE Testing: 0.79
2/Prediction of
prognostic risk
factors in HCC
treated with
TACE:
C-index of 0.733
for overall survival
Zhang L, 2020 [72] Prediction of response to 201 patients (Training CNN External validation Training: N.R.
TACE and sorafenib set, 120/External vali- 0.664−0.739
dation set, 81 External validation:
patients) 0.679−0.730
An C, 2020 [78] Local tumor progression 141 patients with HCC CNN No validation AUC: 0.728 N.R.
after MWA

(continued)

32
A. Pellat, M. Barat, R. Coriat et al. Diagnostic and Interventional Imaging 104 (2023) 24−36

Table 4 (Continued)

Author, year [Ref.] Study outcome Type and size of the Artificial intelligence Validation method(s) as Performance value Sensitivity
training and/or test and/ approach(es) described by each study (s) (best values (%)/Specificity (%)
or validation set (s) (n) unless stated (best values unless
otherwise) stated otherwise)

Ultrasound imaging
Dong Y, 2020 [66] Microvascular invasion 322 patients with oper- Radiomics +/- clinical Internal validation AUC: 0.680−0.806 33.3−83.8/53.1
prediction ated HCC (Training data AUC: 0.744 with −92.9 (without
set, 221/Test set, 101) RF (2 classifiers) clinical data and clinical data);
classifier 1 89.2−94.6/32.8
−48.4 with clini-
cal data and clas-
sifier 1
Liu D, 2020 [76] Prediction of response to 130 patients Radiomics Internal validation Training: Training:
TACE Training set: 89 patients CNN AUC: 0.82−0.98, 78.6−98.2/74.2
Validation set: 41 Accuracy: 78−98% −96.7
patients Validation: Validation:
AUC: 0.80−0.93, 82.1−89.3/73.3
Accuracy: 80−90% −92.3
Oezdemir I, 2020 [75] Prediction of response to 36 patients Distance weighted dis- Internal validation Accuracy: 86% 89/82
TACE crimination method
Magnetic resonance imaging
Zhang F, 2018 [74] Liver tissue classification 20 patients Auto-context-based Internal validation Dice score: 0.63 N.R.
after TACE CNN −0.77
Abajian A, 2018 [73] Prediction of response to 36 patients LR Internal validation Accuracy: 78% 62.5/82.1
TACE RF
Song D, 2021 [63] Microvascular invasion 601 patients with HCC CNN Internal validation and Training: Training:
prediction (Training set, (models +/- clinical data) testing AUC: 0.764−0.934, 66.7−84.5/76.3
461 patients/Testing Accuracy: 0.727 −86.1
set, 140 patients) −0.871 Testing:
Testing: 64.7−88.2/68.5
AUC: 0.731−0.931, −88.8
Accuracy: 0.671
−0.886
Zhang Y, 2021 [64] Microvascular invasion 237 patients RF Internal validation Training: Training: 69/79
prediction Training set: 158 AUC: 0.81 Validation:
patients Validation: 55/81
Internal validation set: AUC: 0.72
79 patients
Wang G, 2021 [65] Microvascular invasion 97 patients with 100 CNN Internal validation and Accuracy: 66.81 56.56−76.4/64.35
prediction HCC (Training set, testing −77.50, −79.13
60 patients/Test set, AUC: 0.6865
40 patients) −0.7969
Aujay G, 2022 [80] Prediction of response to 22 patients with HCC Radiomics: four features No model development AUC: 0.86−1 across 75−100/86−100
TARE (evaluation of associated with early the four features across the four
radiomics features response features
associated with early
response)
AUC: Area under the curve; CNN: Convolutional neural network; HCC: Hepatocellular carcinoma; LASSO: Least absolute shrinkage and selection operator; LR: Logistic regression;
MRMR: Maximum relevance minimum redundancy; MWA: Microwave ablation; N.R.: Not reported; RF: Random forest; RSF: Random survival forest; SVM: Support vector
machine; TACE: Transcatheter arterial chemoembolization; TARE: Transarterial radioembolization

used on CT image radiomics features by Liu et al. to develop AI-based due to the small sample size [75]. Finally, Liu et al. constructed and
prognostic risk factors for overall survival in patients treated with validated a DL radiomics-based model by using CEUS cine recordings.
TACE and also help for the prediction of MVI [71]. Interestingly, these The model showed an AUC of 0.93 for prediction of response to TACE
factors were shown to be independently associated with survival, but [76].
the study lacked external validation, which may limit generalizability
[71]. Similarly, Zhang et al. built a DL signature derived from CT
images of patients with HCC treated with TACE and sorafenib [72].
The nomogram, built by combining the DL signature and clinical fea- 4.5. Thermal ablative therapies
tures, showed good performances for prediction of overall survival.
Abajian et al. developed predictive models by using pretherapeu- Percutaneous thermal ablation is a therapeutic option in patients
tic MR images from 36 patients in combination with clinical parame- with early stage CHC (BCLC-0) [13−15,77]. Researchers evaluated
ters [73]. The resulting models could predict TACE treatment treatment response to microwave ablation based on the ablative
response with an accuracy of 78% [73]. Interestingly, other research- margin, by using MR images [78]. To reduce the registration errors
ers tested an auto-context-based deep NN approach to detect and due to breathing motion and heating-induced tissue deformation,
delineate different types of liver tissue after TACE (viable, necrosis) they developed an unsupervised landmark-constrained CNN-based
on multi-parameter MR images and demonstrated the feasibility of deformable image registration technique which could predict local
bypassing the time-consuming process of manually designing MRI- tumor progression at two years with good accuracy [78]. Finally,
based features [74]. another study showed that a CT-based peritumoral radiomics signa-
With a similar aim, Oezdemir et al. extracted handcrafted HCC ture was more effective than a tumoral radiomics signature to predict
microvascular features from CEUS images [75]. Although the model early recurrence in HCC after curative tumor resection or ablation
achieved an accuracy of 86%, these results require further evaluation [79].
33
A. Pellat, M. Barat, R. Coriat et al. Diagnostic and Interventional Imaging 104 (2023) 24−36

4.6. Transarterial radioembolization 6. Conclusion

One study evaluated radiomics using MRI data in the assessment Many AI-based models have been developed with the aim to
of early response to 90yttrium transarterial radioembolization in change and improve the way we care for patients with HCC. The util-
patients with locally advanced HCC [80]. They found four radiomics ity of AI continues to evolve as an assistant for image interpretation,
parameters predictors of early response with high sensitivities, out- especially in the context of diagnosis and response to treatment.
doing other evaluation methods (mRECIST, RECIST) [80]. Although significant progress has been made during the last decade,
improvements in HCC management are still critically needed. Exist-
ing studies involving imaging show that AI has a potential role for
early detection of HCC, characterization of liver lesions and liver/
5. Current challenges for artificial intelligence applicability in tumor segmentation. Regarding HCC treatment, AI has the potential
healthcare to both increase the efficiency and effectiveness of post-treatment
evaluation as well as helping in treatment decision through pre-ther-
Although AI holds many promises for HCC management, and apeutic prognostication. Now we must demonstrate that these
healthcare in general, deployment of AI algorithms in clinical settings approaches work in clinical settings, by comparing model perfor-
remains very rare due to a number of challenges. First, there is the mance to that of conventional staging systems, and further through
issue of regulation of AI models. AI tools, designed to assist in diagno- the conduction of tailored prospective clinical trials involving AI-
sis and treatment of diseases, could be considered as medical devices based interventions. Remains a great need to standardize and
and therefore should adhere to the respective regulations. Both the robustly evaluate AI algorithms in large-scale datasets that reflect the
FDA and European commission have published plans on AI in medical “real-world” heterogeneity and improve interpretability. Even if AI
device legislation to tackle this issue (https://futurium.ec.europa.eu/ will most certainly play a role in the future of patient care alongside
en/european-ai-alliance/document/artificial-intelligence-medical- with the physician, only the latter is capable of understanding and
device-legislation) [81]. There are also concerns regarding intellec- considering patients’ needs, beliefs and wishes, to adjust the optimal
tual property, especially when changes or modifications are made individual treatment.
overtime once the device is marketed. This also implies potential
safety issues if the device evolves from its original form [82]. The lack Human rights
of accuracy of AI models is another issue for applicability. Available
health datasets can be subject to biases and data quality issues, which The authors declare that the work described has been performed
can arise during the collection process, in case of wrong labeling, in in accordance with the Declaration of Helsinki of the World Medical
the context of lack of standardization and because of missing data. Association revised in 2013 for experiments involving humans.
Indeed, large high-quality standardized health datasets are rare, and
when using RCD, which are initially not collected for research pur-
Informed consent and patient details
poses, there is a risk for the aforementioned limitations. Overfitting
and spectrum biases are frequently encountered in AI-based models
The authors declare that this report does not contain any personal
[83]. CNN, which is extensively used in radiological imaging-based
information that could lead to the identification of the patients.
models, is particularly vulnerable to overfitting [83]. Furthermore,
standardized methods for AI-based data analysis or interpretation
are still needed, as well as universal approaches to address missing Declaration of Competing Interest
data.
As the performance of AI models highly depends on the amount of The authors declare that they have no known competing financial
data used for training, the availability of large datasets is crucial and or personal relationships that could be viewed as influencing the
data sharing should be encouraged. In this regard, one major issue is work reported in this paper.
the “real-world” performance of AI and the need for post-approval
validation of AI applications [11,84]. Sharing of individual-participant Author contributions
data from trials and studies could assist in constructing datasets of
sufficient size and detail to appropriately train and validate AI models All authors attest that they meet the current International Com-
[85]. However, access to RCD for development of AI models, as well as mittee of Medical Journal Editors (ICMJE) criteria for Authorship.
data sharing of datasets, raises some ethical and privacy concerns for Anna Pellat: Conceptualization, Methodology, Formal analysis, Writ-
patients [86]. ing- Original draft preparation, Final draft approval; Maxime Barat:
Another significant concern of integrating AI-based models in Conceptualization, Methodology, Formal analysis, Validation,
healthcare is the lack of interpretability. AI models are often Reviewing and Editing, Final draft approval; Romain Coriat: Concep-
described as “black-boxes” since there is little insight into how the tualization, Investigation, Supervision, Writing- Reviewing and Edit-
model decides based on its input. To deal with this problem, “explain- ing, Final draft approval; Philippe Soyer: Conceptualization,
able AI”, which refers to a particular set of methods that allows users Supervision, Writing- Reviewing and Editing, Final draft approval;
to interpret an AI model, is an active field of research. The main goal Anthony Dohan: Supervision, Conceptualization, Methodology,
is to better interpret model outcomes. Interpretability is critical to Writing − review & editing., Final draft approval.
build up the trust needed to convince doctors to rely on these CAD
that might assist them in the future. All the more that the future Funding
implantation of these AI-based tools will raise the issue of liability
regarding clinical decisions, especially in case of discrepancies This research did not receive any specific grant from funding
between the physician and the tool. This might also involve legal con- agencies in the public, commercial, or not-for-profit sectors
siderations.
Finally, in order to fully demonstrate the role of AI in HCC clinical
References
care, tailored prospective clinical trials are still needed. In that sense,
the widely used SPIRIT and CONSORT guidelines have been extended [1] Kaul V, Enslin S, Gross SA. History of artificial intelligence in medicine. Gastroint-
to integrate the concept of interventions involving AI [87,88]. est Endosc 2020;92:807–12.

34
A. Pellat, M. Barat, R. Coriat et al. Diagnostic and Interventional Imaging 104 (2023) 24−36

[2] Lassau N, Bousaid I, Chouzenoux E, Verdon A, Balleyguier C, Bidault F, et al. Three [31] Mokrane F-Z, Lu L, Vavasseur A, Otal P, Peron J-M, Luk L, et al. Radiomics machine-
artificial intelligence data challenges based on CT and ultrasound. Diagn Interv learning signature for diagnosis of hepatocellular carcinoma in cirrhotic patients
Imaging 2021;102:669–74. with indeterminate liver nodules. Eur Radiol 2020;30:558–70.
[3] Min JK, Kwak MS, Cha JM. Overview of deep learning in gastrointestinal endos- [32] Khan AA, Narejo GB. Analysis of abdominal computed tomography images for
copy. Gut Liver 2019;13:388–93. automatic liver cancer diagnosis using image processing algorithm. Curr Med
[4] Yasaka K, Akai H, Kunimatsu A, Kiryu S, Abe O. Deep learning with convolutional Imaging Rev 2019;15:972–82.
neural network in radiology. Jpn J Radiol 2018;36:257–72. [33] Krishan A, Mittal D. Ensembled liver cancer detection and classification using CT
[5] Milea D, Najjar RP, Zhubo J, Ting D, Vasseneix C, Xu X, et al. Artificial intelligence images. Proc Inst Mech Eng H 2021;235:232–44.
to detect papilledema from ocular fundus photographs. N Engl J Med [34] Mao B, Zhang L, Ning P, Ding F, Wu F, Lu G, et al. Preoperative prediction for path-
2020;382:1687–95. ological grade of hepatocellular carcinoma via machine learning-based radiomics.
[6] Noorbakhsh J, Farahmand S, Foroughi Pour A, Namburi S, Caruana D, Rimm D, Eur Radiol 2020;30:6924–32.
et al. Deep learning-based cross-classifications reveal conserved spatial behaviors [35] Yasaka K, Akai H, Abe O, Kiryu S. Deep learning with convolutional neural net-
within tumor histological images. Nat Commun 2020;11:6367. work for differentiation of liver masses at dynamic contrast-enhanced CT: a pre-
[7] Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, et al. Dermatologist- liminary study. Radiology 2018;286:887–96.
level classification of skin cancer with deep neural networks. Nature [36] Shi W, Kuang S, Cao S, Hu B, Xie S, Chen S, et al. Deep learning assisted differentia-
2017;542:115–8. tion of hepatocellular carcinoma from focal liver lesions: choice of four-phase and
[8] Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global three-phase CT imaging protocol. Abdom Radiol 2020;45:2688–97.
cancer statistics 2020: GLOBOCAN estimates of incidence and mortality world- [37] Li J, Wu Y, Shen N, Zhang J, Chen E, Sun J, et al. A fully automatic computer-aided
wide for 36 cancers in 185 countries. CA Cancer J Clin 2021;71:209–49. diagnosis system for hepatocellular carcinoma using convolutional neural net-
[9] Rumgay H, Ferlay J, de Martel C, Georges D, Ibrahim AS, Zheng R, et al. Global, works. Biocybern Biomed Eng 2020;40:238–48.
regional and national burden of primary liver cancer by subtype. Eur J Cancer [38] Zhou J, Wang W, Lei B, Ge W, Huang Y, Zhang L, et al. Automatic detection and
2022;161:108–18. classification of focal liver lesions based on deep convolutional neural networks:
[10] Nakaura T, Higaki T, Awai K, Ikeda O, Yamashita Y. A primer for understanding a preliminary study. Front Oncol 2020;10:581210.
radiology articles about machine learning and deep learning. Diagn Interv Imag- [39] Todoroki Y, Iwamoto Y, Lin L, Hu H, Chen YW. Automatic detection of focal liver
ing 2020;101:765–70. lesions in multi-phase CT images using a multi-channel & multi-scale CNN. Annu
[11] Soyer P, Fishman EK, Rowe SP, Patlas MN, Chassagnon G. Does artificial intelli- Int Conf IEEE Eng Med Biol Soc 2019;2019:872–5.
gence surpass the radiologist? Diagn Interv Imaging 2022;103:445–7. [40] Ponnoprat D, Inkeaw P, Chaijaruwanich J, Traisathit P, Sripan P, Inmutto N, et al.
[12] Liu Z, Wang S, Dong D, Wei J, Fang C, Zhou X, et al. The applications of radiomics Classification of hepatocellular carcinoma and intrahepatic cholangiocarcinoma
in precision diagnosis and treatment of oncology: opportunities and challenges. based on multi-phase CT scans. Med Biol Eng Comput 2020;58:2497–515.
Theranostics 2019;9:1303–22. [41] Li W, Jia F, Hu Q. Automatic segmentation of liver tumor in CT images with deep
[13] EASL clinical practice guidelines: management of hepatocellular carcinoma. J convolutional neural networks. J Comput Com 2015;03:146.
Hepatol 2018;69:182–236. [42] Sun C, Guo S, Zhang H, Li J, Chen M, Ma S, et al. Automatic segmentation of liver
[14] Marrero JA, Kulik LM, Sirlin CB, Zhu AX, Finn RS, Abecassis MM, et al. Diagnosis, tumors from multiphase contrast-enhanced CT images based on FCNs. Artif Intell
staging, and management of hepatocellular carcinoma: 2018 practice guidance Med 2017;83:58–66.
by the American association for the study of liver diseases. Hepatology [43] Chlebus G, Schenk A, Moltz JH, Ginneken BV, Hahn HK, Meine H. Automatic liver
2018;68:723–50. tumor segmentation in CT with fully convolutional neural networks and object-
[15] Heimbach JK, Kulik LM, Finn RS, Sirlin CB, Abecassis MM, Roberts LR, et al. AASLD based postprocessing. Sci Rep 2018;8:15497.
guidelines for the treatment of hepatocellular carcinoma. Hepatology [44] Chen L, Song H, Wang C, Cui Y, Yang J, Hu X, et al. Liver tumor segmentation in CT
2018;67:358–80. volumes using an adversarial densely connected network. BMC Bioinform
[16] Hanna RF, Miloushev VZ, Tang A, Finklestone LA, Brejt SZ, Sandhu RS, et al. Com- 2019;20:587.
parative 13-year meta-analysis of the sensitivity and positive predictive value of [45] Chen WF, Ou HY, Liu KH, Li ZY, Liao CC, Wang SY, et al. In-series U-net network to
ultrasound, CT, and MRI for detecting hepatocellular carcinoma. Abdom Radiol 3D tumor image reconstruction for liver hepatocellular carcinoma recognition.
2016;41:71–90. Diagnostics 2020;11:E11.
[17] Chou R, Cuevas C, Fu R, Devine B, Wasson N, Ginsburg A, et al. Imaging techniques [46] Vivanti R, Szeskin A, Lev-Cohain N, Sosna J, Joskowicz L. Automatic detection of
for the diagnosis of hepatocellular carcinoma: a systematic review and meta- new tumors and tumor burden evaluation in longitudinal liver CT scan studies.
analysis. Ann Intern Med 2015;162:697–711. Int J Comput Assist Radiol Surg 2017;12:1945–57.
[18] Bharti P, Mittal D, Ananthasivan R. Preliminary study of chronic liver classification on [47] Kim J, Min JH, Kim SK, Shin SY, Lee MW. Detection of hepatocellular carcinoma in
ultrasound images using an ensemble model. Ultrason Imaging 2018;40:357–79. contrast-enhanced magnetic resonance imaging using deep learning classifier: a
[19] Brehar R, Mitrea DA, Vancea F, Marita T, Nedevschi S, Lupsor-Platon M, et al. Com- multi-center retrospective Study. Sci Rep 2020;10:9458.
parison of deep-learning and conventional machine-learning methods for the [48] Bousabarah K, Letzen B, Tefera J, Savic L, Schobert I, Schlachter T, et al. Automated
automatic recognition of the hepatocellular carcinoma areas from ultrasound detection and delineation of hepatocellular carcinoma on multiphasic contrast-
images. Sensors 2020;20:E3085. enhanced MRI using deep learning. Abdom Radiol 2021;46:216–25.
[20] Virmani J, Kumar V, Kalra N, Khandelwal N. Neural network ensemble based CAD [49] Jansen MJA, Kuijf HJ, Veldhuis WB, Wessels FJ, Viergever MA, Pluim JPW. Auto-
system for focal liver lesions from B-mode ultrasound. J Digit Imaging matic classification of focal liver lesions based on MRI and risk factors. PLoS One
2014;27:520–37. 2019;14:e0217053.
[21] Schmauch B, Herent P, Jehanno P, Dehaene O, Saillard C, Aube  C, et al. Diagnosis of [50] Hamm CA, Wang CJ, Savic LJ, Ferrante M, Schobert I, Schlachter T, et al. Deep
focal liver lesions from ultrasound using deep learning. Diagn Interv Imaging learning for liver tumor diagnosis part I: development of a convolutional neural
2019;100:227–33. network classifier for multi-phasic MRI. Eur Radiol 2019;29:3338–47.
[22] Yang Q, Wei J, Hao X, Kong S, Yu X, Jiang T, et al. Improving B-mode ultrasound [51] Zhen SH, Cheng M, Tao YB, Wang YF, Juengpanich S, Jiang ZY, et al. Deep Learning
diagnostic performance for focal liver lesions using deep learning: a multicentre for accurate diagnosis of liver tumor based on magnetic resonance imaging and
study. EBioMedicine 2020;56:102777. clinical data. Front Oncol 2020;10:680.
[23] Mao B, Ma J, Duan S, Xia Y, Tao Y, Zhang L. Preoperative classification of primary [52] Oestmann PM, Wang CJ, Savic LJ, Hamm CA, Stark S, Schobert I, et al. Deep learn-
and metastatic liver cancer via machine learning-based ultrasound radiomics. Eur ing-assisted differentiation of pathologically proven atypical and typical hepato-
Radiol 2021;31:4576–86. cellular carcinoma (HCC) versus non-HCC on contrast-enhanced MRI of the liver.
[24] Ren S, Qi Q, Liu S, Duan S, Mao B, Chang Z, et al. Preoperative prediction of patho- Eur Radiol 2021;31:4981–90.
logical grading of hepatocellular carcinoma using machine learning-based ultra- [53] Wu Y, White GM, Cornelius T, Gowdar I, Ansari MH, Supanich MP, et al. Deep
somics: a multicenter study. Eur J Radiol 2021;143:109891. learning LI-RADS grading system based on contrast enhanced multiphase MRI for
[25] Ren S, Li Q, Liu S, Qi Q, Duan S, Mao B, et al. Clinical value of machine learning- differentiation between LR-3 and LR-4/LR-5 liver tumors. Ann Transl Med
based ultrasomics in preoperative differentiation between hepatocellular carci- 2020;8:701.
noma and intrahepatic cholangiocarcinoma: a multicenter study. Front Oncol [54] Wang CJ, Hamm CA, Savic LJ, Ferrante M, Schobert I, Schlachter T, et al. Deep
2021;11:749137. learning for liver tumor diagnosis part II: convolutional neural network interpre-
[26] Duron L, Savatovsky J, Fournier L, Lecler A. Can we use radiomics in ultrasound tation using radiologic imaging features. Eur Radiol 2019;29:3348–57.
imaging? Impact of preprocessing on feature repeatability. Diagn Interv Imaging [55] Trivizakis E, Manikis GC, Nikiforaki K, Drevelegas K, Constantinides M, Drevelegas
2021;102:659–67. A, et al. Extending 2-D convolutional neural networks to 3-D for advancing deep
[27] Ta CN, Kono Y, Eghtedari M, Taik Oh Y, Robbin ML, Barr RG, et al. Focal liver learning cancer classification with application to MRI liver tumor differentiation.
lesions: computer-aided diagnosis by using contrast-enhanced US cine record- IEEE J Biomed Health Inform 2019;23:923–30.
ings. Radiology 2018;286:1062–71. [56] Oyama A, Hiraoka Y, Obayashi I, Saikawa Y, Furui S, Shiraishi K, et al. Hepatic
[28] Huang Q, Pan F, Li W, Yuan F, Hu H, Huang J, et al. Differential diagnosis of atypical tumor classification using texture and topology analysis of non-contrast-
hepatocellular carcinoma in contrast-enhanced ultrasound using spatio-temporal enhanced three-dimensional T1-weighted MR images with a radiomics approach.
diagnostic semantics. IEEE J Biomed Health Inform 2020;24:2860–9. Sci Rep 2019;9:8764.
[29] Guo LH, Wang D, Qian YY, Zheng X, Zhao CK, Li XL, et al. A two-stage multi-view [57] Liang W, Shao J, Liu W, Ruan S, Tian W, Zhang X, et al. Differentiating hepatic epi-
learning framework based computer-aided diagnosis of liver tumors with con- thelioid angiomyolipoma from hepatocellular carcinoma and focal nodular
trast enhanced ultrasound images. Clin Hemorheol Microcirc 2018;69:343–54. hyperplasia via radiomics models. Front Oncol 2020;10:564307.
[30] Wang W, Wu S-S, Zhang J-C, Xian M-F, Huang H, Li W, et al. Preoperative patho- [58] Guo D, Gu D, Wang H, Wei J, Wang Z, Hao X, et al. Radiomics analysis enables
logical grading of hepatocellular carcinoma using ultrasomics of contrast- recurrence prediction for hepatocellular carcinoma after liver transplantation.
Enhanced ultrasound. Acad Radiol 2021;28:1094–101. Eur J Radiol 2019;117:33–40.
A. Pellat, M. Barat, R. Coriat et al. Diagnostic and Interventional Imaging 104 (2023) 24−36

[59] Erstad DJ, Tanabe KK. Prognostic and therapeutic implications of microvascular [74] Zhang F, Yang J, Nezami N, Laage-Gaupp F, Chapiro J, De Lin M, et al. Liver tissue
invasion in hepatocellular carcinoma. Ann Surg Oncol 2019;26:1474–93. classification using an auto-context-based deep neural network with a multi-
[60] Jiang YQ, Cao SE, Cao S, Chen JN, Wang G-Y, Shi WQ, et al. Preoperative identifica- phase training framework. Patch Based Tech Med Imaging 2018;11075:59–66.
tion of microvascular invasion in hepatocellular carcinoma by XGBoost and deep [75] Oezdemir I, Wessner CE, Shaw C, Eisenbrey JR, Hoyt K. Tumor vascular networks
learning. J Cancer Res Clin Oncol 2021;147:821–33. depicted in contrast-enhanced ultrasound images as a predictor for transarterial
[61] Ma X, Wei J, Gu D, Zhu Y, Feng B, Liang M, et al. Preoperative radiomics nomogram chemoembolization treatment response. Ultrasound Med Biol 2020;46:2276–86.
for microvascular invasion prediction in hepatocellular carcinoma using contrast- [76] Liu D, Liu F, Xie X, Su L, Liu M, Xie X, et al. Accurate prediction of responses to
enhanced CT. Eur Radiol 2019;29:3595–605. transarterial chemoembolization for patients with hepatocellular carcinoma by
[62] Xu X, Zhang HL, Liu QP, Sun SW, Zhang J, Zhu FP, et al. Radiomic analysis of con- using artificial intelligence in contrast-enhanced ultrasound. Eur Radiol
trast-enhanced CT predicts microvascular invasion and outcome in hepatocellular 2020;30:2365–76.
carcinoma. J Hepatol 2019;70:1133–44. [77] Young S, Rivard M, Kimyon R, Sanghvi T. Accuracy of liver ablation zone predic-
[63] Song D, Wang Y, Wang W, Wang Y, Cai J, Zhu K, et al. Using deep learning to pre- tion in a single 2450MHz 100 Watt generator model microwave ablation system:
dict microvascular invasion in hepatocellular carcinoma based on dynamic con- an in human study. Diagn Interv Imaging 2020;101:225–33.
trast-enhanced MRI combined with clinical parameters. J Cancer Res Clin Oncol [78] An C, Jiang Y, Huang Z, Gu Y, Zhang T, Ma L, et al. Assessment of ablative margin
2021;147:3757–67. after microwave ablation for hepatocellular carcinoma using deep learning-based
[64] Zhang Y, Lv X, Qiu J, Zhang B, Zhang L, Fang J, et al. Deep learning with 3D convo- deformable image registration. Front Oncol 2020;10:573316.
lutional neural network for noninvasive prediction of microvascular invasion in [79] Shan QY, Hu HT, Feng ST, Peng ZP, Chen SL, Zhou Q, et al. CT-based peritumoral
hepatocellular carcinoma. J Magn Reson Imaging 2021;54:134–43. radiomics signatures to predict early recurrence in hepatocellular carcinoma after
[65] Wang G, Jian W, Cen X, Zhang L, Guo H, Liu Z, et al. Prediction of microvascular curative tumor resection or ablation. Cancer Imaging 2019;19:11.
invasion of hepatocellular carcinoma based on preoperative diffusion-weighted [80] Aujay G, Etchegaray C, Blanc J-F, Lapuyade B, Papadopoulos P, Pey MA, et al. Com-
MR using deep learning. Acad Radiol 2021;28(Suppl 1):S118–27. parison of MRI-based response criteria and radiomics for the prediction of early
[66] Dong Y, Zhou L, Xia W, Zhao XY, Zhang Q, Jian JM, et al. Preoperative prediction of response to transarterial radioembolization in patients with hepatocellular carci-
microvascular invasion in hepatocellular carcinoma: initial application of a radio- noma. Diagn Interv Imaging 2022;103:360–6.
mic algorithm based on grayscale ultrasound images. Front Oncol 2020;10:353. [81] Artificial intelligence/machine learning (AI/ML)-based software as a medical
[67] Ji GW, Zhu FP, Xu Q, Wang K, Wu MY, Tang WW, et al. Machine-learning analysis device (SaMD) action plan. https://www.fda.gov/medical-devices/software-medi-
of contrast-enhanced CT radiomics predicts recurrence of hepatocellular carci- cal-device-samd/artificial-intelligence-and-machine-learning-software-medical-
noma after resection: a multi-institutional study. EBioMedicine 2019;50:156–65. device (2021, accessed 23 September 2022).
[68] Dohan A, Barat M, Coriat R, Soyer P. A step toward a better understanding of [82] Hwang TJ, Kesselheim AS, Vokinger KN. Lifecycle regulation of artificial intelli-
hepatocellular progression after transarterial embolization. Diagn Interv Imaging gence− and machine learning−based software devices in medicine. JAMA
2022;103:125–6. 2019;322:2285–6.
[69] Morshid A, Elsayes KM, Khalaf AM, Elmohr MM, Yu J, Kaseb AO, et al. A machine [83] Park SH, Han K. Methodologic guide for evaluating clinical performance and effect
learning model to predict hepatocellular carcinoma response to transcatheter of artificial intelligence technology for medical diagnosis and prediction. Radiol-
arterial chemoembolization. Radiol Artif Intell 2019;1:e180021. ogy 2018;286:800–9.
[70] Peng J, Kang S, Ning Z, Deng H, Shen J, Xu Y, et al. Residual convolutional neural [84] Dupuis M, Delbos L, Veil R, Adamsbaum C. External validation of a commercially
network for predicting response of transarterial chemoembolization in hepato- available deep learning algorithm for fracture detection in children. Diagn Interv
cellular carcinoma from CT imaging. Eur Radiol 2020;30:413–24. Imaging 2022;103:151–9.
[71] Liu QP, Xu X, Zhu FP, Zhang YD, Liu XS. Prediction of prognostic risk factors in [85] Chassagnon G, Dohan A. Artificial intelligence: from challenges to clinical imple-
hepatocellular carcinoma with transarterial chemoembolization using multi- mentation. Diagn Interv Imaging 2020;101:763–4.
modal multi-task deep learning. EClinicalMedicine 2020;23:100379. [86] Krupinski EA. An ethics framework for clinical imaging data sharing and the
[72] Zhang L, Xia W, Yan ZP, Sun JH, Zhong BY, Hou ZH, et al. Deep learning predicts greater good. Radiology 2020;295:683–4.
overall survival of patients with unresectable hepatocellular carcinoma treated [87] Cruz Rivera S, Liu X, Chan AW, Denniston AK, Calvert MJ. Guidelines for clinical
by transarterial chemoembolization plus sorafenib. Front Oncol 2020;10:593292. trial protocols for interventions involving artificial intelligence: the SPIRIT-AI
[73] Abajian A, Murali N, Savic LJ, Laage-Gaupp FM, Nezami N, Duncan JS, et al. Predict- extension. Nat Med 2020;26:1351–63.
ing treatment response to intra-arterial therapies for hepatocellular carcinoma [88] Liu X, Cruz Rivera S, Moher D, Calvert MJ, Denniston AK. Reporting guidelines for
with the use of supervised machine learning: an artificial intelligence concept. J clinical trial reports for interventions involving artificial intelligence: the CON-
Vasc Interv Radiol 2018;29 850-857.e1. SORT-AI extension. Nat Med 2020;26:1364–74.

You might also like