Application of An Improved U2 Net Model in Ultraso
Application of An Improved U2 Net Model in Ultraso
Application of An Improved U2 Net Model in Ultraso
25122520, 2022
Copyright © 2022 The Author(s). Published by Elsevier Inc. on behalf of World Federation for Ultrasound in Medicine & Biology.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Printed in the USA. All rights reserved.
0301-5629/$ - see front matter
https://doi.org/10.1016/j.ultrasmedbio.2022.08.003
Original Contribution
Abstract—To investigate whether an improved U2-Net model could be used to segment the median nerve and
improve segmentation performance, we performed a retrospective study with 402 nerve images from patients
who visited Huashan Hospital from October 2018 to July 2020; 249 images were from patients with carpal tunnel
syndrome, and 153 were from healthy volunteers. From these, 320 cases were selected as training sets, and 82
cases were selected as test sets. The improved U2-Net model was used to segment each image. Dice coefficients
(Dice), pixel accuracy (PA), mean intersection over union (MIoU) and average Hausdorff distance (AVD) were
used to evaluate segmentation performance. Results revealed that the Dice, MIoU, PA and AVD values of our
improved U2-Net were 72.85%, 79.66%, 95.92% and 51.37 mm, respectively, which were comparable to the
actual ground truth; the ground truth came from the labeling of clinicians. However, the Dice, MIoU, PA and
AVD values of U-Net were 43.19%, 65.57%, 86.22% and 74.82 mm, and those of Res-U-Net were 58.65%,
72.53%, 88.98% and 57.30 mm. Overall, our data suggest our improved U2-Net model might be used for segmen-
tation of ultrasound median neural images. (E-mail: gengdy@163.com) © 2022 The Author(s). Published by
Elsevier Inc. on behalf of World Federation for Ultrasound in Medicine & Biology. This is an open access article
under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Key Words: Image segmentation, Ultrasound, Median nerve, U2-Net model, Computer-aided diagnosis.
2512
Improved U2-Net model J. SHAO et al. 2513
2021). With improvements in resolution, an increase in 1968; George et al. 1976; Yan and Zhuang 2003; Sites
the utilization of ultrasound in musculoskeletal system et al. 2007; Fu et al. 2019; Pissas et al. 2020) greatly
imaging has been observed; however, the correct seg- impedes the clinician’s ability to distinguish carpal tun-
mentation of neural ultrasound images has remained a nel syndrome and segment and measure the median
challenge (Shin et al. 2021). Automatic detection of nerve. In addition to median neural segmentation based
median nerve structure from medical images is a key on a neural network, there are also many non-neural-net-
step in early diagnosis of carpal tunnel syndrome. Proper work algorithms, such as threshold-based methods
segmentation of median nerve provides clinicians with (Rodrigues et al. 2011), which consider only gray statis-
information on the mechanism, diagnosis and treatment tics without considering spatial location information.
of carpal tunnel syndrome. Correct identification of the The graph-based method (Huang et al. 2012) requires
median nerve is the first step in the diagnosis of CTS. the operator to have rich inspection experience; the
For beginners, it is very time consuming to identify the active contour model (Huang et al. 2007) easily reduces
correct median nerve image, because it is difficult to dis- segmentation accuracy because of the false edge and
tinguish the median nerve from the flexor tendon of the image noise of the ultrasound image. Because of the spe-
finger. For experienced operators, it is time consuming cific characteristics of ultrasound such as attenuation,
to measure the median nerve. An artificial intelligence- penetration, uniformity, shadow, real time and operator
based segmentation algorithm allows accurate localiza- dependence, as well as different image characteristics on
tion of the median nerve and faster and more standard- different devices, it is necessary to develop a rapid, accu-
ized measurements of the median nerve, eliminating rate and automated screening tool to identify the median
operator dependence and inter-observer variability in nerve. The purpose of this study was to retrospectively
standardized measurements. study the role of the U2-Net model in median neural
Algorithm updates, the strengthening of computing image segmentation.
power and the availability of large-scale data are three
factors that have facilitated the continuous improvement METHODS
of image segmentation. At present, U-Net networks are
used mainly for image segmentation, and various other Data set
networks based on U-Net networks have been proposed From October 2018 to July 2020, a total of 249
(Shelhamer et al. 2017). Fang et al. (2021) proposed a patients with carpal tunnel syndrome diagnosed by elec-
new learning conceptual model, the generalized linear tromyography and operated on by the Department of
model (GLM), which effectively overcomes the influ- Hand Surgery, Huashan Hospital, Fudan University,
ence of poor quality and further improves the accuracy were selected, including 51 men and 198 women ranging
of segmentation. Su et al. (2021) proposed a multiscale in age from 38 to 79 y (average: 55.39 § 7.98 y).
U-Net (Msu-Net) for medical image segmentation, Another 153 healthy volunteers who visited during that
which has been found to have the best performance using same period also participated, including 45 men and 108
different data sets (Su et al. 2021). Traditional image women, ranging in age from 54 to 66 y (average: 56.20
segmentation methods (U-Net) have been used for § 6.18 y) (patient demographics are summarized in
median nerve segmentation (Festen et al. 2021) and can Table 1). A total of 402 cases were included in our study.
be reliably used to assess median nerve size obtained by In addition, we cropped all images to 400 £ 300 to elim-
ultrasound, thus greatly reducing labor. U-Net is a con- inate unnecessary information such as device ID and
volutional neural network that was created by O. Ronne- acquisition time. Approximately 320 and 82 cases were
berger, P. Fischer and T. Brox. In this model, the feature randomly used as the training and test sets.
maps are extracted by four subsampling and restored to An experienced ultrasound practitioner used an
original size by four upsampling, which eventually yield acoustic (EPIQ5 diagnostic ultrasound scanner, Philips,
segmentation results. When Festen et al. (2021) used the Bothell, WA, USA) S15-4 linear array probe to record a
U-Net network to segment the median nerve, the Dice transverse 2-D ultrasound image of the median nerve at
coefficient reached 0.88. When Huang et al. (2019) used the level of the peas bone and delineate the nerve bound-
the U-Net network, its accuracy was 88.4%. Horng et al. ary in the 2-D ultrasound image. This retrospective study
(2020) proposed a new coiled neural network framework was approved by the medical ethics committee of
based on the U-Net model to segment median nerve, and
the Dice coefficient reached 0.8912, which improved Table 1. Patient demographics
segmentation performance and generated satisfactory Group Number Age (y) Sex (M/F) Education (y)
results. Although U-Net (Ibtehaz et al. 2020) networks
have been used for median nerve image segmentation, Case 249 55.39 § 7.98 51/198 15.00 § 3.00
Volunteer 153 56.20 § 6.18 45/108 15.60 § 3.00
the inherent noise of ultrasound images (Gerritsen et al.
2514 Ultrasound in Medicine & Biology Volume 48, Number 12, 2022
Huashan Hospital, Fudan University, and informed con- R denotes the 3 3 image region, and G is the convolu-
sent was obtained from each patient. tion kernel. The convolution operation can eliminate the
noise caused by the CLAHE algorithm and smooth the
Pre-processing algorithms image.
Fig. 1. (a) Original image. (b) Contrast limited adaptive histogram equalization (CLAHE) image. (c) Image after 2-D fil-
ter. The arrows indicate the carpal tunnel syndrome lesions.
Improved U2-Net model J. SHAO et al. 2515
Statistical analysis
We used analysis of variance (ANOVA) tests and t-
tests to verify whether the segmentation performance of
our proposed method is significantly different from that
of the other networks. These tests were statistically ana-
lyzed using Python 3.8 and GraphPad Prism 8.
RESULTS
Because previous studies did not perform CTS seg-
mentation, to verify the superiority of our model we
Fig. 3. Train loss through 250 training epochs. compared it with classical segmentation networks such
as U-Net (Ronneberger et al. 2015) and Res-U-Net
PN (Xiao et al. 2018). Of these, U-Net is a semantic segmen-
2 pi qi
Dice ¼ PN i¼1
PN ð3Þ tation algorithm using fully convolutional networks.
i¼1 pi þ
2 q2i
i¼1 Res-U-Net is based on the original U-Net model and
adds a weighted attention mechanism.
where N is the number of total pixels on the prediction
As outlined in Table 2, when compared with the
masks; and pi and qi are the pixels of the predicted seg-
other networks, our proposed modified U2-Net achieved
mentation result and the ground truth label, respectively.
the best segmentation performance values, including
PA is given by the expression
Dice (72.85%), the best MIoU (74.36%), the best PA
TP þ TN (87.92%) and the best AVD (113.65 mm), which are
PA ¼ ð4Þ
TP þ TN þ FP þ FN closest to the ground truth. In addition, the metrics of
where TP, TN, FP and FN represent the amounts of true- Res-U-Net were significantly better than those of U-Net.
positive, true-negative, false-positive, and false-negative These findings indicate that local and global contextual
results, respectively. information is more important for segmentation tasks.
MIoU is the ratio of the intersection and union of In Figure 4 are the boxplots of the different
predicted results and the ground truth, and can be networks’ evaluation metrics. The boxplot is a statistical
expressed as chart that depicts the distribution of data, including the
maximum, minimum, median and upper and lower quar-
TP
MIoU ¼ ð5Þ tiles of a set of data. The results of U-Net and Res-U-Net
FN þ FP þ TP reveal a wide range of distribution, which indicates their
AVD describes the degree of similarity between test results are largely influenced by the test samples. In
two sets of points and is defined as contrast, the results of the modified U2-Net are more
8 concentrated and have the best data distribution results,
< H ðA; BÞ ¼ maxðhðA; BÞ; hðB; AÞÞ which indicates that our model is more robust. As out-
hðA; BÞ ¼ maxða 2 AÞminðbeBÞa b ð6Þ lined in Table 3, the p values of ANOVAs and t-tests are
:
hðB; AÞ ¼ maxðb 2 BÞminðaeAÞb a all <0.05, which indicates that our model has a signifi-
cantly better performance.
where A and B are the sets of predicted results and the
The segmentation results of the different networks
ground truth, a and b are members of sets A and B and k
are illustrated in Figure 5. The (a) U-Net and (b) Res-U-
Fig. 4. Boxplots of different networks’ evaluation metrics. AVD = average Hausdorff distance; Dice = Dice coefficient;
MIoU = mean intersection over union; PA = pixel accuracy.
Table 3. ANOVA test and t-test on metrics for different segmentation networks
U2-Net vs. U-net U2-Net vs. Res-U-Net
p Value Levene’s test ANOVA t-Test Levene’s test ANOVA t-Test
Fig. 5. Columns from left to right are the segmentation results of carpal tunnel syndrome lesions by different networks
and the ground truth. (a) Original image. (b) U-Net. (c) Res-U-Net. (d) U2-Net. Rows from up to down are the samples
(cases 13).
and Res-U-Net models. As illustrated in Figure 4, the have their advantages and disadvantages. With 2-D
U2-Net model mitigates the disadvantage of U-Net’s U-Net, numerous samples can be learned, but because
and Res-U-Net’s ability to be influenced by test samples, each image is independently processed, the 3-D direc-
and its results are denser and generate the best data dis- tion of information is reduced. However, while using 3-
tribution. These results indicate that the improved U2- D U-Net, the number of samples is decreased, but the
Net model used in this study enhances the accuracy of amount of information of each sample is increased, and
median nerve segmentation. so the 3-D information direction is enriched. In addi-
The U-Net model is one of the most popular whole tion, compared with 2-D U-Net, 3-D U-Net requires a
rolls of woven network models and has been widely larger amount of computing resources and longer com-
used in medical image segmentation, including both the puting time (Nemoto et al. 2020). In this study, 2-D U-
encoder and the decoder, between which layers are Net was adopted to train through slices and make full
skipped to connect the two (Ronneberger et al. 2015). It use of 2-D plane spatial information in each slice. In
can extract image features with limited samples and terms of the algorithm, parallel coiling cores of differ-
achieve high segmentation accuracy, exceeding the ent sizes are used to view the image region of interest
accuracy of traditional methods, and it illustrates the of different scales in the initial block. Currently, 1 £ 1,
potential of efficient automatic segmentation. It can 3 £ 3, 5 £ 5 and 7 £ 7 coiling operations are commonly
reduce delineation time and eliminate differences used. The 1 £ 1 coiling operation can reduce the num-
between and within observers (Young et al. 2011; ber of parameters and broaden the network channel
Daisne and Blumhofer 2013). At present, there are with the least parameters. In this study, we used 3 £ 3
many variant networks of U-Net such as Res-U-Net, volume weaving to reduce computational overhead and
HDA-Res-U-Net (Wang et al. 2021a, 2021b), CS2-Net network parameters. We obtained output information
(Mou et al. 2021) and UV-Net (Zhang et al. 2021), all from three volume-weaving blocks, connected them
of which have achieved good segmentation results. U- together to extract spatial features from different scales
Net can be realized in 2-D or 3-D format, both of which and generated good results.
Improved U2-Net model J. SHAO et al. 2519
The model we proposed not only retains the charac- effectiveness and high reproducibility. Many image seg-
teristics of traditional U-Net, but also adds the L-RSU mentation models have been used to segment peripheral
module to learn low-dimensional and high-dimensional nerves with good results.
features. The convergence of the loss function can be
accelerated by calculating the difference between the CONCLUSIONS
segmentation results and the ground truth of each layer.
The models proposed by Festen et al. (2021) and Horng Image segmentation has been widely used in ultra-
et al. (2020) were simply applications for U-Net. In addi- sound. The main contribution of this study was develop-
tion, our data set contains only 402 images, whereas ment of an improved model called U2-Net, which was
those of Festen et al. and Horng et al. contained 5560 compared with U-Net and Res-U-Net. The results indi-
and 1680 images, respectively. But our Dice coefficient cate that the segmentation accuracy of U2-Net is supe-
is only about 0.1 lower than that for their model. There- rior, thus providing evidence that our model may be
fore, our proposed model will definitely perform better interchanged with manual segmentation. In view of the
than their algorithms if we have the same amount of good performance of U2-Net in median nerve segmenta-
data. tion, it would be productive to further study these meth-
Compared with manual and other automatic seg- ods and improve them for potential use in the image
mentation methods, the U2-Net model does have a per- segmentation of other neural entities.
formance advantage. However, there is room for
improvement in terms of accuracy based on some limita- CONFLICT OF INTEREST DISCLOSURE
tions: (i) Our accuracy is based on the manual segmenta- The authors declare that they have no conflicts of
tion, which is influenced by the user’s variability and interest.
change. (ii) The 3 £ 3 volume weaving operation leads
to relatively few-feature scales and a lack of multiscale REFERENCES
features. (iii) Because U2-Net uses many slices but
ignores the information between slices, the segmentation Alfonso C, Jann S, Massa R, Torreggiani A. Diagnosis, treatment and
follow-up of the carpal tunnel syndrome: A review. Neurol Sci
accuracy is reduced. (iv) The design of the image seg- 2010;31:243–252.
mentation network is usually severely limited by central Bargsten L, Raschka S, Schlaefer A. Capsule networks for segmenta-
processing unit and computer memory. (v) The sample tion of small intravascular ultrasound image datasets. Int J Comput
Assist Radiol Surg 2021;7:1861–1872.
size of the test is limited, and the segmentation of medi- Cartwright MS, Hobson-Webb LD, Boon AJ, Alter KE, Hunt CH,
cal images requires more annotated high-resolution Flores VH, Werner RA, Shook SJ, Thomas TD, Primack SJ,
images, which requires individuals with medical back- Walker FO. Evidence-based guideline: Neuromuscular ultrasound
for the diagnosis of carpal tunnel syndrome. Muscle Nerve
grounds to annotate a large number of training image 2012;46:287–293.
data sets. (vi) This study only examined only the median Daisne JF, Blumhofer A. Atlas-based automatic segmentation of head
nerve, and thus excluded other peripheral and small and neck organs at risk and nodal target volumes: A clinical valida-
tion. Radiat Oncol 2013;154 17481717.
nerves. (vii) This study could only identify the median de Krom MC, van Croonenborg JJ, Blaauw G, Scholten RJ, Spaans F.
nerve, not diagnose disease, and was unable to determine Guideline ’Diagnosis and treatment of carpal tunnel syndrome.
disease severity. Ned Tijdschr Geneeskd 2008;152:76–81.
Fang L, Zhang L, Yao Y. Integrating a learned probabilistic model with
Ultrasound has been extensively used in routine energy functional for ultrasound image segmentation. Med Biol
peripheral nerve examination. Electrophysiological Eng Comput 2021;59:1917–1931.
examination and magnetic resonance examination are Festen RT, Schrier VJMM, Amadio PC. Automated segmentation of
the median nerve in the carpal tunnel using U-Net. Ultrasound Med
also two important examination methods. Electrophysio- Biol 2021;47:1964–1969.
logical examination is considered the gold standard for Fu J, Liu J, Tian H. Dual attention network for scene segmentation. In:
the diagnosis of peripheral neuropathy, but it can cause Proceedings, 2019 IEEE/CVF Conference on Computer Vision and
Pattern Recognition (CVPR). 32, New York: IEEE; 2019. p. 3146–
trauma to the patient and may generate false-negative 3154. Long Beach, CA, June 1520.
results. Although magnetic resonance imaging is cur- George N, Christensen CR, Bennerr JS. Speckle noise in displays. J
rently the best imaging test for diagnosing peripheral Opt Soc Am 1976;66:1282–1290.
Gerritsen HJ, Hannan WJ, Ramberg EG. Elimination of speckle noise
neuropathy, it is expensive and unsuitable for patients in holograms with redundancy. Appl Opt 1968;7:2301–2311.
with metal implants and claustrophobia. Manual seg- Horng MH, Yang CW, Sun YN, Yang TH. DeepNerve: A new convo-
mentation is generally time consuming and laborious, lutional neural network for the localization and segmentation of the
median nerve in ultrasound image sequences. Ultrasound Med Biol
and thus the demand for automatic neural segmentation 2020;46:2439–2452.
has increased. Automatic or semi-automatic artificial Huang YL, Jiang YR, Chen DR, Moon WK. Level set contouring for
intelligencebased segmentation exhibits potential breast tumor in sonography. J Digit Imaging 2007;20:238–247.
Huang QH, Lee SY, Liu LZ, Lu MH, Jin LW, Li AH. A robust graph-
advantages with respect to peripheral nerve segmenta- based segmentation method for breast tumors in ultrasound images.
tion because of its nearly instant evaluation, cost- Ultrasonics 2012;52:266–275.
2520 Ultrasound in Medicine & Biology Volume 48, Number 12, 2022
Huang C, Zhou Y, Tan W. Applying deep learning in recognizing the intervention—MICCAI 2015. Lecture Notes in Computer Science.
femoral nerve block region on ultrasound images. Ann Transl Med 9351, Cham: Springer; 2015. p. 234–241. Vol..
2019;7:453–460. Shelhamer E, Long J, Darrell T. Fully Convolutional networks for
Ibtehaz N, Rahman MS. MultiRes U-Net: Rethinking the U-Net archi- semantic segmentation. Proc IEEE Comput Soc Conf Comput Vis
tecture for multimodal biomedical image segmentation. Neural Pattern Recognit 2017;39:640–651.
Netw 2020;121:74–87. Shen YT, Chen L, Yue WW, Xu HX. Artificial intelligence in ultra-
Kaluarachchi T, Reis A, Nanayakkara S. A review of recent deep learn- sound. Eur J Radiol 2021;139 109717.
ing approaches in human-centered machine learning. Sensors Shin Y, Yang J, Lee YH, Kim S. Artificial intelligence in musculoskel-
(Basel) 2021;21:2514–2543. etal ultrasound imaging. Ultrasonography 2021;40:30–44.
Lang S, Xu Y, Li L, Wang B, Yang Y, Xue Y, Shi K. Joint detection of Sites BD, Brull R, Chan VW. Artifacts and pitfall errors associated
Tap and CEA based on deep learning medical image segmentation: with ultrasound-guided regional anesthesia: Part I. Understanding
Risk prediction of thyroid cancer. J Healthc Eng 2021;6:1–9. the basic principles of ultrasound physics and machine operations.
Lee K, Kim JY, Lee MH, Choi CH, Hwang JY. Imbalanced loss-inte- Reg Anesth Pain Med 2007;32:412–418.
grated deep-learning-based ultrasound image analysis for diagnosis Su R, Zhang D, Liu J, MSU-Net Cheng C. Multi-Scale U-Net for 2D
of rotator-cuff tear. Sensors (Basel) 2021;21:2214–2228. medical image segmentation. Front Genet 2021;12:63993.
Lian J, Zhang M, Jiang N, Bi W, Dong X. Feature extraction of kidney Taha AA, Hanbury A. Metrics for evaluating 3D medical image seg-
tissue image based on ultrasound image segmentation. J Healthc mentation: Analysis, selection, and tool. BMC Med Imaging
Eng 2021;4:1155–1171. 2015;15:29–35.
Mendelsohn ML, Kolman WA, Perry B, Prewitt JM. Morphological Wang K, Liang S, Zhong S, Feng Q, Ning Z, Zhang Y. Breast ultra-
analysis of cells and chromosomes by digital computer. Methods sound image segmentation: A coarse-to-fine fusion convolutional
Inf Med 1965;4:163–167. neural network. Med Phys 2021;3:2405–2435.
Mou L, Zhao Y, Fu H, Liu Y, Cheng J, Zheng Y, Su P, Yang J, Chen L, Wang Z, Zou Y, Liu PX. Hybrid dilation and attention residual U-Net
Frangi AF, Akiba M, Liu J. CS2-Net: Deep learning segmentation for medical image segmentation. Comput Biol Med 2021;134
of curvilinear structures in medical imaging. Med Image Anal 104449.
2021;67 101874. Xiao X, Lian S, Luo Z. Weighted Res-UNet for high-quality retina ves-
Nemoto T, Futakami N, Yagi M, Kumabe A, Takeda A, Kunieda E, sel segmentation. 2018 9th International Conference on Informa-
Shigematsu N. Efficacy evaluation of 2D, 3D U-Net semantic seg- tion Technology in Medicine and Education (ITME). New York:
mentation and atlas-based segmentation of normal lungs excluding IEEE; 2018. p. 327–331.
the trachea and main bronchi. J Radiat Res 2020;61:257–264. Yan JY, Zhuang T. Applying improved fast marching method to endo-
Pempel D, Evanoff B, Amadio PC, de Krom M, Franklin G, Franzblau cardial boundary detection in echocardiographic images. Pattern
A, Gray R, Gerr F, Hagberg M, Hales T, Katz JN, Pransky G. Con- Recognit Lett 2003;24:2777–2784.
sensus criteria for the classification of carpal tunnel syndrome in Young AV, Wortham A, Wernick I, Ennis RD. Atlas-based segmenta-
epidemiologic studies. Am J Public Health 1998;88:1447–1451. tion improves consistency and decreases time required for contour-
Pissas T, Bloch E, Cardoso M. Deep iterative vessel segmentation in ing postoperative endometrial cancer nodal volumes. Int J Radiat
OCT angiography. Biomed Opt Express 2020;11:2490–2510. Oncol Biol Phys 2011;79:943–947.
Pizer SM, Amburn EP, Austin JD. Adaptive histogram equalization and Zeng Y, Tsui PH, Wu W, Zhou Z, Wu S. Fetal ultrasound image seg-
its variations. Comput Vis Graph Image Process 1987;39:355–368. mentation for automatic head circumference biometry using deeply
Qin X, Zhang Z, Huang C. U2-Net: Going deeper with nested U-struc- supervised attention-gated V-Net. J Digit Imaging 2021;34:134–
ture for salient object detection. Pattern Recognition 2020;106 148.
107404. Zhang C, Hua Q, Chu Y, Wang P. Liver tumor segmentation using
Rodrigues PS, Giraldi GA. Improving the non-extensive medical image 2.5D UV-Net with multi-scale convolution. Comput Biol Med
segmentation based on Tsallis entropy. Pattern Anal Appl 2021;133 104424.
2011;14:369–379. Zhuang S, Li F, Raj ANJ, Ding W, Zhou W, Zhuang Z. Automatic seg-
Ronneberger O, Fischer P, Brox T. U-Net: Convolutional networks for mentation for ultrasound image of carotid intimalmedia based on
biomedical image segmentation. In: Navab, Hornegger, Wells, improved superpixel generation algorithm and fractal theory. Com-
Frangi, (eds). Medical image computing and computer-assisted put Methods Programs Biomed 2021;205 106084.