Sensors 24 00264 v2
Sensors 24 00264 v2
Sensors 24 00264 v2
Article
Industrial Product Surface Anomaly Detection with Realistic
Synthetic Anomalies Based on Defect Map Prediction
Tao Peng 1,† , Yu Zheng 2,† , Lin Zhao 1 and Enrang Zheng 1, *
1 School of Electrical and Control Engineering, Shaanxi University of Science and Technology,
Xi’an 710026, China; 210611013@sust.edu.cn (T.P.); 210612064@sust.edu.cn (L.Z.)
2 School of Cyber Engineering, Xidian University , Xi’an 710126, China; yuzheng.xidian@gmail.com
* Correspondence: zhenger@sust.edu.cn
† These authors contributed equally to this work.
Abstract: The occurrence of anomalies on the surface of industrial products can lead to issues such
as decreased product quality, reduced production efficiency, and safety hazards. Early detection
and resolution of these problems are crucial for ensuring the quality and efficiency of production.
The key challenge in applying deep learning to surface defect detection of industrial products is the
scarcity of defect samples, which will make supervised learning methods unsuitable for surface defect
detection problems. Therefore, it is a reasonable solution to use anomaly detection methods to deal
with surface defect detection. Among image-based anomaly detection, reconstruction-based methods
are the most commonly used. However, reconstruction-based approaches lack the involvement of
defect samples in the training process, posing the risk of a perfect reconstruction of defects by the
reconstruction network. In this paper, we propose a reconstruction-based defect detection algorithm
that addresses these challenges by utilizing more realistic synthetic anomalies for training. Our model
focuses on creating authentic synthetic defects and introduces an auto-encoder image reconstruction
network with deep feature consistency constraints, as well as a defect separation network with a large
receptive field. We conducted experiments on the challenging MVTec anomaly detection dataset
and our trained model achieved an AUROC score of 99.70% and an average precision (AP) score of
99.87%. Our method surpasses recently proposed defect detection algorithms, thereby enhancing the
accuracy of surface defect detection in industrial products.
Citation: Peng, T.; Zheng, Y.; Zhao, L.;
Zheng, E. Industrial Product Surface Keywords: defect detection; image reconstruction; synthetic anomalies; defect separation
Anomaly Detection with Realistic
Synthetic Anomalies Based on Defect
Map Prediction. Sensors 2024, 24, 264.
https://doi.org/10.3390/s24010264 1. Introduction
Academic Editors: Baoping Cai, Defects on the surface of industrial products refer to incomplete, irregular, or non-
Haidong Shao and Dongming Fan compliant areas or traces that occur during manufacturing, processing, or usage. These
defects can be caused by physical, chemical, mechanical, or other factors and they can
Received: 30 October 2023 affect the appearance, quality, and performance of the products. The presence of defective
Revised: 14 December 2023
products has a significant impact on both businesses and users. In mature industrial pro-
Accepted: 18 December 2023
duction processes, defective products exhibit three main characteristics. Firstly, the number
Published: 2 January 2024
of defective products is extremely low compared to normal products. Secondly, the defects
exhibit various forms and diverse types. Thirdly, the defect areas are relatively small and
the defect images are similar in distribution to the normal images. Therefore, identifying
Copyright: © 2024 by the authors.
the differences between normal and defective samples is a highly challenging task.
Licensee MDPI, Basel, Switzerland. Traditional detection methods primarily rely on increased allocation of human re-
This article is an open access article sources, where product quality inspectors visually discern the quality of products. This
distributed under the terms and approach proves to be inefficient and incurs high costs. In addition, machine vision-based
conditions of the Creative Commons defect detection methods have also been widely explored, including techniques such as
Attribution (CC BY) license (https:// edge detection, threshold segmentation, and texture analysis. However, these techniques
creativecommons.org/licenses/by/ exhibit significant limitations when applied. For example, noise and variations in illumi-
4.0/). nation can directly result in inaccurate edge detection, unstable threshold segmentation,
and interference with the texture analysis results. Moreover, these methods typically rely
on designed feature extraction, lacking good adaptability to different types of defects or
image scenes, requiring adjustments and optimizations specific to the problem at hand,
which further involves the challenge of parameter selection. In recent years, there has been
rapid progress in deep learning methods aimed at emulating human habits and capabilities,
with the objective of substituting humans in performing complex and high-risk tasks. With
the swift advancement of computer technology and the enhancement of computational
capabilities, the performance of deep learning-based anomaly detection techniques has
been continuously improving. These techniques have found extensive applications in
various domains, including agricultural production [1,2], industrial manufacturing [3,4],
aerospace [5,6], and computer network security [7,8].
Supervised anomaly detection based on image data is one of the commonly employed
methods in the field of deep learning. By being able to learn the distinctive features of
positive and negative samples, it typically achieves the desired task objectives. However,
the stable performance of supervised learning methods relies on a massive dataset with
a balanced distribution of positive and negative samples. The major challenge in surface
defect detection tasks lies in the extremely limited quantity of defect samples, which can
result in overfitting of the model during fully supervised learning and subsequently affects
the detection accuracy. In comparison, reconstruction-based semi-supervised anomaly
detection methods, which do not require labeled defect samples, have gained popularity
as an alternative approach. Among them, the two most classical categories are based
on Generative Adversarial Networks (GANs) and Autoencoders (AEs), two fundamental
techniques in the field of semi-supervised learning for image reconstruction. These methods
extensively train on a large number of normal samples, aiming to learn the close relationship
between the high-dimensional and low-dimensional distributions of images. This enables
the network to learn how to reconstruct output images that closely resemble the input
images. During testing, defect images are fed into the pre-trained network model, and due
to significant differences from the reconstructed images, they are effectively identified and
filtered out. Therefore, reconstruction-based anomaly detection methods have become an
effective means to accomplish surface defect detection tasks in industrial products. When
the network is trained to be too robust, it tends to perfectly reconstruct defect images as
well, thus evading detection.
However, this type of image reconstruction technique is trained only using normal
samples, and real defect images have never been involved in the entire process. This makes
the inference of the entire network somewhat biased. The reality is that the scarcity of real
defect images prevents their inclusion in the training process, and artificially synthesized
defects generally differ significantly from real defects. As a result, the trained network
exhibits poor generalization ability and fails to detect real defective products. Addition-
ally, the authenticity of the reconstructed images serves as a criterion for assessing the
performance of the reconstruction network. While autoencoders primarily focus on the
reconstruction effect on high-dimensional images without considering low-dimensional
features, Ganomaly takes into account the reconstruction consistency of low-dimensional la-
tent vectors. However, training Ganomaly [9] is often challenging and struggles to converge
to the global optimum.
In response to the aforementioned issues, this study was inspired by the DRAEM [10]
concept to create more realistic and plausible synthetic anomaly images. This approach
addresses the problem of defect images not being involved in the training process. An image
reconstruction network was designed with deep feature consistency, and the network’s
ability to separate defects was enhanced by utilizing the larger effective receptive field
provided by the use of oversized convolutional kernels. This resulted in the generation of
defect region prediction maps. By calculating the loss function using the predicted maps
and the real defect regions, the possibility of the network model directly reconstructing
defect images was eliminated, thus achieving more accurate surface defect detection in
industrial products. The main contributions of this study are as follows:
Sensors 2024, 24, 264 3 of 17
2. Related Work
2.1. The Study of Anomaly Synthesis
Obtaining a large amount of defect data is a very challenging issue in defect detection
tasks. Synthetic anomaly is a reverse solution approach that addresses this challenge by
artificially creating more anomalous situations and expanding the defect dataset. The Cut-
Paste method proposed by Chung-Liang Li et al. [11] has been validated on the MVTec [12]
dataset. This method involves cutting out patch blocks from images and pasting them
randomly onto the image to augment the dataset. This data augmentation strategy is simple
and effective, enabling the model to detect local irregularities of the target. However, this
random masking method for creating anomalies does not match actual situations. For
instance, in the bottle dataset, the edge of the bottle bottom may appear in the middle of
the bottle image, and in the toothbrush dataset, the top of the toothbrush head may appear
in the middle of the toothbrush head (as shown on the left in Figure 1). The FIP method
proposed by Jeremy Tan et al. [13] extracts the same patch area from two independent
samples, uses interpolation between the two patches to obtain a fused patch, and then
replaces it at the original patch position. The model trained with this method has stronger
generalization ability and can detect subtle irregularities, performing well on the MOOD
Challenge [14] dataset of medical images. NSA [15] uses Poisson image editing to make
the synthesized defects more natural and closer to real anomalies. DRAEM first uses Berlin
noise to crop DTD [16] texture dataset images and then paste them onto the images to be
trained. The design of the discriminative network is specifically for learning the ability to
separate these synthesized anomalies. However, the Berlin noise is superimposed on the
entire image, beyond the scope of the foreground target (as shown on the right in Figure 1)
and differs significantly from real anomalies, resulting in inaccurate defect positioning.
Figure 1. The left-hand side of the figure presents an example of defect synthesis using the CutPaste
method, while the right-hand side shows an example of defect synthesis using the DRAEM approach.
instances into latent space vectors during the training process and detects anomalies by cal-
culating the distance between the input image and the reconstructed image. Convolutional
Autoencoders are also widely used for data compression and dimensionality reduction.
Comprising of an encoder and a decoder, the network model must retain the essential
information of data instances to minimize the reconstruction error. DRAEM adopts a dual
autoencoder architecture and uses a re-embedding technique to directly learn the anomaly
distance function, achieving good performance in anomaly detection.
The flow-based method was initially used for network traffic analysis and security
monitoring. Recently, with the development of computer technology, the algorithm per-
formance has been significantly improved. Cflow [19], Csflow [20], and Fastflow [21]
determine anomalies by analyzing the characteristic patterns in data flows and using unsu-
pervised methods to learn anomaly patterns from the data. They have strong adaptability
to the data, but Cflow can only detect abnormal traffic significantly different from normal
data, as Csflow has weak processing ability for high-dimensional data, which can result in
false positives or negatives, and Fastflow has limited effectiveness in industrial product
defect detection due to the need for a large amount of data for training and weak processing
ability for high-dimensional data.
Using pre-trained models can greatly reduce training time and have good feature
extraction capabilities. STFPM [22] and RDFOCE [23] are based on the teacher–student
network architecture and belong to a class of knowledge distillation methods that cooperate
with pre-trained models. They can be trained end-to-end, but RDFOCE requires a high
amount of training data, as insufficient training data can lead to performance degradation.
STFPM may perform poorly when dealing with large-sized images due to the large amount
of data needed.
Performing data feature extraction followed by processing the feature set is also a
good approach for anomaly detection. PatchCore [4] divides images into patches, extracts
features via convolutional networks, learns the similarity of nodes in the PatchCore graph,
and detects anomalies using clustering. PaDim [24] shares a similar approach with Patch-
Core, but uses an anomaly detection model to detect anomalies. DFM [25] also extracts
features to establish the probability distribution of normal samples in the feature space
and detects anomalies by calculating the likelihood of a new sample belonging to normal
samples. The commonality among these three methods is that they rely too much on the
accuracy of the feature extraction network. If there are few available normal samples for
learning, it may lead to problems such as feature learning bias. In addition, other methods
include CFA [26], which uses feature adaptation and coupled hypersphere methods for
anomaly detection, but consumes significant computational resources.
3. Method
The defect detection algorithm model proposed in this study, which is based on the
prediction of defect maps through the learning of abnormal distance function, is composed
of an image reconstruction network and an anomaly separation network (as shown in
Figure 2).
The image reconstruction network is trained to ensure that the reconstructed image
and the original normal image have highly similar high-level semantic information and
low-level semantic information, resulting in high visual similarity between the two. The
anomaly separation network takes the reconstructed image and the synthesized abnormal
image as inputs and aims to learn the distance function between the abnormal image and
the real image, thereby generating accurate abnormal segmentation images and completing
the defect detection task. The mechanism for synthesizing anomalies adopts a simple
cut-and-patch method to mimic real anomalies and add a large number of realistic defect
samples, thus compensating for the sample imbalance problem caused by the lack of defect
images in the training data of the image reconstruction method.
Sensors 2024, 24, 264 5 of 17
Figure 2. The model consists of a reconstruction network on the left and a defect prediction network
on the right. The reconstruction network comprises an autoencoder and a deep feature extractor,
while the defect prediction network employs an ultra-large kernel convolutional encoder and connects
the encoding and decoding components via a U-Net network.
In the third stage, M2 is used to extract a portion of the region from sample A, and
similarly, M2 is used to extract the corresponding region from input image I, which is then
blended using random interpolation to obtain the final defect image. It is then combined
with the other regions (1 − M2 ) of the input image I to obtain the final synthesized anomaly
image. Therefore, the anomaly image Ia is defined as:
K K K
Ia = ( A M2 ) β + ( I M2 )(1 − β) + I (1 − M2 ) (2)
J
where is pixel-wise multiplication, while β is a random interpolation coefficient with
β ∈ [0, 0.8). The defect region created using the random interpolation blending method
includes both the partial information of the original image I and the information from
the anomaly source image A, which makes the synthesized anomaly diverse and realistic.
Figure 4 presents a set of examples of synthesized anomaly images.
Figure 4. From left to right, the three columns are the anomaly source image A, the input image I,
and the synthesized anomaly image Ia .
Sensors 2024, 24, 264 7 of 17
Therefore, our synthetic anomaly method ensures that the anomaly cases appear only
on the foreground object, independent of the background, and the anomalies produced are
more realistic.
The SSI M [27] loss function can be used to measure the structural similarity between
the generated image and the original image and can compensate for the shortcomings of
the L2 loss function. The SSI M loss is defined as follows:
H W
1
LSSI M ( I, Ir ) =
HW ∑ ∑ 1 − SSI M( I, Ir )(i,j) (4)
i =1 j =1
The variables H and W in Equations (3) and (4) represent the height and width of
the input image I, respectively, which denotes the reconstructed image generated by the
network, and SSI M is the similarity function used to measure the similarity between
I and Ir .
The two loss functions are combined proportionally to form the visual image recon-
struction loss function Lvision , which is used to measure the loss of image reconstruction in
terms of visual perception.
where M2 is the final mask image, representing the ground truth, and M p is the defect
prediction image.
η = max ( M p ∗ f s f ×s f ) (10)
4. Experiments
The performance of this method was evaluated and compared with other advanced
methods in the field of defect detection. Furthermore, the effectiveness of each component
module of the proposed method was validated via ablation experiments.
Sensors 2024, 24, 264 9 of 17
Figure 5. Results of defect prediction for several categories. For each category, the four images from
left to right are the original image, the defect prediction image, the heat map, and the ground truth.
Table 1. Our method compared to defect detection algorithms based on optical flow and pre-trained
model-based methods: a comparison of AUROC values on the MVTecAD dataset.
Category Cflow [19] Csflow [20] Fastflow [21] STFPM [22] RDFOCE [23] Ours
bottle 100.0 99.4 100.0 99.8 93.2 99.5
cable 93.1 97.3 90.8 93.4 92.9 98.8
capsule 90.3 97.7 87.6 67.5 90.5 99.5
carpet 94.8 97.9 97.2 98.4 98.3 99.8
grid 86.5 99.3 98.3 93.8 94.7 100.0
hazelnut 99.3 93.2 81.0 99.1 100.0 100.0
leather 99.9 99.7 100.0 100.0 86.5 100.0
metal nut 97.9 94.6 95.7 98.5 97.4 100.0
pill 90.2 93.3 91.4 76.7 95.7 98.8
screw 91.0 98.1 72.4 79.5 88.6 100.0
tile 91.0 98.1 72.4 79.5 88.6 100.0
toothbrush 95.0 94.3 82.2 86.3 97.0 100.0
transistor 91.4 98.0 91.0 91.8 93.1 99.2
wood 99.6 98.7 96.8 98.7 99.2 100.0
zipper 92.1 98.6 94.0 84.6 92.7 99.9
Average 94.7 97.3 91.6 90.9 93.3 99.7
Table 2. Our method compared to defect detection algorithms based on feature extraction and image
reconstruction methods: a comparison of AUROC values on MVTecAD dataset.
Category PC * [4] PaDim [25] DFM [26] DRAEM [10] CFA [27] Ganomaly [9] Ours
bottle 100.0 99.4 100.0 99.2 99.8 54.6 99.5
cable 98.7 84.3 95.6 91.8 97.2 56.6 98.8
capsule 97.2 90.1 94.4 98.5 90.7 66.6 99.5
carpet 98.1 94.5 81.7 97.0 97.3 55.8 99.8
grid 97.0 85.7 73.6 99.9 95.0 86.0 100.0
hazelnut 100.0 75.0 99.4 100.0 100.0 88.5 100.0
leather 100.0 98.2 99.3 100.0 100.0 43.8 100.0
metal nut 99.6 96.1 92.2 98.7 99.1 48.7 100.0
Sensors 2024, 24, 264 11 of 17
Table 2. Cont.
Category PC * [24] PaDim [25] DFM [26] DRAEM [10] CFA [27] Ganomaly [9] Ours
pill 94.2 86.3 96.1 98.9 94.9 66.7 98.8
screw 97.3 75.9 89.0 93.9 70.8 44.3 100.0
tile 98.7 95.0 96.6 99.6 99.8 59.3 100.0
toothbrush 100.0 88.9 96.9 100.0 100.0 41.9 100.0
transistor 100.0 92.0 93.9 93.1 96.5 58.2 99.2
wood 99.4 97.6 97.7 99.1 99.5 86.9 100.0
zipper 99.4 77.9 96.9 100.0 96.7 56.2 99.9
Average 98.6 89.1 93.6 98.0 95.8 60.9 99.7
* PC refers to PatchCore.
Figure 6. The ROC curve for the cable dataset is shown in the upper and lower halves of the figure,
respectively.
Figure 7. Visualizations of the box plots for Tables 1 and 2 show the distribution of results for
each method.
Sensors 2024, 24, 264 12 of 17
Figure 8. Several examples of comparisons between predicted results from different methods and
ground truth images.
Figure 9. The AP curve for the cable dataset is shown in the upper and lower halves of the figure,
respectively.
Next, we fixed the existing autoencoder reconstruction network and conducted com-
parative experiments on the encoding part of the defect prediction network using the
RepLKNet structure, which showed significant improvement in performance compared
to the baseline model. This is because the actual receptive field of the larger convolution
kernel is larger than the effective receptive field of the stacked small convolution kernels,
as proven in the RepLKNet paper. A larger receptive field allows the network to better un-
Sensors 2024, 24, 264 14 of 17
derstand the global structure and contextual information in the image, avoiding overfitting
during network training and thus learning more general features in the image.
Figure 10. The first row to the last row in the figure are baseline method (item 4 in Table 4), ablation
experiment of generative network (item 6 in Table 4), ablation experiment of defect prediction network
(item 5 in Table 4), and our method (item 7 in Table 4), respectively. The rightmost column in the
image is the ground truth.
Sensors 2024, 24, 264 15 of 17
5. Conclusions
A semi-supervised defect detection algorithm based on defect map prediction with
realistic synthetic anomalies is proposed in this paper. Our method demonstrates excellent
performance in industrial product defect detection tasks. After conducting experiments
on the MVTec dataset, which consists of 15 different categories, our method outperformed
other recent detection methods by 1.1 percentage points on the AUROC evaluation metric,
showcasing its strong generalization capability. Furthermore, our method surpassed
the best-performing DRAEM by 31.5% on the defect localization evaluation metric AP,
indicating a significant improvement in localization accuracy. This is because we only
learn the distance function between normal and abnormal samples, rather than directly
learning the features of anomalies. By employing various data preprocessing techniques
such as affine transformations and image enhancement, combined with the utilization
of synthetically generated realistic abnormal images as input samples for training, the
network has acquired enhanced resistance to interference and robustness. We discussed the
design of the two sub-modules, analyzed the benefits of parameter copying in the feature
extractor, and demonstrated the effectiveness of large kernel convolution in expanding the
receptive field in practical applications via experiments.
Author Contributions: All authors participated in some part of the work for this article. Investigation,
T.P.; methodology, Y.Z.; software, T.P.; supervision, E.Z.; writing—original draft preparation, T.P.
and Y.Z.; writing—review and editing, Y.Z., L.Z. and E.Z. All authors have read and agreed to the
published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Sensors 2024, 24, 264 16 of 17
Data Availability Statement: This study analyzed the MVTec anomaly detection public dataset,
which can be found at https://www.mvtec.com/company/research/datasets/mvtec-ad (accessed
on 17 July 2023).
Conflicts of Interest: The authors declare no conflicts of interest.
References
1. Catalano, C.; Paiano, L.; Calabrese, F.; Cataldo, M.; Mancarella, L.; Tommasi, F. Anomaly detection in smart agriculture systems.
Comput. Ind. 2022, 143, 103750. [CrossRef]
2. Staar, B.; Lütjen, M.; Freitag, M. Anomaly detection with convolutional neural networks for industrial surface inspection. Procedia
CIRP 2019, 79, 484–489. [CrossRef]
3. Moso, J.C.; Cormier, S.; de Runz, C.; Fouchal, H.; Wandeto, J.M. Anomaly detection on data streams for smart agriculture.
Agriculture 2021, 11, 1083. [CrossRef]
4. Roth, K.; Pemula, L.; Zepeda, J.; Schölkopf, B.; Brox, T.; Gehler, P. Towards total recall in industrial anomaly detection. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022;
pp. 14318–14328.
5. Qin, K.; Wang, Q.; Lu, B.; Sun, H.; Shu, P. Flight anomaly detection via a deep hybrid model. Aerospace 2022, 9, 329. [CrossRef]
6. Memarzadeh, M.; Akbari Asanjan, A.; Matthews, B. Robust and Explainable Semi-Supervised Deep Learning Model for Anomaly
Detection in Aviation. Aerospace 2022, 9, 437. [CrossRef]
7. Albasheer, H.; Md Siraj, M.; Mubarakali, A.; Elsier Tayfour, O.; Salih, S.; Hamdan, M.; Khan, S.; Zainal, A.; Kamarudeen, S.
Cyber-attack prediction based on network intrusion detection systems for alert correlation techniques: A survey. Sensors 2022,
22, 1494. [CrossRef] [PubMed]
8. Yang, Z.; Liu, X.; Li, T.; Wu, D.; Wang, J.; Zhao, Y.; Han, H. A systematic literature review of methods and datasets for
anomaly-based network intrusion detection. Comput. Secur. 2022, 116, 102675. [CrossRef]
9. Akcay, S.; Atapour-Abarghouei, A.; Breckon, T.P. Ganomaly: Semi-supervised anomaly detection via adversarial training. In
Proceedings of the Computer Vision—ACCV 2018: 14th Asian Conference on Computer Vision, Perth, Australia, 2–6 December
2018; Revised Selected Papers, Part III 14; Springer: Berlin/Heidelberg, Germany, 2019; pp. 622–637.
10. Zavrtanik, V.; Kristan, M.; Skočaj, D. Draem-a discriminatively trained reconstruction embedding for surface anomaly detec-
tion. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Nashville, TN, USA, 20–25 June 2021;
pp. 8330–8339.
11. Li, C.L.; Sohn, K.; Yoon, J.; Pfister, T. Cutpaste: Self-supervised learning for anomaly detection and localization. In Proceedings of
the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 9664–9674.
12. Bergmann, P.; Fauser, M.; Sattlegger, D.; Steger, C. MVTec AD–A comprehensive real-world dataset for unsupervised anomaly
detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA,
15–20 June 2019; pp. 9592–9600.
13. Tan, J.; Hou, B.; Batten, J.; Qiu, H.; Kainz, B. Detecting outliers with foreign patch interpolation. arXiv 2020, arXiv:2011.04197.
14. Zimmerer, D.; Petersen, J.; Köhler, G.; Jäger, P.; Full, P.; Roß, T.; Adler, T.; Reinke, A.; Maier-Hein, L.; Maier-Hein, K. Medical
out-of-distribution analysis challenge. Zenodo 2020. [CrossRef]
15. Schlüter, H.M.; Tan, J.; Hou, B.; Kainz, B. Natural synthetic anomalies for self-supervised anomaly detection and localization. In
Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Berlin/Heidelberg,
Germany, 2022; pp. 474–489.
16. Cimpoi, M.; Maji, S.; Kokkinos, I.; Mohamed, S.; Vedaldi, A. Describing textures in the wild. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 3606–3613.
17. Schlegl, T.; Seeböck, P.; Waldstein, S.M.; Schmidt-Erfurth, U.; Langs, G. Unsupervised anomaly detection with generative
adversarial networks to guide marker discovery. In Proceedings of the International Conference on Information Processing in
Medical Imaging, Boone, NC, USA, 25–30 June 2017; Springer: Berlin/Heidelberg, Germany, 2017; pp. 146–157.
18. Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial
networks. Commun. ACM 2020, 63, 139–144. [CrossRef]
19. Gudovskiy, D.; Ishizaka, S.; Kozuka, K. Cflow-ad: Real-time unsupervised anomaly detection with localization via conditional
normalizing flows. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA,
3–8 January 2022; pp. 98–107.
20. Rudolph, M.; Wehrbein, T.; Rosenhahn, B.; Wandt, B. Fully convolutional cross-scale-flows for image-based defect detection. In
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2022;
pp. 1088–1097.
21. Yu, J.; Zheng, Y.; Wang, X.; Li, W.; Wu, Y.; Zhao, R.; Wu, L. Fastflow: Unsupervised anomaly detection and localization via 2d
normalizing flows. arXiv 2021, arXiv:2111.07677.
22. Wang, G.; Han, S.; Ding, E.; Huang, D. Student-teacher feature pyramid matching for unsupervised anomaly detection. arXiv
2021, arXiv:2103.04257.
Sensors 2024, 24, 264 17 of 17
23. Deng, H.; Li, X. Anomaly detection via reverse distillation from one-class embedding. In Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 9737–9746.
24. Defard, T.; Setkov, A.; Loesch, A.; Audigier, R. Padim: A patch distribution modeling framework for anomaly detection and
localization. In Proceedings of the International Conference on Pattern Recognition, Bangkok, Thailand, 28–30 July 2021; Springer:
Berlin/Heidelberg, Germany, 2021; pp. 475–489.
25. Ahuja, N.A.; Ndiour, I.; Kalyanpur, T.; Tickoo, O. Probabilistic modeling of deep features for out-of-distribution and adversarial
detection. arXiv 2019, arXiv:1909.11786.
26. Lee, S.; Lee, S.; Song, B.C. Cfa: Coupled-hypersphere-based feature adaptation for target-oriented anomaly localization. IEEE
Access 2022, 10, 78446–78454. [CrossRef]
27. Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE
Trans. Image Process. 2004, 13, 600–612. [CrossRef] [PubMed]
28. Ding, X.; Zhang, X.; Han, J.; Ding, G. Scaling up your kernels to 31 × 31: Revisiting large kernel design in cnns. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022;
pp. 11963–11975.
29. Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of
the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255.
30. Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in
context. In Proceedings of the Computer Vision—ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September
2014; Proceedings, Part V 13; Springer: Berlin/Heidelberg, Germany, 2014; pp. 740–755.
31. Zhou, B.; Zhao, H.; Puig, X.; Xiao, T.; Fidler, S.; Barriuso, A.; Torralba, A. Semantic understanding of scenes through the ade20k
dataset. Int. J. Comput. Vis. 2019, 127, 302–321. [CrossRef]
32. Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of
the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015: 18th International Conference, Munich,
Germany, 5–9 October 2015; Proceedings, Part III 18; Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241.
33. Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International
Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988.
34. Akcay, S.; Ameln, D.; Vaidya, A.; Lakshmanan, B.; Ahuja, N.; Genc, U. Anomalib: A deep learning library for anomaly detection.
In Proceedings of the 2022 IEEE International Conference on Image Processing (ICIP), Bordeaux, France, 16–19 October 2022;
pp. 1706–1710.
35. Bergmann, P.; Fauser, M.; Sattlegger, D.; Steger, C. Uninformed students: Student-teacher anomaly detection with discriminative
latent embeddings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA,
13–19 June 2020; pp. 4183–4192.
36. Zavrtanik, V.; Kristan, M.; Skočaj, D. Reconstruction by inpainting for visual anomaly detection. Pattern Recognit. 2021,
112, 107706. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.