Multi-Block Color-Binarized Statistical Images For Single Sample Face Recognition
Multi-Block Color-Binarized Statistical Images For Single Sample Face Recognition
Multi-Block Color-Binarized Statistical Images For Single Sample Face Recognition
v1
Article
a.benzaoui@univ-bouira.dz (A.B.)
3 Polytech Tours, Imaging and Brain, INSERM U930, University of Tours, 37200 Tours, France;
abdeldjalil.ouahabi@univ-tours.fr (A.O.)
4 GREMAN UMR 7347, University of Tours, CNRS, INSA Centre Val-de-Loire, 37200 Tours, France;
sebastien.jacques@univ-tours.fr (S.J.)
* Correspondence: abdeldjalil.ouahabi@univ-tours.fr; Tel.: +33-2-4736-1323
Abstract: Single sample face recognition (SSFR) is a computer vision challenge. In this scenario,
there is only one example from each individual on which to train the system, making it difficult to
identify persons in unconstrained environments, particularly when dealing with changes in facial
expression, posture, lighting, and occlusion. This paper suggests a different method based on a
variant of the Binarized Statistical Image Features (BSIF) descriptor called Multi-Block Color-
Binarized Statistical Image Features (MB-C-BSIF) to resolve the SSFR Problem. First, the MB-C-BSIF
method decomposes a facial image into three channels (e.g., red, green, and blue), then it divides
each channel into equal non-overlapping blocks to select the local facial characteristics that are
consequently employed in the classification phase. Finally, the identity is determined by calculating
the similarities among the characteristic vectors adopting a distance measurement of the k-nearest
neighbors (K-NN) classifier. Extensive experiments on several subsets of the unconstrained Alex &
Robert (AR) and Labeled Faces in the Wild (LFW) databases show that the MB-C-BSIF achieves
superior results in unconstrained situations when compared to current state-of-the-art methods,
especially when dealing with changes in facial expression, lighting, and occlusion. Furthermore, the
suggested method employs algorithms with lower computational cost, making it ideal for real-time
applications.
Keywords: Biometrics, Face Recognition, Single Sample Face Recognition, Binarized Statistical
Image Features, K-Nearest Neighbors.
1. Introduction
Generally speaking, biometrics aims to identify or verify an individual’s identity according to
some physical or behavioral characteristics [1]. Biometric practices are replacing conventional
knowledge-based solutions, such as passwords or PINs, and possession-based strategies, such as ID
cards or badges [2]. Several biometric methods have been developed to varying degrees and are being
implemented and used in numerous commercial applications [3]. Biometric recognition is a
continually growing area of study and one that is continually searching for new high-performance
methods.
Fingerprints are the biometric features most commonly used to identify criminals [4]. The first
automated fingerprint authentication device was commercialized in the early 1960s. In actual fact,
multiple studies have shown that the iris of the eye is the most accurate modality since its texture
remains stable throughout a person’s life [5]. However, those techniques have the significant
drawback of being invasive, which significantly restricts their applications. In addition, use of iris
recognition remains problematic for users who do not wish to put their eyes in front of a sensor. On
the contrary, biometric recognition based on facial analysis does not pose any such user constraints:
in contrast to other biometric modalities, face recognition is a modality that can be employed without
any user-sensor cooperation and can be applied discreetly in surveillance applications. Face
recognition has a number of advantages: the sensor device (i.e., the camera) is simple to mount; it is
not costly; it does not require subject cooperation; there are no hygiene issues; and, being passive,
people much prefer this modality [6].
2D face recognition with single sample face recognition (SSFR) (i.e., using a single sample per
person (SSPP) in the training set) has already matured as a technology. Although the latest studies
on the Face Recognition Grand Challenge (FRGC) [7] project have shown that computer-vision
systems [8] offer better performance than human-visual systems in controlled conditions [9], research
into face recognition, however, needs to be geared towards, more realistic uncontrolled conditions.
In an uncontrolled scenario, human visual systems are more robust when dealing with the numerous
possibilities that can impact the recognition process [10], such as variations in lighting, facial
orientation, facial expression, and facial appearance due to the presence of sunglasses, a scarf, a beard,
or makeup. Solving such challenges will make 2D face recognition techniques a much more important
technology for identification or the verification of an identity.
Several methods and algorithms have been suggested in the face recognition literature. They can
be subdivided into 4 fundamental approaches [11] depending on the method used for feature
extraction and classification: holistic, local, hybrid, and deep learning-based approaches. The deep
learning class [12], which applies consecutive layers of information processing arranged
hierarchically for representation, learning, and classification, has dramatically increased state-of-the-
art performance, especially with unconstrained large-scale databases, and encouraged real-world
applications [13].
Most current methods in the literature use several facial images (samples) per person in the
training set. Nevertheless, in real-world systems (e.g., in fugitive tracing, ID cards, immigration
management, or passports), only SSFR systems are used, which employ a single sample per person
in the training stage, i.e., just one example of the person to be recognized is recorded in the database
and accessible for the recognition task [14]. Since there are insufficient data (i.e., we do not have
several samples per person) to perform supervised learning, many well-known algorithms may not
work particularly well. For instance, deep neural networks (DNNs) [13] can be used in powerful face
recognition techniques. Nonetheless, they necessitate a considerable volume of training data to work
well. Vapnik and Chervonenkis [15] showed that vast training data are required to ensure learning
systems generalization in their statistical learning theorem. We can infer that SSFR remains an
unsolved topic in academia and business, particularly in terms of the great efforts being made and
the growth in face recognition.
In this paper, we tackle the SSFR issue in unconstrained conditions by proposing an efficient
method based on a variant of the local texture operator Binarized Statistical Image Features (BSIF)
[16] called Multi-Block Color-binarized Statistical Image Features (MB-C-BSIF). It employs local color
texture information to obtain honest and precise representation. The BSIF descriptor has been widely
used in texture analysis [17, 18] and has proven its utility in many computer vision tasks. In the first
step, the proposed method uses preprocessing to enhance the quality of facial photographs and
remove noise [19-21]. The color image is then decomposed into three channels (e.g., red, green, and
blue for the RGB color-space). Next, to find the optimum configuration, several multi-block
decompositions are checked and examined under various color-spaces (i.e., we tested RGB, HSV, in
addition to the YCbCr color-spaces). Finally, classification is undertaken using the distance
measurement of the k-nearest neighbors (K-NN) classifier.
The rest of the paper is structured as follows. We discuss relevant research about SSFR in
Section 2. Section 3 describes our suggested method. In Section 4, the experimental study, key
findings, and comparisons are performed and presented to show our method’s superiority. Section 5
of the paper presents some conclusions.
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 9 December 2020 doi:10.20944/preprints202012.0237.v1
2. Related Work
Current methods designed to resolve the issue of SSFR can be categorized into four fundamental
classes [22], namely: virtual sample generating, generic learning, image partitioning, and deep
learning methods.
intra-class variations of the test images. Zhang et al. (2020) [36] introduced a novel nearest neighbor
classifier (NNC) distance measurement to resolve SSFR problems. The suggested technique, entitled
Dissimilarity-based Nearest Neighbor Classifier (DNNC), divides all images into equal non-
overlapping blocks and produces an organized image block-set. The dissimilarities among the given
query image block-set and the training image block-sets are calculated and considered by the NNC
distance metric.
3. Proposed Method
The suggested feature extraction and classification rules, which compose the essential steps in
our proposed SSFR, are listed in this section. The facial image is enhanced by applying histogram
normalization and then filtered with a non-linear filter. The median filter [42] was adopted to
minimize noise while preserving facial appearance and enhancing the operational outcomes [43].
descriptors that generate binary codes, such as the local binary pattern (LBP) [46] and the local phase
quantization (LPQ) [47], have inspired the BSIF process. However, the BSIF is based on natural image
statistics, rather than heuristic or handcrafted code constructions, enhancing its modeling
capabilities.
Technically speaking, the 𝑠𝑖 filter response is calculated, for a given picture patch 𝑋 of size
𝑙 × 𝑙 pixels and a linear filter 𝑊𝑖 of the same size, by:
The BSIF descriptor has two key parameters: the filter size 𝑙 × 𝑙 and the bit string length 𝑛. Using
ICA, 𝑊𝑖 filters are trained by optimizing 𝑠𝑖 ’s statistical independence. The training of 𝑊𝑖 filters is
based on different choices of parameter values. In particular, each filter set was trained using 50,000
image patches. Figure 1 displays some examples of the filters obtained with 𝑙 × 𝑙 = 7 × 7 and 𝑛 = 8.
Figure 2 provides some examples of facial images and their respective BSIF representations (with
𝑙 × 𝑙 = 7 × 7 and 𝑛 = 8).
Figure 1. Examples of 7×7 BSIF filter banks learned from natural pictures.
Figure 2. (a) Examples of facial images, and (b) their parallel BSIF representations.
Like LBP and LPQ methodologies, the BSIF codes’ co-occurrences are collected in a histogram
𝐻1, which is employed as a feature vector.
However, the simple BSIF operator based on a single block does not possess information that
dominates the texture characteristics, which is forceful for the image's occlusion and rotation. To
address those limitations, an extension of the basic BSIF, the Multi-Block BSIF (MB-BSIF), is used. The
concept is based on partitioning the original image into non-overlapping blocks. An undefined facial
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 9 December 2020 doi:10.20944/preprints202012.0237.v1
image may be split equally along the horizontal and vertical directions. As an illustration, we can
derive 1, 4, or 16 blocks by segmenting the image into grids of 1×1, 2×2, or 4×4, as shown in Figure 3.
Each block possesses details about its composition, such as the nose, eyes, or eyebrows. Overall, these
blocks provide information about position relationships, such as nose to mouth and eye to eye. The
blocks and the data between them are thus very important for SSFR tasks.
Figure 3. Examples of multi-block (MB) image decomposition: (a) 1×1, (b) 2×2, and (c) 4×4.
Our idea was to segment the image into equal non-overlapping blocks and calculate the BSIF
operator’s histograms related to the different blocks. The histogram 𝐻2 represents the fusion of the
regular histograms calculated for the different blocks, as shown in Figure 4.
In the face recognition literature, a number of works have concentrated solely on analyzing the
luminance details of facial images (i.e., grayscale). This paper suggests a different and exciting
technique that exploits color texture information and shows that analysis of chrominance can be
beneficial to SSFR systems. To prove this idea, we can separate the RGB facial image into three
channels (i.e., red, green, and blue) and then compute the MB-BSIF separately for each channel. The
final feature vector is the concatenation of their histograms in a global histogram 𝐻3. This approach
is called Multi-Block Color BSIF (MB-C-BSIF). Figure 4 provides a schematic illustration of the
proposed MB-C-BSIF framework.
We note that the RGB is the most commonly employed color-space for detecting, modeling and
displaying color images. Nevertheless, its use in image interpretation is restricted due to the broad
connection between the three color channels (i.e., red, green, and blue) and the inadequate separation
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 9 December 2020 doi:10.20944/preprints202012.0237.v1
of details in terms of luminance and chrominance. To identify captured objects, the various color
channels can be highly discriminative and offer excellent contrast for several visual indicators from
natural skin tones. In addition to the RGB, we studied and tested two additional color-spaces—HSV
and YCbCr—to exploit color texture details. These color-spaces are based on separating components
of the chrominance and luminance. For the HSV color-space, the dimensions of hue and saturation
determine the image’s chrominance while the dimension of brightness (v) matches the luminance.
The YCbCr color-space divides the components of the RGB into luminance (Y), chrominance blue
(Cb), and chrominance red (Cr). We should note that the representation of chrominance components
in the HSV and YCbCr domains is dissimilar and, consequently, they can offer additional color
texture descriptions for SSFR systems.
8. end for
9. Concatenate the computed MB-BSIF features of the component 𝐶𝑛 :
(𝑛) (𝑛) (𝑛)
𝐻2(𝑛) = 𝐻1(1) + 𝐻1(2) + ⋯ + 𝐻1(𝐾)
4. Experimental Analysis
The proposed SSFR was evaluated using the unconstrained Alex & Robert (AR) [48] and Labeled
Faces in the Wild (LFW) [49] databases. In this section, we present the specifications of each utilized
database and their experimental setups. Furthermore, we analyze the findings obtained from our
proposed SSFR method and compare the accuracy of recognition with other current state-of-the-art
approaches.
4.1.2. Setups
To determine the efficiency of the proposed MB-C-BSIF in dealing with changes in facial
expression, subset A (normal-1) was used as the training set and subsets B (smiling-1), C (angry-1),
D (screaming-1), N (normal-2), O (smiling-2), P (angry-2), and Q (screaming-2) were employed for
the test set. The facial images from the 8 subsets display different facial expressions and were used in
two different sessions. For the training set, we employed 100 images of the normal-1 type (100 images
for 100 persons, i.e., one image per person). Moreover, we employed 700 images in the test set
(smiling-1, angry-1, screaming-1, normal-2, smiling-2, angry-2, and screaming-2). These 700 images
were divided into 7 subsets for testing, with each subset containing 100 images.
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 9 December 2020 doi:10.20944/preprints202012.0237.v1
As shown in Figure 5, two forms of occlusion are found in 12 subsets. The first is occlusion by
sunglasses, as seen in subsets H, I, J, U, V, and W; while the second is occlusion by scarf in subsets K,
L, M, X, Y, and Z. In these 12 subsets, each individual’s photographs have various illumination
conditions and were acquired in two distinct stages. There are 100 different items in each subset and
the total number of facial photographs used in the test set was 1,200. To examine the performance of
the suggested MB-C-BSIF under conditions of object occlusion, we considered subset A as the
training set and the 12 occlusion subjects as the test set, which was similar to the initial setup.
First shot
A B C D E F G H I J K L M
Second shot
N O P Q R S T U V W X Y Z
B: Smiling I: With sunglasses + left light P: Angry W: With sunglasses + left light
C: Angry J: With sunglasses + left light Q: Screaming X: With scarf + without light
D: Screaming K: With scarf + without light R: Normal + right light Y: With scarf + right light
E: Normal + right light L: With scarf + right light S: Normal + left light Z: With scarf + left light
F: Normal + left light M: With scarf + left light T: Normal + frontal light
Figure 5. The 26 facial images of the first individual from the AR database and their detailed descriptions.
The implemented configuration can achieve better accuracy for changes in facial expression with all
7 subsets. However, for subset Q, which is characterized by considerable variation in facial
expression, the accuracy of recognition was very low (71 %). Lastly, the performance of this
implemented configuration under conditions of occlusion by an object is unsatisfactory, especially
with occlusion by scarf, and needs further improvement.
Table 1. Comparison of the results obtained using six BSIF configurations with changes in facial
expression.
Table 2. Comparison of the results obtained using six BSIF configurations with occlusion by
sunglasses.
Table 3. Comparison of the results obtained using six BSIF configurations with occlusion by scarf.
We note that the city block distance produced the most reliable recognition performance
compared to the other distances analyzed in this test, such as the Hamming and Euclidean distances.
As such, we can say that the city block distance is the most suitable for our method.
Table 4. Comparison of the results obtained using different distances with changes in facial
expression.
Table 5. Comparison of the results obtained using different distances with occlusion by sunglasses.
Table 6. Comparison of the results obtained using different distances with occlusion by scarf.
- As such, in the case of partial occlusion, we may claim that local information is essential. It
helps to go deeper in extracting relevant information from the face like details about the facial
structure, such as the nose, eyes, or mouth, and information about position relationships, such as
nose to mouth, eye to eye, and so on.
- Finally, we note that the optimum configuration with the best accuracy was provided by
4×4 blocks for subsets of facial expression, occlusion by sunglasses, and occlusion by scarf.
Table 7. Comparison of the results obtained using different divided blocks with changes in facial
expression.
Table 8. Comparison of the results obtained using different divided blocks with occlusion by
sunglasses.
Table 9. Comparison of the results obtained using different divided blocks with occlusion by scarf.
- HSV shows some regression for scarf occlusion subsets, but both the RGB and YCbCr color-
spaces display some progress compared to the grayscale norm. Additionally, RGB remains the color-
space with the highest output.
- The most significant observation is that the RGB color-space saw significantly improved
performance in the V, W, Y, and Z subsets (from 81 % to 85 % with V; 79 % to 84 % with W; 84 % to
88 % with Y; and 77 % to 87 % with Z). Note that images of these occluded subsets are characterized
by light degradation (either to the right or left, as shown in Figure 5).
- Finally, we note that the optimum color-space, providing a perfect balance between lighting
restoration and improvement in identification, was the RGB.
Table 10. Comparison of the results obtained using different color-spaces with changes in facial
expression.
Table 11. Comparison of the results obtained using different color-spaces with occlusion by
sunglasses.
Table 12. Comparison of the results obtained using different color-spaces with occlusion by scarf.
4.1.7. Comparison #1
To confirm that our suggested method produces superior recognition performance with
variations in facial expression, we compared the collected results with several state-of-the-art
methods recently employed to tackle the SSFR issue. Table 13 presents the highest accuracies
obtained using the same subsets and the same assessment protocol with Subset A as the training set
and subsets of facial expression variations B, C, D, N, O, and P constituting the test set. The results
presented in Table 13 are taken from a number of references [30, 33, 45, and 46]. “- -” signifies that
the considered method has no experimental results. The best results are in bold.
The outcomes obtained validate the robustness and reliability of our proposed SSFR system
compared to state-of-the-art methods when assessed with identical subsets. We suggest that this is a
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 9 December 2020 doi:10.20944/preprints202012.0237.v1
competitive technique that has achieved a desirable level of identification accuracy with the six
subsets of up to: 100.00 % for B and C; 95.00 % for D; 97.00 % for N; 92.00 % for O; and 93.00 % for P.
For all subsets, our suggested technique surpasses the state-of-the-art methods analyzed in this
paper, i.e., the proposed MB-C-BSIF can achieve excellent identification performance under the
condition of variation in facial expression.
B C D N O P accuracy (%)
Turk, Pentland [52] 1991 PCA 97.00 87.00 60.00 77.00 76.00 67.00 77.33
Wu and Zhou [53] 2002 (PC)2A 97.00 87.00 62.00 77.00 74.00 67.00 77.33
Chen et al. [54] 2004 E(PC)2A 97.00 87.00 63.00 77.00 75.00 68.00 77.83
Yang et al. [55] 2004 2DPCA 97.00 87.00 60.00 76.00 76.00 67.00 77.17
Gottumukkal and 2004 Block-PCA 97.00 87.00 60.00 77.00 76.00 67.00 77.33
Asari [56]
Chen et al. [57] 2004 Block-LDA 85.00 79.00 29.00 73.00 59.00 59.00 64.00
Zhang , Zhou [58] 2005 (2D)2PCA 98.00 89.00 60.00 71.00 76.00 66.00 76.70
Tan et al. [59] 2005 SOM 98.00 88.00 64.00 73.00 77.00 70.00 78.30
He et al. [60] 2005 LPP 94.00 87.00 36.00 86.00 74.00 78.00 75.83
Zhang et al. [24] 2005 SVD-LDA 73.00 75.00 29.00 75.00 56.00 58.00 61.00
Deng et al. [61] 2010 UP 98.00 88.00 59.00 77.00 74.00 66.00 77.00
Lu et al. [33] 2013 DMMA 99.00 93.00 69.00 88.00 85.00 85.50 79.00
Ji et al. [51] 2017 CPL 92.22 88.06 83.61 83.59 77.95 72.82 83.04
Chu et al. [62] 2019 MFSA+ 100.00 100.00 74.00 93.00 85.00 86.00 89.66
Zhang et al. [36] 2020 DNNC 100.00 98.00 69.00 92.00 76.00 85.00 86.67
Our method 2020 MB-C-BSIF 100.00 100.00 95.00 97.00 92.00 93.00 96.17
4.1.8. Comparison #2
To further demonstrate the efficacy of our proposed SSFR system, we also compared the best
configuration of the MB-C-BSIF (i.e., RGB color-space, segmentation of the image into 4×4 blocks, city
block distance, 𝑙 × 𝑙 = 17 × 17, and 𝑛 = 12) with recently published work under unconstrained
conditions. We followed the same experimental protocol described in [30, 36]. Table 14 displays the
accuracies of the works compared on the tested subsets H + K (i.e., occlusion by sunglasses and scarf)
and subsets J + M (i.e., occlusion by sunglasses and scarf with variations in lighting). The best results
are in bold.
In Table 14, we can observe that the work presented by Zhu et al. [30], called LGR, shows a
comparable level, but the identification accuracy of our MB-C-BSIF procedure is much higher than
all the methods considered for both test sessions.
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 9 December 2020 doi:10.20944/preprints202012.0237.v1
Compared to related SSFRs, which can be categorized as either generic learning methods (e.g.,
ESRC [28], SVDL [29], and LGR [30), image partitioning methods (e.g., CRC [29], PCRC [30], and
DNNC [36]), or deep learning methods (e.g., DCNN [38] and BDL [41]), the capabilities of our method
can be explained in terms of its exploitation of different forms of information. This can be
summarized as follows:
- The BSIF descriptor scans the image pixel by pixel, i.e., we consider the benefits of local
information.
- The image is decomposed into several blocks, i.e., we exploit regional information.
- BSIF descriptor occurrences are accumulated in a global histogram, i.e., we manipulate
global information.
- The MB-BSIF is applied to all RGB image components, i.e., color texture information is
exploited.
Figure 6. Examples of two different subjects from the LFW-a database [44].
In a recent work by Zeng et al. [67], the authors combined traditional (handcrafted) and deep
learning (TDL) characteristics to overcome the limitation of each class. They reached an identification
accuracy of near 74 %, which is something of a quantum leap in this challenging topic.
In the comparative study presented in [68], we can see that current face recognition systems
employing several examples in the training set achieve very high accuracy with the LFW database,
especially with deep-learning-based methods. However, SSFR systems suffer considerably when
using the challenging LFW database and further research is required to improve their reliability.
In the situation where the learning stage is based on millions of images, the proposed SSFR
technique cannot be used. In such a situation, the references [12] and [67], which use deep learning
techniques, allow to obtain better accuracy.
Finally, the proposed SSFR method is reserved for the case where only one sample per person is
available, which is the most common case in real world through remote surveillance or unmanned
aerial vehicles’ shots. In these applications, faces are most often captured under harsh conditions,
such as changing lighting, posture, or if the person is wearing accessories such as glasses, masks or
disguises. In these cases, the method proposed here is by far the most accurate.
challenging LFW database, the results outperform traditional state-of-the-art methods and validate
this work's effectiveness.
Even if the proposed method is used in applications where it is not initially intended, such as in
cases where hundreds of samples are available, or where the training phase is based on millions of
samples, it still has higher facial recognition accuracies than other SSFR techniques described in the
literature.
In the future, we aim to explore the effectiveness of combining both deep learning and
traditional methods in addressing the SSFR issue. We also aim to investigate and analyze the SSFR
issue in unconstrained environments using large-scale databases that hold millions of facial images.
Author Contributions: Investigation, Software, writing original draft, I.A.; methodology, validation, writing, review and
editing, A.B.; project administration, supervision, validation, writing, review and editing, A.O.; validation, writing, review
and editing, S.J. All authors have read and agreed to the published version of the manuscript.
References
1. Alay, N.; Al-Baity, H.H. Deep Learning Approach for Multimodal Biometric Recognition System Based on
Fusion of Iris, Face, and Finger Vein Traits. Sensors 2020, 20, 5523.
2. Pagnin, E.; Mitrokotsa, A. Privacy-Preserving Biometric Authentication: Challenges and Directions.
Security and Communication Networks 2017, 1-9.
3. Mahfouz, A.; Mahmoud, T.M.; Sharaf Eldin, A. A survey on behavioral biometric authentication on
smartphones. Journal of Information Security and Applications 2017, 37, 28-37.
4. Ferrara, M.; Cappelli, R.; Maltoni, D. On the Feasibility of Creating Double-Identity Fingerprints. IEEE
Transactions on Information Forensics and Security 2017, 12(4), 892-900.
5. Thompson, J.; Flynn, P.; Boehnen, C.; Santos-Villalobos, H. Assessing the Impact of Corneal Refraction and
Iris Tissue Non-Planarity on Iris Recognition. IEEE Transactions on Information Forensics and Security 2019,
14(8), 2102-2112.
6. Benzaoui, A.; Bourouba, H.; Boukrouche, A. System for automatic faces detection. In Proceedings of the 3rd
International Conference on Image Processing, Theory, Tools, and Applications (IPTA), Istanbul, Turkey,
2012, pp. 354-358.
7. Phillips, P.J.; Flynn, P.J.; Scruggs, T.; Bowyer, K.W.; Chang, J.; Hoffman, K.; Marques, J.; Min; J.; Worek, W.
Overview of the face recognition grand challenge. In Proceedings of the IEEE Computer Society Conference
on Computer Vision and Pattern Recognition (CVPR), San Diego, CA, USA, 2005, pp. 947-954.
8. Femmam, S.; M'Sirdi, N. K.; Ouahabi, A. Perception and characterization of materials using signal
processing techniques. IEEE Transactions on Instrumentation and Measurement 2001, 50(5), 1203-1211.
9. Ring, T. Humans vs machines: the future of facial recognition. Biometric Technology Today 2016, 4, 5-8.
10. Phillips, P.J.; Yates, A.N.; Hu, Y.; Hahn, A.C.; Noyes, E.; Jackson, K.; Cavazos, J.G.; Jeckeln, G.; Ranjan,
R.; Sankaranarayanan, S.; Chen, J.C.; Castillo, C.D.; Chellappa, R.; White, D.; O’Toole, A.J. Face recognition
accuracy of forensic examiners, superrecognizers, and face recognition algorithms. Proceedings of the
National Academy of Sciences 2018, 115(24), 6171-6176.
11. Kortli, Y.; Jridi, M.; Al Falou, A.; Atri, M. Face Recognition Systems: A Survey. Sensors 2020, 20, 342.
12. Parkhi, O.M.; Vedaldi, A.; Zisserman, A. Deep Face Recognition. In Proceedings of the British Machine
Vision Conference (BMVC), Swansea, UK, 2015, pp. 1-12.
13. Rahman, J.U.; Chen, Q.; Yang, Z. Additive Parameter for Deep Face Recognition. Communications in
Mathematics and Statistics 2019, 8, 203-217.
14. Benzaoui, A.; Boukrouche, A. Ear recognition using local color texture descriptors from one sample image
per person. In Proceedings of the 4th International Conference on Control, Decision and Information
Technologies (CoDIT), Barcelona, Spain, 2017, pp. 0827-0832.
15. Vapnik, V.N.; Chervonenkis, A. Learning theory and its applications. IEEE Transactions on Neural Networks
1999, 10(5): 985-987.
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 9 December 2020 doi:10.20944/preprints202012.0237.v1
16. Kannala, J.; Rahtu, E. BSIF: binarized statistical image features. In Proceedings of the 21th International
Conference on Pattern Recognition (ICPR), Tsukuba, Japan, 2012, pp. 1363-1366.
17. Djeddi, M.; Ouahabi, A.; Batatia, H.; Basarab, A. ; Kouamé, D. Discrete wavelet for multifractal texture
classification: application to medical ultrasound imaging," 2010 IEEE International Conference on Image
Processing, Hong Kong 2010, 637-640.
18. Ouahabi, A. Multifractal analysis for texture characterization: A new approach based on DWT. 10th
International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010),
Kuala Lumpur, 2010, 698-703.
19. Ouahabi, A. Signal and Image Multiresolution Analysis. 1st ed.; ISTE-Wiley, London, 2012.
20. Ouahabi, A. A review of wavelet denoising in medical imaging. In Proceedings of the 8th International
Workshop on Systems, Signal Processing and their Applications (WoSSPA), Tipaza, Algeria, 2013, pp. 19-
26.
21. Sidahmed, S.; Messali, Z.; Ouahabi, A.; Trépout, S.; Messaoudi, C.; Marco, S. Nonparametric denoising
methods based on contourlet transform with sharp frequency localization: Application to low exposure
time electron microscopy images. Entropy 2015, 17(5), 3461-3478.
22. Ding, C.; Bao, T.; Karmoshi, S.; Zhu, M. Single sample per person face recognition with KPCANet and a
weighted voting scheme. Signal, Image and Video Processing 2017, 11, 1213-1220.
23. Vetter, T. Synthesis of novel views from a single face image. International Journal of Computer Vision 1998,
28(2), 103-116.
24. Zhang, D.; Chen, S.; Zhou, Z.H. A new face recognition method based on SVD perturbation for single
example image per person. Applied Mathematics and Computation 2005, 163(2), 895-907.
25. Gao, Q.X.; Zhang, L.; Zhang, D. Face recognition using FLDA with single training image per person. Applied
Mathematics and Computation 2008, 205(2), 726-734.
26. Hu, C.; Ye, M.; Ji, S.; Zeng, W.; Lu, X.; A new face recognition method based on image decomposition for
single sample per person problem. Neurocomputing 2015, 160, 287–299.
27. Dong, X.; Wu, F.; Jing, X.Y. Generic Training Set based Multimanifold Discriminant Learning for Single
Sample Face Recognition. KSII Transactions on Internet and Information Systems 2018, 12(1), 368-391.
28. Deng, W.; Hu, J.; Guo, J. Extended SRC: undersampled face recognition via intraclass variant dictionary.
IEEE Transactions on Pattern Analysis and Machine Intelligence 2012, 34(9), 1864-1870.
29. Yang, M.; Van, L.V.; Zhang, L. Sparse variation dictionary learning for face recognition with a single
training sample per person. In Proceedings of the IEEE International Conference on Computer Vision
(ICCV), Sydney, Australia, 2013, pp. 689–696.
30. Zhu, P.; Yang, M.; Zhang, L.; Lee, L. Local Generic Representation for Face Recognition with Single Sample
per Person. In Proceedings of the Asian Conference on Computer Vision (ACCV), Singapore, Singapore,
2014, pp. 34-50.
28. Zhu, P.; Zhang, L.; Hu, Q.; Shiu, S.C.K. Multi-scale patch based collaborative representation for face
recognition with margin distribution optimization. In Proceedings of the European Conference on
Computer Vision (ECCV), Florence, Italy, 2012, pp. 822-835.
29. Zhang, L.; Yang, M.; Feng, X. Sparse representation or collaborative representation: which helps face
recognition? In Proceedings of the International Conference on Computer Vision (ICCV), Barcelona, Spain,
2011, pp. 471-478.
30. Lu, J.; Tan, Y.P.; Wang, G. Discriminative multimanifold analysis for face recognition from a single training
sample per person. IEEE Transactions on Pattern Analysis and Machine Intelligence 2012, 35(1), 39–51.
31. Zhang, W.; Xu, Z.; Wang, Y.; Lu, Z.; Li, W.; Liao, Q. Binarized features with discriminant manifold filters
for robust single-sample face recognition. Signal Processing: Image Communication 2018, 65, 1-10.
32. Gu, J.; Hu, H.; Li, H. Local robust sparse representation for face recognition with single sample per person.
IEEE/CAA Journal of Automatica Sinica 2018, 5(2), 547-554.
33. Zhang, Z.; Zhang, L.; Zhang, M. Dissimilarity-based nearest neighbor classifier for single-sample face
recognition. The Visual Computer 2020, https://doi.org/10.1007/s00371-020-01827-3.
34. Zeng, J.; Zhao, X.; Qin, C.; Lin, Z. Single sample per person face recognition based on deep convolutional
neural network. In Proceedings of the 3rd IEEE International Conference on Computer and
Communications (ICCC), Chengdu, China, 2017, 1647-1651.
35. Ding, C.; Bao, T.; Karmoshi, S.; Zhu, M. Single sample per person face recognition with KPCANet and a
weighted voting scheme. Signal, Image and Video Processing 2017, 11, 1213-1220.
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 9 December 2020 doi:10.20944/preprints202012.0237.v1
36. Zhang, Y.; Peng, H. Sample reconstruction with deep autoencoder for one sample per person face
recognition. IET Computer Vision 2018, 11(6), 471-478.
37. Mimouna, A.; Alouani, I.; Ben Khalifa, A.; El Hillali, Y.; Taleb-Ahmed, A.; Menhaj, A.; Ouahabi, A.; Ben
Amara, N.E. OLIMP: A Heterogeneous Multimodal Dataset for Advanced Environment
Perception. Electronics 2020, 9, 560
38. Du, Q.; Da, F. Block dictionary learning-driven convolutional neural networks for few-shot face
recognition. The visual Computer 2020, https://doi.org/10.1007/s00371-020-01802-y.
39. Ataman, E.; Aatre, V.; Wong, K. A fast method for real-time median filtering. IEEE Transactions on Acoustics,
Speech, and Signal Processing 1980, 28(4), 415-421.
40. Benzaoui, A.; Hadid, A.; Boukrouche, A. Ear biometric recognition using local texture descriptors. Journal
of Electronic Imaging 2014, 23(5), 053008.
41. Stone, J.V. Independent component analysis: An introduction. Trends in Cognitive Sciences 2002, 6(2), 59-64.
42. Zehani, S.; Ouahabi, A.; Oussalah, M. ; Mimi, M. ; Taleb‐Ahmed, A. Bone microarchitecture
characterization based on fractal analysis in spatial frequency domain imaging. Int J Imaging Syst
Technol. 2020; 1– 19.
43. Ojala, T.; Pietikainen, M.; Maenpaa, T. Multiresolution gray-scale and rotation invariant texture
classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence 2002,
24(7): 971-987.
44. Ojansivu, V.; Heikkil, J. Blur insensitive texture classification using local phase quantization. In Proceedings
of the 3rd International Conference on Image and Signal Processing (ICSIP), Paris, France, 2012, pp. 236-
243.
45. Martinez, A.M.; Benavente, R. The AR face database. CVC Technical Report 1998, 24, 1-10.
46. Huang, G.B.; Mattar, M.; Berg, T.; Learned-Miller, E. Labeled Faces in the Wild: A Database for Studying
Face Recognition in Unconstrained Environments; Technical Report; University of Massachusetts:
Amherst, MA, USA, 2007; pp. 7–49.
47. Mehrasa, N.; Ali, A.; Homayun, M. A supervised multimanifold method with locality preserving for face
recognition using single sample per person. Journal of Center South University 2017, 24, 2853-2861.
48. Ji, H.K.; Sun, Q.S.; Ji, Z.X.; Yuan, Y.H.; Zhang, G.Q. Collaborative probabilistic labels for face recognition
from single sample per person. Pattern Recognition 2017, 62, 125–134.
49. Turk, M.; Pentland, A. Eigenfaces for recognition. Journal of Cognitive Neuroscience 1991, 3(1), 71-86.
48. Wu, J.; Zhou, Z.H. Face recognition with one training image per person. Pattern Recognition Letters 2002,
23(14), 1711-1719.
49. Chen, S.; Zhang, D.; Zhou, Z.H. Enhanced (PC)2A for face recognition with one training image per person.
Pattern Recognition Letters 2004, 25(10), 1173-1181.
50. Yang, J.; Zhang, D.; Frangi, A.F.; Yang, J.Y. Two-dimensional PCA: a new approach to appearance-based
face representation and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 2004,
26(1), 131–137.
51. Gottumukkal, R.; Asari, V.K. An improved face recognition technique based on modular PCA approach.
Pattern Recognition Letters 2004, 25(4), 429–436.
52. Chen, S.; Liu, J.; Zhou, Z.H. Making FLDA applicable to face recognition with one sample per person.
Pattern Recognition 2004, 37(7) 1553–1555.
53. Zhang, D.; Zhou, Z.H. (2D)2PCA: two-directional two-dimensional PCA for efficient face representation
and recognition. Neurocomputing 2005, 69(1–3), 224–231.
54. Tan, X.; Chen, S.; Zhou, Z.H.; Zhang, F. Recognizing partially occluded, expression variant faces from single
training image per person with SOM and soft K-NN ensemble. IEEE Transactions on Neural Networks 2005,
16(4), 875–886.
55. He, X.; Yan, S.; Hu, Y.; Niyogi, P.; Zhang, H.J. Face recognition using Laplacian faces. IEEE Transactions on
Pattern Analysis and Machine Intelligence 2005, 27(3), 328-340.
56. Deng, W.; Hu, J.; Guo, J.; Cai, W.; Fenf, D. Robust, accurate and efficient face recognition from a single
training image: a uniform pursuit approach. Pattern Recognition 2010, 43(5), 1748-1762.
57. Chu, Y.; Zhao, L.; Ahmad, T. Multiple feature subspaces analysis for single s ample per person face
recognition. The Visual Computer 2019, 35, 239-256.
58. Seetafaceengine, 2016, https:// github.com/ seetaface/ SeetaFaceEngine.
59. Cuculo, V.; D’Amelio, A.; Grossi, G.; Lanzarotti, R.; Lin, J. Robust Single-Sample Face Recognition by
Sparsity-Driven Sub-Dictionary Learning Using Deep Features. Sensors 2019, 19, 146.
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 9 December 2020 doi:10.20944/preprints202012.0237.v1
60. Wright, J.; Yang, A.Y.; Ganesh, A.; Sastry, S.S.; Ma, Y. Robust Face Recognition via Sparse Representation.
IEEE Transactions on Pattern Analysis and Machine Intelligence 2009, 31(2), 210-227.
61. Su, Y.; Shan, S.; Chen, X.; Gao, W. Adaptive generic learning for face recognition from a single sample per
person. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern
Recognition, San Francisco, CA, 2010, pp. 2699-2706.
62. Zeng, J.; Zhao, X.; Gan, J.; Mai, C.; Zhai, Y.; Wang, F. Deep Convolutional Neural Network Used in Single
Sample per Person Face Recognition. Computational Intelligence and Neuroscience 2018, 2018, 3803627.
63. Adjabi, I.; Ouahabi, A.; Benzaoui, A.; Taleb-Ahmed, A. Past, Present, and Future of Face Recognition: A
Review. Electronics 2020, 9, 1188.