Papers by Ioannis Pratikakis
2020 IEEE International Joint Conference on Biometrics (IJCB), 2020
The paper presents a summary of the 2020 Sclera Segmentation Benchmarking Competition (SSBC), the... more The paper presents a summary of the 2020 Sclera Segmentation Benchmarking Competition (SSBC), the 7th in the series of group benchmarking efforts centred around the problem of sclera segmentation. Different from previous editions, the goal of SSBC 2020 was to evaluate the performance of sclera-segmentation models on images captured with mobile devices. The competition was used as a platform to assess the sensitivity of existing models to i) differences in mobile devices used for image capture and ii) changes in the ambient acquisition conditions. 26 research groups registered for SSBC 2020, out of which 13 took part in the final round and submitted a total of 16 segmentation models for scoring. These included a wide variety of deeplearning solutions as well as one approach based on standard image processing techniques. Experiments were conducted with three recent datasets. Most of the segmentation models achieved relatively consistent performance across images captured with different mobile devices (with slight differences across devices), but struggled most with lowquality images captured in challenging ambient conditions, i.e., in an indoor environment and with poor lighting.
IEEE Access
Skeleton-based human action recognition with Graph Convolutional Networks is an active research f... more Skeleton-based human action recognition with Graph Convolutional Networks is an active research field that has gained increased popularity over the last few years. A challenge in skeleton-based action recognition is the design of a model in a way that captures fine-grained motions and the relations between the movements of different parts of the skeleton towards the recognition of specific actions. In this paper, the use of a set of part-aware graphs for the skeleton representation is proposed aiming to enhance discrimination between actions in the recognition task since each action put emphasis on specific parts of the skeleton. Extensive experimental work has been carried out in a consistent evaluation framework taking into account different combinations of part-aware graphs and feature representations leading to a configuration that achieves the optimal balance. Based upon two well-established datasets, namely NTU RGB+D and NTU RGB+D 120, we demonstrate that the proposed methodology compares favourably with the state-of-theart. Code is publicly available at: https://github.com/joyios1/Improving-skeleton-based-action-recognitionusing-part-aware-graphs-in-a-multi-stream-fusion-context. INDEX TERMS Graph convolutional networks, skeleton-based action recognition, part-aware graphs.
IEEE Transactions on Information Forensics and Security
Bias and fairness of biometric algorithms have been key topics of research in recent years, mainl... more Bias and fairness of biometric algorithms have been key topics of research in recent years, mainly due to the societal, legal and ethical implications of potentially unfair decisions made by automated decision-making models. A considerable amount of work has been done on this topic across different biometric modalities, aiming at better understanding the main sources of algorithmic bias or devising mitigation measures. In this work, we contribute to these efforts and present the first study investigating bias and fairness of sclera segmentation models. Although sclera segmentation techniques represent a key component of sclera-based biometric systems with a considerable impact on the overall recognition performance, the presence of different types of biases in sclera segmentation methods is still underexplored. To address this limitation, we describe the results of a group evaluation effort (involving seven research groups), organized to explore the performance of recent sclera segmentation models within a common experimental framework and study performance differences (and bias), originating from various demographic as well as environmental factors. Using five diverse datasets, we analyze seven independently developed sclera segmentation models in different experimental configurations. The results of our experiments suggest that there are significant differences in the overall segmentation performance across the seven models and that among the considered factors, ethnicity appears to be the biggest cause of bias. Additionally, we observe that training with representative and balanced data does not necessarily lead to less biased results. Finally, we find that in general there appears to be a negative correlation between
2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), 2017
DIBCO 2017 is the international Competition on Document Image Binarization organized in conjuncti... more DIBCO 2017 is the international Competition on Document Image Binarization organized in conjunction with the ICDAR 2017 conference. The general objective of the contest is to identify current advances in document image binarization of machine-printed and handwritten document images using performance evaluation measures that are motivated by document image analysis and recognition requirements. This paper describes the competition details including the evaluation measures used as well as the performance of the 26 submitted methods along with a brief description of each method.
Pattern Recognition, Nov 1, 2017
Abstract In this work, the incorporation of content-based image retrieval (CBIR) into computer ai... more Abstract In this work, the incorporation of content-based image retrieval (CBIR) into computer aided diagnosis (CADx) is investigated, in order to contribute to the decision-making process of radiologists in the characterization of mammographic masses. The proposed scheme comprises two stages: A margin-specific supervised CBIR stage that retrieves images from reference cases along with a decision stage that is based on the retrieved items. The feature set utilized exploits state-of-the-art features along with a newly proposed texture descriptor, namely mHOG, targeted to capturing margin and core specific mass properties. Performance evaluation considers the CBIR and diagnosis stages separately and is addressed by using standard measures on an enhanced version of the widely adopted digital database for screening mammography (DDSM). The proposed scheme achieved improved performance of CADx of masses in X-ray mammography experimentally compared to the state-of-the-art.
Despite numerous recent efforts, 3D object retrieval based on partial shape queries remains a cha... more Despite numerous recent efforts, 3D object retrieval based on partial shape queries remains a challenging problem, far from being solved. The problem can be defined as: given a partial view of a shape as query, retrieve all partially similar 3D models from a repository. The objective of this track is to evaluate the performance of partial 3D object retrieval methods, for partial shape queries of various scan qualities and degrees of partiality. The retrieval problem is often found in cultural heritage applications, for which partial scans of objects query a dataset of geometrically distinct classes.
A novel 3D model classification and retrieval method, based on the PANORAMA representation and Co... more A novel 3D model classification and retrieval method, based on the PANORAMA representation and Convolutional Neural Networks, is presented. Initially, the 3D models are pose normalized using the SYMPAN method and consecutively the PANORAMA representation is extracted and used to train a convolutional neural network. The training is based on an augmented view of the extracted panoramic representation views. The proposed method is tested in terms of classification and retrieval accuracy on standard large scale datasets.
International Journal of Image and Graphics, 2019
Two novel methods for fully unsupervised human action retrieval using 3D mesh sequences are prese... more Two novel methods for fully unsupervised human action retrieval using 3D mesh sequences are presented. The first achieves high accuracy but is suitable for sequences consisting of clean meshes, such as artificial sequences or highly post-processed real sequences, while the second one is robust and suitable for noisy meshes, such as those that often result from unprocessed scanning or 3D surface reconstruction errors. The first method uses a spatio-temporal descriptor based on the trajectories of 6 salient points of the human body (i.e. the centroid, the top of the head and the ends of the two upper and two lower limbs) from which a set of kinematic features are extracted. The resulting features are transformed using the wavelet transformation in different scales and a set of statistics are used to obtain the descriptor. An important characteristic of this descriptor is that its length is constant independent of the number of frames in the sequence. The second descriptor consists of ...
Computers & Graphics, 2019
IEEE Transactions on Multimedia, 2017
Performance evaluation is one of the main research topics in information retrieval. Evaluation me... more Performance evaluation is one of the main research topics in information retrieval. Evaluation metrics are used to quantify various performance aspects of a retrieval method. These metrics assist in identifying the optimum method for a specific retrieval challenge but also to allow its parameters fine-tuning in order to achieve a robust operation for a given set of requirements specification. In this work, we present RETRIEVAL, a Web-based integrated information retrieval performance evaluation platform. It offers a number of metrics that are popular within the scientific community, so as to compose an efficient framework for implementing performance evaluation. We discuss the functionality of RETRIEVAL by citing important aspects such as the data input approaches, the user-level performance metrics parameterization, the evaluation scenarios, the interactive plots, and the performance reports repository that offers both archiving and download functionalities.
2016 Digital Media Industry & Academic Forum (DMIAF), 2016
The aim of this research is to achieve spatial consistency of the UV map. We present an approach ... more The aim of this research is to achieve spatial consistency of the UV map. We present an approach to produce a fully spatially consistent UV mapping based on the planar parameterisation of the mesh. We apply our method on a 3D digital replica of an ancient Greek Lekythos vessel. We parameterise the mesh of a 3D model onto a unit square 2D plane using computational conformal geometry techniques. The proposed method is genus independent, due to an iterative 3D mesh cutting procedure. Having now the texture of a 3D model depicted on a spatially continuous two dimensional structure enables us to efficiently apply a vast range of image processing based techniques and algorithms.
The Visual Computer, 2016
Human emotions are often expressed by facial expressions and are generated by facial muscle movem... more Human emotions are often expressed by facial expressions and are generated by facial muscle movements. In recent years, the analysis of facial expressions has emerged as an active research area due to its various applications such as human-computer interaction, human behavior understanding, biometrics, emotion recognition, computer graphics, driver fatigue detection, and psychology. A novel analysis of dynamic 3D facial expressions using the positional information of automatically detected facial landmarks and the wavelet transformation is presented, which results in the proposed spatio-temporal descriptor. This descriptor is employed within the current paper in a retrieval scheme for dynamic 3D facial expression datasets and is thoroughly evaluated. Experiments have been conducted using the six prototypical expressions of the publicly available BU-4DFE dataset as well as the eight expressions included in the newly released publicly available BP4D-Spontaneous dataset. The obtained retrieval results outperform the retrieval results of the state-of-the-art methodologies. Furthermore, the retrieval results are exploited to achieve unsupervised dynamic 3D facial expression recognition. The aforementioned unsupervised procedure achieves better recognition accuracy compared to supervised dynamic 3D facial expression recognition state-of-the-art techniques.
Pattern Recognition, 2016
Partial 3D object retrieval has attracted intense research efforts due to its potential for a wid... more Partial 3D object retrieval has attracted intense research efforts due to its potential for a wide range of applications, such as 3D object repair and predictive digitization. This work introduces a partial 3D object retrieval method, applicable on both point clouds and structured 3D models, which is based on a shape matching scheme combining local shape descriptors with their Fisher encodings. Experiments on the SHREC 2013 large-scale benchmark dataset for partial object retrieval, as well as on the publicly available Hampson pottery dataset, demonstrate that the proposed method outperforms seven recently evaluated partial retrieval methods.
Pattern Recognition, 2016
The problem of facial expression recognition in dynamic sequences of 3D face scans has received a... more The problem of facial expression recognition in dynamic sequences of 3D face scans has received a significant amount of attention in the recent past whereas the problem of retrieval in this type of data has not. A novel retrieval methodology for such data is introduced in this paper. The proposed methodology automatically detects specific facial landmarks and uses them to create a descriptor. This descriptor is the concatenation of three sub-descriptors which capture topological as well as geometric information of the 3D face scans. The motivation behind the proposed hybrid facial expression descriptor is the fact that some facial expressions, like happiness and surprise, are characterized by obvious changes in the mouth topology while others, like anger, fear and sadness, produce geometric but no significant topological changes. The proposed retrieval scheme exploits the Dynamic Time Warping technique in order to compare descriptors corresponding to different 3D facial sequences. A detailed evaluation of the introduced retrieval scheme is presented showing that it outperforms previous state-of-the-art retrieval schemes. Experiments have been conducted using the six prototypical expressions of the standard dataset BU-4DFE and the eight prototypical expressions of the recently available dataset BP4D-Spontaneous. Finally, a majority voting scheme based on the retrieval results is used to achieve unsupervised dynamic 3D facial expression recognition. The achieved classification accuracy is comparable to the state-of-the-art supervised dynamic 3D facial expression recognition techniques. HighlightsWe illustrate a novel retrieval methodology for dynamic sequences of 3D face scans.We present a detailed evaluation of the introduced retrieval scheme.BU-4DFE and BP4D-Spontaneous data sets were used for experiments.Retrieval results are used to achieve unsupervised facial expression recognition.The presented results outperform state-of-the-art retrieval schemes.
The recent availability of dynamic 3D facial scans has spawned research activity in recognition b... more The recent availability of dynamic 3D facial scans has spawned research activity in recognition based on such data. However, the problem of facial expression retrieval based on dynamic 3D facial data has hardly been addressed and is the subject of this paper. A novel descriptor is created, capturing the spatio-temporal deformation of the 3D facial mesh sequence. Experiments have been implemented using the standard BU -- 4DFE dataset. The obtained retrieval results exceed the state-of-the-art results and the new descriptor is much more frugal in terms of space requirements. Furthermore, a methodology which exploits the retrieval results, in order to achieve unsupervised dynamic 3D facial expression recognition is presented, in order to directly compare the proposed descriptor against the wealth of works in recognition. The aforementioned unsupervised methodology outperforms the supervised dynamic 3D facial expression recognition state-of-the-art techniques in terms of classification ...
Computer Vision and Image Understanding, 2018
Range-based pedestrian recognition is instrumental towards the development of autonomous driving ... more Range-based pedestrian recognition is instrumental towards the development of autonomous driving and driving assistance systems. This work introduces encoding methods for pedestrian recognition, based on statistical shape analysis of 3D LIDAR data. The proposed approach has two variants, based on the encoding of local shape descriptors either in a spatially agnostic or spatially sensitive fashion. The latter method derives more detailed cues, by enriching the 'gross' information reflected by overall statistics of local shape descriptors, with 'fine-grained' information reflected by statistics associated with spatial clusters. Experiments on artificial LIDAR datasets, which include challenging samples, as well as on a large scale dataset of real LIDAR data, lead to the conclusion that both variants of the proposed approach (i) obtain high recognition accuracy, (ii) are robust against low-resolution sampling, (iii) are robust against increasing distance, and (iv) are robust against non-standard shapes and poses. On the other hand, the spatially-sensitive variant is more robust against partial occlusion and bad clustering.
The Visual Computer, 2015
The problem of facial expression recognition in dynamic sequences of 3D face scans has received a... more The problem of facial expression recognition in dynamic sequences of 3D face scans has received a significant amount of attention in the recent past whereas the problem of retrieval in this type of data has not. A novel retrieval scheme for such data is introduced in this paper. It is the first spatio-temporal retrieval scheme ever used for retrieval in dynamic sequences of 3D face scans. The proposed scheme automatically detects specific facial landmarks and uses them to create a spatio-temporal descriptor. At first, geometric as well as topological information of the 3D face scans is captured by using the detected landmarks. In the sequel, the aforementioned spatial information is filtered by using wavelet transformation, resulting to our final spatio-temporal descriptor. Our descriptor is invariant to the number of the 3D face scans of a facial expression sequence. The proposed retrieval scheme exploits the Square of Euclidean distance in order to compare descriptors corresponding to different 3D facial sequences. A detailed evaluation of the introduced retrieval scheme is presented showing that it outperforms previous state-of-the-art retrieval schemes. Experiments have been conducted using the six prototypical expressions of the standard data set $$\textit{BU}-4\textit{DFE}$$BU-4DFE. Finally, a majority voting methodology based on the retrieval results is used to achieve unsupervised dynamic 3D facial expression recognition. The achieved classification accuracy outperforms the state-of-the-art supervised dynamic 3D facial expression recognition techniques.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017
A fully automatic mesh segmentation scheme using heterogeneous graphs is presented. We introduce ... more A fully automatic mesh segmentation scheme using heterogeneous graphs is presented. We introduce a spectral framework where local geometry affinities are coupled with surface patch affinities. A heterogeneous graph is constructed combining two distinct graphs: a weighted graph based on adjacency of patches of an initial over-segmentation, and the weighted dual mesh graph. The partitioning relies on processing each eigenvector of the heterogeneous graph Laplacian individually, taking into account the nodal set and nodal domain theory. Experiments on standard datasets show that the proposed unsupervised approach outperforms the state-of-the-art unsupervised methodologies and is comparable to the best supervised approaches.
Journal of Imaging
Word spotting strategies employed in historical handwritten documents face many challenges due to... more Word spotting strategies employed in historical handwritten documents face many challenges due to variation in the writing style and intense degradation. In this paper, a new method that permits efficient and effective word spotting in handwritten documents is presented that relies upon document-oriented local features that take into account information around representative keypoints and a matching process that incorporates a spatial context in a local proximity search without using any training data. The method relies on a document-oriented keypoint and feature extraction, along with a fast feature matching method. This enables the corresponding methodological pipeline to be both effectively and efficiently employed in the cloud so that word spotting can be realised as a service in modern mobile devices. The effectiveness and efficiency of the proposed method in terms of its matching accuracy, along with its fast retrieval time, respectively, are shown after a consistent evaluatio...
Uploads
Papers by Ioannis Pratikakis