Papers by Gianluigi Ciocca
Journal of Electronic Imaging, 2016
Abstract. We adopt genetic programming (GP) to define a measure that can predict complexity perce... more Abstract. We adopt genetic programming (GP) to define a measure that can predict complexity perception of texture images. We perform psychophysical experiments on three different datasets to collect data on the perceived complexity. The subjective data are used for training, validation, and test of the proposed measure. These data are also used to evaluate several possible candidate measures of texture complexity related to both low level and high level image features. We select four of them (namely roughness, number of regions, chroma variance, and memorability) to be combined in a GP framework. This approach allows a nonlinear combination of the measures and could give hints on how the related image features interact in complexity perception. The proposed complexity measure MGP exhibits Pearson correlation coefficients of 0.890 on the training set, 0.728 on the validation set, and 0.724 on the test set. MGP outperforms each of all the single measures considered. From the statistical analysis of different GP candidate solutions, we found that the roughness measure evaluated on the gray level image is the most dominant one, followed by the memorability, the number of regions, and finally the chroma variance.
Image Analysis and Processing - ICIAP 2017
Given the existence of many change detection algorithms, each with its own peculiarities and stre... more Given the existence of many change detection algorithms, each with its own peculiarities and strengths, we propose a combination strategy, that we termed IUTIS (In Unity There Is Strength), based on a genetic Programming framework. This combination strategy is aimed at leveraging the strengths of the algorithms and compensate for their weakness. In this paper we show our findings in applying the proposed strategy in two different scenarios. The first scenario is purely performance-based. The second scenario performance and efficiency must be balanced. Results demonstrate that starting from simple algorithms we can achieve comparable results with respect to more complex state-of-the-art change detection algorithms, while keeping the computational complexity affordable for real-time applications.
IEEE Access
State recognition of food images is a recent topic that is gaining a huge interest in the Compute... more State recognition of food images is a recent topic that is gaining a huge interest in the Computer Vision community. Recently, researchers presented a dataset of food images at different states where unfortunately no information regarding the food category was included. In practical food monitoring applications it is important to be able to recognize a peeled tomato instead of a generic peeled item. To this end, in this paper, we introduce a new dataset containing 20 different food categories taken from fruits and vegetables at 11 different states ranging from solid, sliced to creamy paste. We experiment with most common Convolutional Neural Network (CNN) architectures on three different recognition tasks: food categories, food states, and both food categories and states. Since lack of labeled data is a common situation in practical applications, here we exploits deep features extracted from CNNs combined with Support Vector Machines (SVMs) as an alternative to the End-to-End classification. We also compare deep features with several hand-crafted features. These experiments confirm that deep features outperform hand-crafted features on all the three classification tasks and whatever is the food category or food state considered. Finally, we test the generalization capability of the most performing deep features by using another, publicly available, dataset of food states. This last experiment shows that the features extracted from a CNN trained on our proposed dataset achieve performance quite close to the one achieved by the state of the art method. This confirms that our deep features are robust with respect to data never seen by the CNN.
Applied Sciences
Assisted living technologies can be of great importance for taking care of elderly people and hel... more Assisted living technologies can be of great importance for taking care of elderly people and helping them to live independently. In this work, we propose a monitoring system designed to be as unobtrusive as possible, by exploiting computer vision techniques and visual sensors such as RGB cameras. We perform a thorough analysis of existing video datasets for action recognition, and show that no single dataset can be considered adequate in terms of classes or cardinality. We subsequently curate a taxonomy of human actions, derived from different sources in the literature, and provide the scientific community with considerations about the mutual exclusivity and commonalities of said actions. This leads us to collecting and publishing an aggregated dataset, called ALMOND (Assisted Living MONitoring Dataset), which we use as the training set for a vision-based monitoring approach.We rigorously evaluate our solution in terms of recognition accuracy using different state-of-the-art archit...
Journal of Imaging
Structure from Motion (SfM) is a pipeline that allows three-dimensional reconstruction starting f... more Structure from Motion (SfM) is a pipeline that allows three-dimensional reconstruction starting from a collection of images. A typical SfM pipeline comprises different processing steps each of which tackles a different problem in the reconstruction pipeline. Each step can exploit different algorithms to solve the problem at hand and thus many different SfM pipelines can be built. How to choose the SfM pipeline best suited for a given task is an important question. In this paper we report a comparison of different state-of-the-art SfM pipelines in terms of their ability to reconstruct different scenes. We also propose an evaluation procedure that stresses the SfM pipelines using real dataset acquired with high-end devices as well as realistic synthetic dataset. To this end, we created a plug-in module for the Blender software to support the creation of synthetic datasets and the evaluation of the SfM pipeline. The use of synthetic data allows us to easily have arbitrarily large and d...
PLOS ONE, 2016
The aim of this work is to predict the complexity perception of real world images. We propose a n... more The aim of this work is to predict the complexity perception of real world images. We propose a new complexity measure where different image features, based on spatial, frequency and color properties are linearly combined. In order to find the optimal set of weighting coefficients we have applied a Particle Swarm Optimization. The optimal linear combination is the one that best fits the subjective data obtained in an experiment where observers evaluate the complexity of real world scenes on a web-based interface. To test the proposed complexity measure we have performed a second experiment on a different database of real world scenes, where the linear combination previously obtained is correlated with the new subjective data. Our complexity measure outperforms not only each single visual feature but also two visual clutter measures frequently used in the literature to predict image complexity. To analyze the usefulness of our proposal, we have also considered two different sets of stimuli composed of real texture images. Tuning the parameters of our measure for this kind of stimuli, we have obtained a linear combination that still outperforms the single measures. In conclusion our measure, properly tuned, can predict complexity perception of different kind of images.
E-commerce is one of the most challenging fields of application of the new Internet technologies.... more E-commerce is one of the most challenging fields of application of the new Internet technologies. It is clear that the larger the number of items available to be presented, the more difficult it is to guide the user towards the product he is looking for. In this article we present a prototype for the interactive search of images in high-quality electronic catalogues. The system is based on a visual information search engine, and integrates a Color Management System for the faithful display of images.
Proceedings of Spie the International Society For Optical Engineering, 2009
In the framework of multimedia applications image quality may have different meanings and interpr... more In the framework of multimedia applications image quality may have different meanings and interpretations. In this paper, considering the quality of an image as the degree of adequacy to its function/goal within a specific application field, we provide an organized overview of image quality assessment methods by putting in evidence their applicability and limitations in different application domains. Three scenarios have been chosen representing three typical applications with different degree of constraints in their image workflow chains and requiring different image quality assessment methodologies.
Proceedings of Spie the International Society For Optical Engineering, 2003
The paper addresses the problem of distinguishing between pornographic and non-pornographic photo... more The paper addresses the problem of distinguishing between pornographic and non-pornographic photographs, for the design of semantic filters for the web. Both, decision forests of trees built according to CART (Classification And Regression Trees) methodology and Support Vectors Machines (SVM), have been used to perform the classification. The photographs are described by a set of low-level features, features that can be automatically computed simply on gray-level and color representation of the image. The database used in our experiments contained 1500 photographs, 750 of which labeled as pornographic on the basis of the independent judgement of several viewers.
In this paper we propose a strategy for semi-supervised image classification that leverages unsup... more In this paper we propose a strategy for semi-supervised image classification that leverages unsupervised representation learning and co-training. The strategy, that is called CURL from Co-trained Unsupervised Representation Learning, iteratively builds two classifiers on two different views of the data. The two views correspond to different representations learned from both labeled and unlabeled data and differ in the fusion scheme used to combine the image features. To assess the performance of our proposal, we conducted several experiments on widely used data sets for scene and object recognition. We considered three scenarios (inductive, transductive and self-taught learning) that differ in the strategy followed to exploit the unlabeled data. As image features we considered a combination of GIST, PHOG, and LBP as well as features extracted from a Convolutional Neural Network. Moreover, two embodiments of CURL are investigated: one using Ensemble Projection as unsupervised representation learning coupled with Logistic Regression, and one based on LapSVM. The results show that CURL clearly outperforms other supervised and semi-supervised learning methods in the state of the art.
Lecture Notes in Computer Science, 2000
This paper describes the main features of the multimedia information retrieval engine of Quickloo... more This paper describes the main features of the multimedia information retrieval engine of Quicklook 2. Quicklook 2 allows the user to query image and multimedia databases with the aid of sample images, or a user-made sketch and/or textual descriptions, and ...
Iceis, 2001
Mining for association rules is one of the fundamental data mining methods. In this paper we desc... more Mining for association rules is one of the fundamental data mining methods. In this paper we describe how to efficiently integrate association rule mining algorithms with relational database systems. From our point of view direct access of the algorithms to the database system is a basic requirement when transferring data mining technology into daily operation. This is especially true in the context of large data warehouses, where exporting the mining data and preparing it outside the database system becomes annoying or even infeasible. The development of our own approach is mainly motivated by shortcomings of current solutions. We investigate the most challenging problems by contrasting the prototypical but somewhat academic association mining scenario from basket analysis with a real-world application. We thoroughly compile the requirements arising from mining an operative data warehouse at DaimlerChrysler. We generalize the requirements and address them by developing our own approach. We explain its basic design and give the details behind our implementation. Based on the warehouse, we evaluate our own approach together with commercial mining solutions. It turns out that regarding runtime and scalability we clearly outperform the commercial tools accessible to us. More important, our new approach supports mining tasks that are not directly addressable by commercial mining solutions.
We propose an innovative approach to the selection of representative frames of a video shot for v... more We propose an innovative approach to the selection of representative frames of a video shot for video summarization. By analyzing the differences between two consecutive frames of a video sequence, the algorithm determines the complexity of the sequence in terms of visual content changes. Three descriptors are used to express the frame's visual content: a color histogram, wavelet statistics and an edge direction histogram. Similarity measures are computed for each descriptor and combined to form a frame difference measure. The use of multiple descriptors provides a more precise representation, capturing even small variations in the frame sequence. This method can dynamically, and rapidly select a variable number of key frame within each shot, and does not exhibit the complexity of existing methods based on clustering algorithm strategies.
Uploads
Papers by Gianluigi Ciocca