Papers by Arsalane Zarghili
International Journal of Power Electronics and Drive Systems, Jan 31, 2024
Optical character recognition (OCR) is one of the widely used pattern recognition systems. Howeve... more Optical character recognition (OCR) is one of the widely used pattern recognition systems. However, the research on ancient Arabic writing recognition has suffered from a lack of interest for decades, despite the availability of thousands of historical documents. One of the reasons for this lack of interest is the absence of a standard dataset, which is fundamental for building and evaluating an OCR system. In 2022, we published a database of ancient Arabic words as the only public dataset of characters written in Al-Mojawhar Moroccan calligraphy. Therefore, such a database needs to be studied and evaluated. In this paper, we explored the proposed database and investigated the recognition of Al-Mojawhar Arabic characters. We studied feature extraction by using the most popular descriptors used in Arabic OCR. The studied descriptors were associated with different machine learning classifiers to build recognition models and verify their performance. In order to compare the learned and handcrafted features on the proposed dataset, we proposed a deep convolutional neural network for character recognition. Regarding the complexity of the character shapes, the results obtained were very promising, especially by using the convolutional neural network model, which gave the highest accuracy score.
2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583)
The current paper addresses the aspects related to the development of an automatic probabilistic ... more The current paper addresses the aspects related to the development of an automatic probabilistic recognition system for facial expressions in video streams. The face analysis component integrates an eye tracking mechanism based on Kalman filter. The visual feature detection includes PCA oriented recognition for ranking the activity in certain facial areas. The description of the facial expressions is given according to sets of atomic Action Units (AU) from the Facial Action Coding System (FACS). The base for the expression recognition engine is supported through a BBN model that also handles the time behavior of the visual features.
Currently, large lexical resources are getting a high potential relevance for information systems... more Currently, large lexical resources are getting a high potential relevance for information systems and need of Lexical resources in Natural Language Processing (NLP) fields is paramount. To contribute meet these needs, we build a lexical resource from the famous dictionary al=qāmūs al=muḥīṭ (AQAM). Using a rule based approach, we have designed a system that allows extracting morpho-syntactical, semantics and lexical information from the famous dictionary. So, we obtained a digitized and structured version of AQAM, enriched by morpho-syntactical and lexical explicit information. In addition, the obtained resource is enriched by English translations of lemma and accompanying senses using a bilingual English-Arabic dictionary. Then we present an overview of an experiment alignment of the section of the letter bā’ on Princeton’s WordNet (PWN) and Suggested Upper Merged Ontology (SUMO). This experience turned out to be interesting because it revealed that mapping an Arabic lexical resource on an English resource shows commonality between the two languages, but it allows especially to emphasize the non-equivalences between them. All obtained resources are represented in XML format and distributed under free license.
IEEE Conference Proceedings, 2016
Journal of King Saud University - Computer and Information Sciences, Apr 1, 2017
At present, language technologies are instrumental to millions of people, who use them every day ... more At present, language technologies are instrumental to millions of people, who use them every day with little if any awareness of their existence and role. Popular machine translation systems or web search engines rely more and more on levels of linguistic information automatically overlaid by batch processing tools using language technologies. This development has not only had a huge impact on our daily life, but has also deeply affected the way we think about language as an object of scientific inquiry. The present special issue of JKSU is intended to explore the contribution of computational language models to a better understanding of linguistic, psycholinguistic, sociolinguistic, and literary issues of the Arabic language and culture. The wide array of contributions offered here, ranging from text diacriticisation to psycho-computational modelling of Arabic lexical organisation, bears witness to the maturity of the field and highlight a few general lessons we can learn from current research on Arabic Natural Language Processing. Computational models of language are, primarily, models of language usage. They focus on those aspects of language performance that are involved in, but are not limited to, language acquisition, lexical access, speech and optical character recognition, text translation, text reading, text understanding, knowledge and ontology extraction, sentiment analysis. What all these tasks have in common is their use of language as a means of conveying information to respond to specific communicative needs and goals. Dealing with language performance ultimately requires bringing language variety and subjectivity to inter-subjective invariance, to a shared representation of its content and structure. From this perspective, emotional undertones, variations in style, speed, pitch, handwriting, topic or dialect are superadded signal complications, which are nonetheless inseparable from language performance. On reflection, in real language-based communication, noise is not simply overlaid on the message, but is actually PART of the message. Upon hearing the voice of a speaker or reading a text, one can get a lot of information about the speaker/writer's own sex, age, bodily features, cultural level, social status, personal attitude, political bias and even ethnicity. Arabic language processing happens to throw all these performance-related aspects in sharp relief. A first, significant level of variation can already be found in spelling, where underspecified (i.e. non-diacriticized) written words are amenable to a large vari
Face recognition can be defined as the ability of a system to classify or describe a human face. ... more Face recognition can be defined as the ability of a system to classify or describe a human face. The motivation for such system is to enable computers to do things like humans do and to apply computers to solve problems that involve analysis and classification. Face recognition systems require less user cooperation than systems based on other biometrics (e.g. fingerprints
Advances in intelligent systems and computing, 2017
In this paper, a new model-based clustering algorithm is introduced for optimal speaker modeling ... more In this paper, a new model-based clustering algorithm is introduced for optimal speaker modeling in speaker identification systems. The introduced algorithm can estimates the optimal number of mixture components using a cross-validation methodology, as well as, overcome the initialization sensitivity and local maxima problems of classical EM algorithm using a split & merge incremental learning approach. The performed experiments in speaker identification task demonstrate the efficiency and effectivity of the proposed algorithm compared to the commonly used Expectation-Maximization (EM) algorithm.
The objective of this article is to propose a new method for moving objects detection in a video ... more The objective of this article is to propose a new method for moving objects detection in a video based on the background subtraction approach. Background subtraction is, in general, a challenging moving object detection problem. Resolve this problem is very important for many applications: statistical analysis of attendance, access securing, monitoring epileptic patients in hospitals, and so on. In this paper, we propose a method to detect the changing regions of an image sequence using the background subtraction approach. This approach is based on the estimation of similarity between the values that a pixel assumes in two consecutive frames using the radiometric similarity with a dynamic threshold.
Over the past several years, the Mel-Frequency Cepstral Coefficients (MFCCs) has become the state... more Over the past several years, the Mel-Frequency Cepstral Coefficients (MFCCs) has become the state-of-the-art approach for features extraction in text-independent speaker recognition applications. However, the recently introduced Gammatone Frequency Cepstral Coefficients (GFCC) has shown a promising recognition performance in such speaker recognition applications, especially in noisy acoustical environments. In this paper, The Gammatone Frequency Cepstral Coefficients are studied and evaluated text-independent speaker identification task over VoIP Networks. The study comprises the exploration of the various parameters included in the calculation process of the Gammatone Frequency Cepstral Coefficients. The GFCCs features were tested and accessed under a Gaussian mixture model (GMM)-based speaker identification system, which represents the speaker modeling state-of-art approach in contemporary text-independent speaker identification systems.
a Backpropagation Neural Network (BPNN) is one of the most used methods in the domain of face rec... more a Backpropagation Neural Network (BPNN) is one of the most used methods in the domain of face recognition. BPNN need supervised training to learn how to predict results from desired data, and through many research and studies, they proof there robustness to do so. In this paper, we propose a hybrid method to achieve face recognition purpose using semi supervised
In this paper, we compare classification results, of six facial expressions including joy, surpri... more In this paper, we compare classification results, of six facial expressions including joy, surprise, sadness, anger, disgust, and fear, relying on two different methods of distance computing between 121 landmark points on the face. Facial features were computed using L1 norm (Manhattan distance) in the first case and L2 norm (Euclidean distance) in the second case. Training and test data have been collected using kinect sensor. Labelled dataset contains sequences of 121 landmark points extracted from the face of each subject while displaying six facial expressions including joy, surprise, sadness, anger, disgust, and fear. Classification has been realized using multi-layer feed forward neural network with one hidden layer. Good recognition rates have been achieved in the early stages of training regarding Euclidean facial distances.
Over the past several years, The Mel-Frequency Cepstral Coefficients (MFCCs) and Gaussian mixture... more Over the past several years, The Mel-Frequency Cepstral Coefficients (MFCCs) and Gaussian mixture models (GMMs) using the well-known EM algorithm have become the state-of-the-art approach in text-independent speaker recognition applications. However, in recent few years, Self-Organizing Mixture Models which combines the strengths of Self-Organizing Maps and Mixture Models have been proposed in the literature and yielded better results than the classical GMM training in many applications. In this paper, firstly, the implementation and the comparison of the most popular MFCCs variants are done in order to find the best implementation for our speaker identification system. Then, The Self-Organizing Mixture Models are introduced for speaker modeling in text-independent speaker identification. The performance of the Self-Organizing Mixture Models is assessed and compared with the classical Gaussian mixture models using the EM algorithm.
Social Science Research Network, 2021
Communications in computer and information science, 2019
The field of optical character recognition for the Arabic text is not getting much attention by r... more The field of optical character recognition for the Arabic text is not getting much attention by researchers comparing to Latin text. It is only in the last two decades that this field was being exploited, due to the complexity of Arabic writing and the fact that it demands a critical step which is segmentation; first from text to lines, then from lines to words and finally from words to characters. In case of historical documents, the segmentation is more complicated because of the absence of writing rules and the poor quality of documents. In this paper we present a projection-based technique for the segmentation of text into lines of ancient Arabic documents. To override the problem of overlapping and touching lines which is the most challenging problem facing the segmentation systems, firstly, pre-processing operations are applied for binarization and noise reduction. Secondly a skew correction technique is proposed beside a space following algorithm which is performed to separate lines from each other. The segmentation method is applied on four representations of the text image, including an original binary image and other three representations obtained by transforming the input image into: (1) smeared image with RLSA algorithm, (2) up-to-down transitions, (3) smoothed image by gaussian filter. The obtained results are promising and they are compared in term of accuracy and time cost. These methods are evaluated on a private set of 129 historical documents images provided by Al-Qaraouiyine Library.
Iet Image Processing, May 22, 2019
Facial landmarks detection is an important and basic step in many face analysis applications. For... more Facial landmarks detection is an important and basic step in many face analysis applications. For this reason, it is considered a challenging task as the final results of the analysis depend on the accuracy of the landmarks detection. Decades of research have investigated approaches for two-dimensional (2D) facial landmarks detection but; however, the good obtained results, they still suffer from some weakness regarding the pose and illumination variations. Recently, the large availability of 3D scans makes the use of 3D face models easier hence, overcome the problems caused using 2D images. Many papers have studied the problem of 3D facial landmarks detection; nevertheless, there is a lack of literature reviews allowing an overview of the studies and researches related to the 3D face landmarks detection. In this study, the authors present a detailed survey of the latest (2010-2018) approaches based geometric information for 3D face landmarks detection, including the limitations and strengths of each work.
International Journal of Image, Graphics and Signal Processing, Aug 1, 2012
In this paper, a color face recognition system is developed to identify human faces using Back pr... more In this paper, a color face recognition system is developed to identify human faces using Back propagation neural network. The architecture we adopt is All-Class-in-One-Network, where all the classes are placed in a single network. To accelerate the learning process we propose the use of Bhattacharyya distance as total error to train the network. In the experimental section we compare how the algorithm converge using the mean square error and the Bhattacharyya distance. Experimental results indicated that the image faces can be recognized by the proposed system effectively and swiftly.
The aim of this paper is to evaluate, analyze and compare the performance of the most popular MFC... more The aim of this paper is to evaluate, analyze and compare the performance of the most popular MFCC variants for features extraction in text-independent speaker identification over VoIP Networks. The MFCC variants were tested and evaluated under a Gaussian mixture model (GMM)-based speaker identification system, which represents the speaker modeling state-of-art approach in contemporary text-independent speaker identification systems.
In order to enrich the digital content of Classical Arabic, we aim to propose and represent the A... more In order to enrich the digital content of Classical Arabic, we aim to propose and represent the Arabic dictionary “'Al-Qamiis Al-Muhit” in the standard format LEMON. Printed transition to digital format requires various steps of work. This article describes the procedures that we followed to convert the dictionary in digitized and encoded format to apply automatic extractions and get the Lemon format used in semantic web. Furthermore, due to Arabic dictionary complexity, formalize lexical and semantic information involves morphosyntactic and derivational knowledge that we try to explain.
This is the parent item and links to 29 child items. Specifically, this item together with the 29... more This is the parent item and links to 29 child items. Specifically, this item together with the 29 children is the famous Arabic medieval dictionary al-qāmūs l-muḥīṭ. It was compiled by the Persian lexicographer Al-fīrūz’ābādī. It belongs to lexicographical tradition and it is divided in sections (in Arabic bāb). Each section is devoted to an alphabetical consonant constituting the last radical consonant and is divided into chapters (in Arabic faṣl) ordered according to the first radical consonant. Each chapter (faṣl) is also divided into various parts gathering root family i.e. all lexical entries that have same root. In each chapter, roots are listed alphabetically according to the second radical consonant. Finally, lexical entries are grouped together under the root from which are derived. In the version uploaded, we adopted the original division based upon sections. Each section contains a text file and various XML files. The text file is the original plain text along with the macro (and micro) structure of the medieval dictionary: sections, chapters, roots, lexical entries and so on. There are two main types of XML files. One type contains the conversion of the plain text into a well formed XML document arranged according to the part of speech (verbs, noun, adjectives, proper nouns) of the lexical entry, while the other adds English translation of the lexical entries. For a quick navigation: hamza http://hdl.handle.net/20.500.11752/ILC-98 bāʾ http://hdl.handle.net/20.500.11752/ILC-101 tāʾ http://hdl.handle.net/20.500.11752/ILC-102 ṯāʾ http://hdl.handle.net/20.500.11752/ILC-103 jīm http://hdl.handle.net/20.500.11752/ILC-104 ḥāʾ http://hdl.handle.net/20.500.11752/ILC-105 ḫāʾ http://hdl.handle.net/20.500.11752/ILC-106 dāl http://hdl.handle.net/20.500.11752/ILC-107 rāʾ http://hdl.handle.net/20.500.11752/ILC-108 zāy http://hdl.handle.net/20.500.11752/ILC-109 sīn http://hdl.handle.net/20.500.11752/ILC-110 thāʾ http://hdl.handle.net/20.500.11752/ILC-111 šīn http://hdl.handle.net/20.500.11752/ILC-112 ṣād http://hdl.handle.net/20.500.11752/ILC-113 ḍād http://hdl.handle.net/20.500.11752/ILC-114 wāw http://hdl.handle.net/20.500.11752/ILC-115 ḏāl http://hdl.handle.net/20.500.11752/ILC-116 ṭāʾ http://hdl.handle.net/20.500.11752/ILC-117 ẓāʾ http://hdl.handle.net/20.500.11752/ILC-118 ʿayn http://hdl.handle.net/20.500.11752/ILC-119 ġayn http://hdl.handle.net/20.500.11752/ILC-120 fāʾ http://hdl.handle.net/20.500.11752/ILC-121 qāf http://hdl.handle.net/20.500.11752/ILC-122 kāf http://hdl.handle.net/20.500.11752/ILC-123 lām http://hdl.handle.net/20.500.11752/ILC-124 mīm http://hdl.handle.net/20.500.11752/ILC-125 nūn http://hdl.handle.net/20.500.11752/ILC-126 hāʾ http://hdl.handle.net/20.500.11752/ILC-127 yāʾ http://hdl.handle.net/20.500.11752/ILC-128 Each sub item is Publicly Available and licensed under: Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) -- https://creativecommons.org/licenses/by-sa/4.0
Pattern Recognition Letters, Jul 1, 2022
Uploads
Papers by Arsalane Zarghili