Papers by Ignacio Sánchez Gendriz
Computers in biology and medicine, May 1, 2024

medRxiv (Cold Spring Harbor Laboratory), Mar 13, 2024
Background: DNA sequences harbor vital information regarding various organisms and viruses. The a... more Background: DNA sequences harbor vital information regarding various organisms and viruses. The ability to analyze extensive DNA sequences using methods amenable to conventional computer hardware has proven invaluable, especially in timely response to global pandemics such as COVID-19. Objectives: This study introduces a new representation that encodes DNA sequences in unit vector transitions in a 2D space, extracted from the 2019 repository Novel Coronavirus Resource (2019nCoVR). The main objective is to elucidate the potential of this method to facilitate virus classification using minimal hardware resources. It also aims to demonstrate the feasibility of the technique through dimensionality reduction and the application of machine learning models. Methods: DNA sequences were transformed into two-nucleotide base transitions (referred to as 'transitions'). Each transition was represented as a corresponding unit vector in 2D space. This coding scheme allowed DNA sequences to be efficiently represented as dynamic transitions. After applying a moving average and resampling, these transitions underwent dimensionality reduction processes such as Principal Component Analysis (PCA). After subsequent processing and dimensionality reduction, conventional machine learning approaches were applied, obtaining as output a multiple classification among six species of viruses belonging to the coronaviridae family, including SARS-CoV-2. Results and Discussions: The implemented method effectively facilitated a careful representation of the sequences, allowing visual differentiation between six types of viruses from the Coronaviridae family through direct plotting. The results obtained by this technique reveal values accuracy, sensitivity, specificity and F1-score equal to or greater than 99%, applied in a stratified cross-validation, used to evaluate the model. The results found produced performance comparable, if not superior, to the computationally intensive methods discussed in the state of the art. Conclusions: The proposed coding method appears as a computationally efficient and promising addition to contemporary DNA sequence coding techniques. Its merits lie in its simplicity, visual NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice. A PREPRINT interpretability and ease of implementation, making it a potential resource in complementing existing strategies in the field.

Machine learning and knowledge extraction, Feb 5, 2024
This study introduces an efficient methodology for addressing fault detection, classification, an... more This study introduces an efficient methodology for addressing fault detection, classification, and severity estimation in rolling element bearings. The methodology is structured into three sequential phases, each dedicated to generating distinct machine-learning-based models for the tasks of fault detection, classification, and severity estimation. To enhance the effectiveness of fault diagnosis, information acquired in one phase is leveraged in the subsequent phase. Additionally, in the pursuit of attaining models that are both compact and efficient, an explainable artificial intelligence (XAI) technique is incorporated to meticulously select optimal features for the machine learning (ML) models. The chosen ML technique for the tasks of fault detection, classification, and severity estimation is the support vector machine (SVM). To validate the approach, the widely recognized Case Western Reserve University benchmark is utilized. The results obtained emphasize the efficiency and efficacy of the proposal. Remarkably, even with a highly limited number of features, evaluation metrics consistently indicate an accuracy of over 90% in the majority of cases when employing this approach.

Dengue is a significant global health issue, affecting millions of people annually and imposing s... more Dengue is a significant global health issue, affecting millions of people annually and imposing substantial socioeconomic burdens. Effective disease control relies on monitoring the population of Aedes aegypti mosquitoes, the primary vector of dengue. One surveillance method involves counting the eggs laid by these mosquitoes in spatially distributed ovitraps. This study focuses on the application of computational methods to forecast dengue vector populations. We analyze a four-year (2016-2019) database from 397 ovitraps distributed across Natal, RN-Brazil, with a weekly sampling frequency. Our objective is to develop accurate machine learning (ML) models that can predict the egg density index (EDI) at a fine-grained spatial resolution, aligned with zoonosis interventions. To preprocess the dataset obtained from the ovitraps, we employ spatial smoothing techniques and aggregation. The preprocessed data is then used to train ML models, including recurrent deep learning (DL) models, enabling accurate forecasting of the EDI. This approach shows promise for monitoring and preventing arbovirus outbreaks. Our findings demonstrate the effectiveness of spatial smoothing and aggregation as preprocessing steps for reducing randomness and noise in the dataset. The recurrent DL models exhibit high forecasting accuracy, thereby validating their utility in arbovirus monitoring and prevention efforts.

Frontiers in artificial intelligence, Dec 7, 2023
The COVID-pandemic is already considered one of the biggest global health crises. In Rio Grande d... more The COVID-pandemic is already considered one of the biggest global health crises. In Rio Grande do Norte, a Brazilian state, the RegulaRN platform was the health information system used to regulate beds for patients with COVID-. This article explored machine learning and deep learning techniques with RegulaRN data in order to identify the best models and parameters to predict the outcome of a hospitalized patient. A total of , bed regulations for COVID-patients were analyzed. The data analyzed comes from the RegulaRN Platform database from April to August . From these data, the nine most pertinent characteristics were selected from the twenty available, and blank or inconclusive data were excluded. This was followed by the following steps: data pre-processing, database balancing, training, and test. The results showed better performance in terms of accuracy ( . %), precision ( . %), and F -score ( . %) for the Multilayer Perceptron model with Stochastic Gradient Descent optimizer. The best results for recall ( . %), specificity ( . %), and ROC-AUC ( . %) were achieved by Root Mean Squared Propagation. This study compared di erent computational methods of machine and deep learning whose objective was to classify bed regulation data for patients with COVID-from the RegulaRN Platform. The results have made it possible to identify the best model to help health professionals during the process of regulating beds for patients with COVID-. The scientific findings of this article demonstrate that the computational methods used applied through a digital health solution, can assist in the decision-making of medical regulators and government institutions in situations of public health crisis.

Scientific Reports, Aug 8, 2023
Osteoporosis is a disease characterized by impairment of bone microarchitecture that causes high ... more Osteoporosis is a disease characterized by impairment of bone microarchitecture that causes high socioeconomic impacts in the world because of fractures and hospitalizations. Although dualenergy X-ray absorptiometry (DXA) is the gold standard for diagnosing the disease, access to DXA in developing countries is still limited due to its high cost, being present only in specialized hospitals. In this paper, we analyze the performance of Osseus, a low-cost portable device based on electromagnetic waves that measures the attenuation of the signal that crosses the medial phalanx of a patient's middle finger and was developed for osteoporosis screening. The analysis is carried out by predicting changes in bone mineral density using Osseus measurements and additional common risk factors used as input features to a set of supervised classification models, while the results from DXA are taken as target (real) values during the training of the machine learning algorithms. The dataset consisted of 505 patients who underwent osteoporosis screening with both devices (DXA and Osseus), of whom 21.8% were healthy and 78.2% had low bone mineral density or osteoporosis. A cross-validation with k-fold = 5 was considered in model training, while 20% of the whole dataset was used for testing. The obtained performance of the best model (Random Forest) presented a sensitivity of 0.853, a specificity of 0.879, and an F1 of 0.859. Since the Random Forest (RF) algorithm allows some interpretability of its results (through the impurity check), we were able to identify the most important variables in the classification of osteoporosis. The results showed that the most important variables were age, body mass index, and the signal attenuation provided by Osseus. The RF model, when used together with Osseus measurements, is effective in screening patients and facilitates the early diagnosis of osteoporosis. The main advantages of such early screening are the reduction of costs associated with exams, surgeries, treatments, and hospitalizations, as well as improved quality of life for patients. Osteoporosis is characterized by impaired bone strength 1 and affects approximately 6.3% of men over the age of 50 and 21.2% of women over the same age range globally, i.e., approximately 500 million men and women worldwide 2 . The disease causes more than 8.9 million fractures annually, resulting in one osteoporosis fracture every 3 s 3 . In the US, the annual direct medical cost of osteoporosis in 2005 was 17 billion USD and is projected to rise to 25 billion USD by 2025 4 . In the 27 countries of the European Union, the estimated cost was 34.5 billion dollars in 2010. In four latin american countries (Brazil, Mexico, Colombia, and Argentina), the burden of the disease in 2018 was estimated at 1.17 billion dollars 5 .

Osteoporosis is a disease characterized by impairment of bone microarchitecture that causes high ... more Osteoporosis is a disease characterized by impairment of bone microarchitecture that causes high socioeconomic impacts in the world because of fractures and hospitalizations. Although dual-energy X-ray absorptiometry (DXA) is the gold standard for diagnosing the disease, access to DXA in developing countries is still limited due to its high cost, being present only in specialized hospitals. In this paper, we analyze the performance of Osseus, a low-cost portable device based on electromagnetic waves that measures the attenuation of the signal that crosses the medial phalanx of a patient's middle finger and was developed for osteoporosis screening. The analysis is carried out by predicting changes in bone mineral density using Osseus measurements and additional common risk factors used as input features to a set of supervised classification models, while the results from DXA are taken as target (real) values during the training of the machine learning algorithms. The dataset cons...

Scientific Reports
Dengue is recognized as a health problem that causes significant socioeconomic impacts throughout... more Dengue is recognized as a health problem that causes significant socioeconomic impacts throughout the world, affecting millions of people each year. A commonly used method for monitoring the dengue vector is to count the eggs that Aedes aegypti mosquitoes have laid in spatially distributed ovitraps. Given this approach, the present study uses a database collected from 397 ovitraps allocated across the city of Natal, RN—Brazil. The Egg Density Index for each neighborhood was computed weekly, over four complete years (from 2016 to 2019), and simultaneously analyzed with the dengue case incidence. Our results illustrate that the incidence of dengue is related to the socioeconomic level of the neighborhoods in the city of Natal. A deep learning algorithm was used to predict future dengue case incidence, either based on the previous weeks of dengue incidence or the number of eggs present in the ovitraps. The analysis reveals that ovitrap data allows earlier prediction (four to six weeks)...

Despite the extensive Brazilian coast areas, little is known on underwater acoustic environments ... more Despite the extensive Brazilian coast areas, little is known on underwater acoustic environments in Brazil. Acoustic environments (or soundscape) are composed by biological, geological and man-made sound sources. Soundscapes are strongly linked to ecosystems dynamics, and follow temporal patters that can vary at daily and seasonal scales. Thus, for soundscape characterization, it is necessary to undertake sound recordings for long periods, which demands innovative analyzing methods. Accordingly, the present research focuses in two principal objectives: (1) to develop methods for analyzing long-term acoustic recordings and, (2) to characterize marine soundscapes of selected points in São Paulo State. Four deployment sites were selected for the underwater acoustic monitoring: a point located at the channel entrance of the Santos Harbor, and three marine Protected Areas (PAs) in Sao Paulo state. As a result, the largest underwater acoustic database from Brazilian seas was acquired. The...

The role of sleep on memory consolidation is thought to involve experience-dependent changes in s... more The role of sleep on memory consolidation is thought to involve experience-dependent changes in spindle oscillations and protein phosphorylation, but how these phenomena are related remains poorly understood. To gain insight into this relationship, we used electrophysiological recordings and quantitative phosphoproteomic analysis to assess spindle oscillations and phosphoprotein levels in the hippocampus (HP) and primary somatosensory cortex (S1) of adult male rats recorded across the sleep cycle. Animals were surgically implanted with multielectrode probes and after recovery were exposed or unexposed to novel objects (+ and – groups, respectively). HP and S1 samples were obtained after periods rich in either slow-wave sleep (SWS) or rapid-eye-movement sleep. Bottom-up shotgun mass spectrometry in a two-dimensional liquid chromatography-tandem mass spectrometry setup (MSE mode with label-free quantification) showed that the proteomes differed in the numbers of phosphoproteins identi...

Confins
As cidades inteligentes, como aposta para a gestao urbana, tem despertado a atencao de pesquisado... more As cidades inteligentes, como aposta para a gestao urbana, tem despertado a atencao de pesquisadores e gestores publicos ao redor do mundo. Em 2020, na Regiao Nordeste do Brasil, 36 municipios estao categorizados como “cidades inteligentes”. Este artigo descreve os metodos usados para analisar, visualizar e discutir algumas informacoes sobre essas 36 cidades. A utilizacao de mapas tematicos permitiu demonstrar, numa perspectiva espacial, algumas caracteristicas desses municipios, e sua associacao ao uso de metodos de mineracao de dados e a uma perspectiva teorica critica mostrou-se util para compreender as dinâmicas territoriais das cidades inteligentes. Ao combinar variaveis das cidades pesquisadas, a analise de componentes principais e o agrupamento hierarquico identificaram automaticamente similaridades e diferencas entre elas. Em comparacao com as metropoles e capitais incluidas neste estudo, e tomando-se como referencia a REGIC 2007, os resultados mostram-se uteis para identificar as cidades com potencialidades para melhor promover e/ou abrigar dinâmicas territoriais caracteristicas de cidades inteligentes, que sao: Jaboatao dos Guararapes-PE, Olinda-PE, Igarassu-PE, Sao Lourenco da Mata-PE, Moreno-PE, Itapissuma-PE, que fazem parte da Regiao Metropolitana de Recife; e Parnamirim-RN, pertencente a Regiao Metropolitana de Natal.
Expert Systems with Applications
SÁNCHEZ-GENDRIZ, I.. A Methodology for analyzing data from long-term passive acoustic monitoring.... more SÁNCHEZ-GENDRIZ, I.. A Methodology for analyzing data from long-term passive acoustic monitoring. 2017. 97 f. Doctoral dissertation (Doctorate Candidate in Science.

Brazilian Archives of Biology and Technology
Venous refilling time (VRT) can diagnose the presence of venous diseases in lower limbs. In order... more Venous refilling time (VRT) can diagnose the presence of venous diseases in lower limbs. In order to calculate VRT it is necessary to determine the End of the Emptying Maneuvers (EEM). First Derivative Method (FDM) can be employed for automatic detection of the EEM, but its sensitivity to artifacts and noise can degrade its performance. In contrast, studies report that Area Triangulation Method (ATM) evinces effectiveness in biosignals point finding. This work compares the exactness of ATM and FDM for recognition of the EEM. The annotations made by 3 trained human observers on 37 photoplethysmography records were used as a reference. Bland-Altman graphics supported the analysis of agreement among human observers and methods, which was complemented with Analysis of variance and Multiple Comparisons statistical tests. Results showed that ATM is more accurate than FDM for automatic detection of the EEM, with statistically significant differences (p-value < 0.01).
2012 XVII Symposium of Image, Signal Processing, and Artificial Vision (STSIVA), 2012
Uploads
Papers by Ignacio Sánchez Gendriz