Papers by Francesca Greselin
The rich are getting richer" implies that the population income distributions are getting more ri... more The rich are getting richer" implies that the population income distributions are getting more right-skewed and heavy-tailed. For such distributions, the mean is not the best measure of the center, but the classical indices of income inequality, including the celebrated Gini index, are all mean-based. In view of this, Professor Gastwirth sounded an alarm back in 2014 by suggesting to incorporate the median into the de nition of the Gini index, although noted a few shortcomings of his proposed index. In the present paper we make a further step in the modi cation of classical indices and, to acknowledge the possibility of differing viewpoints, arrive at three median-based indices of inequality. They avoid the shortcomings of the previous indices and can be used even when populations are ultra-heavy tailed, that is, when their rst moments are in nite. The new indices are illustrated both analytically and numerically using parametric families of income distributions, and further illustrated using a real data set of capital incomes of fteen countries. We also discuss the performance of the indices from the perspective of the Pigou-Dalton principle of transfers.
Summary: The aim of this paper is to propose a new operational measure for evaluating the degree ... more Summary: The aim of this paper is to propose a new operational measure for evaluating the degree of dependence existing between two nominal categorical variables. Given an r×c table T, representing bivariate statistical data, our approach to measure the strength of this relation is based on the consideration of the class of all contingency tables with the same margins as T. Once a partial or total ordering of dependence in (as defined in Greselin and Zenga [2004b]) has been given, the relative position assumed by T in can be a meaningful measure of dependence. Some desirable properties of these indices are presented: by construction, they are normalized, coherent with each level of ordering and attain extreme values in extreme dependence situations. They are invariant to permutation of rows and columns in the table and to transposition (as qualitative variables classification requires), and, finally they show a sort of stability behaviour with respect to similar populations. Further...
This book is the collection of the Abstract / Short Papers submitted by the authors of the Intern... more This book is the collection of the Abstract / Short Papers submitted by the authors of the International Conference of The CLAssification and Data Analysis Group (CLADAG) of the Italian Statistical Society (SIS), held in Milan (Italy) on September 13-15, 201
In this study, we address the problem of fuzzy material deprivation measurement and propose a new... more In this study, we address the problem of fuzzy material deprivation measurement and propose a new procedure to compute deprivation degrees at individual level, through concepts and tools from partially ordered set theory (poset theory, for short). We describe material deprivation data in terms of “material deprivation states” (i.e. configurations of deprivation scores) rather than material deprivation variables. The set of deprivation states can be naturally described as a partial order, whose structure contains a great deal of information about the relative deprivation degree of each statistical unit. By means of poset tools, such a structure can be represented and exploited so as to compute individual deprivation degrees, fully respecting the ordinal nature of the variables. The proposed methodology is then applied to data pertaining to Italian macroregions (North, Centre, South), extracted from the EU-SILC datasets for years 2005 - 2007, so as to give an account of both temporal ...
Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduct... more Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For at least a century academics and governmental researchers have been developing measures that would aid them in understanding income distributions, their differences with respect to geographic regions, and changes over time periods. It is a fascinating area due to a number of reasons, one of them being the fact that different measures, or indices, are needed to reveal different features of income distributions. Keeping also in mind that the notions of poor and rich are relative to each other, Zenga 2007 proposed a new index of economic inequality. The index is remarkably insightful and useful, but deriving statistical inferential results has been a challenge. For example, unlike many other indices, Zenga’s new index does not fall into the classes of L-, U-, and V-
Stats
Statistical inference based on the cluster weighted model often requires some subjective judgment... more Statistical inference based on the cluster weighted model often requires some subjective judgment from the modeler. Many features influence the final solution, such as the number of mixture components, the shape of the clusters in the explanatory variables, and the degree of heteroscedasticity of the errors around the regression lines. Moreover, to deal with outliers and contamination that may appear in the data, hyper-parameter values ensuring robust estimation are also needed. In principle, this freedom gives rise to a variety of “legitimate” solutions, each derived by a specific set of choices and their implications in modeling. Here we introduce a method for identifying a “set of good models” to cluster a dataset, considering the whole panorama of choices. In this way, we enable the practitioner, or the scientist who needs to cluster the data, to make an educated choice. They will be able to identify the most appropriate solutions for the purposes of their own analysis, in light...
In this paper we introduce a new methodology to study the degree of progression as well as the re... more In this paper we introduce a new methodology to study the degree of progression as well as the redistributive and re-ranking effects of a personal income tax system by employing and extending the new inequality curve (and index) proposed by Michele Zenga. Given an income distribution, the Zenga curve compares the economic conditions of two exhaustive groups of population obtained by dividing the overall population at all possible percentiles, from the bottom to the top observed income. Since the recent literature underlines that the Zenga curve shows features that are different from the standard approach based on the Lorenz curves, we show the potentialities of the new curve when studying the effects exerted by a personal income tax. This new methodology is compared to the classical one by a stylized example and by developing an application to Italian personal income tax data.
Studies in Classification, Data Analysis, and Knowledge Organization, 2021
International Institute of Social and Economic Sciences (IISES) Kamerunska 607/1, 160 00, Prague ... more International Institute of Social and Economic Sciences (IISES) Kamerunska 607/1, 160 00, Prague 6 – Vokovice, Czech Republic
Detecting the emergence of abrupt property changes in time series is a challenging problem. Kerne... more Detecting the emergence of abrupt property changes in time series is a challenging problem. Kernel two-sample test has been studied for this task which makes fewer assumptions on the distributions than traditional parametric approaches. However, selecting kernels is non-trivial in practice. Although kernel selection for two-sample test has been studied, the insufficient samples in change point detection problem hinders the success of those developed kernel selection algorithms. In this paper, we propose KL-CPD, a novel kernel learning framework for time series CPD that optimizes a lower bound of test power via an auxiliary generative model. With deep kernel parameterization, KL-CPD endows kernel twosample test with the data-driven kernel to detect different types of change-points in real-world applications. The proposed approach significantly outperformed other state-of-the-art methods in our comparative evaluation of benchmark datasets and simulation studies.
TEMELJNA DOKUMENTACIJSKA KARTICA TI (naslov) Genetska raznolikost hrasta lužnjaka (Quercus robur ... more TEMELJNA DOKUMENTACIJSKA KARTICA TI (naslov) Genetska raznolikost hrasta lužnjaka (Quercus robur L.) u pokusnim nasadima s potomstvom iz odabranih sjemenskih sastojina AU (autor) Maja Morić AD (adresa) 10 450 Jastrebarsko, Cvjetno naselje 41 SO (izvor) Knjižnica Šumarskog fakulteta Sveučilišta u Zagrebu, Svetošimunska 25 Knjižnica Šumarskog instituta, Jastrebarsko, Cvjetno naselje 41 PY (godina objave) 2016 LA (izvorni jezik) hrvatski LS (jezik sažetka) engleski DE (kljucne rijeci) Genetska raznolikost, kvantitativna fenotipska svojstva, unutarpopulacijska genetska varijabilnost, razina i obrazac kvantitativne genetske diferencijacije, interakcija genotipa sa okolišem, molekularne genetske analize, jezgrini i kloroplastni mikrosatelitni DNK biljezi GE (zemlja objave) Hrvatska PT (vrsta objave) doktorski rad VO (obujam) I-XXI + 243 str. + 76 tablica + 42 slike + 257 literat. AB (sažetak) Genetska raznolikost (varijabilnost, diverzitet) temeljni je dio ukupne biološke raznolikosti, a predstavlja bogatstvo različitih alela odnosno gena na razinama jedinki, populacija odnosno vrsta. Veća razina genetske raznolikosti omogućava populacijama veći kapacitet prilagođavanja putem prirodne selekcije i zbog toga je važan preduvjet njihova opstanka u promjenjivom okolišu. Stoga su upoznavanje i očuvanje genetske raznolikosti izuzetno važne aktivnosti za dugoročni opstanak vrsta šumskog drveća, posebice onih koje su nositelji ekosustava. Utvrđivanje razine, obrasca i uzroka genetske raznolikosti kod vrsta šumskog drveća moguće je putem dvaju metoda, a to su: 1. analizom kvantitativnih fenotipskih svojstava u genetičkim testovima i 2. analizom DNK biljega. Glavni cilj ovoga rada bio je utvrditi razinu i obrazac genetske raznolikosti populacija hrasta lužnjaka u Hrvatskoj korištenjem obje dostupne metode. Na tri različite lokacije osnovani su genetički testovi s potomstvom hrasta lužnjaka iz 16 sjemenskih i jedne gospodarske sastojine, koje reprezentiraju cjelokupan areal ove vrste u Hrvatskoj. Provedene su izmjere i ocjenjivanje raznovrsnih kvantitativnih fenotipskih svojstava: visinskog rasta, visinskog prirasta, preživljenja, zimske retencije lišća, intenziteta zaraze hrastovom pepelnicom, intenziteta oštećenosti biljaka kasnim proljetnim mrazem i fenologije listanja. Analizom varijance utvrđena je statistička značajnost ispitivanih izvora varijabilnosti (blokova, populacija, familija unutar populacija, interakcije blokova s populacijama i familijama), te su izračunati kvantitativni genetički parametri: nasljednost (individualna (h 2 i) i familijska (h 2 f)), koeficijent aditivne genetske varijacije (CV A) i parametar kvantitativne genetske diferencijacije (Q ST). Determinacija obrasca genetske diferencijacije provedena je multivarijatnom regresijskom stabalnom analizom (engl. Multivariate Regression Tree analysis-MRT). Vrijednosti CV A i nasljednosti bile su kod većine svojstava niske, što upućuje na nisku razinu unutarpopulacijske genetske varijabilnosti istraživanih populacija. Međutim vjerojatno je da su niske vrijednosti genetičkih parametara bile uzrokovane visokom varijancom ostatka tj. neaditivnom genetskom i okolišnom varijancom uključujući i varijancu eksperimentalne pogreške. Stoga, izračunati genetički parametri nisu POPIS SLIKA Slika 1. Rasprostranjenost hrasta lužnjaka u Europi (DUCOUSSO I BORDACS EUFORGEN 2004). Slika 2. Rasprostranjenost hrasta lužnjaka u Hrvatskoj (NN 114/15). Slika 3. Sjemenske regije hrasta lužnjaka (Q. robur L.) u Hrvatskoj, (NN 114/15). Slika 4. Pokusni nasad hrasta lužnjaka Jastrebarsko, travanj 2011. godine, biljke u polipropilenskim štitnicima, tzv. Tulijeve cijevi. Slika 5. Pokusni nasad hrasta lužnjaka Koška, svibanj 2011. godine. Slika 6. Pokusni nasad hrasta lužnjaka Vrbanja, studeni 2012. god. Slika 7. Shema pokusnog nasada Jastrebarsko, obilježene provenijencije (populacije), te ponavljanja (različitim bojama). Slika 8. Izmjere visina u pokusnom nasadu hrasta lužnjaka Jastrebarsko. Slika 9. Kategorije zimske retencije lišća-zadržavanja prošlogodišnjeg odumrlog lišća do novog listanja, redom od 0-biljke bez odumrlog lišća na granama, 1, 2, do 3-biljka potpuno prekrivena sa odumrlim lišćem. Slika 10. Ocjene intenziteta zaraze lista micelijem gljive pepelnice (iz LIOVIĆ I ŽUPANIĆ 2006). Slika 11.a),b) Primjer biljke sa pokusnog nasada Jastrebarsko a) najviši stupanj zaraženosti pepelnicom, b) primjer nezaražene biljke. Slika 12. Faze listanja hrasta lužnjaka. Slika 13. Prikaz biljaka oštećenih mrazem. Slika 14. a,b-Prikupljanje biljnog materijala za izolaciju DNK u pokusom nasadu hrasta lužnjaka, Jastrebarsko. Slika 15. Vaganje lisnog tkiva, stavljanje u mikroepruvetu sa čeličnom kuglicom, te stoj za usitnjavanje materijala (TissueLyser). Slika 16. Postupci izolacija DNK, dodavanje kemikalija, vorteksiranje, inkubiranje, odvajanje otopina, centrifugiranje, te pohrana konačnog izlota DNK. Slika 17. Prosječne visine u pokusnom nasadu Jastrebarsko od 2010. do 2012. godine. Slika 18. Prosječne visine populacija hrasta lužnjaka u sva tri pokusna nasada za 2012. godinu (dob 6 godina). Slika 19. Prosječno preživljenje u pokusnom nasadu Jastrebarsko za tri uzastopne godine (2010.-2012.). Slika 20. Prosječno preživljenje u sva tri pokusna nasada za 2012. godinu. Slika 21. Prosječna ocjena zimske retencije lišća populacija hrasta lužnjaka u pokusnom nasadu Jastrebarsko, u dobi od 4 do 6 godina. Slika 22. Prosječan intenzitet zaraze hrastovom pepelnicom populacija hrasta lužnjaka u pokusnom nasadu Jastrebarsko u dobi od 5 i 6 godina. Slika 23. Prosječan početak prolistavanja u dobi od 5 do 7 godina u pokusnom nasadu Jastrebarsko. Slika 24. Prosječna ocjena oštećenosti biljaka kasnim proljetnim mrazem u dobi od 6 godina u pokusnom nasadu Jastrebarsko.
Review of Income and Wealth, 2020
We adopt and extend the new Zenga inequality curve to study the degree of progressivity as well a... more We adopt and extend the new Zenga inequality curve to study the degree of progressivity as well as the redistributive and re-ranking effects of a personal income tax system. Moreover, we also establish the social welfare implications of these new inequality measures and compare them with the classical approach based on the Lorenz curve and the Gini coefficient. The Zenga methodology is based on comparing the mean income of the poorest income earners with the mean income of the remaining richest part of the population. To the best of our knowledge, this approach has never been applied to study the effects produced by a personal income tax. To fill this gap in the literature, we prove that the elasticity of the Zenga uniformity curve with respect to the Lorenz curve is always greater than 1, thus recasting -within the new paradigm\u2014the most important curves and the corresponding tax indices, such as the Reynolds\u2013Smolensky, the Kakwani, and the Atkinson\u2013Plotnick\u2013Kakwani indices. We then derive three important inequalities for the newly developed measures, inspired by the well-known properties of the classical approach. Finally, we show how some information, which could remain unnoticed by the cumulative approach inherent to the Lorenz curve, is instead highlighted by the new methodology. The advantages of complementing the classic indices with the new ones are discussed through an application to the Italian tax system
Advances in Data Analysis and Classification, 2017
This paper presents a review about the usage of eigenvalues restrictions for constrained paramete... more This paper presents a review about the usage of eigenvalues restrictions for constrained parameter estimation in mixtures of elliptical distributions according to the likelihood approach. These restrictions serve a twofold purpose: to avoid convergence to degenerate solutions and to reduce the onset of non interesting (spurious) maximizers, related to complex likelihood surfaces. The paper shows how the constraints may play a key role in the theory of Euclidean data clustering. The aim here is to provide a reasoned review of the constraints and their applications, along the contributions of many authors, spanning the literature of the last thirty years.
SUMMARY The sample mean difference ∆ is an unbiased estimator of Gini’s mean difference ∆. It is ... more SUMMARY The sample mean difference ∆ is an unbiased estimator of Gini’s mean difference ∆. It is well known that ∆ ˆ is asymptotically normally distributed (Hoeffding, 1948). In order to obtain confidence intervals for ∆, ∆ ˆ must be standardized and hence its variance Var( ∆ ˆ ) must be estimated. In this paper we study the effective coverage of the confidence intervals for ∆, when using a specific unbiased estimator ) ˆ ( ar V ˆ ∆ for the variance of ∆ ˆ , in a non-parametric framework. The empirical determination of the minimum sample size required to reach a good approximation of the nominal coverage is analyzed through a new approach. The reported results show that this threshold is critically related to the asymmetry and the tail heaviness in the underlying distribution.
Statistical Analysis and Data Mining: The ASA Data Science Journal, 2017
Uploads
Papers by Francesca Greselin