Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2005
…
12 pages
1 file
Finding interestingness measures to evaluate association rules has become an important knowledge quality issue in KDD. Many interestingness measures may be found in the literature, and many authors have discussed and compared interestingness properties in order to help choose the best measures for a given application. As interestingness depends both on the data structure and on the decision-maker's goals, some measures may be relevant in some context, but not in others. Therefore, it is necessary to design new contextual approaches in order to help the decision-maker to select the best interestingness measures. In this paper, we present ARQAT a new tool to study the specific behavior of a set of 34 interestingness measures in the context of a specific dataset and in an exploratory data analysis perspective. The tool implements 14 graphical and complementary views structured on 5 levels of analysis: ruleset analysis, correlation and clustering analysis, best rules analysis, sensitivity analysis, and comparative analysis. The tool is described and illustrated on the mushroom dataset in order to show the interest of both the exploratory approach and the use of complementary views.
It is a common problem that Kdd processes may generate a large number of patterns depending on the algorithm used, and its parameters. It is hence impossible for an expert to assess these patterns. This is the case with the well-known Apriori algorithm. One of the methods used to cope with such an amount of output depends on using association rule interestingness measures. Stating that selecting interesting rules also means using an adapted measure, we present a formal and an experimental study of 20 measures. The experimental studies carried out on 10 data sets lead to an experimental classification of the measures. This study is compared to an analysis of the formal and meaningful properties of the measures. Finally, the properties are used in a multi-criteria decision analysis in order to select amongst the available measures the one or those that best take into account the user's needs. These approaches seem to be complementary and could be useful in solving the problem of a user's choice of measure.
2014
Abstract. It is a common issue thatKdd processes may generate a large number of patterns depending on the algorithm used, and its parameters. It is hence impossible for an expert to sustain these patterns. This is the case with the well-known Apriori algorithm. One of the methods used to cope with such an amount of output depends on the use of associa-tion rule interestingness measures. Stating that selecting interesting rules also means using an adapted measure, we present a formal and an exper-imental study of 20 measures. The experimental studies carried out on 10 datasets lead to an experimental classification of the measures. This study is compared to an analysis of the formal and meaningful properties of the measures. Last, the properties are used in a multi-criteria deci-sion analysis in order to select amongst the available measures the one or those that best take into account the user’s needs. These approaches seem to be complementary and could be profitable for the problem...
IRACST-International Journal of Computer Science and Information Technology & Security (IJCSITS), 2016
field of data processing is changing as fastly as the volume is increasing at a faster pace and as the more intelligent and automated viewpoint for looking at data are the need of the time. This changing need is from all dimensions of life like Business, Biology Medical Research, Education, Governance, Risk Analysis, Text Analysis and Social Relation Management. These domains putting more and more difficult to manage storage, and computationally complex challenges before scientific community. To summarize or to take decisions, finding interesting and useful patterns in data is must. Association Rule Mining is the branch of data mining that is very helpful in this context. Association rule mining plays critical and important part in knowledge mining. The difficult task in is discovering hidden knowledge i.e useful rules from the large number of rules generated for reduced support. For pruning of rules or grouping the rules, many techniques are suggested such as rule structure cover methods, informative cover methods, rule clustering, etc. Another way of selecting association rules is based on interestingness measures such as support, confidence, correlation, and so on. In this paper, we study various interestingness measures and their use in selecting interesting and useful rules. These association rules are of utmost importance in implementing software systems that are based on methods and techniques of Data Science, Big Data and Data Mining. Because, the interestingness evaluation of association rule has the significance for the practical application of association rule mining technology, so it is necessary to study and improve it
Studies in Computational Intelligence, 2007
Finding interestingness measures to evaluate association rules has become an important knowledge quality issue in KDD. Many interestingness measures may be found in the literature, and many authors have discussed and compared interestingness properties in order to improve the choice of the most suitable measures for a given application. As interestingness depends both on the data structure and on the decision-maker's goals, some measures may be relevant in some context, but not in others. Therefore, it is necessary to design new contextual approaches in order to help the decision-maker select the most suitable interestingness measures. In this paper, we present a new approach implemented by a new tool, ARQAT, for making comparisons. The approach is based on the analysis of a correlation graph presenting the clustering of objective interestingness measures and reflecting the post-processing of association rules. This graph-based clustering approach is used to compare and discuss the behavior of thirty-six interestingness measures on two prototypical and opposite datasets: a highly correlated one and a lowly correlated one. We focus on the discovery of the stable clusters obtained from the data analyzed between these thirty-six measures.
European Journal of Operational Research, 2008
Data mining algorithms, especially those used for unsupervised learning, generate a large quantity of rules. In particular this applies to the Apriori family of algorithms for the determination of association rules. It is hence impossible for an expert in the field being mined to sustain these rules. To help carry out the task, many measures which evaluate the interestingness of rules have been developed. They make it possible to filter and sort automatically a set of rules with respect to given goals. Since these measures may produce different results, and as experts have different understandings of what a good rule is, we propose in this article a new direction to select the best rules: a two-step solution to the problem of the recommendation of one or more user-adapted interestingness measures. First, a description of interestingness measures, based on meaningful classical properties, is given. Second, a multicriteria decision aid process is applied to this analysis and illustrates the benefit that a user, who is not a data mining expert, can achieve with such methods.
Fuzzy Optimization and Decision Making, 2000
In Knowledge Discovery in Databases (KDD)/Data Mining literature, ''interestingness'' measures are used to rank rules according to the ''interest'' a particular rule is expected to evoke. In this paper, we introduce an aspect of subjective interestingness called ''item-relatedness''. Relatedness is a consequence of relationships that exist between items in a domain. Association rules containing unrelated or weakly related items are interesting since the co-occurrence of such items is unexpected. 'Item-Relatedness' helps in ranking association rules on the basis of one kind of subjective unexpectedness. We identify three types of item-relatedness-captured in the structure of a ''fuzzy taxonomy'' (an extension of the classical concept hierarchy tree). An ''item-relatedness'' measure for describing relatedness between two items is developed by combining these three types. Efficacy of this measure is illustrated with the help of a sample taxonomy. We discuss three mechanisms for extending this measure from a two-item set to an association rule consisting of a set of more than two items. These mechanisms utilize the relatedness of item-pairs and other aspects of an association rule, namely its structure, distribution of items and item-pairs. We compare our approach with another method from recent literature.
Knowledge Based Systems, 2011
Assessing rules with interestingness measures is the pillar of successful application of association rules discovery. However, association rules discovered are normally large in number, some of which are not considered as interesting or significant for the application at hand. In this paper, we present a systematic approach to ascertain the discovered rules, and provide a precise statistical approach supporting this
2002
Many techniques for association rule mining and feature selection require a suitable metric to capture the dependencies among variables in a data set. For example, metrics such as support, confidence, lift, correlation, and collective strength are often used to determine the interestingness of association patterns. However, many such measures provide conflicting information about the interestingness of a pattern, and the best metric to use for a given application domain is rarely known. In this paper, we present an overview of various measures proposed in the statistics, machine learning and data mining literature. We describe several key properties one should examine in order to select the right measure for a given application domain. A comparative study of these properties is made using twenty one of the existing measures. We show that each measure has different properties which make them useful for some application domains, but not for others. We also present two scenarios in which most of the existing measures agree with each other, namely, support-based pruning and table standardization. Finally, we present an algorithm to select a small set of tables such that an expert can select a desirable measure by looking at just this small set of tables.
2023
In: Laura Quintana, Nuria Sánchez Madrid (eds.), Neoliberal Techniques of Social Suffering: Political Resistance and Critical Theory from Latin America and Spain, Lexington Books, 2023 (978-1-66691-507-5). Translated by Alex Alvarez Taylor.
Canadian Journal of Applied Linguistics Revue Canadienne De Linguistique Appliquee, 2011
2ND INTERNATIONAL CONFERENCE ON MATHEMATICAL TECHNIQUES AND APPLICATIONS: ICMTA2021
Iran at the Crossroads, 2001
Perspectiva Geográfica , 2021
Evolution of Plant-Pollinator Relationships, 2011
International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 2025
Poliana Chaves Vieira, 2023
American Journal of Respiratory Cell and Molecular Biology, 1999
Rad Hrvatske akademije znanosti i umjetnosti. Medicinske znanosti, 2018
American Political Science Review, 1985
Relaciones Internacionales, 2006
The Journal of Strain Analysis for Engineering Design, 2011