Lecture Rough Sets
Lecture Rough Sets
Lecture Rough Sets
Elshimaa Elgendi
Operations Research and Decision Support
Faculty of Computers and Artificial Intelligence
Cairo University
Preamble
• Fuzzy set theory is the first to have a theoretical treatment
of the problem of vagueness and uncertainty, and has had
many successful implementations. Fuzzy set theory is,
however, not the only theoretical logic that addresses
these concepts.
• Pawlak 1980s developed a new theoretical framework to
reason with vague concepts and uncertainty.
• While rough set theory is somewhat related to fuzzy set
theory, there are major differences.
• Rough set theory is based on the assumption that some
information, or knowledge, about the elements of the universe
of discourse is initially available. This is contrary to fuzzy set
theory where no such prior information is assumed.
Z. Pawlak. Rough Sets. International Journal of Computer and Information Sciences, 11:341–356, 1982.
Rough Set Theory
Rough Set Illustration
• The information available about elements is used to find similar elements
and indiscernible elements. Rough set theory is then based on the
concepts of upper and lower approximations of sets.
• Rough sets constitutes a sound basis for Knowledge Discovery in Database
(KDD). It offers mathematical tools to discover patterns hidden in data.
• The lower approximation contains those elements that belong to the set
with full certainty,
• The upper approximation encapsulates elements for which membership is
uncertain.
• The boundary region of a set, which is the difference between the upper
and lower approximations, thus contains all examples which cannot be
classified based on the available information.
The main advantages of the rough set
approach
• It does not need any preliminary or additional information about data −
like probability in statistics, grade of membership in the fuzzy set theory.
• It provides efficient methods, algorithms and tools for finding hidden
patterns in data.
• It allows to reduce original data, i.e. to find minimal sets of data with the
same knowledge as in the original data.
• It allows to evaluate the significance of data.
• It allows to generate in automatic way the sets of decision rules from data.
• It is easy to understand.
• It offers straightforward interpretation of obtained results.
• It is suited for concurrent (parallel/distributed) processing.
Applications
• Feature selection,
• Feature extraction,
• Data reduction,
• Decision rule generation,
• Pattern extraction (templates, association rules) etc.
Observations
•An equivalence relation induces a partitioning of the universe.
•Subsets that are most often of interest have the same value of the decision
attribute.
One can define the following four basic classes of rough sets, i.e., four categories
of vagueness:
Information System
The information about the real world is given in the form of an information table
(sometimes called a decision table).
Thus, the information table represents input data, gathered from any domain, such
as medicine, finance, or the military.
Rows of a table, labeled e1, e2, e3, e4, e5, and e6 are called examples (objects,
entities).
Conditional attributes= {Headache, muscle_pain, temperature}
Decisional attribute= {Flu}