Lecture Rough Sets

Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Rough Sets

Elshimaa Elgendi
Operations Research and Decision Support
Faculty of Computers and Artificial Intelligence
Cairo University
Preamble
• Fuzzy set theory is the first to have a theoretical treatment
of the problem of vagueness and uncertainty, and has had
many successful implementations. Fuzzy set theory is,
however, not the only theoretical logic that addresses
these concepts.
• Pawlak 1980s developed a new theoretical framework to
reason with vague concepts and uncertainty.
• While rough set theory is somewhat related to fuzzy set
theory, there are major differences.
• Rough set theory is based on the assumption that some
information, or knowledge, about the elements of the universe
of discourse is initially available. This is contrary to fuzzy set
theory where no such prior information is assumed.
Z. Pawlak. Rough Sets. International Journal of Computer and Information Sciences, 11:341–356, 1982.
Rough Set Theory
Rough Set Illustration
• The information available about elements is used to find similar elements
and indiscernible elements. Rough set theory is then based on the
concepts of upper and lower approximations of sets.
• Rough sets constitutes a sound basis for Knowledge Discovery in Database
(KDD). It offers mathematical tools to discover patterns hidden in data.
• The lower approximation contains those elements that belong to the set
with full certainty,
• The upper approximation encapsulates elements for which membership is
uncertain.
• The boundary region of a set, which is the difference between the upper
and lower approximations, thus contains all examples which cannot be
classified based on the available information.
The main advantages of the rough set
approach
• It does not need any preliminary or additional information about data −
like probability in statistics, grade of membership in the fuzzy set theory.
• It provides efficient methods, algorithms and tools for finding hidden
patterns in data.
• It allows to reduce original data, i.e. to find minimal sets of data with the
same knowledge as in the original data.
• It allows to evaluate the significance of data.
• It allows to generate in automatic way the sets of decision rules from data.
• It is easy to understand.
• It offers straightforward interpretation of obtained results.
• It is suited for concurrent (parallel/distributed) processing.
Applications
• Feature selection,
• Feature extraction,
• Data reduction,
• Decision rule generation,
• Pattern extraction (templates, association rules) etc.

Rough sets so desirable for real-world applications is their robustness to noisy


environments, and situations where data is incomplete.

• It is a supervised approach, which clarifies the set-theoretic characteristics


of classes over combinatorial patterns of the attributes. In doing so, rough
sets also perform automatic feature selection by finding the smallest set of
input parameters necessary to discern between classes.
Basic Concepts of Rough Sets
• Information/Decision Systems (Tables)
• Indiscernibility
• Set Approximation
• Reducts and Core
• Rough Membership
• Dependency of Attributes
Information Systems/Tables
• IS is a pair (U, A)
Age LEMS • U is a non-empty finite set of
x1 16-30 50
objects.
x2 16-30 0 • A is a non-empty finite set of
x3 31-45 1-25 attributes such that a : U  Va
x4 31-45 1-25 for every a  A.
x5 46-60 26-49 • Va is called the value set of a.
x6 16-30 26-49
x7 46-60 26-49
Decision Systems/Tables

Age LEMS Walk • DS: T  (U , A  {d })


x1 16-30 50 yes • d  A is the decision
x2 16-30 0 no attribute (instead of one
x3 31-45 1-25 no
we can consider more
x4 31-45 1-25 yes
decision attributes).
x5 46-60 26-49 no • The elements of A are
x6 16-30 26-49 yes called the condition
x7 46-60 26-49 no
attributes.
Concept of Indiscernibility
• The basic idea upon which rough sets rests is the discernibility
between objects.
• If two objects are indiscernible over a set of attributes, it means that
the objects have the same values for these attributes.
• Formally, the indiscernibility relation is defined as:
• For

• With U|IND(B) is denoted the set of equivalence classes in the


relation IND(B).
• That is, U/IND(B) contains one class for each set of objects that satisfy
IND(B) over all attributes in B. Objects are therefore grouped
together, where the objects in different groups cannot be discerned
between.
An Example of Indiscernibility
Age LEMS Walk • The non-empty subsets of the condition
attributes are {Age}, {LEMS}, and {Age, LEMS}.
x1 16-30 50 yes • IND({Age}) = {{x1,x2,x6}, {x3,x4}, {x5,x7}}
x2 16-30 0 no
x3 31-45 1-25 no
x4 31-45 1-25 yes • IND({LEMS}) = {{x1}, {x2}, {x3,x4}, {x5,x6,x7}}
x5 46-60 26-49 no
x6 16-30 26-49 yes • IND({Age,LEMS}) = {{x1}, {x2}, {x3,x4}, {x5,x7},
x7 46-60 26-49 no {x6}}.
Observations

• An equivalence relation induces a partitioning of the universe.


• The partitions can be used to build new subsets of the universe.
• Subsets that are most often of interest have the same value of the
decision attribute.
It may happen, however, that a concept such as “Walk” cannot be
defined in a crisp manner.
discernibility matrix
• A discernibility matrix is a two-dimensional matrix where the equivalence
classes form the indices, and each element is the set of attributes that can
be used to discern between the corresponding classes.
• Formally, for a set of attributes B ⊆ A in A = (U,A), the discernibility matrix
MD(B) is defined as

• Using the discernibility matrix, discernibility functions can be defined to


compute the minimal number of attributes necessary to discern
equivalence classes from one another.
Set Approximation
R-lower approximation
Let X  U and R  C , R is a subset of conditional features, then the R-lower
approximation set of X, is the set of all elements of U which can be with
certainty classified as elements of X.
R X  {Y  U / R : Y  X }
R-lower approximation set of X is a subset of X
R-upper approximation
The R-upper approximation set of X, is the set of all elements of U such
that:
RX  {Y U / R : Y  X  }
X is a subset of R-upper approximation set of X.
R-upper approximation contains all data which can possibly be classified as
belonging to the set X
the R-Boundary set of X is defined as: BN ( X )  RX  RX
If then, X is R-definible (the boundary set is empty)
If then X is Rough with respect to R.
ACCURACY := Card(Lower)/ Card (Upper)

Observations
•An equivalence relation induces a partitioning of the universe.
•Subsets that are most often of interest have the same value of the decision
attribute.
One can define the following four basic classes of rough sets, i.e., four categories
of vagueness:
Information System

The information about the real world is given in the form of an information table
(sometimes called a decision table).
Thus, the information table represents input data, gathered from any domain, such
as medicine, finance, or the military.
Rows of a table, labeled e1, e2, e3, e4, e5, and e6 are called examples (objects,
entities).
Conditional attributes= {Headache, muscle_pain, temperature}
Decisional attribute= {Flu}

• Information System is a pair (U, A)


• U is a non-empty finite set of objects.
• A is a non-empty finite set of attributes such that for every
• is called the value set of a.
An Example of Indiscernibility
The non-empty subsets of
the condition attributes are
{Hedeache}, {Muscle_pain},
{Temperature} and
{Hedeache,Muscle_pain},
{Hedeache,Temperature},
{Muscle_pain,Temperature} , {Hedeache,Temperature ,Muscle_pain}.
• IND({Hedeache}) = {{e1,e2,e3}, {e4,e5,e6}}
• IND({Muscle_pain}) = {{e1,e2,e3,e4,e6},{e5}}
• IND({Temperature}) ={{e1,e4},{e2,e5},{e3,e6}}
• IND({Hedeache, Muscle_pain}) = {{e1,e2,e3}, {e4,e6},{e5}}
• IND({Muscle_pain, Temperature}) = {{e1,e4},{e2},{e3,e6},{e5}}
• IND({Hedeache,Temperature}) ={{e1}{e2},{e3},{e4},{e5},{e6}}
• IND({Hedeache,Temperature, Muscle_pain}) {{e1}{e2},{e3},{e4},{e5},{e6}}
Example
In order to illustrate how a decision system from table shown below defines an
indiscernibility relation, we consider the following three non-empty subsets of
the conditional attributes: {Age} , {LEMS} and {Age, LEMS} .
If we take into consideration the set{LEMS}
then objects x3 and x4 belong to the same
equivalence class; they are indiscernible.
From the same reason, x5, x6 and x7 belong to
another indiscernibility class. The relation
IND defines three partitions of the universe.
An Example of Set Approximation

LEMS: Lower Extremity Motor Score


Rough Membership Function
 Rough sets can be also defined by using, instead of approximations, a rough
membership Function.
 In classical set theory, either an element belongs to a set or it does not.
 The corresponding membership function is the characteristic function for the
set, i.e. the function takes values 1 and 0, respectively. In the case of rough sets,
the notion of membership is different.
 The rough membership function quantifies the degree of relative overlap
between the set X and the equivalence class R(x) to which x belongs. It is
defined as follows:
The meaning of rough membership function
Reduct & Core

• The set of attributes R  C is called a reduct of C, if T’ = (U, R, D) is


independent and
POSR ( D)  POSC ( D).

• The set of all the condition attributes indispensable in T is denoted by


CORE(C).
CORE(C )  RED(C )
where RED(C) is the set of all reducts of C.
An Example of Reducts & Core
Reduct1 = {Muscle-pain,Temp.}
U Headache Muscle Temp. Flu U Muscle Temp. Flu
pain
pain U1,U4 Yes Normal No
U1 Yes Yes Normal No U2 Yes High Yes
U2 Yes Yes High Yes U3,U6 Yes Very-high Yes
U5 No High No
U3 Yes Yes Very-high Yes
U4 No Yes Normal No Reduct2 = {Headache, Temp.}
U5 No No High No U Headache Temp. Flu
U6 No Yes Very-high Yes
U1 Yes Norlmal No
U2 Yes High Yes
U3 Yes Very-high Yes
CORE = {Headache,Temp} ∩ {MusclePain, Temp} = {Temp} U4 No Normal No
U5 No High No
U6 No Very-high Yes
Let us consider a decision system shown in table below to determine the accuracy
of approximation.
Thank You

You might also like