Regular Expression2

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 12

CHAPTER 4

REGULAR
EXPRESSIONS

‫زازية أحمد سالم‬:‫إعداد الطالبة‬


 Regular Expressions and Languages
 A regular expression is a notation to represent
languages, i.e. a set of strings, where the set is either
finite or contains strings that are generated using simple
recursive rules. The languages represented by regular
expressions are called regular languages.
 Regular expressions:
 To simplify the notations of regular languages, and
drawing analogy to arithmetic expressions used in
algebra, we could replace the union symbol  with the
plus sign +, drop the braces “{“ and “}” for sets, and
use parentheses “(“ and “)” for grouping
Some examples of Regular expressions
are as follows
The set of strings over {a, b} that have at least two
occurrences of symbol a:
(a + b)* a(a + b)* a(a + b)*
 The set of strings over {a, b} that contain up to 2
symbols:
Λ+ a + b + aa + ab + ba + bb
 The set of strings over {a, b} that begin with a, and
have an even number of b:
a(a*ba*b)*a*.
 The set of strings over {a, b} that do not contain the
substring aa:
b*(abb*)*(Λ+ a)
Laws and rules for manipulating regular
expressions:
Let L, M, and N denote regular expressions.
(1) (Associative law)
L(MN) = (LM)N. (This is true for any sets L, M, and N of strings)

(2) (distributive laws)


L(M + N) = LM + LN and (M + N)L = ML + NL

. (3) (L*)* = L*. (Both sides contain all possible strings that are made up of
strings of L.)

(4) * = Λ* =Λ. (The Kleene star always contains the empty string Λ).

(5) (L*)L = L(L*). (Both sides equal L  L2  L3  …, which is denoted L+)

(6) (L*M*)* = (L + M)*.


DEFINTION
If S and T are of strings of letters (whether they are finite
or infinite sets), we define the product set of strings of
letters to be
ST= {all combinations of a string from S concatenated with
a string from T }
EXAMPLES
 If S = {a aa aaa} T ={bb bbb}
Then ST = {abb abbb aabb aabbb aaabb aaabbb}
 (a + bb + bab)(a + ab) = aa + aab + bba + bbab +

baba + babab
 (Λ + x + xx)(y*) = y* + xy* + xxy*
Rules define the language associated
with regular expression
 Rule 1
The language associated with Λ is just {Λ}, a one – word language.
 Rule 2
If r1 is a regular expression associated with the language L1 and r2 is
a regular expression associated with the language L2 then,
(i) The regular expression (r1) (r2) is associated with the language
L1 times L2
Language (r1 r2) = L1L2
(ii) The regular expression r1 + r2 is associated with the language
formed by the union of the sets L1 and L2
Language (r1 +r2) = L1 +L2
(iii) The language associated with the regular expression (r1)* is L1* ,the
kleene closure of the set L1 as a set of words
Language (r1*) = L1*
THEOREM
If L is a finite language (a language with only finitely
many words), then L can be defined by a regular
expression
PROOF
Let L = {w1, . . . wn}. Then each wj can be described by
a regular expression (write out the letters and show this
is a regular expression). Then take the union.
EXAMPLE
Let
L = {baa abbba bababa}
The regular expression we get from theorem is
baa + abbba + bababa
 EXAMPLE
We have L = language (bb* + abb* + aabb*) ((a
+ aa)bb*)*(b* + a + aa).
We can simplify this expression.
 The first factor can be written (Λ + a + aa) bb*

 The second factor can be written ((Λ + a +


aa)bb*)* Hence, we can write
 L = language ( ((Λ + a + aa)bb*)+ (b* + a +
aa))
PROBLEMS
 Construct a regular expression defining each of the
following languages over the alphabet Σ ={a ,b}
(i) All the words with exactly two a’s
(ii) All the words that have at least one a and at least one
b
Solution
(i) We use the expression b*ab*ab*
which describes such words as aab, baba, and
bbbabbbab. To make the word aab
(ii) we use the expression (a + b)*a(a + b)* b(a + b)*
then requiring that an a precede a b in the word. Such
words as ba and bbaaaa are not included in this set
 Let us reconsider the regular expression
(a + b)*a(a + b)*b(a + b)*
Show that this is equivalent to
(a + b)*ab(a + b)*
Solution
Language ((a +b)*a(a + b)*b(a + b)*)
= language ((a + b)*ab(a +b)*)
= all words with at least one a and at least one b
Then two regular expression are equivalent because they
describe the same language
 let r1,r2 and r3 be three regular expressions show that the
language associated with (r1+r2)r3 is the same as the
language associated with r1r3 + r2r3. show that r1(r2 + r3) is
equivalent to r1r2 + r1r3. This will be the same as” proving
a distributive law” for regular expressions
Solution
(Distributive law) (r1+r2) r3 = r1r3 + r2r3.
Proof
Let L = language ((r1 + r2) r3), and let M = language (r1r3
+ r2r3). Let w Ï L. Then w begins with either string of r1
or string of r2, and the rest of w consists of string of r3;
hence, w = r1r3 or w =r2r3 . Thus w Ï M. The converse is
similar.
 Show that the following pairs of regular expressions
define the same language over the alphabet Σ={a, b}
(i) (ab)*a and a(ba)*
(ii) (a* + b)* and (a + b)*
(iii) (a* + b*)* and (a + b)*
Solution
(i) Let S = language ((ab)*a) and let T = language
(a(ba)*). Suppose w Ï S. Then w = (ab)^na for some n
Ï N, so we can rewrite w as a(ba)^n. Hence w Ï T. The
reverse inclusion is proven in a similar manner.
(ii) Let S = language ((a* + b)*) and let T = language ((a
+ b)*). Let w Ï S. Then w = (a^n1 b)^k1 . . . Hence w Ï
T. The reverse inclusion is similar.
(iii) Let S = language ((a* + b*)*) and let T = language((a
+ b)*). Let w Ï S. The argument is the same as for (ii)

You might also like