Regular Expression2

CHAPTER 4
REGULAR
EXPRESSIONS
‫زازية أحمد سالم‬:‫إعداد الطالبة‬

 Regular Expressions and Languages
 A regular expression is a notation to represent
languages, i.e. a set of strings, where the set is either
finite or contains strings that are generated using simple
recursive rules. The languages represented by regular
expressions are called regular languages.
 Regular expressions:
 To simplify the notations of regular languages, and
drawing analogy to arithmetic expressions used in
algebra, we could replace the union symbol  with the
plus sign +, drop the braces “{“ and “}” for sets, and
use parentheses “(“ and “)” for grouping
Some examples of Regular expressions
are as follows
The set of strings over {a, b} that have at least two
occurrences of symbol a:
(a + b)* a(a + b)* a(a + b)*
 The set of strings over {a, b} that contain up to 2
symbols:
Λ+ a + b + aa + ab + ba + bb
 The set of strings over {a, b} that begin with a, and
have an even number of b:
a(a*ba*b)*a*.
 The set of strings over {a, b} that do not contain the
substring aa:
b*(abb*)*(Λ+ a)
Laws and rules for manipulating regular
expressions:
Let L, M, and N denote regular expressions.
(1) (Associative law)
L(MN) = (LM)N. (This is true for any sets L, M, and N of strings)
(2) (distributive laws)

L(M + N) = LM + LN and (M + N)L = ML + NL
. (3) (L*)* = L*. (Both sides contain all possible strings that are made up of
strings of L.)
(4) * = Λ* =Λ. (The Kleene star always contains the empty string Λ).
(5) (L*)L = L(L*). (Both sides equal L  L2  L3  …, which is denoted L+)
(6) (L*M*)* = (L + M)*.

DEFINTION
If S and T are of strings of letters (whether they are finite
or infinite sets), we define the product set of strings of
letters to be
ST= {all combinations of a string from S concatenated with
a string from T }
EXAMPLES
 If S = {a aa aaa} T ={bb bbb}
Then ST = {abb abbb aabb aabbb aaabb aaabbb}
 (a + bb + bab)(a + ab) = aa + aab + bba + bbab +
baba + babab
 (Λ + x + xx)(y*) = y* + xy* + xxy*
Rules define the language associated
with regular expression
 Rule 1
The language associated with Λ is just {Λ}, a one – word language.
 Rule 2
If r1 is a regular expression associated with the language L1 and r2 is
a regular expression associated with the language L2 then,
(i) The regular expression (r1) (r2) is associated with the language
L1 times L2
Language (r1 r2) = L1L2
(ii) The regular expression r1 + r2 is associated with the language
formed by the union of the sets L1 and L2
Language (r1 +r2) = L1 +L2
(iii) The language associated with the regular expression (r1)* is L1* ,the
kleene closure of the set L1 as a set of words
Language (r1*) = L1*
THEOREM
If L is a finite language (a language with only finitely
many words), then L can be defined by a regular
expression
PROOF
Let L = {w1, . . . wn}. Then each wj can be described by
a regular expression (write out the letters and show this
is a regular expression). Then take the union.
EXAMPLE
Let
L = {baa abbba bababa}
The regular expression we get from theorem is
baa + abbba + bababa
 EXAMPLE
We have L = language (bb* + abb* + aabb*) ((a
+ aa)bb*)*(b* + a + aa).
We can simplify this expression.
 The first factor can be written (Λ + a + aa) bb*
 The second factor can be written ((Λ + a +

aa)bb*)* Hence, we can write
 L = language ( ((Λ + a + aa)bb*)+ (b* + a +
aa))
PROBLEMS
 Construct a regular expression defining each of the
following languages over the alphabet Σ ={a ,b}
(i) All the words with exactly two a’s
(ii) All the words that have at least one a and at least one
b
Solution
(i) We use the expression b*ab*ab*
which describes such words as aab, baba, and
bbbabbbab. To make the word aab
(ii) we use the expression (a + b)*a(a + b)* b(a + b)*
then requiring that an a precede a b in the word. Such
words as ba and bbaaaa are not included in this set
 Let us reconsider the regular expression
(a + b)*a(a + b)*b(a + b)*
Show that this is equivalent to
(a + b)*ab(a + b)*
Solution
Language ((a +b)*a(a + b)*b(a + b)*)
= language ((a + b)*ab(a +b)*)
= all words with at least one a and at least one b
Then two regular expression are equivalent because they
describe the same language
 let r1,r2 and r3 be three regular expressions show that the
language associated with (r1+r2)r3 is the same as the
language associated with r1r3 + r2r3. show that r1(r2 + r3) is
equivalent to r1r2 + r1r3. This will be the same as” proving
a distributive law” for regular expressions
Solution
(Distributive law) (r1+r2) r3 = r1r3 + r2r3.
Proof
Let L = language ((r1 + r2) r3), and let M = language (r1r3
+ r2r3). Let w Ï L. Then w begins with either string of r1
or string of r2, and the rest of w consists of string of r3;
hence, w = r1r3 or w =r2r3 . Thus w Ï M. The converse is
similar.
 Show that the following pairs of regular expressions
define the same language over the alphabet Σ={a, b}
(i) (ab)*a and a(ba)*
(ii) (a* + b)* and (a + b)*
(iii) (a* + b*)* and (a + b)*
Solution
(i) Let S = language ((ab)*a) and let T = language
(a(ba)*). Suppose w Ï S. Then w = (ab)^na for some n
Ï N, so we can rewrite w as a(ba)^n. Hence w Ï T. The
reverse inclusion is proven in a similar manner.
(ii) Let S = language ((a* + b)*) and let T = language ((a
+ b)*). Let w Ï S. Then w = (a^n1 b)^k1 . . . Hence w Ï
T. The reverse inclusion is similar.
(iii) Let S = language ((a* + b*)*) and let T = language((a
+ b)*). Let w Ï S. The argument is the same as for (ii)

Regular Expression2

Uploaded by

Copyright:

Available Formats

Regular Expression2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Regular Expression2

Uploaded by

Copyright:

Available Formats

CHAPTER 4

‫زازية أحمد سالم‬:‫إعداد الطالبة‬

(2) (distributive laws)

(5) (L)L = L(L). (Both sides equal L  L2  L3  …, which is denoted L+)

(6) (LM)* = (L + M)*.

 The second factor can be written ((Λ + a +

You might also like