Formal Language and Automata Theory Handout
Formal Language and Automata Theory Handout
Formal Language and Automata Theory Handout
Chapter One
Introduction of Automata
Also its an abstract model of a digital computer. That has a mechanism to read input, which is a
string over a given alphabet. This input is actually written on an “input file”, which can be read
by the automaton but cannot change it.
Input file is divided into cells, each of which can hold one symbol. The automaton has a
temporary “storage” device, which has unlimited number of cells, the contents of which can be
altered by the automaton. Automaton has a control unit, which is said to be in one of a finite
number of “internal states”. The automaton can change state in a defined way.
An automaton with a finite number of states is called a Finite Automaton (FA) or Finite State
Machine (FSM).
q0 is the initial state from where any input is processed (q0 ∈ Q).
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 1
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
Related Terminologies
Alphabet
Definition − An alphabet is any finite set of symbols.
Example − ∑ = {a, b, c, d} is an alphabet set where ‘a’, ‘b’, ‘c’, and‘d’ are symbols.
String
Definition − A string is a finite sequence of symbols taken from ∑.
Length of a String
Definition − It is the number of symbols present in a string. (Denoted by |S|).
Examples −
If S = ‘cabcad’, |S|= 6
Kleene Star
Definition − The Kleene star, ∑*, is a unary operator on a set of symbols or strings, ∑,
that gives the infinite set of all possible strings of all possible lengths
over ∑ including λ.
Representation − ∑+ = ∑1 ∪ ∑2 ∪ ∑3 ∪…….
∑+ = ∑* − { λ }
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 2
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
Language
Definition − A language is a subset of ∑* for some alphabet ∑. It can be finite or
infinite.
Example1 − If the language takes all possible strings of length 2 over ∑ = {a, b}, then
L = { ab, aa, ba, bb }
Example2 − If the language takes all possible strings of length 3 over ∑ = {a, b}, then
L = { aaa, aab, abb, aba, baa, bba, bbb,bab }
Then L={ ε, ab, aabb, aaaabbbb………} belongs to L, but sting abb is not in L. because
the number of a's and b's should be equal.
q0 is the initial state from where any input is processed (q0 ∈ Q).
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 3
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
Q = {a, b, c},
∑ = {0, 1},
q0 = {a},
F = {c}, and
Present State Next State for Input 0 Next State for Input 1
A A B
B C A
C B C
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 4
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
(Here the power set of Q (2Q) has been taken because in case of NDFA, from a state,
transition can occur to any combination of Q states) exercise
q0 is the initial state from where any input is processed (q0 ∈ Q).
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 5
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
Example
Q = {a, b, c}
∑ = {0, 1}
q0 = {a}
F = {c}
Present State Next State for Input 0 Next State for Input 1
A a, b b
B C a, c
C b, c c
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 6
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
DFA vs NDFA
The following table lists the differences between DFA and NDFA.
DFA NDFA
The transition from a state is to a single particular The transition from a state can be to multiple
next state for each input symbol. Hence it is next states for each input symbol. Hence it is
called deterministic. called non-deterministic.
Empty string transitions are not seen in DFA. NDFA permits empty string transitions.
A string is accepted by a DFA, if it transits to a final A string is accepted by a NDFA, if at least one
state. of all possible transitions ends in a final state.
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 7
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
δ*(q0, S) ∈ F
{S | S ∈ ∑* and δ*(q0, S) ∈ F}
δ*(q0, S′) ∉ F
{S | S ∈ ∑* and δ*(q0, S) ∉ F}
Example
Let us consider the DFA shown in Figure bellow From the DFA, the acceptable strings can be
derived.
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 8
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
Strings accepted by the above DFA: {0, 00, 11, 010, 101, ...........}
Strings not accepted by the above DFA: {1, 011, 111, ........}
Algorithm
Input − An NDFA
Step 2 − Create a blank state table under possible input alphabets for the equivalent DFA.
Step 3 − Mark the start state of the DFA by q0 (Same as the NDFA).
Step 4 − Find out the combination of States {Q0, Q1,... , Qn} for each possible input alphabet.
Step 5 − Each time we generate a new DFA state under the input alphabet columns, we have to
apply step 4 again, otherwise go to step 6.
Step 6 − The states which contain any of the final states of the NDFA are the final states of the
equivalent DFA.
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 9
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
Example
Let us consider the NDFA shown in the figure below.
Q δ(q,0) δ(q,1)
A {a,b,c,d,e} {d,e}
B {c} {e}
C ∅ {b}
D {e} ∅
E ∅ ∅
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 10
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
Using the above algorithm, we find its equivalent DFA. The state table of the DFA is shown in
below.
Q δ(q,0) δ(q,1)
[d,e] [e] ∅
[e] ∅ ∅
[c, e] ∅ [b]
[c] ∅ [b]
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 11
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
DFA Minimization
DFA Minimized by using Myphill-Nerode theorem and Equivalence Theorem.
But we will see Myphill-Nerode Theorem in this handout.
Algorithm
Input − DFA
Step 1 − Draw a table for all pairs of states (Qi, Qj) not necessarily connected directly [All are
unmarked initially]
Step 2 − Consider every state pair (Qi, Qj) in the DFA where Qi ∈ F and Qj ∉ F or vice versa
and mark them. [Here F is the set of final states]
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 12
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
If there is an unmarked pair (Qi, Qj), mark it if the pair {δ (Qi, A), δ (Qi, A)} is marked for some
input alphabet.
Step 4 − Combine all the unmarked pair (Qi, Qj) and make them a single state in the reduced
DFA.
A B C D E F
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 13
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
A B C D E F
C ✔ ✔
D ✔ ✔
E ✔ ✔
F ✔ ✔ ✔
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 14
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
Step 3 − We will try to mark the state pairs, with green colored check mark, transitively. If we
input 1 to state ‘a’ and ‘f’, it will go to state ‘c’ and ‘f’ respectively. (c, f) is already marked,
hence we will mark pair (a, f). Now, we input 1 to state ‘b’ and ‘f’; it will go to state ‘d’ and ‘f’
respectively. (d, f) is already marked, hence we will mark pair (b, f).
A B C D e f
C ✔ ✔
D ✔ ✔
E ✔ ✔
F ✔ ✔ ✔ ✔ ✔
After step 3, we have got state combinations {a, b} {c, d} {c, e} {d, e} that are unmarked.
So the final minimized DFA will contain three states {f}, {a, b} and {c, d, e}
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 15
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
Chapter Two
Classification of Grammars
Introduction to Grammars
In the literary sense of the term, grammars denote syntactical rules for conversation in natural
languages. Linguistics have attempted to define grammars since the inception of natural
languages like English, Sanskrit, Mandarin, etc.
The theory of formal languages finds its applicability extensively in the fields of Computer
Science. Noam Chomsky gave a mathematical model of grammar in 1956 which is effective
for writing computer languages.
Grammar
A grammar G can be formally written as a 4-tuple (N, T, S, P) where −
P is Production rules for Terminals and Non-terminals. A production rule has the form α
→ β, where α and β are strings on VN ∪ ∑ and least one symbol of α belongs to VN.
Example
Grammar G1 −
Here,
Productions, P : S → AB, A → a, B → b
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 16
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
Example
Grammar G2 −
Here,
ε is an empty string.
x α y ⇒G x β y
Example
Let us consider the grammar −
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 17
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
L(G)={W|W ∈ ∑*, S ⇒G W}
Example
If there is a grammar
Here S produces AB, and we can replace A by a, and B by b. Here, the only accepted string
is ab, i.e.,
L(G) = {ab}
Example
Suppose we have the following grammar −
= {am bn | m ≥ 1 and n ≥ 1}
Example
Problem − Suppose, L (G) = {am bn | m ≥ 0 and n > 0}. We have to find out the
grammar G which produces L(G).
Solution
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 18
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
Here, the start symbol has to take at least one ‘b’ preceded by any number of ‘a’ including null.
To accept the string set {b, ab, bb, aab, abb, …….}, we have taken the productions −
S → aS , S → B, B → b and B → bB
S → B → b (Accepted)
S → B → bB → bb (Accepted)
S → aS → aB → ab (Accepted)
Thus, we can prove every single string in L(G) is accepted by the language generated by the
production set.
Example
Problem − Suppose, L (G) = {am bn | m > 0 and n ≥ 0}. We have to find out the grammar G
which produces L(G).
Solution −
Since L(G) = {am bn | m > 0 and n ≥ 0}, the set of strings accepted can be rewritten as −
Here, the start symbol has to take at least one ‘a’ followed by any number of ‘b’ including null.
To accept the string set {a, aa, ab, aaa, aab, abb, …….}, we have taken the productions −
S → aA, A → aA , A → B, B → bB ,B → λ
S → aA → aB → aλ → a (Accepted)
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 19
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
Thus, we can prove every single string in L(G) is accepted by the language generated by the
production set.
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 20
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
Take a look at the following illustration. It shows the scope of each type of grammar −
Type - 3 Grammar
Type-3 grammars generate regular languages. Type-3 grammars must have a single non-
terminal on the left-hand side and a right-hand side consisting of a single terminal or single
terminal followed by a single non-terminal.
and a ∈ T (Terminal)
The rule S → ε is allowed if S does not appear on the right side of any rule.
Example
X→ε
X → a | aY
Y→b
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 21
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
Type - 2 Grammars
Type-2 grammars generate context-free languages.
Example
S→Xa
X→a
X → aX
X → abc
X→ε
Type - 1 Grammar
Type-1 grammars generate context-sensitive languages. The productions must be in the form
αAβ→αγβ
where A ∈ N (Non-terminal)
The rule S → ε is allowed if S does not appear on the right side of any rule. The languages
generated by these grammars are recognized by a linear bounded automaton.
Example
AB → AbBc
A → bcA
B→b
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 22
Formal Language and Automata Theory Handout for CS 3rd Year Program: Weekend 2011 E.C
Type - 0 Grammar
Type-0 grammars generate recursively enumerable languages. The productions have no
restrictions. They are any phase structure grammar including all formal grammars.
The productions can be in the form of α → β where α is a string of terminals and nonterminals
with at least one non-terminal and α cannot be null. β is a string of terminals and non-terminals.
Example
S → ACaB
Bc → acB
CB → DB
aD → Db
Prepared By: Mr. Tadese F. Department of Computer Science C.Code:[ CoSc3111] Page 23