Chapter 4 Automata
Chapter 4 Automata
Chapter 4 Automata
CFG stands for context-free grammar. It is a formal grammar which is used to generate all possible patterns of
strings in a given formal language. Context-free grammar G can be defined by four tuples as:
G = (V, T, P, S)
Where,
G is the grammar, which consists of a set of the production rule. It is used to generate the string of a language.
T is the final set of a terminal symbol. It is denoted by lower case letters.
P is a set of production rules, which is used for replacing non-terminals symbols(on the left side of the
production) in a string with other terminal or non-terminal symbols(on the right side of the production).
S is the start symbol which is used to derive the string. We can derive the string by repeatedly replacing a non-
terminal by the right-hand side of the production until all non-terminal have been replaced by terminal
symbols.
Example 1:
Construct the CFG for the language having any number of a's over the set ∑= {a}.
Solution:
As we know the regular expression for the above language is
1.r.e. = a* means zero or more occurrence of a
Production rule for the Regular expression is as follows:
S → aS rule 1
S → ε rule 2
Now if we want to derive a string "aaaaaa", we can start with start symbols.
S
aS
aaS rule 1
aaaS rule 1
aaaaS rule 1
aaaaaS rule 1
aaaaaaS rule 1
aaaaaaε rule 2
aaaaaa
The r.e. = a* can generate a set of string {ε, a, aa, aaa,.....}. We can have a null string because S is a start symbol and rule 2
gives S → ε.
Example 2:
Production rules:
E = E + E
E = E - E
E = a | b
Input
a - b + a
The leftmost derivation is:
E = E + E
E = E - E + E
E = a - E + E
E = a - b + E
E = a - b + a
2. Rightmost Derivation:
In rightmost derivation, the input is scanned and replaced with the production rule from right to left. So in rightmost
derivation, we read the input string from right to left.
Example
Production rules:
E = E + E
E = E - E
E = a | b
Input
a - b + a
The rightmost derivation is:
E = E - E
E = E - E + E
E = E - E + a
E = E - b + a
E = a - b + a
When we use the leftmost derivation or rightmost derivation, we may get the same string. This type of derivation
does not affect on getting of a string.
Examples of Derivation:
Example 1:
Derive the string "abb" for leftmost derivation and rightmost derivation using a
CFG given by,
S → AB | ε Rightmost derivation:
A → aB
B → Sb
Solution:
Leftmost derivation:
Example 2:
Derive the string "aabbabba" for leftmost derivation and rightmost derivation using a CFG given by,
S → aB | bA
S → a | aS | bAA
S → b | aS | aBB
Solution:
Leftmost derivation:
S
aB S → aB
aaBB B → aBB
aabB B → b
aabbS B → bS
aabbaB S → aB
aabbabS B → bS
aabbabbA S → bA
aabbabba A → a
Example 2:
Derive the string "aabbabba" for leftmost derivation and rightmost derivation using a CFG given by,
S → aB | bA
S → a | aS | bAA
S → b | aS | aBB
Solution:
Leftmost derivation:
S
aB S → aB
aaBB B → aBB
aabB B → b
aabbS B → bS
aabbaB S → aB
aabbabS B → bS
aabbabbA S → bA
aabbabba A → a
Example 3:
Derive the string "00101" for leftmost derivation and rightmost derivation using a CFG given by,
S → A1B
A → 0A | ε
B → 0B | 1B | ε
Solution:
Leftmost derivation:
S
A1B
0A1B
00A1B
001B
0010B
00101B
00101
Derivation Tree
Step 4:
Production rules: Step 2: Step 5:
E = E + E
E = E * E
E = a | b | c
Input Step 3:
a * b + c
Step 1:
Note: We can draw a derivation tree step by step or directly in one step
Example 2:
Draw a derivation tree for the string "bab" from the CFG given by
S → bSb | a | b
Solution:
Now, the derivation tree for the string "bbabb" is as follows:
The above tree is a derivation tree drawn for deriving a string
bbabb. By simply reading the leaf nodes, we can obtain the
desired string. The same tree can also be denoted by,
Ambiguity in Grammar
Since there are two parse trees for a single string "3 * 2 + 5", the grammar G is
ambiguous.
Example 2:
A symbol can be useless if it does not appear on the right-hand side of the
production rule and does not take part in the derivation of any string.
That symbol is known as a useless symbol. Similarly, a variable can be useless if it
does not take part in the derivation of any string. That variable is known as a
useless variable. For Example:
T → aaB | abA | aaT
A → aA
B → ab | b
C → ad
In the above example, the variable 'C' will never occur in the derivation of any
string, so the production C → ad is useless. So we will eliminate it, and the other
productions are written in such a way that variable C can never reach from the
starting variable 'T'.
Cont….
Production A → aA is also useless because there is no way to terminate it. If it
never terminates, then it can never produce a string. Hence this production can
never take part in any derivation.
To remove this useless production A → aA, we will first find all the variables
which will never lead to a terminal string such as variable 'A'. Then we will
remove all the productions in which the variable 'B' occurs.
Elimination of ε Production
The productions of type S → ε are called ε productions. These type of productions can only be
removed from those grammars that do not generate ε.
Step 1: First find out all nullable non-terminal variable which derives ε.
Step 2: For each production A → a, construct all production A → x, where x is obtained from a
by removing one or more non-terminal from step 1.
Step 3: Now combine the result of step 2 with the original production and remove ε productions.
Example:
Remove the production from the following CFG by preserving the meaning of it.
S → XYX
X → 0X | ε
Y → 1Y | ε
Solution:
Now, while removing ε production, we are deleting the rule X → ε and Y → ε. To preserve the
meaning of CFG we are actually placing ε at the right-hand side whenever X and Y have
appeared.
Co t…
Let us take
S → XYX
If the first X at left-hand side is ε. Then
S → YX
Similarly if the last X in R.H.S. = ε.
Then
S → XY
If Y = ε then
S → XX
If Y and X are ε then,
S → X
If both X are replaced by ε
S → Y
Now,
Removing Unit Productions
S → 0A | 1B | C
A → 0S | 00
B → 1 | A
C → 01
Solution:
S → C is a unit production. But while removing S → C we have to consider what C gives. So, we can
add a rule to S.
S → 0A | 1B | 01
Similarly, B → A is also a unit production so we can modify it as
B → 1 | 0S | 00
Thus finally we can write CFG without unit production as
S → 0A | 1B | 01
A → 0S | 00
B → 1 | 0S | 00
C → 01
Chomsky's Normal Form (CNF)
Step 1: Eliminate start symbol from the RHS. If the start symbol T is at the right-
hand side of any production, create a new production as:S1 → S
Where S1 is the new start symbol.
Step 2: In the grammar, remove the null, unit and useless productions.
Step 3: Eliminate terminals from the RHS of the production if they exist with other
non-terminals or terminals. For example, production S → aA can be decomposed
as:
S → RA
R → a
Step 4: Eliminate RHS with more than two non-terminals. For example, S → ASB
can be decomposed as:
S → RS
R → AS
Example:
Convert the given CFG to CNF. Consider the given grammar G1:
S → a | aA | B
A → aBB | ε
B → Aa | b
Solution:
Step 1: We will create a new production S1 → S, as the start symbol S appears on the
RHS. The grammar will be:
S1 → S
S → a | aA | B
A → aBB | ε
B → Aa | b
Step 2: As grammar G1 contains A → ε null production, its removal from the grammar
yields:
S1 → S
S → a | aA | B
A → aBB
B → Aa | b | a
Cont….
Now, as grammar G1 contains Unit production S → B, its removal yield:
S1 → S
S → a | aA | Aa | b
A → aBB
B → Aa | b | a
Also remove the unit production S1 → S,
its removal from the grammar yields:
S0 → a | aA | Aa | b
S → a | aA | Aa | b
A → aBB
B → Aa | b | a
Step 3: In the production rule S0 → aA | Aa,
S → aA | Aa, A → aBB and B → Aa,
terminal a exists on RHS with non-terminals.
So we will replace terminal a with X:
Greibach Normal Form (GNF)
If the given grammar is not in CNF, convert it into CNF. You can refer the following
topic to convert the CFG into CNF: Chomsky normal form
If the context free grammar contains left recursion, eliminate it. You can refer the
following topic to eliminate left recursion: Left Recursion
Step 3: In the grammar, convert the given production rule into GNF form.
If any production rule in the grammar is not in GNF form, convert it.
Example:
S → XB | AA
A → a | SA
B → b
X → a
Solution:
As the given grammar G is already in CNF and there is no left recursion, so we can skip step 1 and step 2 and directly
go to step 3.
The production rule A → SA is not in GNF, so we substitute S → XB | AA in the production rule A → SA as:
S → XB | AA
A → a | XBA | AAA
B → b
X → a
The production rule S → XB and B → XBA is not in GNF, so we substitute X → a in the production rule S → XB and
B → XBA as:
S → aB | AA
A → a | aBA | AAA
B → b
X → a
Cont…
Now we will remove left recursion (A → AAA), we get:
S → aB | AA
A → aC | aBAC
C → AAC | ε
B → b
X → a
Now we will remove null production C → ε, we get:
S → aB | AA
A → aC | aBAC | a | aBA
C → AAC | AA
B → b
X → a
The production rule S → AA is not in GNF, so we substitute A → aC | aBAC | a | aBA in
production rule S → AA as:
Cont…
S → aB | aCA | aBACA | aA | aBAA
A → aC | aBAC | a | aBA
C → AAC
C → aCA | aBACA | aA | aBAA
B → b
X → a
The production rule C → AAC is not in GNF, so we substitute A → aC | aBAC | a | aBA in production
rule C → AAC as:
S → aB | aCA | aBACA | aA | aBAA
A → aC | aBAC | a | aBA
C → aCAC | aBACAC | aAC | aBAAC
C → aCA | aBACA | aA | aBAA
B → b
X → a
Hence, this is the GNF form for the grammar G.
D ! !
EN