19 CFG

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Context Free Grammars

KCS- 402 Theory Of Automata and Formal Languages

19. Context Free Grammars

Learning Outcomes: At the conclusion of the chapter, the student will be able to:
• Identify whether a particular grammar is context-free
• Construct context-free grammars for simple languages
• Produce leftmost and rightmost derivations of a string generated by a context-free
grammar
• Construct derivation trees for strings generated by a context-free grammar

A Regular Grammar (RG) is either left linear or right linear. Left linear grammar is one in which
ALL productions are of the following form:
A→x
A → Bx, A, B  V and x  T*
Right linear grammar is one in which ALL productions are of the following form:
A→x
A → xB, A, B  V and x  T*
The productions of Regular Grammars are restricted in two ways
1. Left side must be a single variable
2. Right side must have a special form.
By retaining the first restriction and permitting anything on the right side, we get Context Free
Grammars (CFG). Many useful languages are not regular. Context-free grammars are very useful
for the definition and processing of programming languages. A context-free grammar has no
restrictions on the right side of its productions, while the left side must be a single variable
19.1 Definition of CFG
A grammar CFG G is defined by a quadruple
G = (V, T, S, P) where

• V is a set of non-terminal symbols.


• T is a set of terminals where V ∩ T = Φ i.e. V an T are non-empty and disjoint sets.
• P is a set of production rules. All productions have the form A → β where A  V and β
 (V  T)*
• S is the start symbol.

Note that every Regular Grammar is Context Free. However, vice versa is not true.

1 Seema Shukla
Context Free Grammars
KCS- 402 Theory Of Automata and Formal Languages

19.2 Context Free Language


A language L is said to be Context Free iff there is a CFG G such that L = L(G). The major
application of CFGs is in deriving syntax of programming languages.
Example 19.1: Give CFG for L = {anbn | n ≥ 1}
S → aSb | ab
Example 19.2: Give CFG for L = {anbn+1 | n ≥ 0}
S → Ab,
A → aAb | 
Example 19.3: If G has productions S → SS, then L(G) = Φ
Example 19.4: Give CFG for L = {anbam | n,m ≥ 1} (note that this is a regular language)
S → aAB
A → aA | b
B → aB | a
The grammar is not regular. However, one can write a regular grammar for this language as
follows which is a left linear grammar
S → aS | A
A → bB
B → aB | a
The two grammars in this example are equivalent. Grammars G1 and G2 are said to be equivalent
if L(G1) = L(G2)
19.3 Leftmost and Rightmost Derivations
A leftmost derivation is one in which leftmost variable is replaced at each step. A right
derivation is one in which rightmost variable is replaced at each step. For example, consider the
grammar
S → aAB
A → aA | b
B → aB | a
Leftmost derivation of string aabaaa:
S  aAB  aaAB  aabB  aabaB  aabaaB  aabaaa
Rightmost derivation of string aabaaa:
S  aAB  aAaB  aAaaB  aAaaa  aaAaaa  aabaaa

2 Seema Shukla
Context Free Grammars
KCS- 402 Theory Of Automata and Formal Languages

Example 19.5: Grammar for arithmetic Expressions


G = (V, T, S, P)
where V = {E, I}, T = {a, b, 0, 1, +, *, (, )}, S = {E} and P are as follows
E → I | E + E | E * E | (E)
I → a | b | Ia | Ib | I0 | I1
A string (a+b) * a1 can be derived as follows
E  E*E  (E)*E  (E+E)*E  (I+E)*E  (a+E)*E  (a+I)*E  (a+b)*E
 (a+b)*I  (a+b)*I1  (a+b)*a1 (Left most derivation)
E  E*E  E*I  E*I1  E*a1  (E)*a1  (E+E)*a1 (E+I)*a1  (E+b)*a1  (I+b)*a1
 (a+b)*a1 (Right most derivation)
19.4 Parse Trees or Derivation Trees
The parse tree of a grammar G is a tree with following conditions:
i. Each interior node (non-leaf) is labeled by a variable V
ii. Each leaf is labeled either by a variable or a terminal or . However, if the leaf is labeled
, it must be the only child of its parent
iii. If an interior node is labeled A and it’s children are labeled X1, X2, …. , Xk from left, then
A → X1 X2 …. Xk must be a production in the grammar
Yield of a parse tree is a string that we get by concatenating the leaves of any parse tree from
left to right.
If the parse tree is complete then the yield is w  L(G). If the parse tree is partial, then the yield
is a sentential form of G.
Example 19. 6: The parse tree of string (a+b) * a1 of grammar of example 18.6 is as follows

3 Seema Shukla
Context Free Grammars
KCS- 402 Theory Of Automata and Formal Languages

E * E

( E ) I

E + E
1
I

I I

a b

19.5. Solved Examples


Example 19.7
Which language generates the grammar G given by the productions
S → aSa | aBa
B → bB | b
Solution
L(G) = { an bm an | n > 0, m > 0}.
The rule S → aSa recursively builds an equal number of a’s on each end of the string. The
recursion is terminated by the application of the rule S → aBa, ensuring at least one leading and
one trailing a. The recursive B rule then generates any number of b’s. To remove the variable B
from the string and obtain a sentence of the language, the rule B → b must be applied, forcing
the presence of at least one b.
Example 19.8
Find a CFG that generates the language
L(G) = { an bm cm d2n | n ≥ 0, m > 0}.
Solution
S → aSdd | A
A → bAc | bc

4 Seema Shukla
Context Free Grammars
KCS- 402 Theory Of Automata and Formal Languages

The relationship between the number of leading a’s and trailing d’s in the language indicates that
the recursive rule is needed to generate them. The same is true for b’s and c’s. Derivations in the
grammar generate strings in an outside-to-inside manner. The S rules produce the a’s and d’s
while the A rules generate b’s and c’s. The rule A → bc, whose application terminates the
recursion, ensures the presence of the substring bc in every string in the language.
Example 19.9
Find a CFG that generates the language
L(G) = { an bm | 0 ≤ n ≤ m ≤ 2n}.
Solution
S → aSb | aSbb | ε
The first recursive rule of G generates a trailing b for every a, while the second generates two
b’s for each a. Thus there is at least one b for every a and at most two, as specified in the
language.
Example 19.10
Consider the grammar
S → abScB | λ
B → bB | b
What language does it generate?
Solution
The recursive S rule generates an equal number of ab’s and cB’s. The B rules generate b+. In a
derivation, each occurrence of B may produce a different number of b’s. For example in the
derivation
S ⇒ abScB ⇒ ababScBcB ⇒ ababcBcB ⇒ ababcbcB ⇒ ababcbcbB ⇒ ababcbcbb, the first
occurrence of B generates a single b and the second occurrence produces bb. The language of the
grammar consists of the set L(G) = { (ab)n (cbm)n | n ≥ 0, m > 0}.
Example 19.11
Construct context free grammars to accept the following languages. Consider Σ = {0, 1}
a) {w | w starts and ends with the same symbol}
Solution
S → 0A0 | 1A1
A → 0A | 1A | ε
b) {w | |w| is odd}
Solution
S → 0A | 1
A → 0S | 1S | ε

5 Seema Shukla
Context Free Grammars
KCS- 402 Theory Of Automata and Formal Languages

c) {w | |w| is odd and its middle symbol is 0}


Solution
S → 0 | 0S0 | 0S1 | 1S0 | 1S1
d) Binary strings with twice as many 1s as 0s.
Solution
S →ε| 0S1S1S | 1S0S1S | 1S1S0S
Example 19.12
Construct CFG for L = {w#x | wR is a substring of x, where w, x ∈ {a, b}*}
Solution
The following grammar generates language L, where S is the start variable.
S → AT
A → aAa | bAb | #T
T → aT | bT | ε
The strings in the language have the form w#uwRv, where u and v are strings of the form (a + b)*
(any string made from symbols a and b). The variable T generates the strings u and v, while
variable A generates the string w#uwR and the variable S generates the desired string w#uwRv.
Example 19.13
Construct CFG for L = {0n1n | n>0} U {0n12n | n>0}
Solution
S → 0A1 | 0B11
A → 0A1 | ε
B → 0B11 | ε
Example 19.14
Construct CFG for L ={0i1j2k | i≠j or j≠k}
Solution
S → AC | BC | DE | DF
A → 0 | 0A | 0A1
B → 1 | B1 | 0B1
C → 2 | 2C
D → 0 | 0D
E → 1 | 1E | 1E2
F → 2 | F2 | 1F2

6 Seema Shukla
Context Free Grammars
KCS- 402 Theory Of Automata and Formal Languages

Example 19.15
Construct CFGs for the following languages
1) L = {a,b}+
S → aS | bS | a | b

2) L = {anbman | n ≥ 1, m ≥ 0}
S → aSa | aAa

A → bA | ε

3) L = { (ab)n | n ≥ 0}
S → abS | ε

4) L = { w | na(w) = nb(w) }
S → SS | aSa | bSb | ε

5) L = {anbmcn+m | n,m ≥ 0}
S → aSc | A

A → bAc | ε

6) L = { {anbm | n ≠ m }

S → AS1 | S1B

S1 → aS1b | ε

A → aA | a

B → bB | b

7 Seema Shukla

You might also like