TIC 2151 - Theory of Computation: Context-Free Grammars (CFG)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

TIC 2151 – Theory of

Computation

Lecture 6
Context-Free Grammars (CFG)

TIC 2151 – Theory of Computation


Lecture 6 – Outline

 Introduction
 Formal description of a CFG
 Ambiguous CFGs
 Normal forms

2
Context-free grammars (CFG)

context-free grammars CFG, a more powerful method of describing


languages. Such grammars can describe certain features that have a
recursive structure, which makes them useful in a variety of applications.
 The collection of languages associated with context-free grammars are called
the context-free languages.
 A language is context free if and only if some pushdown automaton recognizes
it.
 Languages that can be accepted by some PDA or generated by some CFG are
context-free.
 Many useful languages are context-free, including most programming
languages, query languages, and markup languages.

We can establish a relationship between the regular languages and the context-
free languages. Because every regular language is recognized by a finite
automaton and every finite automaton is automatically a pushdown automaton
that simply ignores its stack, we now know that every regular language is also a
context-free language.
3
Context-free grammars (CFG)

4
Context-free grammars (CFG)

The following is an example of a context-free grammar, which we call G.


Grammar G:
A => 0A1
A => B
B => #

derivation of a word

You may also represent the same information pictorially with a parse tree.

See next slide


5
Context-free grammars (CFG)

parse trees for the word 00#11

Formal description of a CFG

Formally written we have the grammar G:


G = (V, S, R, S)
with V = {A,B}
S = {0,1,#}
R = {[A, 0A1], [A, B], [B, #]}
S=A 6
Some
Context-free grammars (CFG) exercises
Give the formal definition for the following grammar an describe the
produced language: S => aSb | SS | e

G = ({S}, {a, b},R, S). The set of rules, R, is S => aSb | SS | e

This grammar G generates strings such as abab, aaabbb, and aababb.

Give a parse trees for aaabbb and abab

7
Some
Context-free grammars (CFG) exercises
Give the formal definition for the following grammar and give parse trees for
a + a x a and (a + a) x a
Formal Definition

8
Some
Context-free grammars (CFG) exercises
Give the formal definition for the following grammar and give parse trees for
a + a x a and (a + a) x a
Formal Definition

9
Context-free grammars (CFG) Ambiguity

Ambiguity
o Sometimes a grammar can generate the same string in several different
ways. Such a string will have several different parse trees and thus
several different meanings.

o This result may be undesirable for certain applications, such as


programming languages, where a program should have a unique
interpretation.

o A grammar is called ambiguous if there is at least one word having two


different parse trees.

10
Context-free grammars (CFG) Ambiguity

Ambiguity
Consider the following rules of a grammar and give two different parse trees for
a+axa

This grammar generates the string a + a x a ambiguously.

 Sometimes when we have an ambiguous grammar we can find an


unambiguous grammar that generates the same language.
 A grammar is called inherently ambiguous if there is no equivalent
unambiguous grammar to generates it. {aibjck | i=j or j=k} is inherently
ambiguous! 11
Chomsky Normal Form (CNF)

When working with context-free grammars CFG, it is often convenient to


have them in simplified form. One of the simplest and most useful forms
is called the Chomsky normal form.

Remark:
any context-free grammar may be transformed to a CNF grammar expressing the
same language
12
Chomsky Normal Form (CNF) CFG to CNF

Converting a CFG to CNF Try to follow every step! Don’t go to the next step, if you
still have problems with the one before!
Original CF grammar
S => ASA | aB Step 1. Add a new start variable S0 and the rule S0 S,
A => B | S where S was the original start variable. This change
B => b | e guarantees that the start variable doesn’t occur on the
right-hand side of a rule
1. Insert a new start symbol
S0 => S Step 2.we take care of all e-rules. We remove an e-rule
S => ASA | aB Ce, where C is not the start variable. Then for each
A => B | S occurrence of an C on the right-hand side of a rule, we
B => b | e add a new rule with that occurrence deleted. if R uCv
is a rule in which u and v are strings of variables and
2. Remove the rule B => e terminals, we add rule R uv. The rule R uCvCw
S0 => S causes us to add R uvCw, R uCvw, and R uvw.
S => ASA | aB | a If we have the rule R C, we add R e unless we had
A => B | S | e previously removed the rule R  e .
B => b We repeat these steps until we eliminate all e -rules not
involving the start variable.

13
Chomsky Normal Form (CNF) CFG to CNF

Try to follow every step! Don’t go to the next step, if you


3. Remove the rule A => e still have problems with the one before!
S0 => S
S => ASA | aB | a | SA | AS | S
A => B | S
B => b

4. Remove the rule S => S Step 3 .Eliminate all unit rules of the form
S0 => S CD. We repeat these steps until we
S => ASA | aB | a | SA | AS eliminate all unit rules.
A => B | S
B => b

5. Remove the rule S0 => S


S0 => ASA | aB | a | SA | AS
S => ASA | aB | a | SA | AS
A => B | S
B => b

14
Chomsky Normal Form (CNF) CFG to CNF

Try to follow every step! Don’t go to the next step, if you


6. Remove the rule A => B
still have problems with the one before!
S0 => ASA | aB | a | SA | AS
S => ASA | aB | a | SA | AS
A => S | b
B => b

7. Remove the rule A => S


S0 => ASA | aB | a | SA | AS
S => ASA | aB | a | SA | AS
A => b | ASA | aB | a | SA | AS
B => b
Step 4. Take care of rules with more
8. Insert new symbols for the last incorrect rules than two terminals or variables.
S0 => AA1 | UB | a | SA | AS convert all remaining rules into the
S => AA1 | UB | a | SA | AS proper form.
A => b | AA1 | UB | a | SA | AS
B => b
A1 => SA
U => a
15
Chomsky Normal Form (CNF) CFG to CNF

Covert the following CFG into Chomsky Normal Form (CNF)

SOLUTION
1. elimination of e, we obtain: 2. elimination of the unit production we obtain

3. convert all remaining rules into the


proper form. replace a by A, b by B
and c by C, we obtain

16
Our next problem: CFG

If we have a word w, how can we determine if it can be


generated by a context-free grammars CFG ?

17
The Cocke-Younger-Kasami
Algorithm (CYK)
CYK – The Concept
S

C D

w1w2. . . . . . wk wk+1. . . wn

 For each word, there must be a rule of the form A  BC, where
B produces the left part of the word and C the right part.
 We replace the occurrence of BC with A.
 This process is repeated, and should leave us with the start
variable S.
18
CYK – In Action Example (1)

Given the following grammar a a a b b b c c


S => AB C C C D D D E,B E,B
A => ab | aAb A B
B => c | cB
and in Chomsky normal form F
S => AB A
A => CD | CF
B => c | EB F
C => a A
D => b
E => c S If we find S here,
F => AD we know
S
aaabbbcc
we build a table for aaabbbcc: is part of the language.

19
CYK – In Action Example (1)

a a a b b b c c
C C C D D D E,B E,B This (darkgray) part
A B is produced by B
F

A
This (gray) part
F is produced by this A
A

S As there is a rule S => AB,


S we can fill S here.

20
CYK – In Action
a a a b b b c c
A F
B E

C D

If there is one of the following


rules, so write the left side X in the box:
X => AD
X => BE
X => CF
If there are more than one, write all.

21
CYK – In Action Example (2)

For the grammar: S => SS | BA | BC


A => SC
B => 0
C => 1
we prove the membership of 00100111 and 01011100:
0 0 1 0 0 1 1 1 0 1 0 1 1 1 0 0
B B C B B C C C B C B C C C B B
S S S S

S S
A

S
YES NO
22
Learning Outcomes

After completion of this lecture, you should be able to:


 Draw parse trees for words generated from CFG.
 Given the definition of a CFL, design a grammar (i.e. give the
grammar rules).
 Convert a CFG into Chomsky Normal Form.
 Apply the CYK algorithm to determine if a word is generated
form a CFG.

23

You might also like