Context-Free Grammar (CFG) : Dr. Nadeem Akhtar
Context-Free Grammar (CFG) : Dr. Nadeem Akhtar
Context-Free Grammar (CFG) : Dr. Nadeem Akhtar
1
Context Free Grammars (CFG)
• Context-Free Grammar is a more powerful method of
describing languages.
CFGs have a recursive structure, which makes them useful in a
variety of applications.
2
Context Free Grammars (CFG)
• An important application of context-free grammars
occurs in the specification and compilation of
programming languages.
A grammar for a programming language is a reference
for people trying to learn the language syntax.
3
Context Free Grammars (CFG)
• Parser of a compiler or interpreter extracts the
meaning of a program prior to generating the
compiled code or performing the interpreted
execution.
4
Context Free Grammars (CFG)
The collection of languages associated with context-free
grammars are called the context-free languages.
5
Context Free Grammar (CFG)
A CFG has four components
1) A set of terminal symbols, sometimes referred to as “tokens”.
Terminals are the elementary symbols of the language defined by the
grammar.
2) A set of non-terminals sometimes called “syntactic variables”. Each
non-terminal represents a set of strings of terminals
In stmt → if (expr) stmt else stmt
stmt and expr are non-terminals.
3) A set of productions
Each production consists of a non-terminal, called head or left side of
the production, an arrow, and a sequence of terminals and/or non-
terminals, called the body or right side of the production.
4) A designation of one of the non-terminals as the start symbol (head)
6
Regular Languages
• Closed under Union, Concatenation and
Closure (∗)
• Recognizable by finite-state automata
• Denoted by Regular Expressions
• Generated by Regular Grammars
7
Context Free Grammars
• More general productions than regular
grammars
S→w
where w is any string of terminals and non-
terminals
• What languages do these grammars generate?
S →(A) S → ε | aSb
A → ε | aA | ASA
8
Context-free languages more
general than regular languages
• {anbn | n ≥ 0} is not regular
‣ but it is context-free
9
Example derivation in a Grammar
• Grammar: start symbol is A
A → aAa
A→B
B → bB
B→ε
• Sample Derivation:
A ⇒ aAa ⇒ aaAaa ⇒ aaaAaaa ⇒ aaaBaaa ⇒ aaabBaaa ⇒
aaabbBaaa ⇒ aaabbaaa
• Language?
10
Derivations in Tree Form
11
Example CFG
• An example of a context-free grammar, which we call G1 .
A → 0A1
A→B
B→#
Collection of substitution rules, called productions. Each rule appears as a line in the
grammar, comprising a symbol and a string separated by an arrow. The symbol is called a
variable.
12
Example CFG
• An example of a context-free grammar, which we call
G1 .
A → 0A1
A→B
B→#
14
Example-1 CFG
Context-free grammar, G1 .
A → 0A1
A→B
B→#
16
Example CFG
• All strings generated in this way constitute the
language of the grammar.
• We write L(G1) for the language of grammar G1
.
Grammar G1 shows that L(G1) is {0n#1n | n > 0}.
Any language that can be generated by some
context-free grammar is called a Context-Free
Language (CFL).
17
FORMAL DEFINITION OF A CONTEXT-FREE GRAMMAR
19
Example-2 CFG
• Consider grammar G2 = ({S}, { a, b }, R, S).
The set of rules, R, is S → aSb I SS I ε
You can see more easily what this language is if you think of a as a
left parenthesis" (" and b as a right parenthesis") ". Viewed in this
way, L(G2) is the language of all strings of properly nested
parentheses.
20
Example-3 CFG
Consider grammar G3 = (V, Σ , R, <EXPR>).
21
<EXPR) → <EXPR> + <TERM> | <TERM>
<TERM> → <TERM> x <FACTOR> | <FACTOR>
<FACTOR> → (<EXPR>) | a
The two strings a+axa and (a+a)xa can be generated with grammar G3 .
The parse trees are shown in the following figure.
22
Example – Designing CFGs
• Grammar for the language
{0n1n | n ≥ 0} U {1n0n | n ≥ 0}
First construct the grammar
S1 → 0 S1 1 | ε
for the language {0n1n | n ≥ 0} and the grammar
S2 → 1 S2 0 | ε
for the language {1n0n | n ≥ 0}
And then add the rule S → S1 | S2
Grammar:
S → S1 | S2
S1 → 0 S 1 1 | ε
S2 → 1 S2 0 | ε
23
Ambiguous Grammar
Sometimes a grammar can generate the same
string in several different ways. Such a string will
have several different parse trees and thus several
different meanings.
24
Ambiguous Grammar
If a grammar generates the same string in
several different ways, then that string is derived
ambiguously in that grammar.
25
For example, consider the following grammar:
<EXPR> → <EXPR>+<EXPR> | <EXPR> x <EXPR> | ( <EXPR>) | a
This grammar generates the string a+axa ambiguously.
The following figure shows the two different parse trees.
26
Ambiguous Grammar
A string w is derived ambiguously in context-
free grammar G if it has two or more different
leftmost derivations.
27
Ambiguous Grammar
• A grammar can have more than one parse tree
generating a given string of terminals. Such a
grammar is said to be ambiguous
28
Ambiguity
E E E | E E | ( E ) | id
String id * id + id has the following two parse trees
E E ' E | E '
E ' id * E ' | id | ( E ) * E ' | ( E )
Enforces precedence of * over +
30
Example
E E ' E | E '
id + id * id
E ' id * E ' | id | ( E ) * E ' | ( E )
E
id *id + id
E E’ + E
E’ E id E’
+
id E’ id E’
* id *
id id
31
Example
Another Ambiguous Grammar
33
Context Free Grammar (CFG)
Example
list → list + digit
list → list - digit
list → digit
digit → 0|1|2|3|4|5|6|7|8|9
35
Context Free Grammar (CFG)
• Example
• Operators on the same line have the same
associativity and precedence
left-associative: + -
left-associative: * /
Two non-terminals expr and term for the two
levels of precedence, non-terminal factor for
generating basic units in expressions
36
Context Free Grammar (CFG)
factor → digit | (expr)
Binary operators * and / have the highest precedence
term → term * factor
| term / factor
| factor
Similarly, expr
expr → expr + term
| expr - term
| term
The resulting grammar is therefore
expr → expr + term | expr – term | term
term → term * factor | term / factor | factor
factor → digit | (expr)
37
Context Free Grammar (CFG)
Example:
A grammar for a subset of Java statement
stmt → id = expression;
| if(expression) stmt
| if(expression) stmt else stmt
| while(expression) stmt
| do stmt while (expression);
| {stmts}
38
Context Free Grammar (CFG)
Example:
Grammar for statement blocks and conditional
statements:
41
CFG vs. Regex
Regex: (a|b)*abb Grammar:
42
CFG vs. Regex
• Language L = {anbn | n>=1} can be described by a grammar but not by a
regex
• Suppose L was defined by some regex
– We could construct a DFA with a finite number of states, say k, to accept L
Path aj-i State si: For an input beginning
with more than k a’s
Path ai
s0 --- si aibi is in the language: A path
bi from si to state f
Path ajbi is also possible
--- Path bi This DFA accepts both aibi
and ajbi
43
Context-free grammars are widely
used for programming languages
• From the definition of Algol-60:
procedure_identifier::= identifier.
actual_parameter::= string_literal | expression | array_identifier | switch_identifier | procedure_identifier.
letter_string::= letter | letter_string letter.
parameter_delimiter::= "," | ")" letter_string ":" "(".
actual_parameter_list::= actual_parameter | actual_parameter_list parameter_delimiter actual_parameter.
actual_parameter_part::= empty | "(" actual_parameter_list} ")".
function_designator::= procedure_identifier actual_parameter_part.
44
Example
adding_operator::= "+" | "−" .
multiplying_operator::= "×" | "/" | "÷" .
primary::= unsigned_number | variable | function_designator | "(" arithmetic_expression ")".
factor::= primary | factor | factor power primary.
term::= factor | term multiplying_operator factor.
simple_arithmetic_expression::= term | adding_operator term |
simple_arithmetic_expression adding_operator term.
if_clause::= if Boolean_expression then.
arithmetic_expression::= simple_arithmetic_expression |
if_clause simple_arithmetic_expression else arithmetic_expression.
if a < 0 then U+V else if a * b < 17 then U/V else if k <> y then
V/U else 0
45
NFA To CFG Conversion
• We can mechanically construct the CFG from an
NFA
• Converting the NFA for (a|b)*abb into CFG
– For each state i of the NFA, create a non-terminal Ai
– If i has a transition to j on input a, add Ai aA j
– If i has a transition to j on input ε, add Ai A j
– If i is an accepting state, add Ai
– If i is the start state, make Ai the start symbol of the
grammar.
46
BNF: Meta-Syntax for CFGs
• <postal-address> ::= <name-part> <street-address>
<zip-part>
• <name-part> ::= <personal-part> <last-name>
<opt-jr-part> <EOL>
| <personal-part> <name-part>
• <personal-part> ::= <first-name> | <initial> "."
• <street-address> ::= <house-num> <street-name>
<opt-apt-num> <EOL>
• <zip-part> ::= <town-name> "," <state-code>
<ZIP-code> <EOL>
• <opt-jr-part> ::= "Sr." | "Jr." | <roman-numeral> | ""
47
Left-Most Derivation Parse Tree
48
Right-Most Derivation Parse Tree
49
Ambiguity
• There are no general techniques for handling
ambiguity
• It is impossible to automatically convert an
ambiguous grammar into an unambiguous one
• If used sensibly, ambiguity can simplify the
grammar
• Disambiguation Rules: Instead of re-writing the
grammar, we can
– Use the ambiguous grammar
– Along with disambiguation rules. 50
Associativity of Operators
• The operator + associates to the left
An operator with + signs on both sides of it
belongs to the operator to its left.
51
Right Associative Operator
• The operator = associates to the right
right → letter = right | letter
letter → a | b |…| z
52
Precedence of Operators
• Associativity rules for + and * apply to
occurrences of the same operator
• Rule
* has the highest precedence than + if * takes its
operands before + does
• Multiplication and division have higher
precedence than addition and subtraction.
• 9 + 5 * 2 and 9 * 5 + 2 equivalent to 9 + (5 * 2)
and (9 * 2) + 2
53
Questions
• CFG representing the Regular Expression a+
A → aA | a
54
Questions
• CFG representing the Regular Expression a*b+
(i.e. start with any number of a’s followed by non-zero numbers of
b)
S → R | aS
R → b | bR
56