"Context-Free Grammar" From John Martin (3 Edition)

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 40

CONTEXT-FREE GRAMMAR

From John Martin (3rd edition)

THE IDEA OF CONTEXT FREE GRAMMAR:

Using recursive definition both for regular & non-regular

languages ,a slight reformation of recursive definition lead


us to the idea of Context-free grammar.

USING GRAMMAR RULES TO DESCRIBE A


LANGUAGE
Let L={a,b}* where ={a,b}. By the recursive definition of

L we can define;
L
For any S L, Sa L
For any S L, Sb L.

No other strings are in L


Here we think S as a variable representing arbitrary element of L whose
value is to be obtained by the combination of rules 1-3:

RULE 1

L means S i.e. variable S is replaced by empty


RULE 2:

Sa L means S Sa i.e. variable S is replaced by Sa


RULE 3:

Sb L means S Sb i.e. variable S is replaced by Sb


The symbol for each of the rules by which a variable is
replaced by a string. e.g , to mean that can be
obtained by applying one of these rules to a single variable
in the string

S => Sa => Sb => Sbba => bba => bba

these steps indicates that rules 1,2,3 are applied to obtain the
string bba.
CONTEXT-FREE GRAMMAR:

CFG is collection of these things


1. An alphabet of letters called Terminals from which we
are going to make strings that will be the words of the
language.
2. A set of symbols called Non-terminals ,one of which is
the symbol S stands for start here.
3. A finite set of productions of the form:
One non-terminal finite strings of terminals/nonterminals. / both

Where the strings of terminals and non-terminals can

consist of only terminals or only non-terminals or any


mixture of terminals and non-terminals or even the
empty string
We require that at least one production rule has the non-

terminal S as its left side.


PRODUCTION RULE:
The grammatical rules of CFL are called production rules.

DERIVITION:
The sequence of applications of the rules that produce the
finishing string of terminals from the starting symbol is
called a derivation.
DEFINITION OF CFG (cohen 251)
The language generated by a CFG is the set of strings of
terminals that can be produced from the start symbol S
using the productions as substitutions

The topic of CFL is perhaps the most important aspect of


formal language theory as it applies to programming
languages actual programming languages have many
features that can be described elegantly by means of CFLs.
DEFINITION CONTEXT-FREE GRAMMAR (Linz pg#126)
A grammar G=(V,T,S,P) is known as CONTEXT-FREE if all
productions in P are of the form:
A x ; A V and x (V u T)
Where;
V: Non-terminals.
T: Terminals say as well
S V
P: productions

A language L is said to be context-free if and only if there is

context-free grammar G such that L=L(G).


NOTE: every regular grammar is context-free , so a regular

language is also a Context Free Language (CFL) but vice


versa is not true.
The CFLs encompasses both R.Ls and non R.Ls
Therefore the family of regular languages is a proper subset

of the family of CFLs

WHY WE CALL CONTEXT-FREE (pg#127 Linz)

CFGs derive their name from the fact that the substitution of
the variables on the left of production can be made any
time such a variable appears in a sentential form, it does
not depend on the symbol in the rest of the sentential
form (the context). This feature is consequence allowing
only a single variable on the left side of the production
FOR EXAMPLE
=> G
Means that the string can be obtained from the string by
replacing some variable that appears on the left of a
production (in G) by the corresponding right side ,or that

=> 1 A 2
=> 1 2
A =>

NOTE:
The grammar in this chapter is called CFG because the
production rule can be applied to terminals/non-terminals
whenever and wherever it occurs the context places no
limitations on the applicability of production rules.

Examples of CFG (pg# 127 Linz)

Example 5.1:
The grammar G = ({S},{a,b},S,P) with productions
S aSa
S bSb
S
Is context-free a typical derivation in this grammar is
S=>aSa=>aaSa=>aabSbaa=>aab baa=>aabbaa.
This makes it clear that :
L(G)={WWR: W {a,b}*} (palindrome problem)
THIS LANGUAGE IS CONTEXT-FREE BUT NOT REGULAR

Example:

The grammar G with productions:


S abB
A aaBb
B bbAa
A
Is context free, we can show that L(G)={ab(bbaa)n bba (ba)n:
n0}

EXAMPLE:
the language L= {an b m : n m} is context-free.
To show this we need to produce a CFG for the language L.
Suppose n > m . We 1st generate CFG for an b m ,then will
add extra as on the left. The production rules will be :
For (n>m)
S AR
R aRb|
A aA|a

For (n<m)
S AR|RB
R aRb|
A aA|a
B bB|b
The resulting grammar is CF, therefore L is CFL. Although
there are many other equivalent CFGs
Example :
Consider the grammar with productions
S aSb|SS|

THE LANGUAGE GENERATED BY A CFG:


let G = (V,,S,P) be a CFG the language generated by G is

L(G)= {x * | S =>* G x}.


A language L is CFL if there is a CFG G so that L=L(G).
The derivability of x from S is denoted S*=>x . The set of

strings derived from V being constructed by finite but


unbounded numbers of rule applications , can be defined
recursively.

THE LANGUAGE OF ALGEBRIC EXPRESSION:

The most straight forward way of generating a CFG is


probably to use the productions.
S S+S|S-S|S*S|S/S|(S)|a
For e.g. the string a + ((a*a)/a)-a can be obtained from the
derivation:
S=> S-S => S+S-S => a+S-S => a+ S/S S => a + (S * S)/S- S
=> a+(a*S)/S S => a+(a*a)/ S S => a+(a*a)/a S=> a+(a*a)/a-a
There may be other solutions for this problem as well
1. Evaluate a*a call it value A
2. Evaluate A/a call it value B
3. Evaluate a+B call it value C
4. Evaluate C-a call it value D

THE SYNTAX OF PROGRAMMING LANGUAGES:


Here we define a rule based approach for generating the
strings of a language. We call a syntactically correct string a
sentence of the language. A small subset of the English
language is used to illustrate the components of the stringgeneration process the alphabets of our miniature()
language is the set.
{ a, the, John, Jill, hamburger, car, drives, eats, slowly,
frequently, big, juicy, brown }
THESE ARE CALLED THE TERMINAL SYMBOLS OF THE
LANGUAGE

Capitalization, punctuation and other important features

of written languages are ignored in this example.


The sentence generation procedure should construct the

strings
John eats a hamburger

Jill drives frequently


Additional symbols are used during the construction of

sentences to enforce syntactic restrictions of the language.


These intermediate symbols are known as variables or
non-terminals, by enclosing them in angle brackets< >.

Since the generation procedure constructs sentences ,the

initial variable is named <sentence>.


The generation of a sentence consists of replacing

variables by strings of specific form.


Syntactically correct replacements are given by a set of

transformation rules
Two possible rules for the variables <sentence> are
1. <sentence> <noun phrase> <verb phrase>
2. <sentence> <noun phrase> <verb><direct-object

phrase>

The existence of multiple transformations indicates that

syntactically correct sentences may have several different


forms.
Rules for the variable that generate noun and verb phrase

are given below:


3. <noun phrase> <proper-noun>
<determiner><common-noun>
4.<proper-noun>John
5. <proper-noun>Jill
6.<common-noun>car
7. <common-noun>hamburger
8.<determiner>a
9.<determiner>the

10. <verb-phrase> <verb> <adverb>


11. <verb>
12. <verb> drives
13.
eats
14. <adverb> slowly
15.
frequently
The application/execution of a rule transforms one string
to another , this transformation consists of replacing an
occurrence of the variable on the left hand side of the
with the string on the right hand side.
The sentence generation process consists of repeated
application of the rule to transform the variable
<sentence>into a string of terminal symbols.

Read examples on page # 66,67 (sud kamp).


DIFFERENCE B/W SYNATX & SEMANTICS

The sentence Car eats slowly is a sentence in the language


since it has the form <noun-phrase><verb phrase>outlined
by Rule1.
DEFINITION:

Let G= (V,,P,S) be a CFG and v (V u ), the set of strings


derivable from v is defined recursively as follow:
1. Basis : v is derivable from V .
2. Recursive step: if u=xAy is derivable from V and Aw
P(set of terminals),then xwy is derivable from v.

3.Closure :- a string is derivable from V only if it can be


generated from V by a finite number of application
of the recursive step.
Note:
The definition of rule uses notation.
The definition of its applicants uses => notation.
The symbol *=> denotes derivability .
The symbol +=>designates derivability utilize.
One or more rule applications employed.

Definition: (Sud Kamp pg# 70)


1. Let G= (V,,P,S) be a CFG
2. A string w (V u ) is a sentential form of G if there is a

derivative S *=>w in G.
3. A string w * is a sentence of G if there is a derivation
S*=>w in G.
4. The language of G , denoted L(G) , is the set of
{w *|S*=>w}.
Details on above definition:
A language has been defined as set of strings over an
alphabet
A grammar consists of an alphabet & method of
generating strings.

These strings may contain both variables and terminals.


The start symbol of grammar , assuming the role of

<sentence> in natural language , example , initiates the


process of generating acceptable strings.
The language of grammar G is the set of terminal strings

derivable from the start symbol.


Sentential form:

A sentential form is a string that is derivable from the start


symbol of grammar e.g. let <sentence> be the start symbol

<sentence> => <noun-phrase> <verb-phrase>


<proper-noun> <verb-phrase>
fill<verb-phrase>
(sentential form of the given grammar)
Its not yet a sentence , it still contains variables , but it has
the form of a sentence

Recursion in context-free grammar:


A variable A is called recursive if there is a
derivation A+ uAv.
A definition of the form A => w+=>uAv , where A
is not in w is called Indirect recursion.
To generate strings of arbitrary length & languages
with infinite many strings.
It is introduced into grammar through the rules .
A rule of the form
A u Av

Is called recursive since it defines variable A in it self.


AAv (left-recursive)
AvA (right-recursive)

Example : Consider the recursive rules


AaAb
AaA
(Right recursion).
AAb
(Left recursion).
AAA
(both left and right recursion).

The derivation tree (DT) structure indicates the rules

applied to each variable but does not designate the order of


the rule application (Sudkamp-PP71).

A=> aAb

A=>aA

A=>Ab

A=>AA

=>aAb

=>aA

=>Ab

A=>AA

=>aaAbb

=>aA

=>Ab

=>AA

=>aaaAbbb

=>aaaA

=>Abbb

=>AAAA

..

EXAMPLE:
S=> AA

S=>AA

S=>AA

S=>AA

aA

AAAA

Aa

aA

aAAA

aAAA

AAAa

aAAA

abAAA

abAAA

AAbAa

aAAa

abaAA

abaAA

Aabaa

ABaaA

ababAA

ababAA

AbAbaa

abAbAa

ababaA

ababaA

Ababaa

ababAa

ababaa

ababaa

ababaa

ababaa

Right

Right

LEft

Both

Left-most & right-most derivations:


A & B in the table (previous slide) are called the left-most

derivation , the non-terminals/variables are replaced by


terminals or (terminals + non-terminals)
C is the right most derivation.
These derivations demonstrate that there may be more

than one derivations of a string CFG.


The essential feature of derivation is not the order in which

the rules are applied but the manner in which each variable
is transformed to terminal string.

Derivation tree (D.T):


Let G= (V,,P,S) be a CFG and let S*=> w be a derivate in G.

The derivation tree , DT of S*=>w is an ordered tree that


can be built iteratively as follow:
I. Initialize DT with root S.
II. If Ax1,x2,xn (with xi (V U )) is the rule in the
derivation applied to the string uAv , then add
x1,x2,x3,xn as the children of A in tree.
III. If A is the rule in the derivation applied to the string
uAv , then add as the only child of A in the tree.

The tree structure indicates the rule applied to each

variable but does not designate the order of the rule


applications , DT can be ordered to yield the result of a
derivation represented by a tree.
A derivation tree can be used to produce several derivations

that generate the same string.

Example:

S=>AA
=>AAAa
=>AAbAa S=>AA
=>AAAa
=>AAbAa
=>Aabaa
=>AbAaa
=>Ababaa
=>ababaa
=>Aabaa
=>AbAaa
=>Ababaa
=>ababaa

TREE
Ordering of leaves

S=> AA

S
a,A

a,A,A,A

A
A

a,b,A,A,A
A

A
b

A
A

S
A

A
b

A
a,b,a,A,A

A
a
S

a,b,a,b,a,A

A
b

A
A
a

A
A

a,b,a,b,a,a

S
A

A
b

A
A
a

A
a
A
a

S
A

ababaa

ababaa
A

a
a

A
b

A
A
a

A
a
A

A
a

A
a

ababaa

A
a

A
a

A
a

ababaa

A
a
A

You might also like