FL&T Unit 3 - 1 - 1724732026415
FL&T Unit 3 - 1 - 1724732026415
FL&T Unit 3 - 1 - 1724732026415
What is a grammar?
A grammar consists of one or more variables that represent classes of strings (i.e., languages)
There are rules that say how the strings in each class are constructed. The construction can use:
1. symbols of the alphabet
2. Strings that are already known to be in one of the classes
3. or both
Production rules
Each production rule consists of:
1. A variable that is being (partially) defined by the production. This variable is often called the
head of the production.
2. The production symbol →.
3. A string of zero or more terminals and variables. This string, called the body of the
production, represents one way to form strings in the language of the variable of the head.
In doing so, we leave terminals unchanged and substitute for each variable of the body any string
that is known to be in the language of that variable.
What is Derivation?
A sequence of rewritings that transforms the start variable S of a grammar G to a
sentence s is called a derivation of s from G.( symbol ⇒ is used to show derivation)
Example:
Given grammar G is:
S → 0S1
S→ε
The string 0011 is in the language generated.
The derivation is:
S ⇒ 0S1
⇒ 00S11
⇒ 0011
A derivation tree or parse tree is an ordered rooted tree that graphically represents the semantic
information a string derived from a context-free grammar.
Representation Technique
If S → x1x2 …… xn is a production rule in a CFG, then the parse tree / derivation tree will be as
follows −
Top-down Approach −
Bottom-up Approach −
Example
X → X*X
→ X*a
→ X+X*a
→ X+a*a
→ a+a*a
Context-free languages (CFLs) are generated by context-free grammars. The set of all context-
free languages is identical to the set of languages accepted by pushdown automata, and the set
of regular languages is a subset of context-free languages. An inputed language is accepted by a
computational model if it runs through the model and ends in an accepting final state. All regular
languages are context-free languages, but not all context-free languages are regular. Most
arithmetic expressions are generated by context-free grammars, and are therefore, context-free
languages. Context-free languages and context-free grammars have applications in computer
science and linguistics such as natural language processing and computer language design.
According to Chomsky hierarchy, grammars are divided of 4 types:
1. Type - 3 Grammar
Type-3 grammars generate regular languages. Type-3 grammars must have a single non-
terminal on the left-hand side and a right-hand side consisting of a single terminal or single
terminal followed by a single non-terminal.
and a ∈ T (Terminal)
The rule S → ε is allowed if S does not appear on the right side of any rule.
Example
X→ε
X → a | aY
Y→b
2. Type - 2 Grammar
Example
S→Xa
X→a
X → aX
X → abc
X→ε
3. Type - 1 Grammar
Type-1 grammars generate context-sensitive languages. The productions must be in the
form
αAβ→αγβ
where A ∈ N (Non-terminal)
The rule S → ε is allowed if S does not appear on the right side of any rule. The languages
generated by these grammars are recognized by a linear bounded automaton.
Example
AB → AbBc
A → bcA
B→b
4. Type - 0 Grammar
Example
S → ACaB
Bc → acB
CB → DB
aD → Db
For any language L, we break its strings into five parts and pump second and fourth substring.
Pumping Lemma, here also, is used as a tool to prove that a language is not CFL.
Because, if any one string does not satisfy its conditions, then the language is not CFL.
Thus, if L is a CFL, there exists an integer n, such that for all x ∈ L with |x| ≥ n, there exists u,
v, w, x, y ∈ Σ∗, such that x = uvwxy, and
(1) |vwx| ≤ n
(2) |vx| ≥ 1
(3) for all i ≥ 0: uviwxiy ∈ L
For above example, 0n1n is CFL, as any string can be the result of pumping at two places, one for
0 and other for 1.
Let us prove, L012 = {0n1n2n | n ≥ 0} is not Context-free.
Let us assume that L is Context-free, then by Pumping Lemma, the above given rules follow.
Now, let x ∈ L and |x| ≥ n. So, by Pumping Lemma, there exists u, v, w, x, y such that (1) – (3)
hold.
We show that for all u, v, w, x, y (1) – (3) do not hold.
If (1) and (2) hold then x = 0n1n2n = uvwxy with |vwx| ≤ n and |vx| ≥ 1.
(1) Tells us that vwx does not contain both 0 and 2. Thus, either vwx has no 0’s, or vwx has no
2’s. Thus, we have two cases to consider.
Suppose vwx has no 0’s.
By (2), vx contains a 1 or a 2. Thus uwy has ‘n’ 0’s and uwy either has less than ‘n’ 1’s or has
less than ‘n’ 2’s.
But (3) tells us that uwy = uv0wx0y ∈ L.
So, uwy has an equal number of 0’s, 1’s and 2’s gives us a contradiction. The case where vwx
has no 2’s is similar and also gives us a contradiction. Thus L is not context-free.
Closure properties
Union
Concatenation
1. Union
Let L1 and L2 be two context free languages. Then L1 ∪ L2 is also context free.
Example
If L1 and L2 are context free languages, then L1L2 is also context free.
Example
3. Kleene closure
Example
Complement − If L1 is a context free language, then L1’ may not be context free.
Context-Sensitive Grammar:
A Context-sensitive grammar is an unrestricted grammar in which all the productions are of form
Where α and β are strings of non-terminals and terminals.
Context-sensitive grammars are more powerful than context-free grammars because there are
some languages that can be described by CSG but not by context-free grammars and CSL are
less powerful than unrestricted grammar. That’s why context-sensitive grammars are positioned
between context-free and unrestricted grammars in the Chomsky hierarchy.
The language that can be defined by context-sensitive grammar is called CSL. Properties of CSL
are:
Example:
1.
And P:
S → 1SBC | 123
1B → 12
2B → 22
2C → 23
3C → 33
CB → HB,
HB → HC,
HC → BC
2.
Solution:
S → aAbc
→ abAc
→ abBbcc
→ aBbbcc
→ aaAbbcc
→ aabAbcc
→ aabbAcc
→ aabbBbccc
→ aabBbbccc
→ aaBbbbccc
→ aaabbbccc
The language generated by this grammar is {anbncn | n≥1}.
Here,
Memory information ≤ c × Input information
The computation is restricted to the constant bounded area. The input alphabet contains two
special symbols which serve as left end markers and right end markers which mean the
transitions neither move to the left of the left end marker nor to the right of the right end marker
of the tape.
A linear bounded automaton can be defined as an 8-tuple (Q, X, ∑, q0, ML, MR, δ, F) where −
δ is a transition function which maps each pair (state, tape symbol) to (state, tape symbol,
Constant ‘c’) where c can be 0 or +1 or -1
A deterministic linear bounded automaton is always context-sensitive and the linear bounded
automaton with empty language is undecidable.