Units

MODULE-I
WhatisTOC?
In theoretical computer science, the theory of computation is the branch that deals with
whether and how efficiently problems can be solved on a model of computation, using an
algorithm.Thefieldisdividedintothreemajorbranches:automatatheory,computabilitytheory and
computational complexity theory.
In order to perform a rigorous study of computation, computer scientists work with a
mathematicalabstractionofcomputerscalledamodelofcomputation.Thereareseveralmodels in use,
but the most commonly examined is the Turing machine.
Automatatheory
In theoretical computer science, automata theory is the study of abstract machines (or more
appropriately,abstract'mathematical'machinesorsystems)andthecomputationalproblemsthat can
be solved using these machines. These abstract machines are called automata.
Thisautomatonconsistsof
 states(representedinthefigurebycircles),
 andtransitions(representedbyarrows).
As the automaton sees a symbol of input, it makes a transition (or jump) to another state,
accordingtoitstransitionfunction(whichtakesthecurrentstateandtherecentsymbolasits inputs).
UsesofAutomata:compilerdesignandparsing.
Introductiontoformalproof:
Basic Symbols used :
U–Union
∩-Conjunction
ϵ - Empty String
Φ – NULL set
7-negation
‘–compliment
=> implies
Additive inverse: a+(-a)=0
Multiplicativeinverse:a*1/a=1
Universal set U={1,2,3,4,5}
Subset A={1,3}
A’={2,4,5}
Absorptionlaw:AU(A∩B)=A,A∩(AUB)=A
DeMorgan’sLaw:
(AUB)’=A’∩B’
(A∩B)’=A’UB’
Doublecompliment
(A’)’ =A
A∩A’=Φ
Logicrelations:
a🡪b=>7aUb
7(a∩b)=7aU7b
Relations:
LetaandbbetwosetsarelationRcontainsaXb. Relations
used in TOC:
Reflexive:a=a
Symmetric:aRb=>bRa
Transition:aRb,bRc=> aRc
Ifagivenrelationisreflexive,symmentricandtransitivethentherelationiscalledequivalence relation.
Deductiveproof:Consistsofsequenceofstatementswhosetruthleadusfromsomeinitial statement
called the hypothesis or the give statement to a conclusion statement.
Additionalformsofproof:
Proofofsets
Proof by contradiction
Proofbycounterexample
Directproof(AKA)Constructiveproof:
Ifpistruethenqistrue
Eg:ifaandbareoddnumbersthenproductisalsoanoddnumber.
Oddnumbercanberepresentedas2n+1
a=2x+1, b=2y+1
productofaXb=(2x+1)X(2y+1)
=2(2xy+x+y)+1=2z+1(oddnumber)
Proofbycontrapositive:
ProofbyContradiction:
HandnotCimpliesfalsehood.
Beregardedasanobservationthanatheorem.
Foranysetsa,b,cifa∩b =Φ andcisasubsetofbtheprovethata∩c=Φ Given :

a∩b=Φ and c subset b
Assume:a∩c Φ
Then
=>a∩b Φ =>a∩c=Φ(i.e.,theassumptioniswrong)
ProofbymathematicalInduction:
Languages:
The languages we consider for our discussion is an abstraction of natural languages. That is,our
focus here is on formal languages that need precise and formal definitions. Programming
languages belong to this category.
Symbols:
Symbolsareindivisibleobjectsorentitythatcannotbedefined.Thatis,symbolsaretheatoms of the
world of languages. A symbol is any single object such as , a, 0, 1, #,
begin, or do.
Alphabets:
Analphabetisafinite,nonemptysetofsymbols.Thealphabetofalanguageisnormallydenoted by .
When more than one alphabets are considered for discussion, then
subscripts may be used (e.g. etc)orsometimesothersymbollikeGmayalsobe
introduced.
Example:
StringsorWordsoverAlphabet:
Astringorwordoveranalphabet isafinitesequenceofconcatenatedsymbols of .
Example:0110,11,001arethreestringsoverthebinaryalphabet{0,1}. aab, abcb, b,
cc arefour strings over the alphabet { a, b, c }.
It is not the case that a string over some alphabet should contain all the symbols from the alpha-
bet. For example,the string cc over the alphabet{ a, b, c } doesnot contain thesymbols a and b.
Hence, it is true that a string over an alphabet is also a string over any superset of that alphabet.
Lengthofastring:
Thenumberofsymbolsinastringwiscalleditslength,denotedby|w|.
Example:|011|=4,|11|=2,|b|=1
Convention :We will use small case letters towards the beginning of the English alphabet
to denote symbols of an alphabet and small case letters towards the end to
denotestringsoveranalphabet.Thatis,
(symbols) and
are strings.
SomeStringOperations:
Let and betwostrings.Theconcatenationofxandy
denotedbyxy,isthestring . That is, the concatenation of x and y denoted
by xy is the string that has a copy of x followed by a copy of y without anyintervening space
between them.
Example:Consider the string 011 over the binary alphabet. All the prefixes, suffixes
andsubstrings of this string are listed below.
Prefixes:ε,0,01,011.
Suffixes:ε,1,11,011.
Substrings:ε,0,1,01,11, 011.
Notethatxisaprefix(suffixorsubstring)tox,foranystringxandεisaprefix(suffixor substring) to any

string.
Astringxisaproperprefix(suffix)of stringyif xisaprefix(suffix)of yandx≠y. In the
above example, all prefixes except 011 are proper prefixes.
Powers of Strings : For any string x and integer , we use todenotethestring

formed by sequentially concatenating n copies of x. We can also give an inductive
definitionof asfollows:
=e,ifn=0;otherwise
Example:Ifx=011,then =011011011, = 011 and
PowersofAlphabets:
Wewrite (forsomeintegerk) todenotethesetofstringsof lengthkwithsymbols from . In
other words,
= { w | w is a string over and| w | = k}. Hence, for any alphabet, denotestheset
of all strings of length zero. That is, = { e }. For the binary alphabet { 0, 1 } we have
the following.
Thesetofallstringsoveranalphabet isdenotedby .Thatis,
The set contains all the strings that can be generated by iteratively concatenating sym-
bols from any number of times.
Example :If ={a,b},then ={ε,a,b,aa,ab,ba,bb,aaa,aab,aba,abb,baa,…}.
Please note that if , then that is .Itmaylookoddthatonecanproceed

from the empty set to a non-empty set by iterated concatenation. But there is a reason for this
and we accept this convention
Thesetofallnonemptystringsoveranalphabet isdenotedby . Thatis,
Notethat isinfinite.Itcontainsnoinfinitestringsbutstringsofarbitrarylengths.
Reversal :
Foranystring thereversalof thestringis .
Aninductivedefinitionofreversalcanbegivenasfollows:
Languages:
Alanguageoveranalphabetisasetofstringsover thatalphabet.Therefore,a language L
is any subset of . That is,any is a language.
Example:
1. Fistheemptylanguage.
2. isalanguageforany .
3. {e} is a language for any .Note that, .BecausethelanguageFdoesnot
contain any string but {e} contains one string of length zero.
4. Thesetofallstringsover{0,1}containingequalnumberof0'sand1's.
5. Thesetofallstringsover{a,b,c}thatstartswitha.
Convention:CapitallettersA,B,C,L,etc.withorwithoutsubscriptsarenormallyusedto denote
languages.
Setoperationsonlanguages: Since languagesare setof stringswe can applyset operationsto

languages. Here are some simple examples (though there is nothing new in it).
Union:Astring
iff or
Example:{0,11,01,011} {1,01,110}={0,11,01,011,111}
Intersection : A string, xϵ L1 ∩L2 iff x ϵ L1 and x ϵ L2 .
Example:{0,11,01,011} {1,01,110}={01}
Complement:Usually, is the universe that a complement is taken with respect to.

Thus for a language L, the complement isL(bar) = { | }.
Example:LetL={x||x|iseven}.Thenitscomplementisthelanguage{ | |x| is
odd}.
Similarlywecandefineotherusualsetoperationsonlanguageslikerelativecom- plement,
symmetric difference, etc.
Reversalofalanguage:
ThereversalofalanguageL,denotedas ,isdefinedas: .
Example:
1. LetL={0,11,01,011}.Then = { 0, 11, 10, 110 }.

2. LetL={ |nisaninteger}.Then ={ |nisaninteger}.
Languageconcatenation:Theconcatenationoflanguages and isdefinedas

={xy| and }.
Example:{a,ab}{b,ba}={ab,aba,abb,abba}.
Notethat,
1. in general.
2.
3.
Iteratedconcatenationoflanguages:Sincewecanconcatenatetwolanguages,wealsorepeat this to
concatenate any number of languages. Or we can concatenate a language with itself any number
of times. The operation denotes the concatenation of
Lwithitselfntimes.Thisisdefinedformallyasfollows:
Example:LetL={a,ab}.Thenaccordingtothedefinition,wehave
andsoon.
Kleene's Star operation :The Kleene star operation on a language L, denoted as is

defined as follows :
=(UnionninN)
={x|xistheconcatenationofzeroormorestringsfromL}
Thus isthesetofallstringsderivablebyanynumberofconcatenationsofstringsin
L.Itisalsousefulto define
= ,i.e.,allstringsderivablebyoneormoreconcatenationsofstringsinL.Thatis
=(UnionninNandn>0)
=
Example:LetL={a,ab}.Thenwehave,
={e} {a,ab} {aa,aab,aba,abab} …
={a,ab} {aa,aab,aba,abab} …
Note:εisin ,foreverylanguageL,including.
Thepreviouslyintroduceddefinitionof isaninstanceofKleenestar.
(Generates) (Recognizes)
Grammar Language Automata
Automata:Aalgorithmorprogramthatautomaticallyrecognizesifaparticularstringbelongsto the
language or not, by checking the grammar of the string.
Anautomataisanabstractcomputingdevice(ormachine).Therearedifferentvaritiesofsuch abstract
machines (also called models of computation) which can be defined mathematically.
EveryAutomatonfulfillsthethreebasicrequirements.
• Every automaton consists of some essential features as in real computers. It has a mech-
anism for reading input. The input is assumed to be a sequence of symbols over a given
alphabetandisplacedonaninputtape(orwrittenonaninputfile).Thesimplerautomata can only
read the input one symbol at a time from left to right but not change. Powerful versions
can both read (from left to right or right to left) and change the input.
 The automaton can produce output of some form. If the output in response to an input
string is binary (say, accept or reject), then it is called an accepter. If itproduces an out-
putsequenceinresponsetoaninputsequence,thenitiscalledatransducer(orautomaton with
output).
• The automaton may have a temporary storage, consisting of an unlimited number of
cells, each capable of holding a symbol from an alphabet ( whcih may be different from
the input alphabet).The automaton can both read and change the contents of the storage
cells in the temporary storage. The accusing capability of this storage varies depending
on the type of the storage.
• The most important feature of the automaton is its control unit, which can be in any
one of a finite number of interval states at any point. It can change state in some de-
fined manner determined by a transition function.
Figure1:Thefigureaboveshowsadiagrammaticrepresentationofagenericautoma- tion.
Operationoftheautomationisdefinedasfollows.
At any point of time the automaton is in some integral state and is reading a particular symbol
from the input tape by using the mechanism for reading input. In the next time step the automa-
ton then moves to some other integral (or remain in the same state) as defined by the transition
function. The transition function is based on the current state, input symbol read, and the content
of the temporary storage. At the same time the content of the storage may be changed and the
inputreadmaybemodifed.Theautomationmayalsoproduce some outputduringthistransition. The
internal state, input and the content of storage at any point defines the configuration of the
automaton at that point. The transition from one configuration to the next ( as defined by the
transition function) is called a move. Finite state machine or Finite Automation is the simplest
type of abstract machine we consider. Any system that is at any point of time in one of a finite
number of interval state and moves among these states in a defined manner in response to some
input, can be modeled by a finite automaton. It doesnot have any temporary storage and hence a
restricted model of computation.
FiniteAutomata
Automata(singular:automation)areaparticularlysimple,butuseful,modelofcompu- tation.
They were initially proposed as a simple model for the behavior of neurons.
States,TransitionsandFinite-StateTransitionSystem:
Let us first give some intuitive idea about a state of a system and state transitions before
describing finite automata.
Informally, a state of a system is an instantaneous description of that system which gives

allrelevant information necessary to determine how the system can evolve from that point on.
Transitions are changes of states that can occur spontaneously or in response to inputs to the
states. Though transitions usually take time, we assume that state transitions are instantaneous
(which is an abstraction).
Someexamplesofstatetransitionsystemsare:digitalsystems,vendingmachines,etc.Asystem
containing only a finite number of states and transitions among them is called
afinite-statetransitionsystem.
Finite-statetransitionsystemscanbemodeledabstractlybyamathematicalmodelcalled
finiteautomation
DeterministicFinite(-state)Automata
Informally, a DFA (Deterministic Finite State Automaton) is a simple machine that reads an in-
put string -- one symbol at a time -- and then, after the input has been completely read, decides
whether to accept or reject the input. As the symbols are read from the tape, the automaton can
change its state, to reflect how it reacts to what it has seen so far. A machine for which a deter-
ministiccodecanbeformulated,andifthereisonlyoneuniquewaytoformulatethecode,then the
machine is called deterministic finite automata.
Thus,aDFAconceptuallyconsistsof3parts:
1. A tape to hold the input string. The tape is divided into a finite number of cells. Each
cell holds a symbol from .
2. Atapeheadforreadingsymbolsfromthetape
3. Acontrol,whichitself consistsof3things:
o finite number of states that the machine is allowed to be in (zero or more states
are designated as accept or final states),
o acurrentstate,initiallysettoastartstate,
o astatetransitionfunctionforchangingthecurrentstate.
Anautomatonprocessesastringonthetapebyrepeatingthefollowingactionsuntilthetape head has

traversed the entire string:
1. Thetapeheadreadsthecurrenttapecellandsendsthesymbolsfoundtheretothe control.
Then the tape head moves to the next cell.
2. hecontroltakessandthecurrentstateandconsultsthestatetransitionfunctiontoget the next
state, which becomes the new current state.
Oncetheentirestringhasbeenprocessed,thestateinwhichtheautomationentersisexamined.
Ifitisanacceptstate,theinputstringisaccepted;otherwise,thestringisrejected.Summariz- ing all the
above we can formulate the following formal definition:
DeterministicFiniteStateAutomaton:ADeterministicFiniteStateAutomaton(DFA)is a 5-
tuple :
 Qisafinitesetofstates.
• isafinitesetofinputsymbolsoralphabet
 isthe“nextstate”transitionfunction(whichistotal).Intuitively, is a
functionthattellswhichstatetomovetoinresponsetoaninput,i.e.,if Misin
stateqandseesinputa,itmovestostate .
 isthestartstate.
• isthesetofacceptorfinalstates.
AcceptanceofStrings:
ADFAacceptsastring ifthereisasequenceofstates in Q
suchthat
1. isthestartstate.
2. forall .
3.
LanguageAcceptedorRecognizedbyaDFA:
ThelanguageacceptedorrecognizedbyaDFAM istheset ofall stringsacceptedbyM ,and
is denoted by i.e. The notion of

acceptancecanalsobemademoreprecisebyextendingthetransitionfunction .
Extendedtransitionfunction:
Extend (whichisfunctiononsymbols)toafunctiononstrings,i.e..
That is, isthestatetheautomationreacheswhenitstartsfromthestateqandfinish

processing the string w. Formally, we can give an inductive definition as follows:
ThelanguageoftheDFAMisthesetof stringsthatcantakethestartstatetooneofthe accepting
states i.e.
L(M)={ |Macceptsw}
={ | }
Example1:
isthestartstate
It is a formal description of a DFA. But it is hard to comprehend. For ex. The language of the
DFA is any string over { 0, 1} having at least one 1
We can describe the same DFA by transition table or state transition diagram asfollow-
ing:
TransitionTable:
0 1
Itiseasytocomprehendthetransitiondiagram.
Explanation:Wecannotreachfindstate w/0orinthei/pstring.Therecan be anyno.

of 0's at the beginning. ( The self-loop at onlabel0indicatesit).Similarlythere can
be any no. of 0's & 1's in any order at the end of the string.
Transitiontable:
Itisbasicallyatabularrepresentationofthetransitionfunctionthattakestwoarguments(astate and a
symbol) and returns a value (the “next state”).
• Rowscorrespondtostates,
• Columnscorrespondtoinputsymbols,
• Entriescorrespondtonextstates
• Thestartstateismarkedwithanarrow
• Theacceptstatesaremarkedwithastar(*).
0 1
(State)Transitiondiagram:
Astatetransitiondiagramorsimplyatransitiondiagramisadirectedgraphwhichcanbe constructed as
follows:
1. ForeachstateinQthereisanode.
2. There is a directed edge from node q to node p labeled a iff .(Ifthere
are several input symbols that cause a transition, the edge is labeled by the list of these
symbols.)
3. Thereisanarrowwithnosourceintothestartstate.
4. Acceptingstatesareindicatedbydoublecircle.
5.
6. HereisaninformaldescriptionhowaDFAoperates.AninputtoaDFAcanbeany
.string Puta pointerto thestart state q.Read theinputstringwfromleft
to right, one symbol at a time, moving the pointer according to the transition
function, .Ifthenextsymbolofwisaandthepointerisonstatep,movethe
pointer to .Whentheendoftheinputstringwisencountered,thepointerison some

state, r. The string is said to be accepted by the DFA if and
rejected if . Note that there is no formal mechanism for moving the pointer.
7. Alanguage issaidtoberegularif L=L(M)forsomeDFAM.
RegularExpressions:FormalDefinition
WeconstructREsfromprimitiveconstituents(basicelements)byrepeatedlyapplyingcertain recursive
rules as given below. (In the definition)
Definition:LetSbeanalphabet.Theregularexpressionsaredefinedrecursivelyasfollows.
Basis:
i) isaRE
ii) isaRE
iii) ,aisRE.
Thesearecalledprimitiveregularexpressioni.e.PrimitiveConstituents
RecursiveStep:
If
and areREsover,thensoare
i)
ii)
iii) i
v)
Closure:risREoveronlyif itcanbeobtainedfromthebasiselements(PrimitiveREs) by a
finite no of applications of the recursive step (given in 2).
Example : Let ={0,1,2}.Then(0+21)*(1+F)isaRE,becausewecanconstructthis

expression by applying the above rules as given in the following step.
Steps REConstructed RuleUsed
1 1 Rule1(iii)
2 Rule1(i)
3 1+ Rule2(i)&ResultsofStep1,2
4 (1+ ) Rule2(iv)&Step3
5 2 1(iii)
6 1 1(iii)
7 21 2(ii),5,6
8 0 1(iii)
9 0+21 2(i),7,8
10 (0+21) 2(iv), 9
11 (0+21)* 2(iii),10
12 (0+21)* 2(ii),4,11
LanguagedescribedbyREs:Eachdescribesalanguage(oralanguageisassociated with every
RE). We will see later that REs are used to attribute regular languages.
Notation:If risaREoversomealphabetthenL(r)isthelanguageassociatewithr.Wecan define the

language L(r) associated with (or described by) a REs as follows.
1. istheREdescribingtheemptylanguagei.e.L( )= .
2. isaREdescribingthelanguage{ }i.e.L( )={ }.
3. ,aisaREdenotingthelanguage{a}i.e.L(a)={a}.
4. If and areREsdenotinglanguageL( )andL( )respectively,then
i) isaregularexpressiondenotingthelanguageL( )=L( )L( )

ii) isaregularexpressiondenotingthelanguageL( )=L( )L( )
iii) isaregularexpressiondenotingthelanguage
iv)( )isaregularexpressiondenotingthelanguageL(( ))=L( )
Example:ConsidertheRE(0*(0+1)).ThusthelanguagedenotedbytheREis
L(0*(0+1))=L(0*)L(0+1).................by 4(ii)
=L(0)*L(0)L(1)
={ ,0,00,000,.}{0} {1}
={ ,0,00,000,.......}{0,1}
={0,00,000,0000,..........,1,01,001,0001,..........}
PrecedenceRule
Consider the RE ab + c. The language described by the RE can be thought
of either L(a)L(b+c) or L(ab) L(c) as provided by the rules (of languages
described by REs)
givenalready.Butthesetworepresentstwodifferentlanguageslendingtoambig
uity. To remove this ambiguity we can either
1) Usefullyparenthesizedexpression-(cumbersome)or
2) UseasetofprecedencerulestoevaluatetheoptionsofREsinsomeorder.Like
other algebras mod in mathematics.
ForREs,theorderofprecedencefortheoperatorsisasfollows:
i) Thestaroperatorprecedesconcatenationandconcatenationprecedesunio
n(+) operator.
ii)Itisalsoimportanttonotethatconcatenation&union(+)operatorsareassocia
tive and union operation is commutative.
Usingtheseprecedencerule,wefindthattheREab+crepresentsthelanguageL(ab)
L(c) i.e. it should be grouped as ((ab)+c).
Wecan,of coursechangetheorderof
precedencebyusingparentheses.Forexample, the language represented by
the RE a(b+c) is L(a)L(b+c).
Example:TheREab*+b isgroupedas((a(b*))+b)whichdescribesthelanguage
L(a)(L(b))* L(b)
Example:TheRE(ab)*+b representsthelanguage(L(a)L(b))* L(b).
Example:ItiseasytoseethattheRE(0+1)*(0+11)representsthelanguageofall
strings over {0,1} which are either ended with 0 or 11.
Example:Theregularexpressionr =(00)*(11)*1denotesthesetofallstringswithan
even number of 0's followed by an odd number of 1's i.e.
Note:Thenotation isusedtorepresenttheRErr*.Similarly, representsthe RE

rr, denotes r,andsoon.
Anarbitrarystringover ={0,1}isdenotedas(0+1)*.
Exercise : Give a RE r over {0,1} s.t. L(r)={ hasatleastonepairof

consecutive 1's}
Solution:EverystringinL(r)
mustcontain00somewhere,butwhatcomesbeforeand what goes before is
completely arbitrary. Considering these observations we can write the REs
as (0+1)*11(0+1)*.
Example : Considering the above example it becomes clean that the RE

(0+1)*11(0+1)*+(0+1)*00(0+1)*representsthesetofstringover{0,1}thatcont
ainsthe substring 11 or 00.
Example: ConsidertheRE0*10*10*.ItisnotdifficulttoseethatthisREdescribesthe
setofstringsover{0,1}thatcontainsexactlytwo1's.Thepresenceof
two1'sintheRE and any no of 0's before, between and after the 1's ensure it.
Example:Considerthelanguageofstringsover{0,1}containingtwoormore1's.
Solution:Theremustbeatleasttwo1'sintheREsomewhereandwhatcomesbefore,
between, and after is completely arbitrary. Hence we can write the RE as
(0+1)*1(0+1)*1(0+1)*. But following two REs also represent the same
language, each ensuring presence of least two 1's somewhere in the string
i) 0*10*1(0+1)*
ii)(0+1)*10*10*
Example:ConsideraRErover{0,1}suchthat
L(r)={ hasnopairofconsecutive1's}
Solution:Thoughitlookssimilartoex…….,itishardertoconstructtoconstruct.We
observer that, whenever a 1 occurs, it must be immediately followed by a 0.
This substring may be preceded & followed by any no of 0's. So the final RE
must be a repetition of strings of the form: 00…0100….00 i.e. 0*100*. So it
looks like the RE is (0*100*)*. But in this case the strings ending in 1 or
consisting of all 0's are not accounted for. Taking these observations into
consideration, the final RE isr = (0*100*)(1+ )+0*(1+ ).
AlternativeSolution:
Thelanguagecanbeviewedasrepetitionsofthestrings0and01.HencegettheRE as
r=(0+10)*(1+ ).Thisisashorterexpressionbutrepresentsthesamelanguage.
RegularExpressionandRegularLanguage:
Equivalence(ofREs)withFA:
Recallthat, language that isacceptedbysome FAsareknown

asRegularlanguage. The two concepts : REs and Regular language are
essentially same i.e. (for) every regular language can be developed by
(there is) a RE, and for every RE there is a Regular Langauge. This fact is
rather suprising, because RE approach to describing language is
fundamentally differnet from the FA approach. But REs and FA are
equivalentintheirdescriptivepower.Wecanputthisfactinthefocusof
thefollowing Theorem.
Theorem:AlanguageisregulariffsomeREdescribes it.
ThisTheoremhastwodirections,andarestated&provedbelowasaseparatelemma
REtoFA:
REsdenoteregularlanguages:
Lemma: IfL(r) isalanguagedescribedbytheRE r,thenitisregulari.e.thereisaFA

such that L(M) L(r).
Proof : To prove the lemma, we apply structured index on the expression r.

First, we showhowto constructFA for the basis elements: , and forany
.Then we show how to combine these Finite Automata into Complex
Automata that accept the Union,
Concatenation,KleenClosureofthelanguagesacceptedbytheoriginalsmaller
automata.
Useof NFAsishelpfulinthecasei.e.weconstructNFAsforeveryREswhichare
represented by transition diagram only.
Basis:
 Case(i): .Then . Then andthefollowingNFAN

recognizes L(r). Formally whereQ={q}and
.
 Case(ii): . ,andthefollowingNFANacceptsL(r).Formally
where .
Sincethestartstateisalsotheacceptstep,andthereisnoanytransitiondefined,itwi
ll accept the only string and nothing else.
 Case(iii):r=aforsome .ThenL(r)={a},andthefollowingNFAN
acceptsL(r).
Formally, where for or
Induction:
Assume thatthe startof the theorem istrue for REs and . Hence we can
assume thatwehaveautomata and thatacceptslanguagesdenotedbyREs
and ,
respectively i.e. and
.TheFAsarerepresented schematically
as shown below.
Eachhasaninitialstateandafinalstate.Therearefourcasestoconsider.
 Case(i):ConsidertheRE denotingthelanguage .We

constructFA ,from and toacceptthelanguagedenotedbyRE as
follows :
Create anew(initial)start state and give -transition to the initial state of

and .This is the initial state of .
 Create afinal state and give -transition from the two final state of
and . istheonlyfinalstateof andfinalstateof and willbe
ordinary states in .
 Allthestateof and arealsostateof .
 Allthemovesof and arealsomovesof .[FormalConstruction] It
is easy to prove that
Proof:Toshowthat wemustshowthat
= by following transition of
Starts at initial state and enters the start state of either or follwoing the
transitioni.e.withoutconsuminganyinput.WLOG,assumethat,itentersthestartst
ate
of . From this point onward it has to followonlythe transition of to enter
the final stateof ,becausethisistheonlywaytoenterthefinalstateof M
byfollowingthee- transition.(Which is the last transition & no input is taken at
hte transition). Hence the whole input w is considered while traversing from
the start state of to the final state
of .Therefore mustaccept .
Say, or .
WLOG,say
Therefore when process the string w , it starts at the initial state and enters
the final state when w consumed totally, by following its transition. Then
also accepts w, by
starting at state and taking -transition enters the start state of -follows
the moves of to enter the final state of consuming input w thus takes -
transition to .
Henceproved
 Case(ii): ConsidertheRE denotingthelanguage .Weconstruct

FA from & to accept as follows :
Createanewstartstate andanewfinalstate
1. Add -transitionfrom
o to the start state of
o to
o final state of to the start state of
2. Allthestatesof arealsothestatesof . has2morestatesthanthatof
namely and .
3. Allthemovesof arealsoincludedin .
Bythetransitionoftype(b), canaccept .
Bythetransitionoftype(a), canenterstheinitialstateof w/oanyinputand then
followallkindsmovesof toenterthefinalstateof andthenfollowing -transition
can enter . Hence if any is accepted by thenw is also accepted by
. By
the transition of type (b), strings accepted by can be repeated by any no of
times & thusacceptedby .Hence accepts andanystringacceptedby
repeated(i.e.
concatenated)anynoof times. Hence
Case(iv):Let =( ).ThentheFA isalsotheFAfor( ),sincetheuseof parentheses

does not change the language denoted by the expression
Non-DeterministicFiniteAutomata
Nondeterminism is an important abstraction in computer science.
Importance of nondeterminism is found in the design of algorithms. For
examples, there are many problems with efficient nondeterministic solutions
but no known efficient deterministic solutions.
(Travellingsalesman,Hamiltoneancycle,clique,etc).Behaviourofaprocess is in
a distributed system is also a good example of nondeterministic situation.
Because
thebehaviourof
aprocessmightdependonsomemessagesfromotherprocessesthat might arrive
at arbitrary times with arbitrary contents.
ItiseasytoconstructandcomprehendanNFAthanDFAforagivenregularlanguage
. The concept of NFA can alsobeusedin provingmanytheorems andresults.
Hence, it plays an important role in this subject.
InthecontextofFAnondeterminismcanbeincorporatednaturally.Thatis,anNFA
is defined in the same way as the DFA but with the following two
exceptions:
 multiplenextstate.
 - transitions.
MultipleNextState:
 IncontrasttoaDFA,thenextstateisnotnecessarilyuniquelydeterminedbyt
he current state and input symbol in case of an NFA. (Recall that, in a
DFA there is exactly one start state and exactly one transition out of
every state for each symbol in ).
 Thismeansthat-inastateq andwithinputsymbola-therecouldbeone,more
thanoneorzeronextstatetogo,i.e.thevalueof isasubsetofQ.Thus
= whichmeansthatanyoneof couldbethenext
state.
 The zero next state case is a special one giving = , which means
that
thereisnonextstateoninputsymbolwhentheautomataisinstateq.Insucha
case, we may think that the automata "hangs" and the input will be
rejected.
-transitions:
In an -transition, the tape head doesn't do anything- it doesnot read and it

doesnot move.However,thestateof theautomatacanbechanged-
thatiscangotozero,one
or more states. This is written formally as
implyingthatthenext state could by any one of w/o consuming the

next input symbol.
Acceptance:
Informally,anNFAissaidtoacceptitsinput ifitispossibletostartinsomestartstate
and process , moving according to the transition rules and making choices
along the way whenever the next state is not uniquely defined, such that
when is completely processed (i.e. end of is reached), the automata is in an
accept state. There may be several possible pathsthrough the automation in
response to an input since the start state is not determined and there are
choices along the way because of multiple next states. Some of these paths
may lead to accpet states while others may not. The
automationissaidtoaccept ifatleastonecomputationpathoninput startingfrom
at least one start stateleads to an accept state-otherwise, the automation
rejectsinput .Alternatively, wecansaythat,
isacceptediffthereexistsapathwithlabel from some start state to some
accept state. Since there is no mechanism for determining which state to
start in or which of the possible next moves to take (including the -
transitions) in response to an input symbol we can think that the automation
is having some "guessing" power to chose the correct one in case the input
is accepted
Example1:ConsiderthelanguageL={ {0,1}*|The3rdsymbolfromtherightis
1}. The following four-state automation accepts L.
Them/cisnotdeterministicsince thereare two transitionsfrom state on input1

and no transition (zero transition) from on both 0 & 1.
Foranystring whose3rdsymbolfromtherightisa1,thereexistsasequenceoflegal
transitions leading from the start state q, to the accept state . But for any
string
where 3rd symbol from the right is 0, there is no possible sequence of
legal tranisitonsleadingfrom and
.Hencem/cacceptsL.Howdoesitacceptanystring L?
FormaldefinitionofNFA:
Formally, an NFA is a quituple whereQ, , ,andF bear

the same meaning as for a DFA, but , the transition function is redefined as
follows:
whereP(Q)isthepowersetofQi.e. .
TheLangaugeofanNFA:
FromthediscussionoftheacceptancebyanNFA,wecangivetheformaldefinitionof
a language accepted by an NFA as follows :
If isanNFA,thenthelangaugeacceptedbyN iswritttenasL(N) is
given by .
That is, L(N) is the set of all strings w in such that
containsatleastone accepting state.

Removingϵ-transition:
-transitionsdonot increasethepowerof anNFA .That is, any -NFA (NFA
with transition), we can always construct an equivalent NFA without -
transitions. The equivalent NFA must keep track where the NFA goes at
every step during computation. This can be done by adding extra
transitions for removal of every - transitions from the - NFA as follows.
Ifweremovedthe -transition fromthe -NFA,thenweneedtomoves

from state p to all the state on input symbol
whicharereachablefromstateq(in the - NFA ) on same input symbol q. This

will allow the modified NFA to move from state p to all states on some input
symbols which were possible in case of -NFA on the same input symbol.
This process is stated formally in the following theories.
Theorem if L isaccepted byan -NFA N, then there issomeequivalent

without transitions accepting the same language L
Proof:
Let bethegiven with
We construct
Where, forall and and
OtherelementsofN'andN
Wecanshowthat i.e.N'andNare equivalent.
We need to prove that
i.e.
Wewillshowsomethingmore,thatis,
Wewillshowsomethingmore,thati
s, Basis : , then
But bydefinitionof .
InductionhypothesisLetthestatementholdforall with .
Bydefinitionofextensionof B
y inductions hypothesis.
Assumingthat
Bydefinitionof Sin
ce
Tocompletetheproofweconsiderthecas
e When i.e. then

andbytheconstructionof wherever constrainsastateinF.
If (and thus is not in F ), then with leadstoanacceptingstateinN'iffitlead

to an accepting state in N ( by the construction of N'and N ).
Also,if ( ,thuswisacceptedbyN'iffwisacceptedbyN(iff )
If (and,thusinMweload inF),thus isacceptedbybothN'andN.
Let . If w cannot lead to in N , then . (Since can add transitions to get an

accept state). So there is no harm in making an accept state in N'.
Ex:ConsiderthefollowingNFAwith -transition.
Transition Diagram
0 1
Transitiondiagramfor 'fortheequivalentNFAwithout -moves

0 1
Since thestartstateq0mustbefinalstateintheequivalentNFA.
Since and and we add moves

and
intheequivalentNFA.Othermovesarealsoconstructedaccordingl
y.
-closures:
Theconceptusedintheaboveconstructioncanbemademoreformalbydefiningt
he -closure for a state (or a set of states). The idea of -closure is that,
when moving
fromastateptoastateq(orfromasetofstatesSitoasetofstatesSj)aninput , we
need to take account of all -moves that could be made after the transition.
Formally,foragivenstateq,
-closures:
Similarly, for a given set
-closures:
So, intheconstructionof equivalent NFA N'without -transitionfrom anyNFA
with moves. the first rule can now be written as

EquivalenceofNFAandDFA
ItisworthnotingthataDFAisaspecialtypeof NFAandhencetheclassof languages

accepted by DFA s is a subset of the class of languages accepted by NFA s.
Surprisingly,thesetwoclassesareinfactequal.NFAsappearedtohavemorepowe
r than DFA s because of generality enjoyed in terms of -transition and
multiple next states. But they are no more powerful than DFA s in terms of
the languages they accept.
ConvertingDFAtoNFA
Theorem:EveryDFAhasasequivalentNFA
Proof:A DFA is justa special type of an NFA. In a DFA ,the transition

functions is definedfrom whereasincaseofanNFAitisdefinedfrom
and
beaDFA.WeconstructanequivalentNFA as
follows.
i.e
If and
AllotherelementsofNareasinD.
If thenthereisasequenceofstates such that
Thenitisclearfromtheaboveconstructionof Nthatthereisasequenceof
states(inN) such that and and hence
Similarlywecanshowtheconverse.
Hence,
GivenanyNFAweneedtoconstructasequivalentDFAi.e.theDFAneedtosimulate
thebehaviourof theNFA.Forthis,theDFAhavetokeeptrackofallthestateswhere
the NFA could be in at every step during processing a given input string.
There are possible subsets of states for any NFA with n states. Every subset
correspondstooneofthepossibilitiesthattheequivalentDFAmustkeeptrackof.Th
us,
theequivalentDFAwillhave states.
TheformalconstructionsofanequivalentDFAforanyNFAisgivenbelow.Wefirst
consider an NFA without transitions and then we incorporate the affects of
transitionslater.
FormalconstructionofanequivalentDFAforagivenNFAwithout transitions.
Givenan without -moves,weconstructanequivalentDFA
asfollows
i.e.
(i.e.everysubsetofQwhichasanelementinFisconsideredasafinal stat
inDFAD)
forall and
where
Thatis,
ToshowthatthisconstructionworksweneedtoshowthatL(D)=L(N)i.e.
Or,
Wewillprovethefollowingwhichisastrangerstatementthusrequired.
Proof:Wewillshowbyinductionson
Basis If =0, then w =
So, bydefinition.
Inductionshypothesis:Assume inductivelythat the statement holds of

length less than or equal to n.
Inductivestep
Let , then with
Now,
Now, given anyNFA with -transition, we canfirst constructanequivalent NFA

without -transition and then use the above construction process to
construct an equivalent DFA , thus, proving the equivalence of NFA s and
DFA s..
It isalsopossibleto construct an equivalent DFA directlyfrom anygiven NFA

with - transition by integrating the concept of -closure in the above
construction.
Recall that, for any
-closure:
IntheequivalentDFA,ateverystep,weneedtomodifythetransitionfunctions to
keep track of all the states where the NFA can go on -transitions. This is
done by
replacing by -closure , i.e. we now compute

ateverystepas follows:
Besidesthis the initialstate of the DFAD hasto be modified to keep track of all
the statesthatcanbereachedfromtheinitialstateof NFAonzeroormore-
transitions.
Thiscanbedonebychangingtheinitialstate to -closure( ).
It is clear that,atevery step in the processingof an input stringbytheDFA D ,
itenters a state that corresponds to the subset of states that the NFA N could
be in at that
particularpoint.ThishasbeenprovedintheconstructionsofanequivalentNFAfora
ny -NFA
IfthenumberofstatesintheNFAisn,thenthereare statesintheDFA.Thatis, each
state in the DFA is a subset of state of the NFA .
But, it is important to note that most of these states are inaccessible from
the start state and hence can be removed from the DFA without changing
the accepted
language.Thus,infact,thenumberofstatesintheequivalentDFAwouldbemuchles
s
than .
Example:ConsidertheNFAgivenbelow.
0 1
{ }
Sincethereare3statesintheNFA
Therewillbe states(representingallpossiblesubsetofstates)intheequivalent
DFA . The transition table of the DFA constructed by using the subset
constructions process is produced here.
0 1 ThestartstateoftheDFAis -closures
Thefinal states areallthosesubsets thatcontains

(since in the NFA).
{ } Letuscomputeoneentry,
Similarly,allothertransitionscanbecomputed
0 1
CorrespondingTransitionfig.forDFA.Notethatstates
arenotaccessibleandhencecanberemoved.This
gives us the following simplified DFA with only 3 states.
Itisinterestingtonotethatwecanavoidencounteringallthoseinaccessible
or unnecessarystatesin the equivalent DFAbyperforming the following
two steps inductively.
1. If is the start state of the NFA, then make - closure ( ) the start
state of the equivalent DFA . This is definitely the only accessible
state.
2. If we have already computed a set of states which are accessible.
Then . compute
becausethesesetofstateswillalsobeaccessible.
Followingthesestepsintheaboveexample,wegetthetransitiontablegivenbelow
MODULE-II
RegularExpressions:FormalDefinition
WeconstructREsfromprimitiveconstituents(basicelements)byrepeatedlyapplyingcertainrecursiveru
lesas given below. (In the definition)
Definition:LetSbeanalphabet.Theregularexpressionsaredefinedrecursivelyasfollows.
Basis:
i) is aRE
ii) is a RE
iii) , a is RE.
Thesearecalledprimitiveregularexpressioni.e.PrimitiveConstituents
RecursiveStep:
If and areREsover,thensoare i)
ii) iii
iv)
Closure :risREoveronlyif itcanbeobtainedfromthebasiselements(PrimitiveREs)byafinitenoof

applications of the recursive step (given in 2).
Example:Let ={0,1,2}.Then(0+21)*(1+F )isaRE,becausewecanconstructthisexpressionby

applying the above rules as given in the following step.
Steps RE Constructed RuleUsed
1 1 Rule1(iii)
2 Rule1(i)
3
1+ Rule2(i)&ResultsofStep1,2
4
(1+ ) Rule2(iv)&Step3
5 2 1(iii)
6 1 1(iii)
7 21 2(ii),5,6
8 0 1(iii)
9 0+21 2(i),7,8
10 (0+21) 2(iv),9
11 (0+21)* 2(iii),10
12 (0+21)* 2(ii),4,11
LanguagedescribedbyREs: Eachdescribesalanguage(oralanguageisassociatedwitheveryRE).We
will see later that REs are used to attribute regular languages.
Notation:If risaREoversomealphabetthenL(r)isthelanguageassociatewithr.Wecandefinethe
language L(r) associated with (or described by) a REs as follows.
1. istheREdescribingtheemptylanguagei.e.L()= .
2. isaREdescribingthelanguage{ }i.e.L( )={ }.
3. ,aisaREdenotingthelanguage{a}i.e.L(a)= {a}.
4. If and areREsdenotinglanguageL( )andL( )respectively,then
i) isaregularexpressiondenotingthelanguage L( )=L( )L( )
ii) is aregularexpressiondenotingthelanguageL( )=L( )L( )
iii) isaregularexpressiondenotingthelanguage
iv)( )isaregularexpressiondenotingthelanguage L(( ))=L( )
Example:ConsidertheRE(0*(0+1)).ThusthelanguagedenotedbytheRE is
L(0*(0+1))=L(0*)L(0+1).................by4(ii)
=L(0)*L(0)L(1)
={ ,0,00,000,........}{0} {1}
={ ,0,00,000,.......}{0,1}
={0,00,000,0000,..........,1,01,001,0001,..........}
PrecedenceRule
Consider the RE ab + c. The language described by the RE can be thought of either
L(a)L(b+c) or L(ab)
L(c)asprovidedbytherules(oflanguagesdescribedbyREs)givenalready.Butthesetwo
represents two different languages lending to ambiguity. To remove this ambiguity we
can either
1) Usefullyparenthesizedexpression-(cumbersome)or
2) UseasetofprecedencerulestoevaluatetheoptionsofREsinsomeorder.Likeotheralgebrasmodin
mathematics.
ForREs,theorderofprecedencefortheoperatorsisasfollows:
i) Thestaroperatorprecedesconcatenationandconcatenationprecedesunion(+)operator.
ii) Itisalsoimportanttonotethatconcatenation&union(+)operatorsareassociativeandunionoperati
onis commutative.
Usingtheseprecedencerule,wefindthattheREab+crepresentsthelanguage L(ab) L(c)i.e.itshouldbe

grouped as ((ab)+c).
Wecan,ofcoursechangetheorderofprecedencebyusingparentheses.Forexample,thelanguage
represented by the RE a(b+c) is L(a)L(b+c).
Example:TheREab*+bisgroupedas((a(b*))+b)whichdescribesthelanguageL(a)(L(b))* L(b)
Example:TheRE(ab)*+brepresentsthelanguage(L(a)L(b))* L(b).
Example:ItiseasytoseethattheRE(0+1)*(0+11)representsthelanguageofallstringsover{0,1}whicha
re either ended with 0 or 11.
Example:Theregularexpressionr=(00)*(11)*1denotes thesetofallstringswithanevennumberof0's
followed by an odd number of 1's i.e.
Note : The notation is used to represent the RE rr*. Similarly, represents

the RE rr, denotes r, and so
on.
Anarbitrarystringover ={0,1}isdenotedas (0+1)*.
Exercise:GiveaRErover{0,1}s.t.L(r)={ hasatleastonepairofconsecutive1's}
Solution:EverystringinL(r)mustcontain00somewhere,butwhatcomesbeforeandwhatgoesbeforeis
completely arbitrary. Considering these observations we can write the REs as (0+1)*11(0+1)*.
Example:ConsideringtheaboveexampleitbecomescleanthattheRE(0+1)*11(0+1)*+(0+1)*00(0+1)
* represents the set of string over {0,1} that contains the substring 11 or 00.
Example : Considerthe RE 0*10*10*. It is not difficult tosee that this RE describes the set of
strings over{0,1} thatcontainsexactlytwo1's.Thepresenceof two1'sintheREandanynoof
0'sbefore,betweenandafterthe 1's ensure it.
Example:Considerthelanguageofstringsover{0,1}containingtwoormore 1's.
Solution : There must be at least two 1's in the RE somewhere and what comes before, between,
and after is
completelyarbitrary.HencewecanwritetheREas(0+1)*1(0+1)*1(0+1)*.ButfollowingtwoREsalsorepr
esent the same language, each ensuring presence of least two 1's somewhere in the string
i) 0*10*1(0+1)*
ii) (0+1)*10*10*
Example:ConsideraRErover{0,1}such that
L(r)={ hasnopairofconsecutive 1's}
Solution:Thoughitlookssimilartoex…….,itishardertoconstructtoconstruct.Weobserverthat,wheneve
r a 1 occurs, it must be immediately followed by a 0. This substring may be preceded & followed
by any no of 0's. So the final RE must be a repetition of strings of the form: 00…0100….00 i.e.
0*100*. So it looks like the RE is (0*100*)*. But in this case the strings ending in 1 or consisting
of all 0's are not accounted for. Taking these observations into consideration, the final RE is r =
(0*100*)(1+ )+0*(1+ ).
AlternativeSolution:
Thelanguagecanbeviewedasrepetitionsofthestrings0and01.HencegettheREasr =(0+10)*(1+
).This is a shorter expression but represents the same language.
RegularExpression:
FAtoregularexpressions:
FAtoRE(REsforRegularLanguages):
Lemma:If alanguageisregular,thenthereisaREtodescribeit.i.e.if L=L(M)forsomeDFAM,thenthere is

a RE r such that L = L(r).
Proof : We need to construct a RE r such that . Since M is a DFA, it has a

finite noof states.Letthesetof statesofMisQ={1,2,3,..., n}forsomeintegern.[Note:if
thenstatesofMwere denoted by some other symbols, we can always rename those to indicate as
1, 2, 3,..., n ]. The required RE is constructed inductively.
Notations: isaREdenotingthelanguagewhichisthesetofallstrings wsuchthatwisthelabelofa
path from state i to state j in M, and that path has no intermediate state whose
number is greaterthenk.
(i&j(beginingandendpts)arenotconsideredtobe"intermediate"so iand/orjcanbe
greaterthank)
Wenow construct inductively,foralli,j Qstartingatk=0andfinallyreachingk=n.
Basis : k = 0,
i.e.thepathsmustnothaveanyintermediatestate(sinceallstatesarenumbered1or above). There

are only two possible paths meeting the above condition :
1. Adirecttransitionfromstateitostatej.
o =aifthenisatransitionfromstateitostatejonsymbolthesinglesymbola.
o =
iftherearemultipletransitionsfromstateitostatejonsymbols .
o =fifthereisnotransitionatallfromstateitostatej.
2. All paths consisting of only one node i.e. when i = j. This gives the path of length 0
(i.e. the RE denotingthestring
)andallselfloops.BysimplyaddingÎtovariouscasesabovewegetthe corresponding REs
i.e.
o = +aif thereisaselflooponsymbol ainstatei.
o = +
ifthereareselfloopsinstateiasmultiplesymbols .
o = ifthereisnoselflooponstatei.
Induction:
Assumethatthereexistsapathfromstateitostatejsuchthatthereisnointermediatestatewhosenumber
is greater than k. The corresponding Re for the label of the path is .

Thereareonlytwopossiblecases:
1. Thepathdosenotgothroughthestatekatalli.e.numberofalltheintermediatestatesarelessthan
k.So,thelabelofthepathfromstateitostatejisthalanguagedescribedbytheRE .
2. Thepathgoesthroughthestatekatleastonce.Thepathmaygofromitojandkmayappearmore
than once. We can break the into pieces as shown in the figure 7.
Figure7
1. Thefirstpartfromthestateitothestatekwhichisthefirstrecurence.Inthispath,allintermediate
states are less than k and it starts at iand ends at k. So the RE
denotesthelanguageofthe label of path.

2. Thelastpartfromthelastoccurenceofthestatekinthepathtostatej.Inthispathalso,no
intermediate state is numbered greater than k. Hence the RE
denotingthelanguageofthelabel of the path.

3. Inthemiddle,forthefirstoccurenceof ktothelastoccurenceofk,represents
aloopwhichmaybe
takenzerotimes,onceoranynooftimes.Andallstatesbetweentwoconsecutive k'sarenumber
ed less than k.
HencethelabelofthepathofthepartisdenotedbytheRE .Thelabelof thepathfromstateitostate

jistheconcatenationofthese3partswhichis
Sinceeithercase1orcase2mayhappenthelabelsofallpathsfromstate itojisdenotedbythefollowingRE
Wecanconstruct foralli,j {1,2,...,n}inincreasingorderofkstartingwiththebasisk=0uptok=n
since
dependsonlyonexpressionswithasmallsuperscript(andhencewillbeavailable).WLOG,assume
thatstate1isthestartstateand arethemfinalstateswhereji {1,2,...,n}, and

.Accordingtotheconventionused,thelanguageoftheautomatacanbedenotedbytheRE
Since is the set of all strings that starts at start state 1 and finishes at final state
followingthetransition of the FA with any value of the intermediate state (1, 2, ... , n)
and hence accepted by the automata.
RegularGrammar:
Agrammar isright-linearifeachproductionhasoneofthefollowingthreeforms:
 A cB,
 A c,
 A
Where A, B ( with A = B allowed) and .AgrammarGisleft-linearifeachproductionhasonceof

the following three forms.
A Bc , A c, A
Arightorleft-lineargrammariscalledaregulargrammar.
RegulargrammarandFiniteAutomataareequivalentasstatedinthefollowingtheorem.
Theorem:AlanguageLisregulariff ithasaregulargrammar.Weusethefollowingtwolemmastoprovethe
above theorem.
Lemma1:IfLisaregularlanguage,thenLisgeneratedbysomeright-lineargrammar.
Proof : Let beaDFAthatacceptsL.
Let and .
Weconstructtheright-lineargrammar byletting
N= Q, and
[Note:If ,then ]
Let .ForMtoacceptw,theremustbeasequenceofstates suchthat

and
Byconstruction,thegrammarGwillhaveoneproductionforeachoftheabovetransitions.Therefore,weh
ave the corresponding derivation.
Hencew L(g).
Conversely, if ,thenthederivationofwinGmusthavetheformasgivenabove.But,
then the construction of G from M implies that
,where ,completingtheproof.
Lemma 2 : Let bearight-
lineargrammar.ThenL(G)isaregularlanguage. Proof: To prove it, we construct a FA M
from G to accept the same language.
isconstructedasfollows:
( isaspecialsumbolnotinN)
For any and and isdefinedas
if
and ,if .
Wenowshowthatthisconstructionworks.
Let .ThenthereisaderivationofwinGoftheform
BycontradictionofM,theremustbeasequenceof transitions
implyingthat i.e.wisacceptedbyM.
Conversely, if is accepted by M, then because is the only accepting state of M,

the transitionscausingwtobeacceptedbyMwillbeof theformgivenabove.
Thesetransitionscorrespondstoa derivationof w in the grammar G. Hence , completing

the proof of the lemma.
Given any left-linear grammar G with production of the form

,wecanconstructfromitaright- linear grammar by replacing every
production of G of the form with
Itiseasytoprovethat .Since isright-linear, isregular.Butthensoare
i.e. becauseregularlanguagesareclosedunderreversal.
Puttingthetwolemmasandthediscussionsintheaboveparagraphtogetherwegettheproofofthetheore
m- A language L is regular iff it has a regular grammar

Example:Considerthegrammar
ItiseasytoseethatGgeneratesthelanguagedenotedbytheregularexpression(01)*
0. The construction of lemma 2 for this grammar produces the follwoing FA.
ThisFAacceptsexactly(01)*1.
Decisions Algorithms for CFL
In this section, we examine some questions about CFLs we can answer. A CFL may be
represented using a
CFGorPDA.Butanalgorithmthatusesonerepresentationcanbemadetoworkfortheothers,sincewecan
construct one from the other.
TestingEmptiness:
Theorem:Therearealgorithmstotestemptinessofa CFL.
Proof:GivenanyCFLL,thereisaCFGGtogenerateit.Wecandetermine,usingtheconstructiondescribed
in the context of elimination of useless symbols, whether the start symbol is useless. If so,
then
; otherwise not.
TestingMembership:
GivenaCFLLandastringx,themembership,problemistodeterminewhether ?
GivenaPDAPforL,simulatingthePDAoninputstringxdoesnotquitework,becausethePDAcangrowits
stack indefinitely on input, and the process may never terminate, even if the PDA is
deterministic.
So, we assume that a CFG isgivensuchthatL=L(G).
Let us first present a simple but inefficient algorithm.
ConvertGto inCNFgenerating .Iftheinputstring ,thenweneedto
determinewhether anditcaneasilybedoneusingthetechniquegiveninthecontextofeliminationof
-production. If , then iff . Consider a derivation under a grammar in CNF.

At everystep, a production in CNF in used, and hence it adds exactlyone terminal symbol to the
sententialform. Hence,if thelengthof
theinputstringxisn,thenittakesexactlynstepstoderivex(providedxisin ).
Letthemaximumnumberofproductionsforanynonterminalin isK.Soateverystepinderivation,ther
e are atmost k choices. We may try out all these choices, systematically., to derive the string x
in . Since
there are atmost i.e.
choices.Thisalgorithmsisofexponentialtimecomplexity.Wenowpresentan efficient (polynomial
time) membership algorithm.
Pumping Lemma:
LimitationsofFiniteAutomataandNonregularLanguages:
TheclassoflanguagesrecognizedbyFAsisstrictlytheregularset.Therearecertain
languageswhichare non regular i.e. cannot be recognized by any FA
Considerthelanguage
In order to accept is language, we find that, an automaton seems to need to remember when
passing the
centerpointbetweena'sandb'showmanya'sithasseensofar.Becauseitwouldhavetocomparethatwith
the number of b's to either accept (when the two numbers are same) or reject (when they are
not same) the input string.
Butthenumberof a'sisnotlimitedand maybe muchlargerthanthenumberof
statessincethestringmaybe arbitrarily long. So, the amount of information the automaton need to
remember is unbounded.
Afiniteautomatoncannotrememberthiswithonlyfinitememory(i.e.finitenumberof
states).ThefactthatFA s have finite memory imposes some limitations on the structure of the
languages recognized. Inductively, we can say that a language is regular only if in processing
any string in this language, the information that has to
be remembered at any point is strictly limited. The argument given above to show that
isnonregularis informal.Wenowpresentaformalmethodforshowingthatcertainlanguagessuchas
arenonregular
Properties of CFL’s
ClosurepropertiesofCFL:
WeconsidersomeimportantclosurepropertiesofCFLs.
Theorem:If and areCFLsthensois
Proof : Let and
beCFGsgenerating.Withoutlossofgenerality, we can assume that
. Let is a nonterminal not in or . We construct the grammar
from and ,where
Wenowshowthat
Thusprovingthetheorem.
Let .Then .Allproductionsappliedintheirderivationarealso in .Hence i.e.
Similarly,if ,then
Thus .
Conversely,let .Then andthefirststepinthisderivationmustbeeither or
.Consideringtheformercase,we have
Since and aredisjoint,thederivation mustusetheproductionsof only(whicharealsoin
) Since isthestartsymbolof .Hence, giving .
Usingsimilarreasoning,inthelattercase,weget .Thus .
So, ,as claimed
Theorem:If and areCFLs, thensois .
Proof:Let and betheCFGs generating and respectively.
Again, we assume that and are disjoint, and is a nonterminal not in
or .weconstructtheCFG from and , where
Weclaimthat
To prove it, we first assume that and . Then and .Wecanderivethestringxyin

as shown below.
since and .Hence .

Fortheconverse, let .Thenthederivationofwin willbeof theform
i.e. the first step in the derivation must see the rule . Again, since and are
disjointand and ,somestringxwillbegeneratedfrom usingproductionsin (whichare

also in )andsuchthat .
Thus
Hence and .
This means that w can be divided into two parts x, y such that and .
Thus .This completes the proof
Theorem:If LisaCFL,thensois .
Proof:Let betheCFGgeneratingL.LetusconstructtheCFG fromG
where .
Wenowprovethat ,whichprovethetheorem.
cangenerate inonestepbyusingtheproduction since , cangenerateanystringinL.
Let for any n >1 we can write where for .wcanbegeneratedby

using following steps.
First(n-1)-stepsusestheproductionS SSproducingthesententialformofnnumbersofS's.The
nonterminal S in the i-th position then generates using production in P ( which are also in )
It is also easy to see that G can generate the empty string, any string in L and any string
forn>1and
none other.
Hence
Theorem:CFLsarenotclosedunderintersection
Proof : We prove it by giving a counter example. Consider the language

.Thefollowin
g CFG generates L1and hence a CFL
The nonterminal X generates strings of the form and C generates strings of the form ,
. These are the only types of strings generated by X and C. Hence, S generates .
Using similar reasoning, it can be shown that the following grammar andhenceitis
also a CFL.
But, andisalreadyshowntobenotcontext-
free. Hence proof.
Theorem:ACFL'sarenotclosedundercomplementations
Proof:Assume,forcontradiction,thatCFL'sareclosedundercomplementation.SInce,CFL'sarealsoclos
ed under union, the language , where and are CFL's must be CFL. But by DeMorgan's law
ThiscontradictsthealreadyprovedfactthatCFL'sarenotclosedunderintersection
. But it can be shown that the CFL's are closed under intersection with a
regular set.
Theorem:If LisaCFLandRisaregularlanguage,then is aCFL.
Proof:Let beaPDAforLandlet beaDFAforR.
WeconstructaPDAMfromPandDasfollows
where isdefinedas
contains iff
and contains
TheideaisthatMsimulatesthemovesofPandDparallelyoninputw,andacceptswiffbothPandD
accepts.Thatmeans,wewanttoshow that
Weapplyinductiononn,thenumberofmoves,toshowthat
iff
and
BasicCaseisn=0. Hence , and .Forthiscaseitistrivially true
Inductivehypothesis:Assumethatthestatementistrueforn-1.
InductiveStep:Letw=xaand
Let
Byinductivehypothesis, and
From the definition of andconsideringthen-thmoveofthePDAMabove,wehave
and
Hence and
If and ,then andwegotthatifMacceptsw,thenbothPandDacceptsit.

We can show that converse, in a similar way. Hence isaCFL(sinceitisacceptedbyaPDA M )
This property is useful in showing that certain languages are not context-free.
Example:Considerthelanguage
IntersectingL withtheregular set ,weget

Whichisalreadyknowntobenotcontext-free.HenceLisnotcontext-free
Theorem:CFL'sareclosedunderreversal.ThatisifLisaCFL,thensois
Proof:LettheCFG generatesL.WeconstructaCFG where
.Wenow showthat ,thusprovingthetheorem.

Weneedtoprovethat
iff .
Theproof isbyinductiononn,thenumberof
stepstakenbythederivation.Weassume,forsimplicity(andof course without loss of generality), that
G and hence are in CNF.
The basis is n=1 in which case it is trivial. Because must be either or BC with
. Hence iff
Assume that it is true for (n-1)-steps. Let .Thenthefirststepmustapplyaruleoftheform

and it gives
where and
ByconstructingofG', H
ence
Theconversecaseisexactlysimilar
Substitution:
, let bealanguage(overanyalphabet).Thisdefinesafunction S,calledsubstitution,on whichis
denotedas -forall
Thisdefinitionofsubstitutioncanbeextendedfurthertoapplystringsandlangaugeas
well. If , where , is a string in , then
.
Similarly,foranylanguageL,
ThefollowingtheoremshowsthatCFLsareclosedundersubstitution.
Thereom:Let isaCFL,andsisasubstitutionon suchthat isaCFLforall ,thus

s(L)isaCFL
Proof:LetL=L(G)foraCFG andforevery , forsome
.Withoutlossofgenerality,assumethatthesetsofnonterminals Nand 'sare

disjoint.
Now,weconstructagrammar ,generatings(L),fromGand 'sasfollows:

 consists of
1. and
2. TheproductionofPbutwitheachterminalaintherighthandsideofaproductionreplacedb
y everywhere.
Wenowwanttoprovethatthisconstructionworksi.e. iff .
If Part : Let then according to the definition there is some string
and for
such that
Wewillshowthat .
From the construction of , we find that, there is a derivation
correspondingtothestring (since contains all productions of G but every ai
replaced with in the RHS of any

production).
Every is the start symbol of and all productions of arealsoincludedin .

Hence
Therefore,
(Only-ifPart)Let .Thentheremustbeaderivativeasfollows:
(usingtheproductionofGincludein asmodifiedby(step2)oftheconstructionof .)
Each ( ) can only generate a string , since each

'sandNaredisjoin.Therefore, we get
since
since
Thestring isformedbysubstitutingstrings foreach andhence .
Theorem:CFL'sareclosedunderhomomorphism
Proof : Let be a CFL, and h is a homomorphism on i.e forsomealphabets

.consider the following substitution S:Replace each symbol by the language consisting
of the only string h(a), i.e.
for all .Then,itisclearthat,h(L)=s(L).Hence,CFL'sbeingclosedundersubstitution

must also be closed under homomorphism.
Grammar
Agrammarisamechanismusedfordescribinglanguages.Thisisoneofthemostsimplebutyetpowerful
mechanism. There are other notions to do the same, of course.
Ineverydaylanguage,likeEnglish,wehaveasetof symbols(alphabet),asetof
wordsconstructedfromthese symbols, and a set of rules using which we can group the words to
construct meaningful sentences. The grammar for English tells us what are the words in it and
the rules to construct sentences. It also tells us whether a particular sentence is well-formed (as
per the grammar) or not. But even if one follows the rules of
theenglishgrammaritmayleadtosomesentences whicharenotmeaningful atall,becauseof
impreciseness and ambiguities involved in the language. In english grammar we use many other
higher level constructs like noun-phrase, verb-phrase, article, noun, predicate, verb etc. A typical
rule can be defined as
<sentence> <noun-phrase><predicate>
meaningthat"asentencecanbeconstructedusinga'noun-
phrase'followedbyapredicate". Some more rules are as follows:
<noun-phrase> <article><noun>
<predicate> <verb
> with similar kind of interpretation given above.
If we take {a, an, the} to be <article>; cow, bird, boy, Ram, pen to be examples of <noun>;
and eats, runs, swims,walks,areassociatedwith<verb>,then wecanconstructthesentence-
acowruns,theboyeats,an penwalks-usingtheaboverules.Eventhoughallsentencesarewell-
formed,thelastoneisnotmeaningful. We observe that we start with the higher level construct
<sentence> and then reduce it to <noun-phrase>,
<article>,<noun>,<verb>successively,eventuallyleadingtoagroupofwordsassociatedwith
these constructs.
Theseconceptsaregeneralizedinformallanguageleadingtoformalgrammars.The word'formal'here
refers tothefactthatthespecifiedrulesforthelanguageareexplicitlystatedintermsof
whatstringsorsymbolscan occur. There can be no ambiguity in it.
FormaldefinitionsofaGrammar
AgrammarGisdefinedasaquadruple.
N is a non-emptyfinite set of non-terminals
orvariables, isanon-
emptyfinitesetofterminalsymbolssuchthat
, is a special non-terminal (or variable) called the start symbol, and isa
finite set of production rules.
Thebinaryrelationdefinedbythesetofproductionrulesisdenotedby ,i.e. iff .
Inotherwords,Pisafinitesetofproductionrulesoftheform ,where and
Productionrules:
The production rules specify how the grammar transforms one string to another. Given a string
, we say that the production rule is applicable to this string, since it
is possible to use the rule torewrite the (in ) to obtaining a new string
. We say that derives and is denoted as
Successivestringsarederviedbyapplyingtheproductionsrulesofthegrammarinanyarbitraryorder.A
particular rule can be used if it is applicable, and it can be applied as many times as described.
We write if the string can be derived from the string in zero or more steps; if
canbe derived from in one or more steps.
Byapplyingtheproductionrules inarbitraryorder,anygivengrammarcangenerate manystrings of

terminal
symbolsstartingwiththespecialstartsymbol,S,ofthegrammar.Thesetofallsuchterminalstringsiscall
ed the language generated (or defined) by the grammar.
Formaly,foragivengrammar thelanguagegeneratedbyGis
That is iff .
If ,wemusthaveforsome , ,denotedasa
derivation sequence of w, The strings
aredenotedassententialformsofthe derivation.
Example : Consider the grammar ,whereN={S}, ={a,b}andPisthesetofthefollowing

production rules
{S ab, S aSb}
Someterminalstringsgeneratedbythisgrammartogetherwiththeirderivationisgivenbelow.
S ab
S aSb aabb
S aSb aaSbb aaabbb
Itiseasytoprovethatthelanguagegeneratedbythisgrammar is
Byusingthefirstproduction,itgeneratesthestringab(for i=1).
To generate any other string, it needs to start with the production S aSb and then the non-
terminal S in the RHScanbereplacedeitherbyab(inwhichwegetthestringaabb)orthesameproductionS
aSbcanbe
usedoneormoretimes.Everytimeitaddsan'a'totheleftanda'b'totherightofS,thusgivingthe sentential
form .Whenthenon-
terminalisreplacedbyab(whichisthenonlypossibilityforgeneratinga terminal string) we get a

terminal string of the form .
Thereisnogeneralruleforfindingagrammarforagivenlanguage.Formanylanguageswecandevise
grammars and there are many languages for which we cannot find any grammar.
Example:Findagrammarforthelanguage .
It is possible tofind a grammar for L bymodifying the previous grammar since we need to
generate an extra b attheendofthestring .WecandothisbyaddingaproductionS
Bbwherethenon-terminalBgenerates as given in the previous example.
Usingtheaboveconceptwedevisethefollwoinggrammarfor L.
where,N= {S,B}, P={S Bb,B ab,B aBb}
ParseTrees:
ConstructionofaParsetree:
YieldofaParsetree:
Ambiguityinlanguagesandgrammars:
MODULE-III
Pushdownautomata:
Regular language can be charaterized as the language accepted by finite automata. Similarly,
we can characterizethecontext-
freelanguageasthelangaugeacceptedbyaclassofmachinescalled"PushdownAutomata" (PDA). A
pushdown automation is an extension of the NFA.
It is observed that FA have limited capability. (in the sense that the class of languages accepted
or characterizedbythemissmall).Thisisduetothe"finitememory"(numberof
states)and"noexternalmemory" involved with them. A PDA is simply an NFA augmented with an
"external stack memory". The addition of a stack provides the PDA with a last-in, first-out
memory management cpapability. This "Stack" or "pushdownstore" can be used to record a
potentially unbounded information. It is due to this memory management capability with the help
of the stack that a PDA can overcome the memory limitations that prevents a FA to
accept many interesting languages like . Although, a PDA can store an unbounded
amount of information on the stack, its access to the information on the stack is limited. It can
push an element onto the topof thestackandpopoff anelementfromthetopof
thestack.Toreaddownintothestackthetopelements must be popped off and are lost. Due to this
limited access to the information on the stack, a PDA still has some limitations and cannot
accept some other interesting languages.
Asshowninfigure,aPDAhasthreecomponents:aninputtapewithreadonlyhead, afinitecontrolanda
pushdown store.
The input head is read-onlyand mayonly move fromleftto right, one symbol (orcell)at a time. In
each step,
thePDApopsthetopsymboloffthestack;basedonthissymbol,theinputsymbolitiscurrentlyreading,and
itspresentstate,itcanpushasequenceof symbolsontothestack,moveitsread-onlyheadonecell(or
symbol) to the right, and enter a new state, as defined by the transition rules of the PDA.
PDA are nondeterministic, by default. That is, - transitions are also allowed in which the PDA can
pop and push,andchangestatewithoutreadingthenextinputsymbolormovingitsread-
onlyhead.Besidesthis,there may be multiple options for possible next moves.
FormalDefinitions:Formally,aPDAMisa7-tupleM=
where,
 isafinitesetof states,
 isafinitesetofinputsymbols(inputalphabets),
 isafinitesetofstacksymbols(stackalphabets),
 isatransitionfunctionfrom tosubsetof
 isthestart state
 ,istheinitialstacksymbol,and
 ,isthefinaloraccept states.
Explanationofthetransitionfunction, :
If, for any ,

.Thismeansintitutivelythatwheneverthe PDA is in state q reading input symbol a
and z on top of the stack, it can nondeterministically for any i,
 go to state
 popzoffthe stack
 push onto the stack (where ) (The usual convention is that if
,then will be at the top andat the bottom.)

 movereadheadrightonecellpastthecurrentsymbola.
If a= , then means intitutively that whenver the PDA

is in stateqwithzonthetopof thestackregardlessof
thecurrentinputsymbol,itcannondeterministicallyforany i, ,
 go to state
 popzoff the stack
 push ontothestack,and
 leaveitsread-onlyheadwhereitis.
Statetransitiondiagram:APDAcanalsobedepictedbyastatetransitiondiagram.Thelabelsonthearcs
indicate both the input and the stack operation. The transition
for and isdepictedby
Finalstatesareindicatedbydoublecirclesandthestartstateisindicatedbyanarrowtoitfromnowhere.
ConfigurationorInstantaneousDescription(ID):
Aconfigurationoraninstantaneousdescription(ID)ofPDAatanymomentduringitscomputationisan element
of describing the current state, the portion of the input remaining to be read (i.e. under and to
the right of the read head), and the current stack contents. Only these three elements can affect the
computation from that point on and, hence, are parts of the ID.
The start or inital configuartion (or ID) on input is . That is, the PDA always starts in its
startstate,withitsreadheadpointingtotheleftmostinput symbol andthestackcontainingonlythe start/initial
stack symbol,.
The"nextmoverelation"onefiguredescribeshowthePDAcanmovefromoneconfigurationtoanother in one
step.
Formally,
iff
'a'maybe oraninput symbol.
LetI,J,KbeIDsof aPDA.WedefinewewriteI K,if IDIcanbecomeKafterexactlyi moves.The
relations and define as follows
I K
I Jif suchthatI K and K J
I Jif suchthat I J.
That is, is the reflexive, transitive closure of . We say that I
JiftheIDJfollowsfromtheIDIin zero or more moves.
(Note:subscriptMcanbedroppedwhentheparticularPDAMisunderstood.)
LanguageacceptedbyaPDAM
Therearetwoalternativedefinitonofacceptanceasgivenbelow.
1. Acceptancebyfinalstate:
Consider the PDA .Informally,thePDAMissaidtoacceptitsinput

byfinalstate if it enters any final state in zero or more moves after reading its entire input,
starting in the start
configurationoninput .
Formally,wedefineL(M),thelanguageacceptedbyfinalstatetobe
{ | forsome and }
2. Acceptancebyemptystack(orNullstack):ThePDAMacceptsitsinput byemptystackifstartinginthe
startconfigurationoninput ,iteveremptiesthestackw/
opushinganythingbackonafterreadingtheentire input. Formally, we define N(M), the language
accepted by empty stack, to be
{ | forsome }
Notethatthesetoffinalstates,Fisirrelevantinthiscaseandweusuallyletthe Ftobetheemptyseti.e.F=
Q.
Example1:HereisaPDAthatacceptsthelanguage .
,and consistsofthefollowingtransitions
ThePDAcanalsobedescribedbytheadjacenttransitiondiagram.
Informally, whenever the PDA M sees an input a in thestart state with the start symbol z on the
top of the stackitpushes aontothestack andchangesstateto .(torememberthatithas
seenthefirst'a').Onstate
ifitseesanymorea,itsimplypushesitontothestack.NotethatwhenMisonstate ,thesymbolonthe
top of the stack can only be a. On state if it sees the first b with a on the top of the stack, then it
needs to start comparison of numbers of a's and b's, since all the a's at the begining of the input
have already been pushedontothestack.Itstartthisprocessbypoppingofftheafromthetopof
thestackandentersinstateq3
(to remember that the comparison process has begun). On state , it expects only b's in the
input (if it sees anymoreaintheinputthustheinputwillnotbeintheproperformof
anbn).Hencethereisnomoreoninputa
when it is in state . On state it pops off an a from the top of the stack for every b in the
input.When it
seesthelastbonstateq3(i.e.whentheinputisexaushted),thenthelastafromthestackwillbepoppedoff
andthestartsymbolzisexposed.Thisistheonlypossiblecasewhentheinput(i.e.on -input)thePDAM
willmovetostate whichisanaccept state.
wecanshowthecomputationofthePDAonagiveninputusingtheIDsandnextmoverelations.Forexampl
e, following are the computation on two input strings.
Lettheinputbeaabb.westartwiththestartconfigurationandproceedtothesubsequentIDsusingthe
transition function defined
(usingtransition1)
( using transition 2 )
( using transition 3 )
( using transition 4 ), (usingtransition5), isfinalstate.Hence,accept.Sothe
string aabb is rightly accepted by M
i) Lettheinputbeaabab.
Nofurthermoveisdefinedatthis point.
HencethePDAgetsstuckandthestringaababisnotaccepted.
Example2:Wegiveanexampleof aPDAMthatacceptsthesetof balancedstringsof parentheses[]by

empty stack.
ThePDAMisgivenbelow.
where isdefinedas
Informally, whenever it sees a [, it will push the ] onto thestack. (first two transitions),and
whenever it sees a ] and the top of the stack symbol is [, it will pop the symbol [ off the stack.
(The third transition). The fourth transition is used when theinput is exhausted in order to pop z
off the stack ( to empty the stack) and accept.
Notethatthereisonlyonestateandnofinalstate.Thefollowingisasequenceofconfigurationsleadingtoth
e acceptance of the string [ [ ] [ ] ] [ ].
Equivalenceofacceptancebyfinalstateandemptystack.
Itturnsoutthatthetwodefinitionsof acceptanceof alanguagebyaPDA-

accpetancebyfinalstateandempty stack-are equivalent in the sense that if a language can be
accepted byemptystack bysome PDA, it can also be accepted byfinal state bysome otherPDA and
vice versa. Hence it doesn't matter which one we use, since
each kind of machine cansimulate the other.Given anyarbitraryPDA M that accpets the language
Lbyfinal state or empty stack, we can always construct an equivalent PDA M with a single final
state that accpets exactlythesamelanguageL.TheconstructionprocessofM'fromMandtheproof
ofequivalenceofM&M'are given below.
Therearetwocasestobeconsidered.
CASE I : PDA M accepts by final state, Let LetqfbeanewstatenotinQ.
Consider the PDA where as well as the following
transition.
contains and .ItiseasytoshowthatMandM'areequivalenti.e.

L(M)=L( )
Let L(M).Then forsome and Then
Thus accepts
Conversely, let accepts i.e. L( ), then for
inheritsallothermovesexceptthelastonefromM.Hence forsome
.
ThusMaccepts .Informally,onanyinput simulateallthemovesofMandentersinitsownfinalstate
whenever M enters in any one of its final status in F. Thus accepts a string iff M accepts it.
CASEII:PDAMacceptsbyemptystack.
We will construct from M in such a way that simulates M and detects when M empties its
stack. entersitsfinalstate whenandonlywhenMemptiesitsstack.Thus willacceptastring
iffM
accepts.
Let where andX and containsallthe

transition of , as well as the following two transitions.
and
Transitions1causes toentertheinitialconfigurationofMexceptthat willhaveitsownbottom-of-
stack marker X which is below the symbols of M's stack. From this point onward will simulate
every move of M since all the transitions of M are also in
IfMeveremptiesitsstack,then whensimulatingMwillemptyitsstackexceptthesymbolXatthebottom
. At this point, will enter its final state by using transition rule 2, thereby (correctly)
accepting the input.

WewillprovethatMand areequivalent.
LetMaccepts .Then
forsome .Butthen
(bytransitionrule1)
(Since includesallthemovesofM)
(bytransitionrule2)
Hence, alsoaccepts .Conversely,let accepts .
Then for some
Everymoveinthesequence, weretakenfromM.
Hence,Mstartingwithitsinitialconfigurationwilleventuallyemptyitsstackandaccepttheinput i.e.
EquivalenceofPDA’sandCFG’s:
We will now show that pushdown automata and context-free grammars are equivalent in
expressive power, thatis,thelanguageacceptedbyPDAsareexactlythecontext-
freelanguages.Toshowthis,wehavetoprove each of the following:
i) GivenanyarbitraryCFGGthereexistssomePDAMthatacceptsexactlythesamelanguag
e generated by G.
ii) GivenanyarbitraryPDAMthereexistsaCFGGthatgeneratesexactlythesamelanguag
e accpeted by M.
(i) CFAtoPDA
Wewillfirstprovethatthefirstparti.e.wewanttoshowtoconvertagivenCFGtoanequivalent PDA.
Let the given CFG is .WithoutlossofgeneralitywecanassumethatGisinGreibach
Normal Form i.e. all productions of G are of the form .
where and .
FromthegivenCFGGwenowconstructanequivalentPDAMthatacceptsbyemptystack.Notethatthereis
only one state in M. Let
,where
 qis theonlystate
 istheinputalphabet,
 Nisthestackalphabet,
 qisthestartstate.
 Sisthestart/initialstacksymbol,and ,thetransitionrelationisdefinedasfollows
For each production , .Wenowwanttoshow
that M and G are equivalent i.e. L(G)=N(M). i.e. for any . iff
If ,thenbydefinitionofL(G),theremustbealeftmostderivationstartingwithSandderivingw.
i.e.
Againif ,thenonesysmbol.Thereforeweneedtoshowthatfor any .
iff .
Butwewillproveamoregeneralresultasgiveninthefollowinglemma.
ReplacingAbyS(thestartsymbol) and by gives the required proof.
LemmaForany , and , viaaleftmostderivative iff
Proof:Theproofisbyinductiononn.
Basis:n=0
iff i.e. and
iff
iff
InductionStep:
First, assume that
viaaleftmostderivation.Letthelastproductionappliedintheirderivationis for some
and .
Then,for some ,
where and
Nowbytheindirectionhypothesis,we get,
...................................................................(1)
AgainbytheconstructionofM,weget
so,from(1),weget
since and , we get
Thatis,if ,then .Conversely,assumethat and

let
bethetransitionusedinthelastmove.Thenforsome , and
where and .
Now,bytheinductionhypothesis,weget
viaaleftmostderivation.
Again,bytheconstructionofM, mustbeaproductionofG.[ Since ].

Applyingtheproductiontothesententialform weget
i.e.
viaaleftmostderivatio
n.
Hencetheproof.
Example:ConsidertheCFGGinGNF
S aAB
A a /aA
B a/bB
TheonestatePDAMequivalenttoGisshownbelow.Forconvenience,aproductionofGandthe
corresponding transition in M are marked by the same encircled number.
(1) S aAB
(2) A a
(3) A aA
(4) B a
(5) B bB
.Wehaveusedthesameconstructiondiscussedearlier
SomeUsefulExplanations:
ConsiderthemovesofMoninputaaabaleadingtoacceptanceofthestring
. Steps
1. (q,aaaba,s) (q,aaba,AB)
2. ( q, aba,AB)
3. (q, ba, B)
4. (q, a, B)
5. (q, , ) Acceptbyemptystack.
Note:encirclednumbershereshowsthetransitionsruleappliedateverystep.
NowconsiderthederivationofthesamestringundergrammarG.Onceagain,theproductionusedatevery
step is shown with encircled number.
S aAB aaAB aaaB aaabB aaaba

Steps 1 2 3 4 5
Observations:
 Thereisanone-to-onecorrespondenceof thesequenceof movesof
thePDAMandthederivation
sequenceundertheCFGGforthesameinputstringinthesensethat-numberof
stepsinboththe cases are same and transition rule corresponding to the same
production is used at every step (as shown by encircled number).
 consideringthemovesofthePDAandderivationunder Gtogether,itisalsoobservedthatateve
ry step the input read so far and the stack content together is exactly identical to the
corresponding sentential form i.e.
<whatisRead><stack>=<sententialform>
Say,atstep2,Readsofar=a
stack=AB
Sententialform=aABFromthispropertyweclaimthat iff .If theclaimis
true, then apply with and we get iff or iff (by

definition )
ThusN(M)=L(G)asdesired.Notethatwehavealreadyprovedamoregeneralversionoftheclaim
PDA and CFG:
WenowwanttoshowthatforeveryPDAMthataccpetsbyemptystack,thereisaCFGGsuchthatL(G)=
N(M)
wefirstseewhetherthe"reverseof
theconstruction"thatwasusedinpart(i)canbeusedheretoconstructan equivalent CFG from any
PDAM.
ItcanbeshowthatthisreverseconstructionworksonlyforsinglestatePDAs.
 Thatis,foreveryone-statePDAMthereisCFGGsuchthatL(G)=N(M).Foreverymoveof the
PDAM we introduce a production in the
grammar where N = T and .
wecannowapplytheproofinpart(i)inthereversedirectiontoshowthat L(G)=N(M).
ButthereverseconstructiondoesnotworkforPDAswithmorethanonestate.Forexample,considerthePDA
Mproducedheretoacceptthelangauge
NowletusconstructCFG usingthe"reverse"construction.
(Note ).
TransitionsinM CorrespondingProductioninG
Wecandrivestringslikeaabaawhichisinthelanguage.
Butunderthisgrammarwecanalsoderivesomestringswhicharenotinthelanguage.e.g
and .But
Therefore,tocompletetheproofofpart(ii)weneedtoprovethefollowingclaimalso.
Claim:ForeveryPDAMthereissomeone-statePDA suchthat .
Itisquitepossibletoprovetheaboveclaim.Butherewe willadoptadifferentapproach.Westartwithany
arbitrary PDA Mthat accepts by empty stack and directly construct an equivalent CFG G.
PDAtoCFG
WewanttoconstructaCFGGtosimulateanyarbitraryPDAMwithoneormorestates.Withoutlossof
generality we can assume that the PDA M accepts by empty stack.
Theideaistousenonterminaloftheform<PAq>wheneverPDAMinstatePwithAontopofthestackgoes
tostate .Thatis,forexample,foragiventransitionofthePDAcorrespondingproductioninthegrammara
s shown below,
And, we would like to show, in general, that iffthePDAM,whenstartedfromstatePwithAon

the top of the stack will finish processing , arrive at state q and remove A from the stack.
wearenowreadytogivetheconstructionofanequivalentCFG GfromagivenPDAM.weneedtointroduce
two kinds of producitons in the grammar as given below. The reason for introduction of the first
kind of productionwillbejustifiedatalaterpoint.Introductionof thesecondtypeof
productionhasbeenjustifiedinthe above discussion.
Let beaPDA.WeconstructfromMaequivalentCFG
Where
 N is the set of nonterminals of the form <PAq>for and andPcontainsthefollwoing

two kind of production
1.
2. If ,thenforeverychoiceofthesequence ,
, .
Includethefollwoingproduction
Ifn=0,thentheproductionis .Forthewholeexercisetobemeaningfulwe want
means there is a sequence of transitions ( for PDA M ), starting in state q, ending

in , duringwhichthePDAMconsumestheinputstring
andremovesAfromthestack(and,ofcourse,allother symbols pushed onto stack in A's place, and so
on.)
Thatiswewanttoclaim that
iff
If this claim is true, then let to get iff

forsome . But for all we have
as production in G. Therefore,
iff i.e. iffPDAMacceptswbyemptystackorL(G)=N(M)
Now,toshowthattheabove constructionof CFGGfromanyPDAMworks,weneedtoprovetheproposed

claim.
Note: At this point, the justification for introduction of the first type of production (of the form )in
the CFG G, is quite clear. This helps use deriving a string from the start symbol of the grammar.
Proof:Oftheclaim iff for some , and

Theproof isbyinductiononthenumberof stepsinaderivationof
G(whichofcourseisequaltothenumberof moves taken by M). Let the number of steps taken is n.
Theproofconsistsof twoparts:'if'partand'onlyif'part.First,considerthe'if' part
If then .
Basisis n=1
Then .Inthiscase,itisclearthat .Hence,byconstruction is a

production of G.
Then
InductiveHypothesis:
Inductive Step :
For n >1, let w = ax for some and considerthefirstmoveofthePDAMwhichusesthe
general transition =
.NowMmust remove fromstackwhile

consumingxintheremainingn-1moves.
Let , where is the prefix of x that M has consumed when
firstappearsattopof the stack. Then there must exist a sequence of states in M (as per
construction) (with ), such that
[Thisstep implies ]
... [Thisstep implies ]
[Note:Eachsteptakeslessthanorequalton-1moves
becausethetotalnumberofmovesrequiredassumed to be n-1.]
Thatis,in general
, .
So,applyinginductivehypothesiswe get
, .Butcorrespondingtotheoriginalmove
inMwehaveaddedthefollowingproductioninG.
Wecanshowthecomputationof
thePDAonagiveninputusingtheIDsandnextmoverelations.Forexample, following are the
computation on two input strings.
i)Lettheinputbeaabb.westartwiththestartconfigurationandproceedtothesubsequentIDsusingthe
transition function defined
(usingtransition1), (usingtransition2)
( using transition 3 ), ( using transition 4 )
(usingtransition5), isfinalstate.Hence,accept.
So the string aabb is rightly accepted by M.
i)Lettheinputbeaabab.
Nofurthermoveisdefinedatthis point.
HencethePDAgetsstuckandthestringaababisnotaccepted.
Thefollowingisasequenceofconfigurationsleadingtotheacceptanceofthestring[[][]][].
Equivalenceofacceptancebyfinalstateandemptystack.
Itturnsoutthatthetwodefinitionsof acceptanceof alanguagebyaPDA-

accpetancebyfinalstateandempty stack-are equivalent in the sense that if a language can be
accepted byemptystack bysome PDA, it can also be accepted byfinal state bysome otherPDA and
vice versa. Hence it doesn't matter which one we use, since each kind of machine can simulate
the other.Given any arbitrary PDA M that accpets the language L by final state or empty stack,
we can always construct an equivalent PDA M with a single final state that accpets exactly the
same language L. The construction process of M'from M and the proof of equivalence of M
&M'are given below
Therearetwocasestobeconsidered.
CASE 1 : PDA M accepts by final state, Let . Let beanewstatenotinQ.
Consider the PDA where as well as the following transition.
contains and .ItiseasytoshowthatMand areequivalenti.e.
Let . Then forsome and
Then .
Thus accepts .
Conversely, let accepts i.e. , then
forsome . inherits all other moves except the last onefromM. Hence
for some
.
ThusMaccepts .Informally,onanyinput simulateallthemovesofMandentersinitsownfinalstate
whenever M enters in any one of its final status in F. Thus accepts a string iff M accepts it.
CASE2:PDAMacceptsbyemptystack.
we will construct from M in such a way that simulates M and detects when M empties its
stack. enters its final state whenandonlywhenMemptiesitsstack.Thus willacceptastring
iffM
accepts.
Let where and and containsall

thetransitionof ,aswellasthefollowingtwotransitions.
and
Transitions1causes toentertheinitialconfigurationofMexceptthat willhaveitsownbottom-of-

stack marker X which is below the symbols of M's stack. From this point onward M' will simulate
every move of M since all the transitions of M are also in .
IfMeveremptiesitsstack,then whensimulatingMwillemptyitsstackexceptthesymbolXatthebottom.
At this point , will enter its final state
byusingtransitionrule2,thereby(correctly)acceptingtheinput. we will prove that M and are

equivalent.
LetMaccepts .
Then
forsome .But then,
(bytransitionrule1)
(since includeallthemoves ofM)

(bytransitionrule2)
Hence, alsoaccepts .Conversely,let accepts .
Then forsomeQ.
Everymoveinthesequence
weretakenfromM.
Hence,Mstartingwithitsinitialconfigurationwilleventuallyemptyitsstackandaccepttheinput i.e.
DeterministicPDA:
RegularLanguagesandDPDA’sTheDPDA’sacceptsaclassoflanguagesthatisinbetweentheregular
languages and CFL’s.
DeterministicPushdownAutomata(DPDA)andDeterministicContext-freeLanguages(DCFLs)
Pushdown automata that we have already defined and discussed are nondeterministic by default, that is , there
may be two or more movesinvolvingthesamecombinationsof state,inputsymbol,andtopof
thestock,andagain,forsomestateand top of the stock the machinemayeither read and input symbol or
make an - transition(without consuming anyinput).
IndeterministicPDA,thereisneverachoiceof moveinanysituation.Thisishandledbypreventingtheabove
mentionedtwo cases as described in the definition below.
Defnition : Let beaPDA.ThenMisdeterministicifandonlyifboththefollowingconditionsare

satisfied.
1. has at most one element for any and

(thisconditionpreventsmultiplechoicef any combination of )
2. If and forevery
(Thisconditionpreventsthepossibilityofachoicebetweenamovewithorwithoutaninputsymbol).
EmptyProductionRemoval
The productions of context-free grammars can be coerced into a variety of forms
withoutaffectingtheexpressivepowerofthegrammars.Iftheemptystringdoesnotbelongtoalanguage,
then there is a way to eliminate the productions of the form A from the grammar.
Iftheemptystringbelongstoalanguage,thenwecaneliminatefromallproductions
save forthesingle productionS.InthiscasewecanalsoeliminateanyoccurrencesofSfrom the right-
hand side of productions.
ProceduretofindCFGwithoutemptyProductions
Unitproductionremoval
LeftRecursionRemoval
NORMALFORMS
Twokindsofnormalforms viz.,ChomskyNormalFormandGreibachNormalForm(GNF)are considered
here.
ChomskyNormalForm(CNF)
Anycontext-free language L without any-production is generated bya grammar is
whichproductionsareoftheformABCorAa,whereA,BVN,andaV.
ProceduretofindEquivalentGrammarinCNF
(i) Eliminatetheunitproductions,and-productionsifany,
(ii) Eliminatetheterminalsontherighthandsideoflengthtwoormore.
(iii) Restrictthenumberofvariablesontherighthandsideofproductionstotwo. Proof:
ForStep(i):Applythefollowingtheorem:“Everycontextfreelanguagecanbegeneratedbya grammar with
no useless symbols and no unit productions”.
AttheendofthissteptheRHSofanyproductionhasa single terminalortwoormore symbols. Let us
assume the equivalent resulting grammar as G  (VN ,VT ,P ,S ).
ForStep(ii):Consideranyproductionofthe form
Example
ObtainagrammarinChomskyNormalForm(CNF)equivalenttothegrammarGwith productions P
given
Solution
PumpingLemmafor CFG
A“PumpingLemma”isatheoremusedtoshowthat,ifcertainstringsbelongtoa
language, then certain other strings must also belong to the language. Let us discuss a Pumping
Lemma for CFL. We will show that , if L is a context-free language, then strings of L that are at
least ‘m’ symbols long can be “pumped” to produce additional strings in L. The value of ‘m’
dependsontheparticularlanguage. LetLbeaninfinitecontext-freelanguage.Thenthereissome positive
integer ‘m’ such that, if S is a string of L of Length at least ‘m’, then
(i) S=uvwxy(forsomeu,v,w,x,y)
(ii) |vwx|m
(iii) |vx|1
(iv) uviwxiyL.
forallnon-negativevaluesofi.
Itshouldbeunderstoodthat
(i) IfSissufficientlylongstring,thentherearetwosubstrings,vandx,somewhereinS. There is
stuff (u) before v, stuff (w) between v and x, and stuff (y), after x.
(ii) Thestuffbetweenvandxwon’tbetoolong,because|vwx|can’tbelargerthanm.
(iii) Substringsvandxwon’tbothbeempty,thougheitheronecouldbe.
(iv) Ifweduplicatesubstringv,somenumber(i)oftimes,andduplicatexthesamenumber of times,
the resultant string will also be in L.
Definitions
Avariableisusefulifitoccursinthederivationofsomestring.Thisrequires that
(a) thevariableoccursinsomesententialform(youcangettothevariableifyoustartfromS),and
(b) astringofterminalscanbederivedfromthesententialform(thevariableisnota“deadend”). A
variable is “recursive” if it can generate a string containing itself. For example, variable A is
recursive if
ProofofPumpingLemma
(a) SupposewehaveaCFLgivenbyL.Thenthereissomecontext-freeGrammarGthat generates
L.Suppose
(i) Lisinfinite,hencethereisnoproperupperboundonthelengthofstringsbelongingtoL.
(ii) Ldoesnotcontainl.
(iii) Ghasnoproductionsorl-productions.
Thereareonlyafinitenumberofvariablesinagrammarandtheproductionsforeach
variable have finitelengths. The onlywaythat a grammarcangenerate arbitrarilylong strings is if
oneormorevariablesisbothusefulandrecursive.Supposenovariableisrecursive.Sincethestart symbol is
non recursive, it must be defined only in terms of terminals and other variables. Then
sincethosevariablesarenonrecursive,theyhavetobedefinedintermsofterminalsandstillother variables
and so on.
After a while we run out of“other variables” whilethe generated stringisstillfinite. Therefore
thereisanupperbondonthelengthofthestringwhichcanbegeneratedfromthestartsymbol. This
contradicts our statement that the language is finite.
Hence,ourassumptionthatnovariableisrecursivemustbeincorrect.
(b) Let usconsidera stringX belongingtoL.IfX is sufficientlylong,thenthederivationofX must have
involved recursive use of some variable A. Since A was used in the derivation, the derivation
should have started as
UsageofPumpingLemma
Henceouroriginalassumption,thatLiscontextfreeshouldbefalse.HencethelanguageLisnot con text-
free.
Example
CheckwhetherthelanguagegivenbyL{ambmcn:mn2m}isaCFLor not. Solution
ClosurepropertiesofCFL–Substitution
Applicationsofsubstitutiontheorem
Reversal
InverseHomomorphism:
MODULE-IV
Turingmachine:
InformalDefinition:
WeconsiderhereabasicmodelofTMwhichisdeterministicandhaveone-
tape.Therearemanyvariations,all are equally powerfull.
Thebasicmodelof TMhasafinitesetofstates,asemi-infinitetapethathasaleftmostcellbutisinfinitetothe
right and a tape head that can move left and right over the tape, reading and writing symbols.
Foranyinput w with |w|=n, initiallyit is written on the n leftmost (continguous)tape cells. The
infinitelymany cells to the right of the inputall contain a blanksymbol, B whcih is aspecial tape
symbol that is not an input symbol. The machine starts in its start state with its head scanning
the leftmost symbol of the input w. De- pendinguponthesymbol
scannedbythetapeheadandthecurrentstatethemachinemakesamovewhich consists of the
following:
 writesanewsymbolonthattapecell,
movesitsheadonecelleithertotheleftortotherightand
 (possibly)entersanewstate.
Theactionittakesineachstepisdeterminedbyatransitionfunctions.Themachinecontinuescomputing(i
.e. making moves) until
 itdecidesto"accept"itsinputbyenteringaspecialstatecalledacceptorfinalstateor
 haltswithoutacceptingi.e.rejectingtheinputwhenthereisnomovedefined.
OnsomeinputstheTMmanykeeponcomputingforeverwithouteveracceptingorrejectingtheinput,inwh
ich case it is said to "loop" on that input
FormalDefinition:
Formally,adeterministicturingmachine(DTM)isa7-tuple ,where
 Qisafinitenonemptysetofstates.
 isafinitenon-emptysetoftapesymbols,callledthetapealphabetof M.
 isafinitenon-emptysetofinputsymbols,calledtheinputalphabetof M.
 isthetransitionfunctionofM,
 istheinitialorstart state.
 is theblanksymbol
 isthesetoffinalstate.
So,giventhecurrentstateandtapesymbolbeingread,thetransitionfunctiondescribesthenextstate,sy
mbol to be written on the tape, and the direction in which to move the tape head ( L and R
denote left and right, respectively ).
Transitionfunction:
 TheheartoftheTMisthetransitionfunction, becauseittellsushowthemachinegetsonestepto
the next.
 whenthemachineisinacertainstateq Qandtheheadiscurrentlyscanningthetapesymbo
l , and if , then the machine
1. replacesthesymbolXbyYonthetape
2. goestostatep,and
3. thetapeheadmovesonecell(i.e.onetapesymbol)totheleft(orright)if DisL(orR).
TheID(instantaneousdescription)of aTMcapturewhatisgoingoutatanymomenti.e.itcontainsallthe
information to exactly capture the "current state of the computations".
Itcontainsthefollowing:
 Thecurrentstate,q
 Thepositionofthetapehead,
 Theconstantsofthetapeuptotherightmostnonblanksymbolorthesymboltotheleftofthehea
d, whichever is rightmost.
Notethat,althoughthereisnolimitonhowfarrighttheheadmaymoveandwritenonblanksymbolsonthe
tape, at any finite
time,theTMhasvisitedonlyafiniteprefixoftheinfinitetape.
AnID(orconfiguration)ofaTMMisdenotedby where and
 is the tape contents to theleft of the head

 qisthecurrent state.
 isthetapecontentsatortotherightofthetapehead
That is, the tape head is currently scanning the leftmost tape symbol of . ( Note that
if ,thenthetape head is scanning
a blank symbol)
If isthestartstateandwistheinputtoaTMMthenthestartingorinitialconfigurationofMisonviously
denoted by
MovesofTuringMachines
Toindicateonemoveweusethesymbol .Similarly,zero,one,ormore moves will berepresentedby

.A move of a TM
Misdefinedasfollows.
Let be an ID of M where , and
. Letthereexistsa
transition ofM.
Then we write meaningthatID yields
 Alternatively , if isatransitionofM,thenwewrite which

means that the ID yields
 Inotherwords,whentwoIDs arerelatedbythe relation ,wesaythatthefirstoneyields

thesecond ( or the second is the result of the first) by one move.
 If IDjresultsfromIDi byzero, oneormore(finite)moves thenwewrite (If theTMMis
understand, then the subscript M can be dropped from or )
SpecialBoundaryCases
 Let be an ID and beantransitionofM.Then .Thatis,theheadisnot

allowed to fall off the left end of the tape.
 Let beanIDand thenfigure(Note that isequivalent to )
 Let beanIDand thenfigure
 Let beanIDand thenfigure
ThelanguageacceptedbyaTM ,denotedasL(M)is
L(M)={ w| andfigureforsomep Fand }
In other words the TM M accepts a string
thatcauseMtoenterafinaloracceptingstatewhenstarted in its initial ID (i.e. ). That is a TM M

accepts the string if a sequence of IDs,
existssuchthat
 istheinitialorstartingIDofM
 ;
 TherepresentationofIDkcontainsanacceptingstate.
ThesetofstringsthatMacceptsisthelanguageofM,denotedL(M),asdefinedabove
More about configuration and acceptance
 AnID ofMiscalledanaccepting(orfinal)IDif
 An ID is called a blocking (or halting) ID if isundefinedi.e.theTMhasnomoveatthis

point.
 is called reactable from if
 is the initial (or starting) ID if is the input to the TM and

istheinitial(orstart)state of M.
Onanyinputstring
either
 M halts on wif there exists a blocking (configuration) ID, such that
Therearetwocasestobeconsidered
 M accepts w if I is an accepting ID. The set of all acceptedbyMisdenotedasL(M)as

already defined
 Mrejectswif isablockingconfiguration.Denotebyreject(M),thesetofall rejectedbyM.
or
 Mloopsonwifitdoesnothaltonw.
Let loop(M) be the set of all onwhichMloopsfor.
It is quite clear that
Thatis,weassumethataTMMhalts
 Whenitentersanaccepting or
 Whenitentersablocking i.e.whenthereisnonextmove.
However,onsomeinputstring,, ,itispossiblethattheTMMloopsforeveri.e.itneverhalts
TheHaltingProblem
The input to a Turing machine is a string. Turing machines themselves can be written as
strings. Since these strings can be used asinputto other Turingmachines. A “Universal Turing
machine”isonewhoseinputconsistsofadescriptionMofsomearbitraryTuringmachine,and
someinputwtowhichmachineMistobeapplied,wewritethiscombinedinputasM+w.This produces the
same output that would be produced by M. This is written as
UniversalTuringMachine(M+w)=M(w).
AsaTuringmachinecanberepresentedasastring,itisfullypossibletosupplya Turing
machineasinput toitself,forexampleM(M).Thisisnotevena particularlybizarre thingtodofor example,
suppose you have written a C pretty printer in C, then used the Pretty printer on itself.
Another common usage is Bootstrapping—where some convenient languages used to write a
minimal compiler for some new language L, then used thisminimal compiler for Lto write a new,
improved compiler for language L. Each time a new feature is added to language L, you can
recompileandusethisnewfeatureinthenextversionofthecompiler.Turingmachinessometimes halt, and
sometimes they enter an infinite loop.
A Turing machine might halt for one input string, but go into an infinite loop when given
someotherstring.Thehaltingproblemasks:“Itispossibletotell,ingeneral,whetheragiven
machine will halt for some given input?” Ifit is possible, then there is aneffective procedure tolook at
a Turing machine and its input and determine whether the machine will halt with that input. If there
is an effective procedure, then we can build a Turing machine to implement it. Suppose we have a
Turingmachine “WillHalt” which,givenaninput stringM+w,will haltandacceptthe string
ifTuringmachineMhaltsoninputwandwill haltandrejectthe stringifTuringmachineMdoesnot
haltoninputw.WhenviewedasaBooleanfunction,“WillHalt(M,w)”haltsandreturns“TRUE”in the first
case, and (halts and) returns “FALSE” in the second.
Theorem
TuringMachine“WillHalt(M,w)”doesnotexist.
Proof: This theorem is proved by contradiction. Suppose we could build a machine “WillHalt”.
Thenwecancertainlybuildasecondmachine,“LoopIfHalts”,thatwillgointoaninfiniteloopif and only if
“WillHalt” accepts its input:
FunctionLoopIfHalts(M,w):
ifWillHalt(M,w)then
while true do { }
else
returnfalse;
Wewillalsodefineamachine“LoopIfHaltOnItSelf”that,foranygiveninputM,representinga
Turingmachine,will determine what will happenifMisappliedtoitself,andloopsifMwill haltin this case.
FunctionLoopIfHaltsOnItself(M):
returnLoopIfHalts(M,M):
Finally,weaskwhathappensifwetry:
FunctionImpossible:
returnLoopIfHaltsOnItself(LoopIfHaltsOnItself):
Thismachine,whenappliedtoitself,goesintoaninfinite loopifandonlyifithaltswhen applied to
itself. This is impossible. Hence the theorem is proved.
ImplicationsofHaltingProblem
Programming
TheTheoremof“HaltingProblem”doesnotsaythatwecanneverdeterminewhetherornot
agivenprogramhaltsonagiveninput.Mostofthetimes,forpracticalreasons,wecouldeliminate infinite
loops from programs. Sometimes a “meta-program” is used to check another program for potential
infinite loops, and get this meta-program to work most of the time.
Thetheoremsaysthatwecannoteverwritesuchameta-programandhaveitworkallofthe
time.Thisresultisalsousedtodemonstrate thatcertainotherprogramsarealsoimpossible. The
basic outline is as follows:
(i) IfwecouldsolveaproblemX,wecouldsolvetheHaltingproblem
(ii) WecannotsolvetheHaltingProblem
(iii) Therefore,wecannotsolveproblemX
ATuringmachinecanbe"programmed,"inmuchthesamemannerasacomputeris
programmed.WhenonespecifiesthefunctionwhichweusuallycallforaTm,heisreallywriting a program
for the Tm.
1. StorageinfiniteControl
The finite control can be used to hold a finite amount of information. To do so, the state is
writtenasapairofelements,oneexercisingcontrolandtheotherstoringasymbol.Itshouldbe
emphasizedthatthisarrangementisforconceptualpurposesonly.Nomodificationinthedefinition of the
Turing machine has been made.
Example
ConsidertheTuringmachine
Solution
2. MultipleTracks
We can imaginethat the tape oftheTuringmachine is divided into k tracks, for anyfinite k. This
arrangementisshowninFig.,withk=3.Whatisactuallydoneisthatthesymbolsonthetapeare considered as
k-tuples. One component for each track.
Example
ThetapeinFig.canbeimaginedtobethatofaTuringmachinewhichtakesabinaryinput
greaterthan2,writtenonthefirsttrack,anddeterminesifitisaprime.Theinputissurroundedby¢ and $ on the
first track.
Thus,theallowableinputsymbolsare[¢,B,B],[0,B,B],[1,B,B],and[$,B,B].These
symbolscanbeidentifiedwith¢,0,1,and$,respectively,whenviewedasinputsymbols.Theblank
symbolcanberepresentedby[B,B,B]
Totestifitsinputisaprime,theTmfirstwritesthenumbertwoinbinaryonthesecondtrack
andcopiesthe firsttrackontothethirdtrack.Then,thesecondtrackissubtracted,asmanytimesas possible,
from the third track, effectively dividing the third track by the second and leaving the remainder. If
the remainder is zero, the number on the first track is not a prime. If the remainder is nonzero,
increase the number on the second track by one.
If now the second track equals the first, the number on the first track is a prime, because it cannotbe
divided by any number between one and itself. If the second is less than the first, the whole
operationisrepeatedforthenewnumberonthesecondtrack.InFig.,theTmistestingtodetermine if 47 is a
prime. The Tm is dividing by5; already5 has been subtracted twice, so 37 appears on the third
track.
3. Subroutines
UNDECIDABILITY
DesignaTuringmachinetoaddtwogivenintegers. Solution:
SomeunsolvableProblemsareasfollows:
(i) DoesagivenTuringmachineMhaltsonall input?
(ii) DoesTuringmachineMhaltforanyinput?
(iii) IsthelanguageL(M)finite?
(iv) DoesL(M)containastringoflengthk,forsomegivenk?
(v) DotwoTuringmachinesM1andM2acceptthesamelanguage?
Itisveryobviousthatifthereisnoalgorithmthatdecides,foranarbitrarygivenTuringmachineM and input
string w, whether or not M accepts w. These problems for which no algorithms exist are called
“UNDECIDABLE” or “UNSOLVABLE”.
CodeforTuringMachine:
Diagonalizationlanguage:
ThistablerepresentslanguageacceptablebyTuringmachine
ProofthatLdisnotrecursivelyenumerable:
RecursiveLanguages:
Universal
Language:
UndecidabilityofUniversalLanguage:
Problem-Reduction :
IfP1reducedtoP2,
Then P2 is at least as hard as P1.
Theorem:IfP1reducestoP2then,
 IfP1isundecidablethesoisP2.
 IfP1isNon-REthensoisP2.
Post'sCorrespondenceProblem(PCP)
A post correspondence system consists of a finite set of ordered pairs
where for some alphabet .
Anysequenceof numbers
iscalledasolutiontoaPostCorrespondenceSystem.
ThePost'sCorrespondenceProblemistheproblemofdeterminingwhether
a Post Correspondence system has a solutions.
Example1:Considerthepostcorrespondencesystem
Thelist1,2,1,3isasolutiontoit.
Because
i xi yi
1
2
3
(ApostcorrespondencesystemisalsodenotedasaninstanceofthePCP)
Example 2 : The following PCP instance has no solution
i xi yi
1
2
This can be proved as follows. cannot be chosen at thestart, since than the LHS andRHS
would differ in the first symbol ( in LHS and in RHS). So, we must start with
.Thenextpairmustbe so that the 3 rd symbol in the RHS
becomes identical to that of the LHS, which is a . After this
step,LHSandRHSarenotmatching.If isselectednext,thenwouldbemismatchedinthe7thsymbol
( in LHS and in RHS). If
isselected,instead,therewillnotbeanychoicetomatchthebothsidein the next step.
Example3:Thelist1,3,2,3isasolutiontothefollowingPCPinstance.
i xi yi
1 1 101
2 10 00
3 011 11
Thefollowingpropertiescaneasilybeproved.
PropositionThePostCorrespondenceSystem
hassolutionsifandonlyif
Corollary:PCPoverone-letteralphabetisdecidable.
Proposition Any PCP instance over an alphabet with isequivalenttoaPCPinstanceoveran

alphabet with
Proof:Let
Consider We can now encode every as anyPCPinstanceover willnow

have only two symbols, 0 and 1 and, hence, is equivalent to a PCP instance over
Theorem:PCPisundecidable.Thatis,thereisnoalgorithmthatdetermineswhetheranarbitraryPost
Correspondence System has a solution.
Proof:ThehaltingproblemofturningmachinecanbereducedtoPCPtoshowtheundecidabilityof
PCP.Since
haltingproblemofTMisundecidable(alreadyproved),ThisreductionshowsthatPCPisalsoundecidable.T
he proof is little bit lengthy and left as an exercise.
Someundecidableproblemincontext-freelanguages
WecanusetheundecidabilityofPCPtoshowthatmanyproblemconcerningthecontext-
freelanguagesare undecidable. To prove this we reduce the PCP to each of these problem. The
following discussion makes it clear how PCP can be used to serve this purpose.
Let beaPostCorrespondenceSystemoverthealphabet
.Weconstruct two CFG's Gxand Gyfrom the ordered pairs x,y respectively as follows.
and
where
and
it is clear that the grammar
generatesthestringsthatcanappearintheLHSofasequencewhilesolving the PCP followed by a

sequence of numbers. The sequence of number at the end records the sequence of
strings from the PCP instance (in reverse order) that generates the string. Similarly,
generatesthestrings that can be obtained from the RHS of a sequence and the corresponding
sequence of numbers (in reverse order).
Now,ifthePostCorrespondenceSystemhasasolution,thentheremustbeasequence
Accordingtotheconstructionof and
Inthiscase
Hence, and implying
Conversely,let
Hence, w must be in the form w1w2 where and w2 in a sequence
(since,onlythatkindof strings can be generated by each of
and ).
Now,thestring isasolutiontothePostCorrespondenceSystem.
ItisinterestingtonotethatwehaveherereducedPCPtothelanguageofpairsofCFG,swhoseintersectionis
nonempty. The following result is a direct conclusion of the above.
Theorem:GivenanytwoCFG'sG1andG2thequestion"Is "is undecidable.
Proof:AssumeforcontradictionthatthereexistsanalgorithmAtodecidethisquestion.Thiswouldimplyth
at PCP is decidable as shown below.
For any Post Correspondence System, P construct grammars and by
using the constructions elaboratedalready.WecannowusethealgorithmAtodecidewhether
and
Thus,PCPisdecidable,acontradiction.So,suchanalgorithmdoesnotexist.
If and areCFG'sconstructedfromanyarbitraryPostCorrespondenceSystem,thanitisnotdifficultto
show that and arealsocontext-free,eventhoughtheclassofcontext-

freelanguagesarenot closed under complementation.
andtheircomplementscanbeusedinvariouswaystoshowthatmanyotherquestions
related to CFL's are undecidable. We prove here some of those.
Theorem:FoeanytwoarbitraryCFG's thefollowingquestionsareundecidable
i. Is
ii. Is
iii. Is
Proof:
i. If then,
Hence,itsufficetoshowthatthequestion“Is "is undecidable.
Since, and are CFl's and CFL's are closed under union,
isalsocontext- free. By DeMorgan's theorem,
Ifthereisanalgorithmtodecidewhether wecanuseittodecide whether
ornot.Butthisproblemhasalreadybeenprovedtobeundecidable.
Hencethereisnosuchalgorithmtodecideornot.
ii.
Let P be any arbitrary Post correspondence system and and

areCFg'sconstructedfromthepairsof
strings.
mustbeaCFLandletG1generatesL1.Thatis,
by De Morgan's theorem, as shown already, any string,
representsasolutiontothe PCP. Hence, contains all but those
strings representing the solution to the PCP.
Let forsameCFGG2.
It is now obvious that ifandonlyifthePCPhasnosolutions,whichisalreadyprovedtobe
undecidable. Hence, the question “Is ?" is undecidable.
iii.
Let beaCFGgeneratingthelanguage andG2beaCFG generating
where and areCFG.sconstructedfromsamearbitraryinstanceof PCP.
iff
i.e.iffthePCPinstancehasnosolutionsasdiscussedinpart(ii).
Hencetheproof.
Theorem:ItisundecidablewhetheranarbitraryCFGisambiguous.
Proof : Consider an arbitrary instance of PCP and construct the CFG's and
fromtheorderedpairsof
strings.
WeconstructanewgrammarGfrom and asfollows.
where
issameasthatof and .
ThisconstructionsgivesareductionofPCPto the-------ofwhetheraCFGisambiguous,thusleadingtothe
undecidabilityof thegivenproblem.Thatis,wewillnowshowthatthePCPhasasolutionif andonlyif Gis
ambiguous. (where G is constructed from an arbitrary instance of PCP).
OnlyifAssume that isasolutionsequencetothisinstanceofPCP.
Considerthefollowingtwoderivation in .
But,
is a solution to the PCP. Hence the same string of terminals
hastwoderivations.Boththese derivations are, clearly, leftmost. Hence G is ambiguous.
IfItisimportanttonotethatanystringofterminalscannothavemorethanonederivation in and
Because,everyterminalstringwhicharederivableunderthesegrammarsendswithasequenceofinte
gers
Thissequenceuniquelydetermineswhichproductionsmustbeusedateverystepofthederivat
ion.
Hence, if a terminal string,

,hastwoleftmostderivations,thenoneofthemmustbeginwitht
he step.
thencontinueswithderivationsunder
In both derivations the resulting string must end with a sequence for same
Thereverseof this
sequence must be a solution to the PCP, because the string that precede in one case is
and intheothercase.Sincethestringderivedinbothcasesareidentical,the
sequence
mustbeasolutiontothePCP.
Hence the proof

Classp-problemsolvableinpolynomialtime:
Nondeterministicpolynomialtime:
Anondeterministic TMthatnevermakesmorethanp(n)movesinanysequence ofchoicesforsome
polynomial p is said to be non polynomial time NTM.
 NPisthesetoflanguagsthatareacceptedbypolynomialtimeNTM’s
 ManyproblemsareinNPbutappearnottobeinp.
 Oneofthegreatmathematicalquestionsofourage:isthereanythinginNPthatisnotinp?
NP-completeproblems:
IfWecannotresolvethe“p=npquestion,wecanatleastdemonstratethatcertainproblemsinNPare the
hardest , in the sense that if any one of them were in P , then P=NP.
 ThesearecalledNP-complete.
 Intellectualleverage:EachNP-completeproblem’sapparentdifficultyreinforcesthebelief that
they are all hard.
MethodsforprovingNP-Completeproblems:
 Polynomialtimereduction(PTR):Taketimethatissomepolynomialintheinputsizeto convert
instances of one problem to instances of another.
 IfP1PTRtoP2andP2isinP1thesoisP1.
 StartbyshowingeveryprobleminNPhasaPTRtoSatisfiabilityofBooleanformula.
 Then,moreproblemscanbeprovenNPcompletebyshowingthatSATPTRstothem directly
or indirectly.

Units

Uploaded by

Copyright:

Available Formats

Units

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Units

Uploaded by

Copyright:

Available Formats

MODULE-I

Foranysetsa,b,cifa∩b =Φ andcisasubsetofbtheprovethata∩c=Φ Given :

cc arefour strings over the alphabet { a, b, c }.

Notethatxisaprefix(suffixorsubstring)tox,foranystringxandεisaprefix(suffixor substring) to any

Astringxisaproperprefix(suffix)of stringyif xisaprefix(suffix)of yandx≠y. In the

above example, all prefixes except 011 are proper prefixes.

Powers of Strings : For any string x and integer , we use todenotethestring

Thesetofallstringsoveranalphabet isdenotedby .Thatis,

Example :If ={a,b},then ={ε,a,b,aa,ab,ba,bb,aaa,aab,aba,abb,baa,…}.

Please note that if , then that is .Itmaylookoddthatonecanproceed

Thesetofallnonemptystringsoveranalphabet isdenotedby . Thatis,

Setoperationsonlanguages: Since languagesare setof stringswe can applyset operationsto

Intersection : A string, xϵ L1 ∩L2 iff x ϵ L1 and x ϵ L2 .

Complement:Usually, is the universe that a complement is taken with respect to.

1. LetL={0,11,01,011}.Then = { 0, 11, 10, 110 }.

Languageconcatenation:Theconcatenationoflanguages and isdefinedas

Kleene's Star operation :The Kleene star operation on a language L, denoted as is

={e} {a,ab} {aa,aab,aba,abab} …

Informally, a state of a system is an instantaneous description of that system which gives

Anautomatonprocessesastringonthetapebyrepeatingthefollowingactionsuntilthetape head has

Ifitisanacceptstate,theinputstringisaccepted;otherwise,thestringisrejected.Summariz- ing all the

above we can formulate the following formal definition:

ThelanguageacceptedorrecognizedbyaDFAM istheset ofall stringsacceptedbyM ,and

is denoted by i.e. The notion of

That is, isthestatetheautomationreacheswhenitstartsfromthestateqandfinish

Explanation:Wecannotreachfindstate w/0orinthei/pstring.Therecan be anyno.

pointer to .Whentheendoftheinputstringwisencountered,thepointerison some

Example : Let ={0,1,2}.Then(0+21)*(1+F)isaRE,becausewecanconstructthis

Notation:If risaREoversomealphabetthenL(r)isthelanguageassociatewithr.Wecan define the

2. isaREdescribingthelanguage{ }i.e.L( )={ }.

4. If and areREsdenotinglanguageL( )andL( )respectively,then

i) isaregularexpressiondenotingthelanguageL( )=L( )L( )

iv)( )isaregularexpressiondenotingthelanguageL(( ))=L( )

Example:TheRE(ab)*+b representsthelanguage(L(a)L(b))* L(b).

Note:Thenotation isusedtorepresenttheRErr*.Similarly, representsthe RE

Exercise : Give a RE r over {0,1} s.t. L(r)={ hasatleastonepairof

Example : Considering the above example it becomes clean that the RE

Recallthat, language that isacceptedbysome FAsareknown

Lemma: IfL(r) isalanguagedescribedbytheRE r,thenitisregulari.e.thereisaFA

Proof : To prove the lemma, we apply structured index on the expression r.

 Case(i): .Then . Then andthefollowingNFAN

Formally, where for or

 Case(i):ConsidertheRE denotingthelanguage .We

Create anew(initial)start state and give -transition to the initial state of

is easy to prove that

 Case(ii): ConsidertheRE denotingthelanguage .Weconstruct

concatenated)anynoof times. Hence

Case(iv):Let =( ).ThentheFA isalsotheFAfor( ),sincetheuseof parentheses

In an -transition, the tape head doesn't do anything- it doesnot read and it

implyingthatthenext state could by any one of w/o consuming the

Them/cisnotdeterministicsince thereare two transitionsfrom state on input1

Formally, an NFA is a quituple whereQ, , ,andF bear

That is, L(N) is the set of all strings w in such that

containsatleastone accepting state.

Ifweremovedthe -transition fromthe -NFA,thenweneedtomoves

whicharereachablefromstateq(in the - NFA ) on same input symbol q. This

Theorem if L isaccepted byan -NFA N, then there issomeequivalent

Let bethegiven with

Where, forall and and

Wecanshowthat i.e.N'andNare equivalent.

Example:TheRE(ab)+b representsthelanguage(L(a)L(b)) L(b).