DFA p1+Exercise-ToC Papadimiriou

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Finite Automata

2.1 DETERMINISTIC FINITE AUTOMATA


This book is about mathematical models of computers and algorithms. In this
and the next two chapters we shall define increasingly powerful models of com-
putation, more and more sophisticated devices for accepting and generating lan-
guages. Seen in the light of the whole development of the book, this chapter will
seem a rather humble beginning: Here we take up a severely restricted model
of an actual computer called a finite automaton,t or finite-state machine.
The finite automaton shares with a real computer the fact that it has a "central
processing unit" of fixed, finite capacity. It receives its input as a string, deliv-
ered to it on an input tape. It delivers no output at all, except an indication
of whether or not the input is considered acceptable. It is, in other words, a
language recognition device, as described at the end of Chapter 1. What makes
the finite automaton such a restricted model of real computers is the complete
absence of memory outside its fixed q:mtral processor.
Such a simple computational model might at first be considered too trivial
to merit serious study: of what use is a computer with no memory? But a finite
automaton is not really without memory; it simply has a memory capacity that
is fixed "at the factory" and cannot thereafter be expanded. It can be argued
that the memory capacity of any computer is limited -by budget constraints,
by physical limits, ultimately by the size of the universe. Whether the best
mathematical model for a computer is one that has finite memory or one that
has unbounded memory is an interesting philosophical question; we shall study
both kinds of models, starting with the finite one and later dwelling much more

t An automaton (pronounced: o-to-ma-ton, plural: automata) is a machine de-


signed to respond to encoded instructions; a robot.

55
56 Chapter 2: FINITE AUTOMATA

on the unbounded one.


Even if one thinks, as we do, that the correct way to model computers and
algorithms is in terms of an unbounded memory capacity, we should first be
sure that the theory of finite automata is well understood. It turns out that
their theory is rich and elegant, and when we understand it we shall be in a
better position to appreciate exactly what the addition of auxiliary memory
accomplishes in the way of added computational power.
A further reason for studying finite automata is their applicability to the
design of several common types of computer algorithms and programs. For
example, the lexical analysis phase of a compiler (in which program units such
as 'begin' and '+' are identified) is often based on the simulation of a finite
autotnaton. Also, the problem of finding an occurrence of a string within another
--for example, whether any of the strings air', water, earth, and fire occur in the
text of Elements of the Theory of Computation t - can also be solved efficiently
by methods originating from the theory of finite automata.

Input
tape

Finite
control

Figure 2-1

Let us now describe the operation of a finite automaton in more detail.


Strings are fed into the device by means of an input tape, which is divided into
squares, with one symbol inscribed in each tape square (see Figure 2-1). The
main part of the machine itself is a "black box" with innards that can be, at
any specified moment, in one of a finite number of distinct internal states. This
black box -called the finite control- can sense what symbol is written at any
position on the input tape by means of a movable reading head. Initially, the
reading head is placed at the leftmost square of the tape and the finite control is
set in a designated initial state. At regular intervals the automaton reads one
symbol from the input tape and then enters a new state that depends only on the

t All four of them do, three of them outside this page.


2.1: Deterministic Finite Automata 57

current state and the symbol just read -this is why we shall call this device a
deterministic finite automaton, to be contrasted to the nondeterministic version
introduced in the next section. After reading an input symbol, the reading head
moves one square to the right on the input tape so that on the next move it will
read the symbol in the next tape square. This process is repeated again and
again; a symbol is read, the reading head moves to the right, and the state of
the finite control changes. Eventually the reading head reaches the end of the
input string. The automaton then indicates its approval or disapproval of what
it has read by the state it is in at the end: if it winds up in one of a set of final
states the input string is considered to be accepted. The language accepted
by the machine is the set of strings it accepts.
When this informal account is boiled down to its mathematical essentials,
the following formal definition results.

Definition 2.1.1: A deterministic finite automaton is a quintuple M


(K,~, J, s, F) where

K is a finite set of states,


~ is an alphabet,
s E K is the initial state,
F ~ K is the set of final states, and
J, the transition function, is a function from K x ~ to K.

The rules according to which the automaton M picks its next state are
encoded into the transition function. Thus if M is in state q E K and the
symbol read from the input tape is a E ~, then J(q, a) E K is the uniquely
determined state to which K passes.
Having formalized the basic structure of a deterministic finite automaton,
we must next render into mathematical terms the notion of a computation by
an automaton on an input string. This will be, roughly speaking, a sequence
of configurations that represent the status of the machine (the finite control,
reading head, and input tape) at successive moments. But since a deterministic
finite automaton is not allowed to move its reading head back into the part of
the input string that has already been read, the portion of the string to the left
of the reading head cannot influence the future operation of the machine. Thus
a configuration is determined by the current state and the unread part of the
string being processed. In other words, a configuration of a deterministic
finite automaton (K,~, J, s, F) is any element of K x ~*. For example, the
configur ation illustrated in Figure 2-1 is (q2, ababab).
The binary relation I- M holds between two configurations of 111 if and only
if the machine can pass from one to the other as a result of a single move. Thus
if (q,w) and (q',w') are two configurations of M, then (q,w) I-M (q',w') if and
only if w = aw' for some symbol a E ~, and J(q, a) = q'. In this case we say
58 Chapter 2: FINITE AUTOMATA

that (q,w) yields (q',w') in one step. Note that in fact I-M is a function from
K x ~+ to K x ~*, that is, for every configuration except those of the form (q, e)
there is a uniquely determined next configuration. A configuration of the form
(q, e) signifies that M has consumed all its input, and hence its operation ceases
at this point.
We denote the reflexive, transitive closure of I-M by I-~; (q,w) I-~ (q',w')
is read, (q,w) yields (q',w') (after some number, possibly zero, of steps). A
string w E ~* is said to be accepted by M if and only if there is a state q E F
such that (s, w) I-~ (q, e). Finally, the language accepted by M, L(M), is
the set of all strings accepted by M.
Example 2.1.1: Let M be the deterministic finite automaton (K,~, J, s, F),
where
K={qo,qd,
~ = {a,b},
s = qo,
F = {qo},
and J is the function tabulated below.

q (J J(q, (J)
qo a qo
qo b ql
ql a ql
ql b qo

It is then easy to see that L( M) is the set of all strings in {a, b} * that have
an even number of b's. For M passes from state qo to ql or from ql back to qo
when a b is read, but 111 essentially ignores a's, always remaining in its current
state when an a is read. Thus M counts b's modulo 2, and since qo (the initial
state) is also the sole final state, M accepts a string if and only if the number
of b's is even.
If M is given the input aabba, its initial configuration is (qo, aabba). Then

(qo, aabba) I-M (qo, abba)


I-M (qo,bba)
I-M (ql,ba)
I-M (qO, a)
I- M (qo, e)

Therefore (qo, aabba) I-j,{ (qo, e), and so aabba is accepted by M. <:;
2.1: Deterministic Finite Automata 59

Figure 2-2

The tabular representation of the transition function used in this example


is not the clearest description of a machine. We generally use a more convenient
graphical representation called the state diagram (Figure 2-2). The state
diagram is a directed graph, with certain additional information incorporated
into the picture. States are represented by nodes, and there is an arrow labeled
with a from node q to q' whenever J(q, a) = q'. Final states are indicated by
double circles, and the initial state is shown by a >. Figure 2-2 shows the state
diagram of the deterministic finite automaton of Example 2.1.1.
Example 2.1.2: Let us design a deterministic finite automaton M that accepts
the language L( M) = {w E {a, b} * : w does not contain three consecutive b's}.
We let M = (K,"L"J,s,F), where

K = {qo, ql , q2, q3 } ,
"L,={a,b},
s = qo,
F = {qO,ql,q2},

and J is given by the following table.

q (J J(q, (J)
qo a qo
qo b ql
ql a qo
ql b q2
q2 a qo
q2 b q3
q3 a q3
q3 b q3

The state diagram is shown in Figure 2-3. To see that M does indeed accept
the specified language, note that as long as three consecutive b's have not been
read, M is in state qi (where i is 0, 1, or 2) immediately after reading a run of
i consecutive b's that either began the input string or was preceded by an a. In
particular, whenever an a is read and M is in state qo, ql, or q2, M returns to
60 Chapter 2: FINITE AUTOMATA

its initial state qo. States qo, ql, and q2 are all final, so any input string not
containing three consecutive b's will be accepted. However, a run of three b's
will drive AI to state q3, which is not final, and AI will then remain in this state
regardless of the symbols in the rest of the input string. State q3 is said to be
a dead state, and if AI reaches state q3 it is said to be tmpped, since no further
input can enable it to escape from this state.O

Figure 2-3

Problems for Section 2.1

2.1.1. Let M be a deterministic finite automaton. Under exactly what circum-


stances is e E L(AI)? Prove your answer.
2.1.2. Describe informally the languages accepted by the deterministic finite au-
tomata shown in the next page.
2.1.3. Construct deterministic finite automata accepting each of the following lan-
guages.
(a) {w E {a, b}' : each a in w is immediately preceded by a b}.
(b) {w E {a,b}' : w has abab as a substring}.
(c) {w E {a, b}' : w has neither aa nor bb as a substring}.
(d) {w E {a,b}' : w has an odd number of a's and an even number of b's}.
(e) {w E {a,b}' : w has both ab and ba as substrings}.
2.1.4. A deterministic tinite-state transducer is a device much like a deter-
ministic finite automaton, except that its purpose is not to accept strings or
languages but to transform input strings into output strings. Informally, it
starts in a designated initial state and moves from state to state, depending
on the input, just as a deterministic finite automaton does. On each step,
however, it emits (or writes onto an output tape) a string of zero or one or
more symbols, depending on the current state and the input symbol. The
state diagram for a deterministic finite-state transducer looks like that for a
deterministic finite automaton, except that the label on an arrow looks like
2.1: Deterministic Finite Automata 61
a

(1....--------~O
b

a,b

a,b
a, b a,b
a a b a

a,b a,b

b
b

b
a
62 Chapter 2: FINITE AUTOMATA

bib
I
1Faa~
bib ale

alw, which means "if the input symbol is a, follow this arrow and emit out-
put w". For example, the deterministic finite-state transducer over {a, b}
shown above transmits all b's in the input string but omits every other a.
(a) Draw state diagrams for deterministic finite-state transducers over {a, b}
that do the following.
(i) On input w, produce output an, where n is the number of occurrences
of the substring ab in w.
(ii) On input w, produce output an, where n is the number of occurrences
of the substring aba in w.
(iii) On input w, produce a string of length w whose ith symbol is an a if
i = 1, or if i > 1 and the ith and (i - l)st symbols of ware different;
otherwise, the ith symbol of the output is a b. For example, on input
aabba the transducer should output ababa, and on input aaaab it should
output abbba.

(b) Formally define


(i) a deterministic finite-state transducer;
(ii) the notion of a configuration for such an automaton;
(iii) the yields in one step relation f- between configurations;
(iv) the notion that such an automaton produces output u on input w;
(v) the notion that such an automaton computes a function.

2.1.5. A deterministic 2-tape finite automaton is a device like a deterministic finite


automaton for accepting pairs of strings. Each state is in one of two sets;
depending on which set the state is in, the transition function refers to the
first or second tape. For example, the automaton shown below accepts all
pairs of strings (Wl,W2) E {a,b}* x {a,b}* such that IW21 = 2lwll·

......-..---------i r-----------------------..
,, ,,
! a,b! a,b
~~
i ! a,b
,,, ,,
,L _.. _.. _____________ .. __ .. ___ _
___ .. __________ !

States for States for


first tape second tape
2.2: Nondeterministic Finite Automata 63

(a) Draw state diagrams for deterministic 2-tape finite automata that accept
each of the following.
(i) All pairs of strings (WI,W2) in {a,b}* x {a,b}* such that IWII = IW21,
and wI(i) i- w2(i) for all i.
(ii) All pairs of strings (WI, W2) in {a, b}' x {a, b} * such that the length of
W2 is twice the number of a's in Wi plus three times the number of b's
in Wi.
(iii) {(anb,anb m ) : n,m;:::: O}.
(iv) {(anb, amb n ) : n, m ;:::: O}.

(b) Formally define


(i) a deterministic 2-tape finite automaton;
(ii) the notion of a configuration for such an automaton;
(iii) the yields in one step relation f- between configurations;
(iv) the notion that such an automaton accepts an ordered pair of strings;
(v) the notion that such an automaton accepts a set of ordered pairs of
strings.

2.1.6. This problem refers to Problems 2.1.5 and 2.1.6. Show that if f : ~* f-t ~* is
a function that can be computed by a deterministic finite-state transducer,
then {(w, f (w)) : W E ~*} is a set of pairs of strings accepted by some
deterministic two-tape finite automaton.

2.1. 7. We say that state q of a deterministic finite automaton M = (K, ~,J, qo, F)
is reachable if there exists W E ~* such that (qo, w) f-M (q, e). Show that
if we delete from NI any nonreachable state, an automaton results that
accepts the same language. Give an efficient algorithm for computing the
set of all reachable states of a deterministic finite automaton.

liiJ NONDETERMINISTIC FINITE AUTOMATA

In this section we add a powerful and intriguing feature to finite automata.


This feature is called nondeterminism, and is essentially the ability to change
states in a way that is only partially determined by the current state and input
symbol. That is, we shall now permit several possible "next states" for a given
combination of current state and input symbol. The automaton, as it reads the
input string, may choose at each step to go into anyone of these legal next states;
the choice is not determined by anything in our model, and is therefore said to
be nondeterministic. On the other hand, the choice is not wholly unlimited
either; only those next states that are legal from a given state with a given input
symbol can be chosen.

You might also like