Mfcs Notes For Mca 2nd Sem
Mfcs Notes For Mca 2nd Sem
Mfcs Notes For Mca 2nd Sem
Introduction 5
Chapter 1. Logic, Proofs 6
1.1. Propositions 6
1.2. Predicates, Quantiers 11
1.3. Proofs 13
Chapter 2. Sets, Functions, Relations 19
2.1. Set Theory 19
2.2. Functions 27
2.3. Relations 32
Chapter 3. Algorithms, Integers 38
3.1. Algorithms 38
3.2. The Euclidean Algorithm 48
3.3. Modular Arithmetic, RSA Algorithm 52
Chapter 4. Induction, Recurences 59
4.1. Sequences and Strings 59
4.2. Mathematical Induction 62
4.3. Recurrence Relations 65
Chapter 5. Counting 69
5.1. Basic Principles 69
5.2. Combinatorics 71
5.3. Generalized Permutations and Combinations 73
5.4. Binomial Coecients 75
5.5. The Pigeonhole Principle 77
Chapter 6. Probability 78
6.1. Probability 78
Chapter 7. Graph Theory 82
7.1. Graphs 82
7.2. Representations of Graphs 88
7.3. Paths and Circuits 91
3
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
CONTENTS 4
7.4. Planar Graphs 97
Chapter 8. Trees 100
8.1. Trees 100
8.2. Binary Trees 102
8.3. Decision Trees, Tree Isomorphisms 104
8.4. Tree Transversal 113
8.5. Spanning Trees 116
Chapter 9. Boolean Algebras 122
9.1. Combinatorial Circuits 122
9.2. Boolean Functions, Applications 127
Chapter 10. Automata, Grammars and Languages 133
10.1. Finite State Machines 133
10.2. Languages and Grammars 137
10.3. Language Recognition 144
Appendix A. 150
A.1. Ecient Computation of Powers Modulo m 150
A.2. Machines and Languages 152
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
Introduction
These notes are intended to be a summary of the main ideas in
course CS 310: Mathematical Foundations of Computer Science. I
may keep working on this document as the course goes on, so these
notes will not be completely nished until the end of the quarter.
The textbook for this course is Keneth H. Rosen: Discrete Mathe-
matics and Its Applications, Fifth Edition, 2003, McGraw-Hill. With
few exceptions I will follow the notation in the book.
These notes contain some questions and exercises intended to
stimulate the reader who wants to play a somehow active role while
studying the subject. They are not homework nor need to be addressed
at all if the reader does not wish to. I will recommend exercises and
give homework assignments separately.
Finally, if you nd any typos or errors, or you have any suggestions,
please, do not hesitate to let me know.
Miguel A. Lerma
mlerma@math.northwestern.edu
Northwestern University
Spring 2005
http://www.math.northwestern.edu/~mlerma/courses/cs310-05s/
5
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
CHAPTER 1
Logic, Proofs
1.1. Propositions
A proposition is a declarative sentence that is either true or false
(but not both). For instance, the following are propositions: Paris
is in France (true), London is in Denmark (false), 2 < 4 (true),
4 = 7 (false). However the following are not propositions: what
is your name? (this is a question), do your homework (this is a
command), this sentence is false (neither true nor false), x is an
even number (it depends on what x represents), Socrates (it is not
even a sentence). The truth or falsehood of a proposition is called its
truth value.
1.1.1. Connectives, Truth Tables. Connectives are used for
making compound propositions. The main ones are the following (p
and q represent given propositions):
Name Represented Meaning
Negation p not p
Conjunction p q p and q
Disjunction p q p or q (or both)
Exclusive Or p q either p or q, but not both
Implication p q if p then q
Biconditional p q p if and only if q
The truth value of a compound proposition depends only on the
value of its components. Writing F for false and T for true, we
can summarize the meaning of the connectives in the following way:
6
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
1.1. PROPOSITIONS 7
p q p p q p q p q p q p q
T T F T T F T T
T F F F T T F F
F T T F T T T F
F F T F F F T T
Note that represents a non-exclusive or, i.e., p q is true when
any of p, q is true and also when both are true. On the other hand
represents an exclusive or, i.e., p q is true only when exactly one of
p and q is true.
1.1.2. Tautology, Contradiction, Contingency.
1. A proposition is said to be a tautology if its truth value is T
for any assignment of truth values to its components. Example:
The proposition p p is a tautology.
2. A proposition is said to be a contradiction if its truth value is F
for any assignment of truth values to its components. Example:
The proposition p p is a contradiction.
3. A proposition that is neither a tautology nor a contradiction is
called a contingency.
p p p p p p
T F T F
T F T F
F T T F
F T T F
6 6
tautology contradiction
1.1.3. Conditional Propositions. A proposition of the form if
p then q or p implies q, represented p q is called a conditional
proposition. For instance: if John is from Chicago then John is from
Illinois. The proposition p is called hypothesis or antecedent, and the
proposition q is the conclusion or consequent.
Note that p q is true always except when p is true and q is false.
So, the following sentences are true: if 2 < 4 then Paris is in France
(true true), if London is in Denmark then 2 < 4 (false true),
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
1.1. PROPOSITIONS 8
if 4 = 7 then London is in Denmark (false false). However the
following one is false: if 2 < 4 then London is in Denmark (true
false).
In might seem strange that p q is considered true when p is
false, regardless of the truth value of q. This will become clearer when
we study predicates such as if x is a multiple of 4 then x is a multiple
of 2. That implication is obviously true, although for the particular
case x = 3 it becomes if 3 is a multiple of 4 then 3 is a multiple of 2.
The proposition p q, read p if and only if q, is called bicon-
ditional. It is true precisely when p and q have the same truth value,
i.e., they are both true or both false.
1.1.4. Logical Equivalence. Note that the compound proposi-
tions p q and p q have the same truth values:
p q p p q p q
T T F T T
T F F F F
F T T T T
F F T T T
When two compound propositions have the same truth values no
matter what truth value their constituent propositions have, they are
called logically equivalent. For instance p q and p q are logically
equivalent, and we write it:
p q p q
Note that that two propositions A and B are logically equivalent
precisely when A B is a tautology.
Example: De Morgans Laws for Logic. The following propositions
are logically equivalent:
(p q) p q
(p q) p q
We can check it by examining their truth tables:
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
1.1. PROPOSITIONS 9
p q p q p q (p q) p q p q (p q) p q
T T F F T F F T F F
T F F T T F F F T T
F T T F T F F F T T
F F T T F T T F T T
Example: The following propositions are logically equivalent:
p q (p q) (q p)
Again, this can be checked with the truth tables:
p q p q q p (p q) (q p) p q
T T T T T T
T F F T F F
F T T F F F
F F T T T T
Exercise: Check the following logical equivalences:
(p q) p q
p q q p
(p q) p q
1.1.5. Converse, Contrapositive. The converse of a conditional
proposition p q is the proposition q p. As we have seen, the bi-
conditional proposition is equivalent to the conjunction of a conditional
proposition an its converse.
p q (p q) (q p)
So, for instance, saying that John is married if and only if he has a
spouse is the same as saying if John is married then he has a spouse
and if he has a spouse then he is married.
Note that the converse is not equivalent to the given conditional
proposition, for instance if John is from Chicago then John is from
Illinois is true, but the converse if John is from Illinois then John is
from Chicago may be false.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
1.1. PROPOSITIONS 10
The contrapositive of a conditional proposition p q is the propo-
sition q p. They are logically equivalent. For instance the con-
trapositive of if John is from Chicago then John is from Illinois is if
John is not from Illinois then John is not from Chicago.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
1.2. PREDICATES, QUANTIFIERS 11
1.2. Predicates, Quantiers
1.2.1. Predicates. A predicate or propositional function is a state-
ment containing variables. For instance x +2 = 7, X is American,
x < y, p is a prime number are predicates. The truth value of the
predicate depends on the value assigned to its variables. For instance if
we replace x with 1 in the predicate x+2 = 7 we obtain 1+2 = 7,
which is false, but if we replace it with 5 we get 5 + 2 = 7, which
is true. We represent a predicate by a letter followed by the variables
enclosed between parenthesis: P(x), Q(x, y), etc. An example for P(x)
is a value of x for which P(x) is true. A counterexample is a value of
x for which P(x) is false. So, 5 is an example for x + 2 = 7, while 1
is a counterexample.
Each variable in a predicate is assumed to belong to a universe (or
domain) of discourse, for instance in the predicate n is an odd integer
n represents an integer, so the universe of discourse of n is the set of
all integers. In X is American we may assume that X is a human
being, so in this case the universe of discourse is the set of all human
beings.
1
1.2.2. Quantiers. Given a predicate P(x), the statement for
some x, P(x) (or there is some x such that p(x)), represented
x P(x), has a denite truth value, so it is a proposition in the
usual sense. For instance if P(x) is x + 2 = 7 with the integers as
universe of discourse, then x P(x) is true, since there is indeed an
integer, namely 5, such that P(5) is a true statement. However, if
Q(x) is 2x = 7 and the universe of discourse is still the integers,
then x Q(x) is false. On the other hand, x Q(x) would be true if we
extend the universe of discourse to the rational numbers. The symbol
is called the existential quantier.
Analogously, the sentence for all x, P(x)also for any x, P(x),
for every x, P(x), for each x, P(x), represented x P(x), has
a denite truth value. For instance, if P(x) is x + 2 = 7 and the
1
Usually all variables occurring in predicates along a reasoning are supposed to
belong to the same universe of discourse, but in some situations (as in the so called
many-sorted logics) it is possible to use dierent kinds of variables to represent
dierent types of objects belonging to dierent universes of discourse. For instance
in the predicate is a string of length n the variable represents a string, while
n represents a natural number, so the universe of discourse of is the set of all
strings, while the universe of discourse of n is the set of natural numbers.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
1.2. PREDICATES, QUANTIFIERS 12
universe of discourse is the integers, then x P(x) is false. However if
Q(x) represents (x + 1)
2
= x
2
+ 2x + 1 then x Q(x) is true. The
symbol is called the universal quantier.
In predicates with more than one variable it is possible to use several
quantiers at the same time, for instance xyz P(x, y, z), meaning
for all x and all y there is some z such that P(x, y, z).
Note that in general the existential and universal quantiers cannot
be swapped, i.e., in general xy P(x, y) means something dierent
from yx P(x, y). For instance if x and y represent human beings and
P(x, y) represents x is a friend of y, then xy P(x, y) means that
everybody is a friend of someone, but yx P(x, y) means that there
is someone such that everybody is his or her friend.
A predicate can be partially quantied, e.g. xy P(x, y, z, t). The
variables quantied (x and y in the example) are called bound variables,
and the rest (z and t in the example) are called free variables. A
partially quantied predicate is still a predicate, but depending on
fewer variables.
1.2.3. Generalized De Morgan Laws for Logic. If x P(x) is
false then there is no value of x for which P(x) is true, or in other
words, P(x) is always false. Hence
x P(x) x P(x) .
On the other hand, if x P(x) is false then it is not true that for
every x, P(x) holds, hence for some x, P(x) must be false. Thus:
x P(x) x P(x) .
This two rules can be applied in successive steps to nd the negation
of a more complex quantied statement, for instance:
xy p(x, y) xy P(x, y) xy P(x, y) .
Exercise: Write formally the statement for every real number there
is a greater real number. Write the negation of that statement.
Answer: The statement is: xy (x < y) (the universe of discourse
is the real numbers). Its negation is: xy (x < y), i.e., xy (x ,< y).
(Note that among real numbers x ,< y is equivalent to x y, but
formally they are dierent predicates.)
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
1.3. PROOFS 13
1.3. Proofs
1.3.1. Mathematical Systems, Proofs. A Mathematical Sys-
tem consists of:
1. Axioms: propositions that are assumed true.
2. Denitions: used to create new concepts from old ones.
3. Undened terms: corresponding to the primitive concepts of the
system (for instance in set theory the term set is undened).
A theorem is a proposition that can be proved to be true. An
argument that establishes the truth of a proposition is called a proof.
Example: Prove that if x > 2 and y > 3 then x +y > 5.
Answer: Assuming x > 2 and y > 3 and adding the inequalities
term by term we get: x +y > 2 + 3 = 5.
That is an example of direct proof. In a direct proof we assume the
hypothesis together with axioms and other theorems previously proved
and we derive the conclusion from them.
An indirect proof or proof by contrapositive consists of proving the
contrapositive of the desired implication, i.e., instead of proving p q
we prove q p.
Example: Prove that if x +y > 5 then x > 2 or y > 3.
Answer: We must prove that x + y > 5 (x > 2) (y > 3). An
indirect proof consists of proving ((x > 2) (y > 3)) (x +y > 5).
In fact: ((x > 2) (y > 3)) is the same as (x 2)(y 3), so adding
both inequalities we get x +y 5, which is the same as (x +y > 5).
Proof by Contradiction. In a proof by contradiction or (Reductio ad
Absurdum) we assume the hypotheses and the negation of the conclu-
sion, and try to derive a contradiction, i.e., a proposition of the form
r r.
Example: Prove by contradiction that if x+y > 5 then either x > 2
or y > 3.
Answer: We assume the hypothesis x +y > 5. From here we must
conclude that x > 2 or y > 3. Assume to the contrary that x > 2 or
y > 3 is false, so x 2 and y 3. Adding those inequalities we get
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
1.3. PROOFS 14
x 2 + 3 = 5, which contradicts the hypothesis x +y > 5. From here
we conclude that the assumption x 2 and y 3 cannot be right,
so x > 2 or y > 3 must be true.
Remark: Sometimes it is dicult to distinguish between an indirect
proof and a proof by contradiction. In an indirect proof we prove an
implication of the form p q by proving the contrapositive q
p. In an proof by contradiction we prove an statement s (which
may or may not be an implication) by assuming s and deriving a
contradiction. In fact proofs by contradiction are more general than
indirect proofs.
Exercise: Prove by contradiction that
2 = a/b.
Answer: Assume that
2 is rational, i.e.,
.
Hence: 2 b
2
= 4 a
2
, and simplifying: b
2
= 2 a
2
. This implies that b
2
is even, so b is even: b = 2b
. Consequently a/b = 2a
/2b
= a
/b
,
contradicting the hypothesis that a/b was in least terms.
1.3.2. Arguments, Rules of Inference. An argument is a se-
quence of propositions p
1
, p
2
, . . . , p
n
called hypotheses (or premises)
followed by a proposition q called conclusion. An argument is usually
written:
p
1
p
2
.
.
.
p
n
q
or
p
1
, p
2
, . . . , p
n
/ q
The argument is called valid if q is true whenever p
1
, p
2
, . . . , p
n
are
true; otherwise it is called invalid.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
1.3. PROOFS 15
Rules of inference are certain simple arguments known to be valid
and used to make a proof step by step. For instance the following
argument is called modus ponens or rule of detachment:
p q
p
q
In order to check whether it is valid we must examine the following
truth table:
p q p q p q
T T T T T
T F F T F
F T T F T
F F T F F
If we look now at the rows in which both p q and p are true (just
the rst row) we see that also q is true, so the argument is valid.
Other rules of inference are the following:
1. Modus Ponens or Rule of Detachment:
p q
p
q
2. Modus Tollens:
p q
q
p
3. Addition:
p
p q
4. Simplication:
p q
p
5. Conjunction:
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
1.3. PROOFS 16
p
q
p q
6. Hypothetical Syllogism:
p q
q r
p r
7. Disjunctive Syllogism:
p q
p
q
8. Resolution:
p q
p r
q r
Arguments are usually written using three columns. Each row con-
tains a label, a statement and the reason that justies the introduction
of that statement in the argument. That justication can be one of the
following:
1. The statement is a premise.
2. The statement can be derived from statements occurring earlier
in the argument by using a rule of inference.
Example: Consider the following statements: I take the bus or
I walk. If I walk I get tired. I do not get tired. Therefore I take the
bus. We can formalize this by calling B = I take the bus, W =
I walk and T = I get tired. The premises are B W, W T and
T, and the conclusion is B. The argument can be described in the
following steps:
step statement reason
1) W T Premise
2) T Premise
3) W 1,2, Modus Tollens
4) B W Premise
5) B 4,3, Disjunctive Syllogism
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
1.3. PROOFS 17
1.3.3. Rules of Inference for Quantied Statements. We
state the rules for predicates with one variable, but they can be gener-
alized to predicates with two or more variables.
1. Universal Instantiation. If x p(x) is true, then p(a) is true for
each specic element a in the universe of discourse; i.e.:
x p(x)
p(a)
For instance, from x(x+1 = 1+x) we can derive 7+1 = 1+7.
2. Existential Instantiation. If x p(x) is true, then p(a) is true for
some specic element a in the universe of discourse; i.e.:
x p(x)
p(a)
The dierence respect to the previous rule is the restriction in
the meaning of a, which now represents some (not any) element
of the universe of discourse. So, for instance, from x (x
2
= 2)
(the universe of discourse is the real numbers) we derive the
existence of some element, which we may represent
2, such
that (
2)
2
= 2.
3. Universal Generalization. If p(x) is proved to be true for a
generic element in the universe of discourse, then x p(x) is
true; i.e.:
p(x)
x p(x)
By generic we mean an element for which we do not make any
assumption other than its belonging to the universe of discourse.
So, for instance, we can prove x[(x + 1)
2
= x
2
+ 2x + 1] (say,
for real numbers) by assuming that x is a generic real number
and using algebra to prove (x + 1)
2
= x
2
+ 2x + 1.
4. Existential Generalization. If p(a) is true for some specic ele-
ment a in the universe of discourse, then x p(x) is true; i.e.:
p(a)
x p(x)
For instance: from 7 + 1 = 8 we can derive x (x + 1 = 8).
Example: Show that a counterexample can be used to disprove a
universal statement, i.e., if a is an element in the universe of discourse,
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
1.3. PROOFS 18
then from p(a) we can derive x p(x). Answer: The argument is as
follows:
step statement reason
1) p(a) Premise
2) xp(x) Existential Generalization
3) x p(x) Negation of Universal Statement
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
CHAPTER 2
Sets, Functions, Relations
2.1. Set Theory
2.1.1. Sets. A set is a collection of objects, called elements of the
set. A set can be represented by listing its elements between braces:
A = 1, 2, 3, 4, 5. The symbol is used to express that an element is
(or belongs to) a set, for instance 3 A. Its negation is represented by
,, e.g. 7 , A. If the set is nite, its number of elements is represented
[A[, e.g. if A = 1, 2, 3, 4, 5 then [A[ = 5.
Some important sets are the following:
1. N = 0, 1, 2, 3, = the set of natural numbers.
1
2. Z = , 3, 2, 1, 0, 1, 2, 3, = the set of integers.
3. Q = the set of rational numbers.
4. R = the set of real numbers.
5. C = the set of complex numbers.
Is S is one of those sets then we also use the following notations:
2
1. S
+
= set of positive elements in S, for instance
Z
+
= 1, 2, 3, = the set of positive integers.
2. S
= the
set of non zero real numbers.
Set-builder notation. An alternative way to dene a set, called set-
builder notation, is by stating a property (predicate) P(x) veried by
exactly its elements, for instance A = x Z [ 1 x 5 = set of
1
Note that N includes zerofor some authors N = 1, 2, 3, , without zero.
2
When working with strings we will use a similar notation with a dierent
meaningbe careful not to confuse it.
19
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
2.1. SET THEORY 20
integers x such that 1 x 5i.e.: A = 1, 2, 3, 4, 5. In general:
A = x U [ p(x), where U is the universe of discourse in which the
predicate P(x) must be interpreted, or A = x [ P(x) if the universe
of discourse for P(x) is implicitly understood. In set theory the term
universal set is often used in place of universe of discourse for a given
predicate.
3
Principle of Extension. Two sets are equal if and only if they have
the same elements, i.e.:
A = B x (x A x B) .
Subset. We say that A is a subset of set B, or A is contained in
B, and we represent it A B, if all elements of A are in B, e.g., if
A = a, b, c and B = a, b, c, d, e then A B.
A is a proper subset of B, represented A B, if A B but
A ,= B, i.e., there is some element in B which is not in A.
Empty Set. A set with no elements is called empty set (or null set,
or void set), and is represented by or .
Note that nothing prevents a set from possibly being an element of
another set (which is not the same as being a subset!). For instance
if A = 1, a, 3, t, 1, 2, 3 and B = 3, t, then obviously B is an
element of A, i.e., B A.
Power Set. The collection of all subsets of a set A is called the
power set of A, and is represented P(A). For instance, if A = 1, 2, 3,
then
P(A) = , 1, 2, 3, 1, 2, 1, 3, 2, 3, A .
Exercise: Prove by induction that if [A[ = n then [P(A)[ = 2
n
.
Multisets. Two ordinary sets are identical if they have the same
elements, so for instance, a, a, b and a, b are the same set because
they have exactly the same elements, namely a and b. However, in
some applications it might be useful to allow repeated elements in a
set. In that case we use multisets, which are mathematical entities
similar to sets, but with possibly repeated elements. So, as multisets,
a, a, b and a, b would be considered dierent, since in the rst one
the element a occurs twice and in the second one it occurs only once.
3
Properly speaking, the universe of discourse of set theory is the collection of
all sets (which is not a set).
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
2.1. SET THEORY 21
2.1.2. Venn Diagrams. Venn diagrams are graphic representa-
tions of sets as enclosed areas in the plane. For instance, in gure 2.1,
the rectangle represents the universal set (the set of all elements con-
sidered in a given problem) and the shaded region represents a set A.
The other gures represent various set operations.
A
Figure 2.1. Venn Diagram.
B A
Figure 2.2. Intersection A B.
B A
Figure 2.3. Union A B.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
2.1. SET THEORY 22
A
Figure 2.4. Complement A.
B
A
Figure 2.5. Dierence A B.
B
A
Figure 2.6. Symmetric Dierence AB.
2.1.3. Set Operations.
1. Intersection: The common elements of two sets:
A B = x [ (x A) (x B) .
If A B = , the sets are said to be disjoint.
2. Union: The set of elements that belong to either of two sets:
A B = x [ (x A) (x B) .
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
2.1. SET THEORY 23
3. Complement: The set of elements (in the universal set) that do
not belong to a given set:
A = x U [ x , A .
4. Dierence or Relative Complement: The set of elements that
belong to a set but not to another:
A B = x [ (x A) (x , B) = A B.
5. Symmetric Dierence: Given two sets, their symmetric dier-
ence is the set of elements that belong to either one or the other
set but not both.
AB = x [ (x A) (x B) .
It can be expressed also in the following way:
AB = A B A B = (A B) (B A) .
2.1.4. Counting with Venn Diagrams. A Venn diagram with
n sets intersecting in the most general way divides the plane into 2
n
regions. If we have information about the number of elements of some
portions of the diagram, then we can nd the number of elements in
each of the regions and use that information for obtaining the number
of elements in other portions of the plane.
Example: Let M, P and C be the sets of students taking Mathe-
matics courses, Physics courses and Computer Science courses respec-
tively in a university. Assume [M[ = 300, [P[ = 350, [C[ = 450,
[M P[ = 100, [M C[ = 150, [P C[ = 75, [M P C[ = 10. How
many students are taking exactly one of those courses? (g. 2.7)
10
185
235
65
140
C
60
90
M P
Figure 2.7. Counting with Venn diagrams.
We see that [(MP)(MPC)[ = 10010 = 90, [(MC)(M
P C)[ = 15010 = 140 and [(P C) (MP C)[ = 7510 = 65.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
2.1. SET THEORY 24
Then the region corresponding to students taking Mathematics courses
only has cardinality 300(90+10+140) = 60. Analogously we compute
the number of students taking Physics courses only (185) and taking
Computer Science courses only (235). The sum 60 + 185 + 235 = 480
is the number of students taking exactly one of those courses.
2.1.5. Properties of Sets. The set operations verify the follow-
ing properties:
1. Associative Laws:
A (B C) = (A B) C
A (B C) = (A B) C
2. Commutative Laws:
A B = B A
A B = B A
3. Distributive Laws:
A (B C) = (A B) (A C)
A (B C) = (A B) (A C)
4. Identity Laws:
A = A
A U = A
5. Complement Laws:
A A = U
A A =
6. Idempotent Laws:
A A = A
A A = A
7. Bound Laws:
A U = U
A =
8. Absorption Laws:
A (A B) = A
A (A B) = A
9. Involution Law:
A = A
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
2.1. SET THEORY 25
10. 0/1 Laws:
= U
U =
11. DeMorgans Laws:
A B = A B
A B = A B
2.1.6. Generalized Union and Intersection. Given a collec-
tion of sets A
1
, A
2
, . . . , A
N
, their union is dened as the set of elements
that belong to at least one of the sets (here n represents an integer in
the range from 1 to N):
N
_
n=1
A
n
= A
1
A
2
A
N
= x [ n(x A
n
) .
Analogously, their intersection is the set of elements that belong to all
the sets simultaneously:
N
n=1
A
n
= A
1
A
2
A
N
= x [ n(x A
n
) .
These denitions can be applied to innite collections of sets as well.
For instance assume that S
n
= kn [ k = 2, 3, 4, . . . = set of multiples
of n greater than n. Then
_
n=2
S
n
= S
2
S
3
S
4
= 4, 6, 8, 9, 10, 12, 14, 15, . . .
= set of composite positive integers .
2.1.7. Partitions. A partition of a set X is a collection S of non
overlapping non empty subsets of X whose union is the whole X. For
instance a partition of X = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 could be
S = 1, 2, 4, 8, 3, 6, 5, 7, 9, 10 .
Given a partition S of a set X, every element of X belongs to exactly
one member of S.
Example: The division of the integers Z into even and odd numbers
is a partition: S = E, O, where E = 2n [ n Z, O = 2n +1 [ n
Z.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
2.1. SET THEORY 26
Example: The divisions of Z in negative integers, positive integers
and zero is a partition: S = Z
+
, Z
, 0.
2.1.8. Ordered Pairs, Cartesian Product. An ordinary pair
a, b is a set with two elements. In a set the order of the elements is
irrelevant, so a, b = b, a. If the order of the elements is relevant,
then we use a dierent object called ordered pair, represented (a, b).
Now (a, b) ,= (b, a) (unless a = b). In general (a, b) = (a
, b
) i a = a
and b = b
.
Given two sets A, B, their Cartesian product AB is the set of all
ordered pairs (a, b) such that a A and b B:
A B = (a, b) [ (a A) (b B) .
Analogously we can dene triples or 3-tuples (a, b, c), 4-tuples (a, b, c, d),
. . . , n-tuples (a
1
, a
2
, . . . , a
n
), and the corresponding 3-fold, 4-fold,. . . ,
n-fold Cartesian products:
A
1
A
2
A
n
=
(a
1
, a
2
, . . . , a
n
) [ (a
1
A
1
) (a
2
A
2
) (a
n
A
n
) .
If all the sets in a Cartesian product are the same, then we can use
an exponent: A
2
= A A, A
3
= A A A, etc. In general:
A
n
= A A
(n times)
A.
An example of Cartesian product is the real plane R
2
, where R is
the set of real numbers (R is sometimes called real line).
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
2.2. FUNCTIONS 27
2.2. Functions
2.2.1. Correspondences. Suppose that to each element of a set
A we assign some elements of another set B. For instance, A = N,
B = Z, and to each element x N we assign all elements y Z such
that y
2
= x (g. 2.8).
1
8
9
10
0
2
3 4
5
6
1
1
2
2
3
0
3
Z N
7
Figure 2.8. Correspondence x
x.
This operation is called a correspondence.
2.2.2. Functions. A function or mapping f from a set A to a set
B, denoted f : A B, is a correspondence in which to each element
x of A corresponds exactly one element y = f(x) of B (g. 2.9).
A B
Figure 2.9. Function.
Sometimes we represent the function with a diagram like this:
f : A B
x y
or
A
f
B
x y
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
2.2. FUNCTIONS 28
For instance, the following represents the function from Z to Z
dened by f(x) = 2x + 1:
f : Z Z
x 2x + 1
The element y = f(x) is called the image of x, and x is a preimage
of y. For instance, if f(x) = 2x + 1 then f(7) = 2 7 + 1 = 15. The
set A is the domain of f, and B is its codomain. If A
A, the image
of A
by f is f(A
) = f(x) [ x A
A, f(x) = f(x
) x = x
.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
2.2. FUNCTIONS 29
0
1
2
3
4
y
2 1 1 2
x
Figure 2.10. Graph of f(x) = x
2
.
For instance, f(x) = 2x from Z to Z is injective.
A B
Figure 2.11. One-to-one function.
2. Onto or Surjective: A function f : A B is called onto or
surjective if every element of B is the image of some element of
A (g. 2.12):
y B, x A such that y = f(x) .
For instance, f(x) = x
2
from R to R
+
0 is onto.
A B
Figure 2.12. Onto function.
3. One-To-One Correspondence or Bijective: A function f : A
B is said to be a one-to-one correspondence, or bijective, or a
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
2.2. FUNCTIONS 30
bijection, if it is one-to-one and onto (g. 2.13). For instance,
f(x) = x + 3 from Z to Z is a bijection.
A B
Figure 2.13. Bijection.
2.2.4. Identity Function. Given a set A, the function 1
A
: A
A dened by 1
A
(x) = x for every x in A is called the identity function
for A.
2.2.5. Function Composition. Given two functions f : A B
and g : B C, the composite function of f and g is the function
g f : A C dened by (g f)(x) = g(f(x)) for every x in A:
A
x
f
gf
B
y=f(x)
g
C
z=g(y)=g(f(x))
For instance, if A = B = C = Z, f(x) = x + 1, g(x) = x
2
, then
(g f)(x) = f(x)
2
= (x +1)
2
. Also (f g)(x) = g(x) +1 = x
2
+1 (the
composition of functions is not commutative in general).
Some properties of function composition are the following:
1. If f : A B is a function from A to B, we have that f 1
A
=
1
B
f = f.
2. Function composition is associative, i.e., given three functions
A
f
B
g
C
h
D,
we have that h (g f) = (h g) f.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
2.2. FUNCTIONS 31
Function iteration. If f : A A is a function from A to A, then
it makes sense to compose it with itself: f
2
= f f. For instance, if
f : Z Z is f(x) = 2x + 1, then f
2
(x) = 2(2x + 1) + 1 = 4x + 3.
Analogously we can dene f
3
= f f f, and so on, f
n
= f
(n times)
. . . f.
2.2.6. Inverse Function. If f : A B is a bijective function, its
inverse is the function f
1
: B A such that f
1
(y) = x if and only
if f(x) = y.
For instance, if f : Z Z is dened by f(x) = x + 3, then its
inverse is f
1
(x) = x 3.
The arrow diagram of f
1
is the same as the arrow diagram of f
but with all arrows reversed.
A characteristic property of the inverse function is that f
1
f = 1
A
and f f
1
= 1
B
.
2.2.7. Operators. A function from AA to A is called a binary
operator on A. For instance the addition of integers is a binary oper-
ator + : Z Z Z. In the usual notation for functions the sum of
two integers x and y would be represented +(x, y). This is called prex
notation. The inx notation consists of writing the symbol of the bi-
nary operator between its arguments: x+y (this is the most common).
There is also a postx notation consisting of writing the symbol after
the arguments: x y +.
Another example of binary operator on Z is (x, y) x y.
A monary or unary operator on A is a function from A to A. For
instance the change of sign x x on Z is a unary operator on Z. An
example of unary operator on R
for
some t
. Hence a = att
, which implies tt
= 1 t
= t
1
. The
only invertible positive integer is 1, so t = t
= 1 a = b.
3. Transitive: a[b and b[c implies b = at for some t and c = bt
for
some t
, hence c = att
, i.e., a[c.
Question: is the strict inequality (<) a partial order on Z?
Two elements a, b A are said to be comparable if either x y
or y x, otherwise they are said to be non comparable. The order
is called total or linear when every pair of elements x, y A are com-
parable. For instance (Z, ) is totally ordered, but (Z
+
, [), where [
represents integer divisibility, is not. A totally ordered subset of a par-
tially ordered set is called a chain; for instance the set 1, 2, 4, 8, 16, . . .
is a chain in (Z
+
, [).
2.3.7. Hasse diagrams. A Hasse diagram is a graphical represen-
tation of a partially ordered set in which each element is represented
by a dot (node or vertex of the diagram). Its immediate successors are
placed above the node and connected to it by straight line segments. As
an example, gure 2.16 represents the Hasse diagram for the relation
of divisibility on 1, 2, 3, 4, 5, 6, 7, 8, 9.
Question: How does the Hasse diagram look for a totally ordered
set?
2.3.8. Equivalence Relations. An equivalence relation on a set
A is a binary relation on A with the following properties:
1. Reexive: for all x A, x x.
2. Symmetric: x y y x.
3. Transitive: (x y) (y z) x z.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
2.3. RELATIONS 36
1
4
8
6
2
7
9
3
5
Figure 2.16. Hasse diagram for divisibility.
For instance, on Z, the equality (=) is an equivalence relation.
Another example, also on Z, is the following: x y (mod 2) (x is
congruent to y modulo 2) i xy is even. For instance, 6 2 (mod 2)
because 6 2 = 4 is even, but 7 , 4 (mod 2), because 7 4 = 3 is not
even. Congruence modulo 2 is in fact an equivalence relation:
1. Reexive: for every integer x, xx = 0 is indeed even, so x x
(mod 2).
2. Symmetric: if x y (mod 2) then x y = t is even, but
y x = t is also even, hence y x (mod 2).
3. Transitive: assume x y (mod 2) and y z (mod 2). Then
x y = t and y z = u are even. From here, x z = (x y) +
(y z) = t +u is also even, hence x z (mod 2).
2.3.9. Equivalence Classes, Quotient Set, Partitions. Given
an equivalence relation on a set A, and an element x A, the
set of elements of A related to x are called the equivalence class of
x, represented [x] = y A [ y x. Element x is said to be a
representative of class
[x]. The collection of equivalence classes, represented A/ = [x] [
x A, is called quotient set of A by .
Exercise: Find the equivalence classes on Z with the relation of
congruence modulo 2.
One of the main properties of an equivalence relation on a set A
is that the quotient set, i.e. the collection of equivalence classes, is
a partition of A. Recall that a partition of a set A is a collection of
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
2.3. RELATIONS 37
non-empty subsets A
1
, A
2
, A
3
, . . . of A which are pairwise disjoint and
whose union equals A:
1. A
i
A
j
= for i ,= j,
2.
n
A
n
= A.
Example: in Z with the relation of congruence modulo 2 (call it
2
), there are two equivalence classes: the set E of even integers and
the set O of odd integers. The quotient set of Z by the relation
2
of congruence modulo 2 is Z/
2
= E, O. We see that it is in fact a
partition of Z, because E O = , and Z = E O.
Exercise: Let m be an integer greater than or equal to 2. On Z
we dene the relation x y (mod m) m[(y x) (i.e., m divides
exactly y x). Prove that it is an equivalence relation. What are the
equivalence classes? How many are there?
Exercise: On the Cartesian product Z Z
and a remainder r
. Next we divide r by r
, which
gives a quotient q
. We continue dividing
each remainder by the next one until obtaining a zero remainder, and
which point we stop. The last non-zero remainder is the gcd.
Example: Assume that we wish to compute gcd(500, 222). Then we
arrange the computations in the following way:
500 = 2 222 + 56 r = 56
222 = 3 56 + 54 r
= 54
56 = 1 54 + 2 r
= 2
54 = 27 2 + 0 r
= 0
The last nonzero remainder is r
m
. For instance:
Z
2
= 1
Z
3
= 1, 2
Z
4
= 1, 3
Z
5
= 1, 2, 3, 4
Z
6
= 1, 5
Z
7
= 1, 2, 3, 4, 5, 6
Z
8
= 1, 3, 5, 7
Z
9
= 1, 2, 4, 5, 7, 8
Given an element [a] in Z
m
, its inverse can be computed by using
the Euclidean algorithm to nd gcd(a, m), since that algorithm also
provides a solution to the equation ax +my = gcd(a, m) = 1, which is
equivalent to ax 1 (mod m).
Example: Find the multiplicative inverse of 17 in Z
64
. Answer: We
use the Euclidean algorithm:
64 = 3 17 + 13 r = 13
17 = 1 13 + 4 r = 4
13 = 3 4 + 1 r = 1
4 = 4 1 + 0 r = 0
Now we compute backward:
1 = 13 3 4 = 13 3 (17 1 13) = 4 13 3 17
= 4 (64 3 17) 3 17 = 4 64 15 17 .
Hence (15) 17 1 (mod 64), but 15 49 (mod 64), so the in-
verse of 17 in (Z
64
, ) is 49. We will denote this by writing 17
1
= 49
(mod 64), or 17
1
mod 64 = 49.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
3.3. MODULAR ARITHMETIC, RSA ALGORITHM 55
3.3.2. Eulers Phi Function. The number of units in Z
m
is equal
to the number of positive integers not greater than and relatively
prime to m, i.e., the number of integers a such that 1 a m and
gcd(a, m) = 1. That number is given by the so called Eulers phi
function:
(m) = number of positive integers not greater than m
and relatively prime to m.
For instance, the positive integers not greater than and relatively prime
to 15 are: 1, 2, 4, 7, 8, 11, 13, 14, hence (15) = 8.
We have the following results:
1. If p is a prime number and s 1, then (p
s
) = p
s
p
s1
=
p
s
(1 1/p). In particular (p) = p 1.
2. If m
1
, m
2
are two relatively prime positive integers, then (m
1
m
2
) =
(m
1
) (m
2
).
1
3. If m = p
s
1
1
p
s
2
2
. . . p
s
k
k
, where the p
k
are prime and the s
k
are
positive, then
(m) = m(1 1/p
1
) (1 1/p
2
) . . . (1 1/p
k
) .
For instance
(15) = (3 5) = (3) (5) = (3 1) (5 1) = 2 4 = 8 .
3.3.3. Eulers Theorem. If a and m are two relatively prime
positive integers, m 2, then
a
(m)
1 (mod m) .
The particular case in which m is a prime number p, Eulers theorem
is called Fermats Little Theorem:
a
p1
1 (mod p) .
For instance, if a = 2 and p = 7, then we have, in fact, 2
71
= 2
6
=
64 = 1 + 9 7 1 (mod 7).
A consequence of Eulers Theorem is the following. If gcd(a, m) = 1
then
x y (mod (m)) a
x
a
y
(mod m) .
1
A function f(x) of positive integers such that gcd(a, b) = 1 f(ab) =
f(a)f(b) is called multiplicative.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
3.3. MODULAR ARITHMETIC, RSA ALGORITHM 56
Consequently, the following function is well dened:
Z
m
Z
(m)
Z
m
([a]
m
, [x]
(m)
) [a
x
]
m
Hence, we can compute powers modulo m in the following way:
a
n
= a
n mod (m)
(mod m) ,
if gcd(a, m) = 1. For instance:
3
9734888
mod 100 = 3
9734888 mod (100)
mod 100
= 3
9734888 mod 40
mod 100 = 3
8
mod 100 = 6561 mod 100 = 61 .
An even more ecient way to compute powers modulo m is given
in Appendix A, paragraph A.1.
3.3.4. Application to Cryptography: RSA Algorithm. The
RSA algorithm is an encryption scheme designed in 1977 by Ronald
Rivest, Adi Shamir and Leonard Adleman. It allows encrypting a mes-
sage with a key (the encryption key) and decrypting it with a dierent
key (the decryption key). The encryption key is public and can be
given to everybody. The decryption key is private and is known only
by the recipient of the encrypted message.
The RSA algorithm is based on the following facts. Given two
prime numbers p and q, and a positive number m relatively prime to p
and q, Eulers theorem tells us that:
m
(pq)
= m
(p1)(q1)
= 1 (mod pq) .
Assume now that we have two integers e and d such that e d = 1
(mod (pq)). Then we have that
(m
e
)
d
= m
ed
= m (mod pq) .
So, given m
e
we can recover m modulo pq by raising to the dth power.
The RSA algorithm consists of the following:
1. Generate two large primes p and q. Find their product n = pq.
2. Find two numbers e and d (in the range from 2 to (n)) such
that e d = 1 (mod (n)). This requires some trial and error.
First e is chosen at random, and the Euclidean algorithm is
used to nd gcd(e, m), solving at the same time the equation
ex +my = gcd(e, m). If gcd(e, m) = 1 then the value obtained
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
3.3. MODULAR ARITHMETIC, RSA ALGORITHM 57
for x is d. Otherwise, e is no relatively prime to (n) and we
must try a dierent value for e.
3. The public encryption key will be the pair (n, e). The private
decryption key will be the pair (n, d). The encryption key is
given to everybody, while the decryption key is kept secret by
the future recipient of the message.
4. The message to be encrypted is divided into small pieces, and
each piece is encoded numerically as a positive integer m smaller
than n.
5. The number m
e
is reduced modulo n; m
= m
e
mod n.
6. The recipient computes m
= m
d
mod n, with 0 m
< n.
It remains to prove that m
_
x r
1
(mod m
1
)
x r
2
(mod m
2
)
. . .
x r
k
(mod m
k
)
has a unique solution modulo M = m
1
m
2
. . . m
k
.
We can nd a solution to that system in the following way. Let
M
i
= M/m
i
, and s
i
= the inverse of M
i
in Z
m
i
. Then
x = M
1
s
1
r
1
+M
2
s
2
r
2
+ +M
k
s
k
r
k
is a solution to the system.
Example: A group of objects can be arranged in 3 rows leaving 2
left, in 5 rows leaving 4 left, and in 7 rows leaving 6 left. How many
objects are there? Answer: We must solve the following system of
congruences:
_
_
_
x 2 (mod 3)
x 4 (mod 5)
x 6 (mod 7)
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
3.3. MODULAR ARITHMETIC, RSA ALGORITHM 58
We have: M = 3 5 7 = 105, M
1
= 105/3 = 35 2 (mod 3),
M
2
= 105/5 = 21 1 (mod 5), M
3
= 105/7 = 15 1 (mod 7); s
1
=
inverse of 2 in Z
3
= 2, s
2
= inverse of 1 in Z
5
= 1, s
3
= inverse
of 1 in Z
7
= 1. Hence the solution is
x = 35 2 2 + 21 1 4 + 15 1 6 = 314 104 (mod 105) .
Hence, any group of 104 + 105 k objects is a possible solution to the
problem.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
CHAPTER 4
Induction, Recurences
4.1. Sequences and Strings
4.1.1. Sequences. A sequence is an (usually innite) ordered list
of elements. Examples:
1. The sequence of positive integers:
1, 2, 3, 4, . . . , n, . . .
2. The sequence of positive even integers:
2, 4, 6, 8, . . . , 2n, . . .
3. The sequence of powers of 2:
1, 2, 4, 8, 16, . . . , n
2
, . . .
4. The sequence of Fibonacci numbers (each one is the sum of the
two previous ones):
0, 1, 1, 2, 3, 5, 8, 13, . . .
5. The reciprocals of the positive integers:
1,
1
2
,
1
3
,
1
4
, ,
1
n
,
In general the elements of a sequence are represented with an in-
dexed letter, say s
1
, s
2
, s
3
, . . . , s
n
, . . . . The sequence itself can be de-
ned by giving a rule, e.g.: s
n
= 2n + 1 is the sequence:
3, 5, 7, 9, . . .
Here we are assuming that the rst element is s
1
, but we can start at
any value of the index that we want, for instance if we declare s
0
to be
the rst term, the previous sequence would become:
1, 3, 5, 7, 9, . . .
The sequence is symbolically represented s
n
or s
n
n=1
.
59
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
4.1. SEQUENCES AND STRINGS 60
If s
n
s
n+1
for every n the sequence is called increasing. If s
n
s
n+1
then it is called decreasing. For instance s
n
= 2n+1 is increasing:
3, 5, 7, 9, . . . , while s
n
= 1/n is decreasing: 1,
1
2
,
1
3
,
1
4
, .
If we remove elements from a sequence we obtain a subsequence.
E.g., if we remove all odd numbers from the sequence of positive inte-
gers:
1, 2, 3, 4, 5 . . . ,
we get the subsequence consisting of the even positive integers:
2, 4, 6, 8, . . .
4.1.2. Sum (Sigma) and Product Notation. In order to ab-
breviate sums and products the following notations are used:
1. Sum (or sigma) notation:
n
i=m
a
i
= a
m
+a
m+1
+a
m+2
+ +a
n
2. Product notation:
n
i=m
a
i
= a
m
a
m+1
a
m+2
a
n
For instance: assume a
n
= 2n + 1, then
6
n=3
a
n
= a
3
+a
4
+a
5
+a
6
= 7 + 9 + 11 + 13 = 40 .
6
n=3
a
n
= a
3
a
4
a
5
a
6
= 7 9 11 13 = 9009 .
4.1.3. Strings. Given a set X, a string over X is a nite ordered
list of elements of X.
Example: If X is the set X = a, b, c, then the following are ex-
amples of strings over X: aba, aaaa, bba, etc.
Repetitions can be specied with a superscripts, for instance: a
2
b
3
ac
2
a
3
=
aabbbaccaaa, (ab)
3
= ababab, etc.
The length of a string is its number of elements, e.g., [abaccbab[ = 8,
[a
2
b
7
a
3
c
6
[ = 18.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
4.1. SEQUENCES AND STRINGS 61
The string with no elements is called null string, represented . Its
length is, of course, zero: [[ = 0.
The set of all strings over X is represented X
. The set of no
null strings over X (i.e., all strings over X except the null string) is
represented X
+
.
Given two strings and over X, the string consisting of followed
by is called the concatenation of and . For instance if = abac
and = baaab then = abacbaaab.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
4.2. MATHEMATICAL INDUCTION 62
4.2. Mathematical Induction
Many properties of positive integers can be proved by mathematical
induction.
4.2.1. Principle of Mathematical Induction. Let P be a prop-
erty of positive integers such that:
1. Basis Step: P(1) is true, and
2. Inductive Step: if P(n) is true, then P(n + 1) is true.
Then P(n) is true for all positive integers.
Remark: The premise P(n) in the inductive step is called Induction
Hypothesis.
The validity of the Principle of Mathematical Induction is obvious.
The basis step states that P(1) is true. Then the inductive step implies
that P(2) is also true. By the inductive step again we see that P(3)
is true, and so on. Consequently the property is true for all positive
integers.
Remark: In the basis step we may replace 1 with some other integer
m. Then the conclusion is that the property is true for every integer n
greater than or equal to m.
Example: Prove that the sum of the n rst odd positive integers is
n
2
, i.e., 1 + 3 + 5 + + (2n 1) = n
2
.
Answer: Let S(n) = 1 + 3 + 5 + + (2n 1). We want to prove
by induction that for every positive integer n, S(n) = n
2
.
1. Basis Step: If n = 1 we have S(1) = 1 = 1
2
, so the property is
true for 1.
2. Inductive Step: Assume (Induction Hypothesis) that the prop-
erty is true for some positive integer n, i.e.: S(n) = n
2
. We must
prove that it is also true for n +1, i.e., S(n +1) = (n +1)
2
. In
fact:
S(n + 1) = 1 + 3 + 5 + + (2n + 1) = S(n) + 2n + 1 .
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
4.2. MATHEMATICAL INDUCTION 63
But by induction hypothesis, S(n) = n
2
, hence:
S(n + 1) = n
2
+ 2n + 1 = (n + 1)
2
.
This completes the induction, and shows that the property is true for
all positive integers.
Example: Prove that 2n + 1 2
n
for n 3.
Answer: This is an example in which the property is not true for
all positive integers but only for integers greater than or equal to 3.
1. Basis Step: If n = 3 we have 2n + 1 = 2 3 + 1 = 7 and
2
n
= 2
3
= 8, so the property is true in this case.
2. Inductive Step: Assume (Induction Hypothesis) that the prop-
erty is true for some positive integer n, i.e.: 2n + 1 2
n
. We
must prove that it is also true for n+1, i.e., 2(n+1)+1 2
n+1
.
By the induction hypothesis we know that 2n 2
n
, and we also
have that 3 2
n
if n 3, hence
2(n + 1) + 1 = 2n + 3 2
n
+ 2
n
= 2
n+1
.
This completes the induction, and shows that the property is true for
all n 3.
Exercise: Prove the following identities by induction:
1 + 2 + 3 + +n =
n(n + 1)
2
.
1
2
+ 2
2
+ 3
2
+ +n
2
=
n(n + 1) (2n + 1)
6
.
1
3
+ 2
3
+ 3
3
+ +n
3
= (1 + 2 + 3 + +n)
2
.
4.2.2. Strong Form of Mathematical Induction. Let P be a
property of positive integers such that:
1. Basis Step: P(1) is true, and
2. Inductive Step: if P(k) is true for all 1 k n then P(n + 1)
is true.
Then P(n) is true for all positive integers.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
4.2. MATHEMATICAL INDUCTION 64
Example: Prove that every integer n 2 is prime or a product of
primes. Answer:
1. Basis Step: 2 is a prime number, so the property holds for
n = 2.
2. Inductive Step: Assume that if 2 k n, then k is a prime
number or a product of primes. Now, either n + 1 is a prime
number or it is not. If it is a prime number then it veries the
property. If it is not a prime number, then it can be written as
the product of two positive integers, n + 1 = k
1
k
2
, such that
1 < k
1
, k
2
< n + 1. By induction hypothesis each of k
1
and
k
2
must be a prime or a product of primes, hence n + 1 is a
product of primes.
This completes the proof.
4.2.3. The Well-Ordering Principle. Every nonempty set of
positive integers has a smallest element.
Example: Prove that
2 is irrational (i.e.,
2 cannot be written as
a quotient of two positive integers) using the well-ordering principle.
Answer: Assume that
2 is rational, i.e.,
. Hence:
2 b
2
= 4 a
2
, and simplifying: b
2
= 2 a
2
. From here we see that
2 =
b/a
2 = a/b
we end up with another fractional representation
2 = b/a
with a
smaller numerator b < a. Repeating the same argument with the
fraction b/a
2 is rational has to be
false.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
4.3. RECURRENCE RELATIONS 65
4.3. Recurrence Relations
Here we look at recursive denitions under a dierent point of view.
Rather than denitions they will be considered as equations that we
must solve. The point is that a recursive denition is actually a def-
inition when there is one and only one object satisfying it, i.e., when
the equations involved in that denition have a unique solution. Also,
the solution to those equations may provide a closed-form (explicit)
formula for the object dened.
The recursive step in a recursive denition is also called a recurrence
relation. We will focus on kth-order linear recurrence relations, which
are of the form
C
0
x
n
+C
1
x
n1
+C
2
x
n2
+ +C
k
x
nk
= b
n
,
where C
0
,= 0. If b
n
= 0 the recurrence relation is called homogeneous.
Otherwise it is called non-homogeneous.
The basis of the recursive denition is also called initial conditions
of the recurrence. So, for instance, in the recursive denition of the
Fibonacci sequence, the recurrence is
F
n
= F
n1
+F
n2
or
F
n
F
n1
F
n2
= 0 ,
and the initial conditions are
F
0
= 0, F
1
= 1 .
One way to solve some recurrence relations is by iteration, i.e., by
using the recurrence repeatedly until obtaining a explicit close-form
formula. For instance consider the following recurrence relation:
x
n
= r x
n1
(n > 0) ; x
0
= A.
By using the recurrence repeatedly we get:
x
n
= r x
n1
= r
2
x
n2
= r
3
x
n3
= = r
n
x
0
= Ar
n
,
hence the solution is x
n
= Ar
n
.
In the following we assume that the coecients C
0
, C
1
, . . . , C
k
are
constant.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
4.3. RECURRENCE RELATIONS 66
4.3.1. First Order Recurrence Relations. The homogeneous
case can be written in the following way:
x
n
= r x
n1
(n > 0) ; x
0
= A.
Its general solution is
x
n
= Ar
n
,
which is a geometric sequence with ratio r.
The non-homogeneous case can be written in the following way:
x
n
= r x
n1
+c
n
(n > 0) ; x
0
= A.
Using the summation notation, its solution can be expressed like this:
x
n
= Ar
n
+
n
k=1
c
k
r
nk
.
We examine two particular cases. The rst one is
x
n
= r x
n1
+c (n > 0); x
0
= A.
where c is a constant. The solution is
x
n
= Ar
n
+c
n
k=1
r
nk
= Ar
n
+c
r
n
1
r 1
if r ,= 1 ,
and
x
n
= A +c n if r = 1 .
Example: Assume that a country with currently 100 million people
has a population growth rate (birth rate minus death rate) of 1% per
year, and it also receives 100 thousand immigrants per year (which
are quickly assimilated and reproduce at the same rate as the native
population). Find its population in 10 years from now. (Assume that
all the immigrants arrive in a single batch at the end of the year.)
Answer: If we call x
n
= population in year n from now, we have:
x
n
= 1.01 x
n1
+ 100, 000 (n > 0); x
0
= 100, 000, 000 .
This is the equation above with r = 1.01, c = 100, 000 and A =
100, 000, 00, hence:
x
n
= 100, 000, 000 1.01
n
+ 100, 000
1.01
n
1
1.01 1
= 100, 000, 000 1.01
n
+ 1000 (1.01
n
1) .
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
4.3. RECURRENCE RELATIONS 67
So:
x
10
= 110, 462, 317 .
The second particular case is for r = 1 and c
n
= c + d n, where c
and d are constant (so c
n
is an arithmetic sequence):
x
n
= x
n1
+c +d n (n > 0) ; x
0
= A.
The solution is now
x
n
= A +
n
k=1
(c +d k) = A +c n +
d n(n + 1)
2
.
4.3.2. Second Order Recurrence Relations. Now we look at
the recurrence relation
C
0
x
n
+C
1
x
n1
+C
2
x
n2
= 0 .
First we will look for solutions of the form x
n
= c r
n
. By plugging in
the equation we get:
C
0
c r
n
+C
1
c r
n1
+C
2
c r
n2
= 0 ,
hence r must be a solution of the following equation, called the char-
acteristic equation of the recurrence:
C
0
r
2
+C
1
r +C
2
= 0 .
Let r
1
, r
2
be the two (in general complex) roots of the above equation.
They are called characteristic roots. We distinguish three cases:
1. Distinct Real Roots. In this case the general solution of the
recurrence relation is
x
n
= c
1
r
n
1
+c
2
r
n
2
,
where c
1
, c
2
are arbitrary constants.
2. Double Real Root. If r
1
= r
2
= r, the general solution of the
recurrence relation is
x
n
= c
1
r
n
+c
2
nr
n
,
where c
1
, c
2
are arbitrary constants.
3. Complex Roots. In this case the solution could be expressed
in the same way as in the case of distinct real roots, but in
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
4.3. RECURRENCE RELATIONS 68
order to avoid the use of complex numbers we write r
1
= r e
i
,
r
2
= r e
i
, k
1
= c
1
+c
2
, k
2
= (c
1
c
2
) i, which yields:
1
x
n
= k
1
r
n
cos n +k
2
r
n
sin n.
Example: Find a closed-form formula for the Fibonacci sequence
dened by:
F
n+1
= F
n
+F
n1
(n > 0) ; F
0
= 0, F
1
= 1 .
Answer: The recurrence relation can be written
F
n
F
n1
F
n2
= 0 .
The characteristic equation is
r
2
r 1 = 0 .
Its roots are:
2
r
1
= =
1 +
5
2
; r
2
=
1
=
1
5
2
.
They are distinct real roots, so the general solution for the recurrence
is:
F
n
= c
1
n
+c
2
(
1
)
n
.
Using the initial conditions we get the value of the constants:
_
(n = 0) c
1
+c
2
= 0
(n = 1) c
1
+c
2
(
1
) = 1
_
c
1
= 1/
5
c
2
= 1/
5
Hence:
F
n
=
1
5
_
n
()
n
_
.
1
Remainder: e
i
= cos + i sin.
2
=
1+
5
2
is the Golden Ratio.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
CHAPTER 5
Counting
5.1. Basic Principles
5.1.1. The Rule of Sum. If a task can be performed in m ways,
while another task can be performed in n ways, and the two tasks
cannot be performed simultaneously, then performing either task can
be accomplished in m+n ways.
Set theoretical version of the rule of sum: If A and B are disjoint
sets (A B = ) then
[A B[ = [A[ +[B[ .
More generally, if the sets A
1
, A
2
, . . . , A
n
are pairwise disjoint, then:
[A
1
A
2
A
n
[ = [A
1
[ +[A
2
[ + +[A
n
[ .
For instance, if a class has 30 male students and 25 female students,
then the class has 30 + 25 = 45 students.
5.1.2. The Rule of Product. If a task can be performed in m
ways and another independent task can be performed in n ways, then
the combination of both tasks can be performed in mn ways.
Set theoretical version of the rule of product: Let A B be the
Cartesian product of sets A and B. Then:
[A B[ = [A[ [B[ .
More generally:
[A
1
A
2
A
n
[ = [A
1
[ [A
2
[ [A
n
[ .
For instance, assume that a license plate contains two letters fol-
lowed by three digits. How many dierent license plates can be printed?
Answer: each letter can be printed in 26 ways, and each digit can be
printed in 10 ways, so 26 26 10 10 10 = 676000 dierent plates can
be printed.
69
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
5.1. BASIC PRINCIPLES 70
Exercise: Given a set A with m elements and a set B with n ele-
ments, nd the number of functions from A to B.
5.1.3. The Inclusion-Exclusion Principle. The inclusion-exclusion
principle generalizes the rule of sum to non-disjoint sets.
In general, for arbitrary (but nite) sets A, B:
[A B[ = [A[ +[B[ [A B[ .
Example: Assume that in a university with 1000 students, 200 stu-
dents are taking a course in mathematics, 300 are taking a course in
physics, and 50 students are taking both. How many students are
taking at least one of those courses?
Answer: If U = total set of students in the university, M = set
of students taking Mathematics, P = set of students taking Physics,
then:
[M P[ = [M[ +[P[ [M P[ = 300 + 200 50 = 450
students are taking Mathematics or Physics.
For three sets the following formula applies:
[A B C[ =
[A[ +[B[ +[C[ [A B[ [A C[ [B C[ +[A B C[ ,
and for an arbitrary union of sets:
[A
1
A
2
A
n
[ = s
1
s
2
+s
3
s
4
+ s
n
,
where s
k
= sum of the cardinalities of all possible k-fold intersections
of the given sets.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
5.2. COMBINATORICS 71
5.2. Combinatorics
5.2.1. Permutations. Assume that we have n objects. Any ar-
rangement of any k of these objects in a given order is called a per-
mutation of size k. If k = n then we call it just a permutation of the
n objects. For instance, the permutations of the letters a, b, c are the
following: abc, acb, bac, bca, cab, cba. The permutations of size 2 of
the letters a, b, c, d are: ab, ac, ad, ba, bc, bd, ca, cb, cd, da, db, dc.
Note that the order is important. Given two permutations, they
are considered equal if they have the same elements arranged in the
same order.
We nd the number P(n, k) of permutations of size k of n given
objects in the following way: The rst object in an arrangement can
be chosen in n ways, the second one in n 1 ways, the third one in
n 2 ways, and so on, hence:
P(n, k) = n (n 1)
(k factors)
(n k + 1) =
n!
(n k)!
,
where n! = 1 2 3
(n factors)
n is called n factorial .
The number P(n, k) of permutations of n objects is
P(n, n) = n! .
By convention 0! = 1.
For instance, there are 3! = 6 permutations of the 3 letters a, b, c.
The number of permutations of size 2 of the 4 letters a, b, c, d is P(4, 2) =
4 3 = 12.
Exercise: Given a set A with m elements and a set B with n ele-
ments, nd the number of one-to-one functions from A to B.
5.2.2. Combinations. Assume that we have a set A with n ob-
jects. Any subset of A of size r is called a combination of n ele-
ments taken r at a time. For instance, the combinations of the letters
a, b, c, d, e taken 3 at a time are: abc, abd, abe, acd, ace, ade, bcd, bce,
bde, cde, where two combinations are considered identical if they have
the same elements regardless of their order.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
5.2. COMBINATORICS 72
The number of subsets of size r in a set A with n elements is:
C(n, r) =
n!
r! (n r)!
.
The symbol
_
n
r
_
(read n choose r) is often used instead of C(n, r).
One way to derive the formula for C(n, r) is the following. Let A
be a set with n objects. In order to generate all possible permutations
of size r of the elements of A we 1) take all possible subsets of size
r in the set A, and 2) permute the k elements in each subset in all
possible ways. Task 1) can be performed in C(n, r) ways, and task
2) can be performed in P(r, r) ways. By the product rule we have
P(n, r) = C(n, r) P(r, r), hence
C(n, r) =
P(n, r)
P(r, r)
=
n!
r! (n r)!
.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
5.3. GENERALIZED PERMUTATIONS AND COMBINATIONS 73
5.3. Generalized Permutations and Combinations
5.3.1. Permutations with Repeated Elements. Assume that
we have an alphabet with k letters and we want to write all possible
words containing n
1
times the rst letter of the alphabet, n
2
times the
second letter,. . . , n
k
times the kth letter. How many words can we
write? We call this number P(n; n
1
, n
2
, . . . , n
k
), where n = n
1
+ n
2
+
+n
k
.
Example: With 3 as and 2 bs we can write the following 5-letter
words: aaabb, aabab, abaab, baaab, aabba, ababa, baaba, abbaa, babaa,
bbaaa.
We may solve this problem in the following way, as illustrated with
the example above. Let us distinguish the dierent copies of a letter
with subscripts: a
1
a
2
a
3
b
1
b
2
. Next, generate each permutation of this
ve elements by choosing 1) the position of each kind of letter, then 2)
the subscripts to place on the 3 as, then 3) these subscripts to place on
the 2 bs. Task 1) can be performed in P(5; 3, 2) ways, task 2) can be
performed in 3! ways, task 3) can be performed in 2!. By the product
rule we have 5! = P(5; 3, 2) 3! 2!, hence P(5; 3, 2) = 5!/3! 2!.
In general the formula is:
P(n; n
1
, n
2
, . . . , n
k
) =
n!
n
1
! n
2
! . . . n
k
!
.
5.3.2. Combinations with Repetition. Assume that we have a
set A with n elements. Any selection of r objects from A, where each
object can be selected more than once, is called a combination of n
objects taken r at a time with repetition. For instance, the combinations
of the letters a, b, c, d taken 3 at a time with repetition are: aaa, aab,
aac, aad, abb, abc, abd, acc, acd, add, bbb, bbc, bbd, bcc, bcd, bdd, ccc, ccd,
cdd, ddd. Two combinations with repetition are considered identical
if they have the same elements repeated the same number of times,
regardless of their order.
Note that the following are equivalent:
1. The number of combinations of n objects taken r at a time with
repetition.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
5.3. GENERALIZED PERMUTATIONS AND COMBINATIONS 74
2. The number of ways r identical objects can be distributed among
n distinct containers.
3. The number of nonnegative integer solutions of the equation:
x
1
+x
2
+ +x
n
= r .
Example: Assume that we have 3 dierent (empty) milk containers
and 7 quarts of milk that we can measure with a one quart measuring
cup. In how many ways can we distribute the milk among the three
containers? We solve the problem in the following way. Let x
1
, x
2
, x
3
be
the quarts of milk to put in containers number 1, 2 and 3 respectively.
The number of possible distributions of milk equals the number of non
negative integer solutions for the equation x
1
+ x
2
+ x
3
= 7. Instead
of using numbers for writing the solutions, we will use strokes, so for
instance we represent the solution x
1
= 2, x
2
= 1, x
3
= 4, or 2 + 1 + 4,
like this: [[ +[ +[[[[. Now, each possible solution is an arrangement of 7
strokes and 2 plus signs, so the number of arrangements is P(9; 7, 2) =
9!/7! 2! =
_
9
7
_
.
The general solution is:
P(n +r 1; r, n 1) =
(n +r 1)!
r! (n 1)!
=
_
n +r 1
r
_
.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
5.4. BINOMIAL COEFFICIENTS 75
5.4. Binomial Coecients
5.4.1. Binomial Theorem. The following identities can be easily
checked:
(x +y)
0
= 1
(x +y)
1
= x +y
(x +y)
2
= x
2
+ 2 xy +y
2
(x +y)
3
= x
3
+ 3 x
2
y + 3 xy
2
+y
3
They can be generalized by the following formula, called the Binomial
Theorem:
(x +y)
n
=
n
k=0
_
n
k
_
x
nk
y
k
=
_
n
0
_
x
n
+
_
n
1
_
x
n1
y +
_
n
2
_
x
n2
y
2
+
+
_
n
n 1
_
xy
n1
+
_
n
n
_
y
n
.
We can nd this formula by writing
(x +y)
n
= (x +y) (x +y)
(n factors)
(x +y) ,
expanding, and grouping terms of the form x
a
y
b
. Since there are n
factors of the form (x + y), we have a + b = n, hence the terms must
be of the form x
nk
y
k
. The coecient of x
nk
y
k
will be equal to the
number of ways in which we can select the y from any k of the factors
(and the x from the remaining n k factors), which is C(n, k) =
_
n
k
_
.
The expression
_
n
k
_
is often called binomial coecient.
Exercise: Prove
n
k=0
_
n
k
_
= 2
n
and
n
k=0
(1)
k
_
n
k
_
= 0 .
Hint: Apply the binomial theorem to (1 + 1)
2
and (1 1)
2
.
5.4.2. Properties of Binomial Coecients. The binomial co-
ecients have the following properties:
1.
_
n
k
_
=
_
n
n k
_
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
5.4. BINOMIAL COEFFICIENTS 76
2.
_
n + 1
k + 1
_
=
_
n
k
_
+
_
n
k + 1
_
The rst property follows easily from
_
n
k
_
=
n!
k!(n k)!
.
The second property can be proved by choosing a distinguished
element a in a set A of n + 1 elements. The set A has
_
n+1
k+1
_
subsets
of size k + 1. Those subsets can be partitioned into two classes: that
of the subsets containing a, and that of the subsets not containing a.
The number of subsets containing a equals the number of subsets of
A a of size k, i.e.,
_
n
k
_
. The number of subsets not containing a is
the number of subsets of A a of size k + 1, i.e.,
_
n
k+1
_
. Using the
sum principle we nd that in fact
_
n+1
k+1
_
=
_
n
k
_
+
_
n
k+1
_
.
5.4.3. Pascals Triangle. The properties shown in the previous
section allow us to compute binomial coecients in a simple way. Look
at the following triangular arrangement of binomial coecients:
_
0
0
_
_
1
0
_ _
1
1
_
_
2
0
_ _
2
1
_ _
2
2
_
_
3
0
_ _
3
1
_ _
3
2
_ _
3
3
_
_
4
0
_ _
4
1
_ _
4
2
_ _
4
3
_ _
4
4
_
We notice that each binomial coecient on this arrangement must
be the sum of the two closest binomial coecients on the line above it.
This together with
_
n
0
_
=
_
n
n
_
= 1, allows us to compute very quickly
the values of the binomial coecients on the arrangement:
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
This arrangement of binomial coecients is called Pascals Trian-
gle.
1
1
Although it was already known by the Chinese in the XIV century.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
5.5. THE PIGEONHOLE PRINCIPLE 77
5.5. The Pigeonhole Principle
5.5.1. The Pigeonhole Principle. The pigeonhole principle is
used for proving that a certain situation must actually occur. It says
the following: If n pigeonholes are occupied by m pigeons and m > n,
then at least one pigeonhole is occupied by more than one pigeon.
1
Example: In any given set of 13 people at least two of them have
their birthday during the same month.
Example: Let S be a set of eleven 2-digit numbers. Prove that
S must have two elements whose digits have the same dierence (for
instance in S = 10, 14, 19, 22, 26, 28, 49, 53, 70, 90, 93, the digits of
the numbers 28 and 93 have the same dierence: 8 2 = 6, 9 3 =
6.) Answer: The digits of a two-digit number can have 10 possible
dierences (from 0 to 9). So, in a list of 11 numbers there must be two
with the same dierence.
Example: Assume that we choose three dierent digits from 1 to
9 and write all permutations of those digits. Prove that among the
3-digit numbers written that way there are two whose dierence is a
multiple of 500. Answer: There are 9 8 7 = 504 permutations of
three digits. On the other hand if we divide the 504 numbers by 500
we can get only 500 possible remainders, so at least two numbers give
the same remainder, and their dierence must be a multiple of 500.
Exercise: Prove that if we select n + 1 numbers from the set S =
1, 2, 3, . . . , 2n, among the numbers selected there are two such that
one is a multiple of the other one.
1
The Pigeonhole Principle (Schubfachprinzip) was rst used by Dirichlet in
Number Theory. The term pigeonhole actually refers to one of those old-fashioned
writing desks with thin vertical wooden partitions in which to le letters.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
CHAPTER 6
Probability
6.1. Probability
6.1.1. Introduction. Assume that we perform an experiment such
as tossing a coin or rolling a die. The set of possible outcomes is called
the sample space of the experiment. An event is a subset of the sample
space. For instance, if we toss a coin three times, the sample space is
S = HHH, HHT, HTH, HTT, THH, THT, TTH, TTT .
The event at least two heads in a row would be the subset
E = HHH, HHT, THH .
If all possible outcomes of an experiment have the same likelihood of
occurrence, then the probability of an event A S is given by Laplaces
rule:
P(E) =
[E[
[S[
.
For instance, the probability of getting at least two heads in a row in
the above experiment is 3/8.
6.1.2. Probability Function. In general the likelihood of dier-
ent outcomes of an experiment may not be the same. In that case
the probability of each possible outcome x is a function P(x). This
function veries:
0 P(x) 1 for all x S
and
xS
P(x) = 1 .
The probability of an event E S will be
P(E) =
xE
P(x)
78
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
6.1. PROBABILITY 79
Example: Assume that a die is loaded so that the probability of
obtaining n point is proportional to n. Find the probability of getting
an odd number when rolling that die.
Answer: First we must nd the probability function P(n) (n =
1, 2, . . . , 6). We are told that P(n) is proportional to n, hence P(n) =
kn. Since P(S) = 1 we have P(1)+P(2)+ P(6) = 1, i.e., k1+k2+
+k 6 = 21k = 1, so k = 1/21 and P(n) = n/21. Next we want to
nd the probability of E = 2, 4, 6, i.e. P(E) = P(2) +P(4) +P(6) =
2
21
+
4
21
+
6
21
=
12
21
.
6.1.3. Properties of probability. Let P be a probability func-
tion on a sample space S. Then:
1. For every event E S,
0 P(E) 1 .
2. P() = 0, P(S) = 1.
3. For every event E S, if E = is the complement of E (not
E) then
P(E) = 1 P(E) .
4. If E
1
, E
2
S are two events, then
P(E
1
E
2
) = P(E
1
) +P(E
2
) P(E
1
E
2
) .
In particular, if E
1
E
2
= (E
1
and E
2
are mutually exclusive,
i.e., they cannot happen at the same time) then
P(E
1
E
2
) = P(E
1
) +P(E
2
) .
Example: Find the probability of getting a sum dierent from 10 or
12 after rolling two dice. Answer: We can get 10 in 3 dierent ways:
4+6, 5+5, 6+4, so P(10) = 3/36. Similarly we get that P(12) = 1/36.
Since they are mutually exclusive events, the probability of getting 10
or 12 is P(10) +P(12) = 3/36+1/36 = 4/36 = 1/9. So the probability
of not getting 10 or 12 is 1 1/9 = 8/9.
6.1.4. Conditional Probability. The conditional probability of
an event E given F, represented P(E [ F), is the probability of E
assuming that F has occurred. It is like restricting the sample space
to F. Its value is
P(E [ F) =
P(E F)
P(F)
.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
6.1. PROBABILITY 80
Example: Find the probability of obtaining a sum of 10 after rolling
two fair dice. Find the probability of that event if we know that at least
one of the dice shows 5 points.
Answer: We call E = obtaining sum 10 and F = at least one of
the dice shows 5 points. The number of possible outcomes is 6 6 =
36. The event obtaining a sum 10 is E = (4, 6), (5, 5), (6, 4), so
[E[ = 3. Hence the probability is P(E) = [E[/[S[ = 3/36 = 1/12.
Now, if we know that at least one of the dice shows 5 points then the
sample space shrinks to
F = (1, 5), (2, 5), (3, 5), (4, 5), (5, 5), (6, 5), (5, 1), (5, 2), (5, 3), (5, 4), (5, 6) ,
so [F[ = 11, and the ways to obtain a sum 10 are E F = (5, 5),
[E F[ = 1, so the probability is P(E [ F) = P(E F)/P(F) = 1/11.
6.1.5. Independent Events. Two events E and F are said to be
independent if the probability of one of them does not depend on the
other, e.g.:
P(E [ F) = P(E) .
In this circumstances:
P(E F) = P(E) P(F) .
Note that if E is independent of F then also F is independent of E,
e.g., P(F [ E) = P(F).
Example: Assume that the probability that a shooter hits a target
is p = 0.7, and that hitting the target in dierent shots are independent
events. Find:
1. The probability that the shooter does not hit the target in one
shot.
2. The probability that the shooter does not hit the target three
times in a row.
3. The probability that the shooter hits the target at least once
after shooting three times.
Answer:
1. P(not hitting the target in one shot) = 1 0.7 = 0.3.
2. P(not hitting the target three times in a row) = 0.3
3
= 0.027.
3. P(hitting the target at least once in three shots) = 10.027 =
0.973.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
6.1. PROBABILITY 81
6.1.6. Bayes Theorem. Suppose that a sample space S is parti-
tioned into n classes C
1
, C
2
, . . . , C
n
which are pairwise mutually exclu-
sive and whose union lls the whole sample space. Then for any event
F we have
P(F) =
n
i=1
P(F [ C
i
) P(C
i
)
and
P(C
j
[ F) =
P(F [ C
j
) P(C
j
)
P(F)
.
Example: In a country with 100 million people 100 thousand of
them have disease X. A test designed to detect the disease has a 99%
probability of detecting it when administered to a person who has it,
but it also has a 5% probability of giving a false positive when given to
a person who does not have it. A person is given the test and it comes
out positive. What is the probability that that person has the disease?
Answer: The classes are C
1
= has the disease and C
2
= does
not have the disease, and the event is F = the test gives a positive.
We have: [S[ = 100, 000, 000, [C
1
[ = 100, 000, [C
2
[ = 99, 900, 000,
hence P(C
1
) = [C
1
[/[S[ = 0.001, P(C
2
) = [C
2
[/[S[ = 0.999. Also
P(F [ C
1
) = 0.99, P(F [ C
2
) = 0.05. Hence:
P(F) = P(F [ C
1
) P(C
1
) +P(F [ C
2
) P(C
2
)
= 0.99 0.001 + 0.05 0.999 = 0.05094 ,
and by Bayes theorem:
P(C
1
[ F) =
P(F [ C
1
) P(C
1
)
P(F)
=
0.99 0.001
0.05094
= 0.019434628 2%.
(So the test is really of little use when given to a random person
however it might be useful in combination with other tests or other
evidence that the person might have the disease.)
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
CHAPTER 7
Graph Theory
7.1. Graphs
7.1.1. Graphs. Consider the following examples:
1. A road map, consisting of a number of towns connected with
roads.
2. The representation of a binary relation dened on a given set.
The relation of a given element x to another element y is rep-
resented with an arrow connecting x to y.
The former is an example of (undirected) graph. The latter is an
example of a directed graph or digraph.
a
b
c
d
Figure 7.1. Undirected Graph.
In general a graph G consists of two things:
1. The vertex set V , whose elements are called vertices, nodes or
points.
2. The edge set E or set of edges connecting pairs of vertices. If
the edges are directed then they are also called directed edges
or arcs. Each edge e E is associated with a pair of vertices.
82
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
7.1. GRAPHS 83
a
b
c
d
Figure 7.2. Directed Graph.
A graph is sometimes represented by the pair (V, E) (we assume V
and E nite).
If the graph is undirected and there is a unique edge e connecting x
and y we may write e = x, y, so E can be regarded as set of unordered
pairs. In this context we may also write e = (x, y), understanding that
here (x, y) is not an ordered pair, but the name of an edge.
If the graph is directed and there is a unique edge e pointing from
x to y, then we may write e = (x, y), so E may be regarded as a set
of ordered pairs. If e = (x, y), the vertex x is called origin, source or
initial point of the edge e, and y is called the terminus, terminating
vertex or terminal point.
b
c
d
a
Figure 7.3. Graph with parallel edges.
Two vertices connected by an edge are called adjacent. They are
also the endpoints of the edge, and the edge is said to be incident to
each of its endpoints. If the graph is directed, an edge pointing from
vertex x to vertex y is said to be incident from x and incident to y. An
edge connecting a vertex to itself is called a loop. Two edges connecting
the same pair of points (and pointing in the same direction if the graph
is directed) are called parallel or multiple.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
7.1. GRAPHS 84
A graph with neither loops nor multiple edges is called a simple
graph. If a graph has multiple edges but no loops then it is called a
multigraph. If it has loops (and possible also multiple edges) then it is
called a pseudograph.
The following table summarizes the graph terminology
Table 7.1.1. Graph Terminology
Type Edges Multiple Edges Allowed? Loops Allowed?
Simple graph indirected no no
Multigraph indirected yes no
Pseudograph indirected yes yes
Directed graph directed no yes
Directed multigraph directed yes yes
The degree of a vertex v, represented deg(v), is the number of edges
that contain it (loops are counted twice). A vertex of degree zero (not
connected to any other vertex) is called isolated. A vertex of degree 1
is called pendant.
The Handshaking Theorem. Let G = (V, E) be an undirected graph
with e edges. Then
2e =
vV
deg (v) .
(This applies even if multiple edges and loops are present.)
In a graph with directed edges, the in-degree of a vertex v, denoted
deg
vV
deg (v) =
vVe
deg (v) +
vVo
deg (v) .
Since deg(v) is even for v V
e
, the rst sum in the right hand side of
the equality is even. The total sum must be 2e, which is even, so the
second sum must be even too. But its terms are all odd, so there must
be an even number of them.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
7.1. GRAPHS 85
Sum of degrees in an directed graph. Let G = (V, E) be a directed
graph. Then
vV
deg
(v) =
vV
deg
+
(v) = [E[ .
A weighted graph is a graph whose edges have been labeled with
numbers. The length of a path in a weighted graph is the sum of the
weights of the edges in the path.
a
b
d
c
6
3
4
6
7
Figure 7.4. Weighted Graph.
7.1.2. Special Graphs. Here we examine a few special graphs.
The n-cube: A graph with with 2
n
vertices labeled 0, 1, . . . , 2
n
1
so that two of them are connected with an edge if their binary repre-
sentation diers in exactly one bit.
000
001
010 011
100
101
110 111
Figure 7.5. 3-cube.
Complete Graph: a simple undirected graph G such that every pair
of distinct vertices in G are connected by an edge. The complete graph
of n vertices is represented K
n
(g. 7.6). A complete directed graph is
a simple directed graph G = (V, E) such that every pair of distinct
vertices in G are connected by exactly one edgeso, for each pair of
distinct vertices, either (x, y) or (y, x) (but not both) is in E.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
7.1. GRAPHS 86
a
b
c d
e
Figure 7.6. Complete graph K
5
.
Bipartite Graph: a graph G = (V, E) in which V can be partitioned
into two subsets V
1
and V
2
so that each edge in G connects some vertex
in V
1
to some vertex in V
2
. A bipartite simple graph is called complete
if each vertex in V
1
is connected to each vertex in V
2
. If [V
1
[ = m and
[V
2
[ = n, the corresponding complete bipartite graph is represented
K
m,n
(g. 7.7).
A graph is bipartite i its vertices can be colored with two colors
so that every edge connects vertices of dierent color.
Question: Is the n-cube bipartite. Hint: color in red all vertices
whose binary representation has an even number of 1s, color in blue
the ones with an odd number of 1s.
b
p
q
r
s
a
c
Figure 7.7. Complete bipartite graph K
3,4
.
Regular Graph: a simple graph whose vertices have all the same
degree. For instance, the n-cube is regular.
7.1.3. Subgraph. Given a graph G = (V, E), a subgraph G
=
(V
, E
V and E
E. If
V
= V then G
= (V
, E
k=1
f
k
l(a
k
) ,
where l(a
k
) = length of a
k
and F =
n
k=1
f
k
. The problem now is,
given an alphabet and the frequencies of its characters, nd an optimal
encoding that provides minimum average length for words.
Fix length and Human codes can be represented by trees like in
gure 8.8. The code of each symbol consists of the sequence of labels of
the edges in the path from the root to the leaf with the desired symbol.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
8.3. DECISION TREES, TREE ISOMORPHISMS 108
a
b d
0
1 0
1 0
1
a
b
c d
1 0
1 0
1 0
Huffman code Fix length code
c
Figure 8.8. Fix length code and Human code.
8.3.5. Constructing an Optimal Human Code. An optimal
Human code is a Human code in which the average length of the
symbols is minimum. In general an optimal Human code can be made
as follows. First we list the frequencies of all the codes and represent
the symbols as vertices (which at the end will be leaves of a tree).
Then we replace the two smallest frequencies f
1
and f
2
with their sum
f
1
+ f
2
, and join the corresponding two symbols to a common vertex
above them by two edges, one labeled 0 and the other one labeled 1.
Than common vertex plays the role of a new symbol with a frequency
equal to f
1
+f
2
. Then we repeat the same operation with the resulting
shorter list of frequencies until the list is reduced to one element and
the graph obtained becomes a tree.
Example: Find the optimal Human code for the following table of
symbols:
character frequency
a 2
b 3
c 7
d 8
e 12
Answer: : The successive reductions of the list of frequencies are as
follows:
2, 3
..
5
, 7, 8, 12 5, 7
..
12
, 8, 12 12, 8, 12
Here we have a choice, we can choose to add the rst 12 and 8, or
8 and the second 12. Lets choose the former:
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
8.3. DECISION TREES, TREE ISOMORPHISMS 109
12, 8
..
20
, 12 20, 12
. .
32
32
The tree obtained is the following:
a b
d
e
1
1
1
1 0
0
0
0
2
3
7
8
12
5
12
20
32
c
Figure 8.9. Optimal Human code 1.
The resulting code is as follows:
character code
a 1111
b 1110
c 110
d 10
e 0
The other choice yields the following:
12, 8, 12
..
20
20, 12
. .
32
32
a
b
c d
e
1 0
1
1
0
0
1
32
12 20
5
7 8
12
2
3
0
Figure 8.10. Optimal Human code 2.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
8.3. DECISION TREES, TREE ISOMORPHISMS 110
character code
a 111
b 110
c 10
d 01
e 00
8.3.6. Game Trees. Trees are used in the analysis of some games.
As an example we study the following game using a tree: Initially
there are two piles with 3 coins and 1 coin respectively. Taking turns
two players remove any number of coins from one of the piles. The
player that removes the last coin loses. The following tree represents
all possible sequences of choices. Each node shows the number of coins
in each pile, and each edge represents a possible move (choice) from
one of the players. The rst player is represented with a box and the
second player is represented with an circle.
1
0
0
1
0
0
0
0
0
0
1
1
0
1
0
1
2
0
0
0
1
0
0
0
0
0
1
0
0
1
0
0
0
0
1
1
2
1
1
0
0
0
0
0
1
0
0
0
0
0
2
0
3
0
3
1
1 1 1
0 0 0
1
1
0
1
0 0
1
1
1
1
0
1
0 0
0 1
0
1
0
0
1
1
Figure 8.11. Tree of a game.
The analysis of the game starts by labeling each terminal vertex
with 1 if it represents a victory for the rst player and 0 if it
represents a victory for the second player. This numbers represent
the value of each position of the game, so that the rst player is
interested in making it maximum and the second player wants to
make it minimum. Then we continue labeling the rest of the vertices
in the following way. After all the children of a given vertex have
been labeled, we label the vertex depending on whether it is a rst
player position (box) or a second player position (circle). First
player positions are labeled with the maximum value of the labels of
its children, second player positions are labeled with the minimum
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
8.3. DECISION TREES, TREE ISOMORPHISMS 111
value of the labels of its children. This process is called the minimax
procedure. Every vertex labeled 1 will represent a position in which
the rst player has advantage and can win if he/she works without
making mistakes; on the other hand, vertices labeled 0 represent
positions for which the second player has advantage. Now the strategy
is for the rst player to select at each position a children with maximum
value, while the second player will be interested in selecting children
with minimum value. If the starting position has been labeled 1 that
means that the rst player has a winning strategy, otherwise the second
player has advantage. For instance in the present game the rst player
has advantage at the initial position, and only one favorable movement
at that point:
_
3
1
_
_
0
1
_
, i.e., he/she must remove all 3 coins from
the rst pile. If for any reason the rst player makes a mistake and
removes say one coin from the rst pile, going to position
_
2
1
_
, then the
second player has one favorable move to vertex
_
0
1
_
, which is the one
with minimum value.
Alpha-beta pruning. In some games the game tree is so complicated
that it cannot be fully analyzed, so it is built up to a given depth only.
The vertices reached at that depth are not terminal, but they can
be evaluated using heuristic methods (for instance in chess usually
losing a knight is a better choice than losing the queen, so a position
with one queen and no knights will have a higher value than one with
no queen and one knight). Even so the evaluation and labeling of the
vertices can be time consuming, but we can bypass the evaluation of
many vertices using the technique of alpha-beta pruning. The idea is
to skip a vertex as soon as it becomes obvious that its value will not
aect the value of its parent. In order to do that with a rst player
(boxed) vertex v, we assign it an alpha value equal to the maximum
value of its children evaluated so far. Assume that we are evaluating
one of its children w, which will be a second player (circled) position. If
at any point a children of w gets a value less than or equal to the alpha
value of v then it will become obvious that the value of w is going to
be less than the current alpha value of v, so it will not aect the value
of v and we can stop the process of evaluation of w (prone the subtree
at w). That is called an alpha cuto. Similarly, at a second player
(circled) vertex v, we assign a beta value equal to the minimum value
of its children evaluated so far, and practice a beta cuto when one of
its grandchildren gets a value greater than or equal to the current beta
value of v, i.e., we prone the subtree at w, where w is the parent of
that grandchildren.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
8.3. DECISION TREES, TREE ISOMORPHISMS 112
4
4 3
5 6 3 4 5 1 ?
v
w
(4)
1
7
Figure 8.12. Alpha cuto.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
8.4. TREE TRANSVERSAL 113
8.4. Tree Transversal
8.4.1. Transversal Algorithms. In order to motivate this sub-
ject, we introduce the concept of Polish notation. Given a (not nec-
essarily commutative) binary operation , it is customary to represent
the result of applying the operation to two elements a, b by placing the
operation symbol in the middle:
a b .
This is called inx notation. The Polish notation consists of placing
the symbol to the left:
a b .
The reverse Polish notation consists of placing the symbol to the right:
a b .
The advantage of Polish notation is that it allows us to write ex-
pressions without need for parenthesis. For instance, the expression
a(b+c) in Polish notation would be a+b c, while ab+c is +a b c.
Also, Polish notation is easier to evaluate in a computer.
In order to evaluate an expression in Polish notation, we scan the
expression from right to left, placing the elements in a stack.
1
Each
time we nd an operator, we replace the two top symbols of the stack
by the result of applying the operator to those elements. For instance,
the expression +2 3 4 (which in inx notation is (2 +3) 4) would
be evaluated like this:
expression stack
+ 2 3 4
+ 2 3 4
+ 2 3 4
+ 2 3 4
5 4
20
An algebraic expression can be represented by a binary rooted tree
obtained recursively in the following way. The tree for a constant or
variable a has a as its only vertex. If the algebraic expression S is of
1
A stack or last-in rst-out (LIFO) system, is a linear list of elements in which
insertions and deletions take place only at one end, called top of the list. A queue
or rst-in rst-out (FIFO) system, is a linear list of elements in which deletions
take place only at one end, called front of the list, and insertions take place only
at the other end, called rear of the list.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
8.4. TREE TRANSVERSAL 114
the form S
L
S
R
, where S
L
and S
R
are subexpressions with trees T
L
and T
R
respectively, and is an operator, then the tree T for S consists
of as root, and the subtrees T
L
and T
R
(g. 8.13).
o
T
L
T
R
Figure 8.13. Tree of S
1
S
2
.
For instance, consider the following algebraic expression:
a +b c +d e (f +h) ,
where + denotes addition, denotes multiplication and denotes ex-
ponentiation. The binary tree for this expression is given in gure 8.14.
+
*
+
d e
f h
*
b c
+
a
Figure 8.14. Tree for a +b c +d e (f +h).
Given the binary tree of an algebraic expression, its Polish, reverse
Polish and inx representation are dierent ways of ordering the ver-
tices of the tree, namely in preorder, postorder and inorder respectively.
The following are recursive denitions of several orderings of the
vertices of a rooted tree T = (V, E) with root r. If T has only one
vertex r, then r by itself constitutes the preorder, postorder and inorder
transversal of T. Otherwise, let T
1
, . . . , T
k
the subtrees of T from left
to right (g. 8.15). Then:
1. Preorder Transversal : Pre(T) = r, Pre(T
1
), . . . , Pre(T
k
).
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
8.4. TREE TRANSVERSAL 115
r
T1 T2
...
Tk
Figure 8.15. Ordering of trees.
2. Postorder Transversal : Post(T) = Post(T
1
), . . . , Post(T
k
), r.
3. Inorder Transversal. If T is a binary tree with root r, left sub-
tree T
L
and right subtree T
R
, then: In(T) = In(T
L
), r, In(T
R
).
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
8.5. SPANNING TREES 116
8.5. Spanning Trees
8.5.1. Spanning Trees. A tree T is a spanning tree of a graph G
if T is a subgraph of G that contains all the vertices of G. For instance
the graph of gure 8.16 has a spanning tree represented by the thicker
edges.
b
c
d
e
f
g
h
a
i
Figure 8.16. Spanning tree.
Every connected graph has a spanning tree which can be obtained
by removing edges until the resulting graph becomes acyclic. In prac-
tice, however, removing edges is not ecient because nding cycles is
time consuming.
Next, we give two algorithms to nd the spanning tree T of a loop-
free connected undirected graph G = (V, E). We assume that the
vertices of G are given in a certain order v
1
, v
2
, . . . , v
n
. The resulting
spanning tree will be T = (V
, E
).
8.5.2. Breadth-First Search Algorithm. The idea is to start
with vertex v
1
as root, add the vertices that are adjacent to v
1
, then the
ones that are adjacent to the latter and have not been visited yet, and
so on. This algorithm uses a queue (initially empty) to store vertices
of the graph. In consists of the following:
1. Add v
1
to T, insert it in the queue and mark it as visited.
2. If the queue is empty, then we are done. Otherwise let v be the
vertex in the front of the queue.
3. For each vertex v
) to T, insert v
in the
queue and mark it as visited.
4. Delete v from the queue.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
8.5. SPANNING TREES 117
5. Go to step 2.
A pseudocode version of the algorithm is as follows:
1: procedure bfs(V,E)
2: S := (v1) ordered list of vertices of a fix level
3: V := v1 v1 is the root of the spanning tree
4: E := no edges in the spanning tree yet
5: while true
6: begin
7: for each x in S, in order,
8: for each y in V - V
9: if (x,y) is an edge then
10: add edge (x,y) to E and vertex y to V
11: if no edges were added then
12: return T
13: S := children of S
14: end
15: end bfs
Figure 8.17 shows the spanning tree obtained using the breadth-rst
search algorithm on the graph with its vertices ordered lexicographi-
cally: a, b, c, d, e, f, g, h, i.
a
b
c
d
e
f
g
h
i
Figure 8.17. Breadth-First Search.
8.5.3. Depth-First Search Algorithm. The idea of this algo-
rithm is to make a path as long as possible, and then go back (back-
track) to add branches also as long as possible.
This algorithm uses a stack (initially empty) to store vertices of the
graph. In consists of the following:
1. Add v
1
to T, insert it in the stack and mark it as visited.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
8.5. SPANNING TREES 118
2. If the stack is empty, then we are done. Otherwise let v be the
vertex on the top of the stack.
3. If there is no vertex v
) to T, insert v
) to T.
4. Apply P to v
.
5. Go to step 2 (backtrack).
The Depth-First Search Algorithm consists of applying the process just
dened to v
1
.
A pseudocode version of the algorithm is as follows:
1: procedure dfs(V,E)
2: V := v1 v1 is the root of the spanning tree
3: E := no edges in the spanning tree yet
4: w := v1
5: while true
6: begin
7: while there is an edge (w,v) that when added
8: to T does not create a cycle in T
9: begin
10: Choose first v such that (w,v)
11: does not create a cycle in T
12: add (w,v) to E
13: add v to V
14: w := v
15: end
16: if w = v1 then
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
8.5. SPANNING TREES 119
17: return T
18: w := parent of w in T backtrack
19: end
20: end
Figure 8.18 shows the spanning tree obtained using the breadth-rst
search algorithm on the graph with its vertices ordered lexicographi-
cally: a, b, c, d, e, f, g, h, i.
a
b
c
d
e
f
g
h
i
Figure 8.18. Depth-First Search.
8.5.4. Minimal Spanning Trees. Given a connected weighted
tree G, its minimal spanning tree is a spanning tree of G such that the
sum of the weights of its edges is minimum. For instance for the graph
of gure 8.19, the spanning tree shown is the one of minimum weight.
a b
c d
e f
4
1
5
2
6
3
2
6
Figure 8.19. Minimum Spanning Tree.
Prims Algorithm. An algorithm to nd a minimal spanning tree is
Prims Algorithm. It starts with a single vertex and at each iteration
adds to the current tree a minimum weight edge that does not complete
a cycle.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
8.5. SPANNING TREES 120
The following is a pseudocode version of Prims algorithm. If (x, y)
is an edge in G = (V, E) then w(x, y) is its weight, otherwise w(x, y) =
. The starting vertex is s.
1: procedure prim(V,w,s)
2: V := s vertex set starts with s
3: E = edge set initially empty
4: for i := 1 to n-1 put n edges in spanning tree
5: begin
6: find x in V and y in V - V with minimum w(x,y)
7: add y to V
8: add (x,y) to E
9: end
10: return E
11: end prim
Prims algorithm is an example of a greedy algorithm. A greedy
algorithm is an algorithm that optimized the choice at each iteration
without regard to previous choices (doing the best locally). Prims
algorithm makes a minimum spanning tree, but in general a greedy
algorithm does not always nds an optimal solution to a given problem.
For instance in gure 8.20 a greedy algorithm to nd the shortest path
from a to z, working by adding the shortest available edge to the most
recently added vertex, would return acz, which is not the shortest path.
a
b
c
z
1
2
4
100
101
Figure 8.20
Kruskals Algorithm. Another algorithm to nd a minimal span-
ning tree in a connected weighted tree G = (V, E) is Kruskals Algo-
rithm. It starts with all n vertices of G and no edges. At each iteration
we add an edge having minimum weight that does not complete a cycle.
We stop after adding n 1 edges.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
8.5. SPANNING TREES 121
1: procedure kruskal(E,w,n)
2: V := V
3: E :=
4: while |E| < n-1
5: begin
6: among all edges not completing a cycle in T
7: choose e of minimum weight and add it to E
8: end
9: T = (V,E)
10: return T
11: end kruskal
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
CHAPTER 9
Boolean Algebras
9.1. Combinatorial Circuits
9.1.1. Introduction. At their lowest level digital computers han-
dle only binary signals, represented with the symbols 0 and 1. The
most elementary circuits that combine those signals are called gates.
Figure 9.1 shows three gates: OR, AND and NOT.
x1
x2
x1
x2
x x
x1
x2
x2
OR GATE
NOT GATE
AND GATE
x1 +
Figure 9.1. Gates.
Their outputs can be expressed as a function of their inputs by the
following logic tables:
x
1
x
2
x
1
+x
2
1 1 1
1 0 1
0 1 1
0 0 0
OR GATE
122
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
9.1. COMBINATORIAL CIRCUITS 123
x
1
x
2
x
1
x
2
1 1 1
1 0 0
0 1 0
0 0 0
AND GATE
x x
1 0
0 1
NOT GATE
These are examples of combinatorial circuits. A combinatorial cir-
cuit is a circuit whose output is uniquely dened by its inputs. They
do not have memory, previous inputs do not aect their outputs. Some
combinations of gates can be used to make more complicated combi-
natorial circuits. For instance gure 9.2 is combinatorial circuit with
the logic table shown below, representing the values of the Boolean
expression y = (x
1
+x
2
) x
3
.
x1
x2
y
x3
Figure 9.2. A combinatorial circuit.
x
1
x
2
x
3
y = (x
1
+x
2
) x
3
1 1 1 0
1 1 0 1
1 0 1 0
1 0 0 1
0 1 1 0
0 1 0 1
0 0 1 1
0 0 0 1
However the circuit in gure 9.3 is not a combinatorial circuit. If
x
1
= 1 and x
2
= 0 then y can be 0 or 1. Assume that at a given time
y = 0. If we input a signal x
2
= 1, the output becomes y = 1, and
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
9.1. COMBINATORIAL CIRCUITS 124
stays so even after x
2
goes back to its original value 0. That way we
can store a bit. We can delete it by switching input x
1
to 0.
x2
x1
y
Figure 9.3. Not a combinatorial circuit.
9.1.2. Properties of Combinatorial Circuits. Here Z
2
= 0, 1
represents the set of signals handled by combinatorial circuits, and the
operations performed on those signals by AND, OR and NOT gates are
represented by the symbols , + and respectively. Then their prop-
erties are the following (a, b, c are elements of Z
2
, i.e., each represents
either 0 or 1):
1. Associative
(a +b) +c = a + (b +c)
(a b) c = a (b c)
2. Commutative
a +b = b +a
a b = b a
3. Distributive
a (b +c) = (a b) + (a c)
a + (b c) = (a +b) (a +c)
4. Identity
a + 0 = a
a 1 = a
5. Complement
a +a = 1
a a = 0
A system satisfying those properties is called a Boolean algebra.
Two Boolean expressions are dened to be equal is they have the
same values for all possible assignments of values to their literals. Ex-
ample: x +y = x y, as shown in the following table:
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
9.1. COMBINATORIAL CIRCUITS 125
x y x +y x y
1 1 0 0
1 0 0 0
0 1 0 0
0 0 1 1
9.1.3. Abstract Boolean Algebras. Here we deal with general
Boolean algebras; combinatorial circuits are an example, but there are
others.
A Boolean algebra B = (S, , , , 0, 1) is a set S containing two
distinguished elements 0 and 1, two binary operators and on S,
and a unary operator on S, satisfying the following properties (x, y,
z are elements of S):
1. Associative
(x y) z = x (y z)
(x y) z = x (y z)
2. Commutative
x y = y x
x y = y x
3. Distributive
x (y z) = (x y) (x z)
x (y z) = (x y) (x z)
4. Identity
x 0 = x
x 1 = x
5. Complement
x x = 1
x x = 0
Example: (Z
2
, +, , , 0, 1) is a Boolean algebra.
Example: If U is a universal set and P(U)= the power set of S (col-
lection of subsets of S) then (P(U), , , , , U). is a Boolean algebra.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
9.1. COMBINATORIAL CIRCUITS 126
9.1.4. Other Properties of Boolean Algebras. The properties
mentioned above dene a Boolean algebra, but Boolean algebras also
have other properties:
1. Idempotent
x x = x
x x = x
2. Bound
x 1 = 1
x 0 = 0
3. Absorption
x xy = x
x (x y) = x
4. Involution
x = x
5. 0 and 1
0 = 1
1 = 0
6. De Morgans
x y = x y
x y = x y
For instance the rst idempotent law can be proved like this: x =
x 0 = x x x = (x x) (x x) = (x x) 1 = x x.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
9.2. BOOLEAN FUNCTIONS, APPLICATIONS 127
9.2. Boolean Functions, Applications
9.2.1. Introduction. A Boolean function is a function from Z
n
2
to Z
2
. For instance, consider the exclusive-or function, dened by the
following table:
x
1
x
2
x
1
x
2
1 1 0
1 0 1
0 1 1
0 0 0
The exclusive-or function can interpreted as a function Z
2
2
Z
2
that assigns (1, 1) 0, (1, 0) 1, (0, 1) 1, (0, 0) 0. It can also
be written as a Boolean expression in the following way:
x
1
x
2
= (x
1
x
2
) + (x
1
x
2
)
Every Boolean function can be written as a Boolean expression as
we are going to see next.
9.2.2. Disjunctive Normal Form. We start with a denition.
A minterm in the symbols x
1
, x
2
, . . . , x
n
is a Boolean expression of the
form y
1
y
2
y
n
, where each y
i
is either x
i
or x
i
.
Given any Boolean function f : Z
n
2
Z
2
that is not identically
zero, it can be represented
f(x
1
, . . . , x
n
) = m
1
+m
2
+ +m
k
,
where m
1
, m
2
, . . . , m
k
are all the minterms m
i
= y
1
y
2
y
n
such that
f(a
1
, a
2
, . . . , a
n
) = 1, where y
j
= x
j
if a
j
= 1 and y
j
= x
j
if a
j
= 0.
That representation is called disjunctive normal form of the Boolean
function f.
Example: We have seen that the exclusive-or can be represented
x
1
x
2
= (x
1
x
2
) + (x
1
x
2
). This provides a way to implement the
exclusive-or with a combinatorial circuit as shown in gure 9.4.
9.2.3. Conjunctive Normal Form. A maxterm in the symbols
x
1
, x
2
, . . . , x
n
is a Boolean expression of the form y
1
+ y
2
+ + y
n
,
where each y
i
is either x
i
or x
i
.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
9.2. BOOLEAN FUNCTIONS, APPLICATIONS 128
x1
x2
x2 x1
Figure 9.4. Exclusive-Or.
Given any Boolean function f : Z
n
2
Z
2
that is not identically
one, it can be represented
f(x
1
, . . . , x
n
) = M
1
M
2
M
k
,
where M
1
, M
2
, . . . , M
k
are all the maxterms M
i
= y
1
+ y
2
+ + y
n
such that f(a
1
, a
2
, . . . , a
n
) = 0, where y
j
= x
j
if a
j
= 0 and y
j
= x
j
if
a
j
= 1. That representation is called conjunctive normal form of the
Boolean function f.
Example: The conjunctive normal form of the exclusive-or is
x
1
x
2
= (x
1
+x
2
) (x
1
+x
2
) .
9.2.4. Functionally Complete Sets of Gates. We have seen
how to design combinatorial circuits using AND, OR and NOT gates.
Here we will see how to do the same with other kinds of gates. In the
following gates will be considered as functions from Z
n
2
into Z
2
intended
to serve as building blocks of arbitrary boolean functions.
A set of gates g
1
, g
2
, . . . , g
k
is said to be functionally complete
if for any integer n and any function f : Z
n
2
Z
2
it is possible to
construct a combinatorial circuit that computes f using only the gates
g
1
, g
2
, . . . , g
k
. Example: The result about the existence of a disjunctive
normal form for any Boolean function proves that the set of gates
AND, OR, NOT is functionally complete. Next we show other sets
of gates that are also functionally complete.
1. The set of gates AND, NOT is functionally complete. Proof:
Since we already know that AND, OR, NOT is functionally
complete, all we need to do is to show that we can compute
x +y using only AND and NOT gates. In fact:
x +y = x y ,
hence the combinatorial circuit of gure 9.5 computes x +y.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
9.2. BOOLEAN FUNCTIONS, APPLICATIONS 129
x1 x2
x2
x1
+
Figure 9.5. OR with AND and NOT.
2. The set of gates OR, NOT is functionally complete. The
proof is similar:
x y = x +y ,
hence the combinatorial circuit of gure 9.6 computes x+y.
x1
x1 x2
x2
Figure 9.6. AND with OR and NOT.
3. The gate NAND, denoted and dened as
x
1
x
2
=
_
0 if x
1
= 1 and x
2
= 1
1 otherwise,
is functionally complete.
x1
x1 x2
x2
Figure 9.7. NAND gate.
Proof: Note that x y = x y. Hence x = x x = x x, so
the NOT gate can be implemented with a NAND gate. Also the
OR gate can be implemented with NAND gates: x+y = x y =
(x x) (y y). Since the set OR, NOT is functionally
complete and each of its elements can be implemented with
NAND gates, the NAND gate is functionally complete.
9.2.5. Minimization of Combinatorial Circuits. Here we ad-
dress the problems of nding a combinatorial circuit that computes a
given Boolean function with the minimum number of gates. The idea
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
9.2. BOOLEAN FUNCTIONS, APPLICATIONS 130
x y
x
x
x
y
Figure 9.8. NOT and OR functions implemented with
NAND gates.
is to simplify the corresponding Boolean expression by using algebraic
properties such as (E a) + (E a) = E and E + (E a) = E, where
E is any Boolean expression. For simplicity in the following we will
represent a b as ab, so for instance the expressions above will look like
this: Ea +Ea = E and E +Ea = E.
Example: Let F(x, y, z) the Boolean function dened by the follow-
ing table:
x y z f(x,y,z)
1 1 1 1
1 1 0 1
1 0 1 0
1 0 0 1
0 1 1 0
0 1 0 0
0 0 1 0
0 0 0 0
Its disjunctive normal form is f(x, y, z) = xyz + xyz + xyz. This
function can be implemented with the combinatorial circuit of gure
9.9.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
9.2. BOOLEAN FUNCTIONS, APPLICATIONS 131
z
y
x
.
f(x,y,z)
Figure 9.9. A circuit that computes f(x, y, z) = xyz +
xyz +xyz.
But we can do better if we simplify the expression in the following
way:
f(x, y, z) =
xy
..
xyz +xyz +xyz
= xy +xyz
= x(y +yz)
= x(y +y)(y +z)
= x(y +z) ,
which corresponds to the circuit of gure 9.10.
f(x,y,z)
. y
x
z
Figure 9.10. A simpler circuit that computes
f(x, y, z) = xyz +xyz +xyz.
9.2.6. Multi-Output Combinatorial Circuits. Example: Half-
Adder. A half-adder is a combinatorial circuit with two inputs x and
y and two outputs s and c, where s represents the sum of x and y and
c is the carry bit. Its table is as follows:
x y s c
1 1 0 1
1 0 1 0
0 1 1 0
0 0 0 0
So the sum is s = x y (exclusive-or) and the carry bit is c = x y.
Figure 9.11 shows a half-adder circuit.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
9.2. BOOLEAN FUNCTIONS, APPLICATIONS 132
y
s
c
x
Figure 9.11. Half-adder circuit.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
CHAPTER 10
Automata, Grammars and Languages
10.1. Finite State Machines
10.1.1. Finite-State Machines. Combinatorial circuits have no
memory or internal states, their output depends only on the current
values of their inputs. Finite state machines on the other hand have
internal states, so their output may depend not only on its current
inputs but also on the past history of those inputs.
A nite-state machine consists of the following:
1. A nite set of states S.
2. A nite set of input symbols I.
3. A nite set of output symbols O.
4. A next-state or transition function f : S I S.
5. An output function g : S I O.
6. An initial state S.
We represent the machine M = (S, I, O, f, g, )
Example: We describe a nite state machine with two input symbols
I = a, b and two output symbols O = 0, 1 that accepts any string
from I
0
0
1
1 0
1
1
1
0 0
133
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
10.1. FINITE STATE MACHINES 134
This nite-state machine also can be represented with the following
transition diagram:
start
GFED @ABC
0
a/1
b/0
GFED @ABC
1
a/0
b/0
then we draw an
arrow from to
labeled i/o or i, o.
Example: The following example is similar to the previous one but
the machine outputs 1 only after a change of input symbol, otherwise
it outputs 0:
start
GFED @ABC
0
a/0
. |
|
|
|
|
|
|
|
|
b/0
B
B
B
B
B
B
B
B
B
GFED @ABC
1
a/0
b/1
GFED @ABC
2
b/0
a/1
01/1
10/1
11/0
GFED @ABC
C
01/0
10/0
.
11/1
00/1
0
b/0
a/1
GFED @ABC
1
b/0
a/1
The second kind of diagram omits the outputs and represents the
accepting states with double circles:
start
GFED @ABC
0
b
1
b
Two nite-state automata that accept exactly the same set of strings
are said to be equivalent. For instance the following automaton also
accepts precisely strings of as abd bs that end with an a, so it is
equivalent to the automaton shown above:
start
GFED @ABC
0
b
1
b
a
GFED @ABC ?>=< 89:;
2
b
GFED @ABC
O
a
0
a
1
a
GFED @ABC
2
a
0
a
GFED @ABC
1
b
b
GFED @ABC
2
a
b
.
GFED @ABC ?>=< 89:;
3
b
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
10.2. LANGUAGES AND GRAMMARS 137
10.2. Languages and Grammars
10.2.1. Formal Languages. Consider algebraic expressions writ-
ten with the symbols A = x, y, z, +, , (, ). The following are some
of them: x + y y, y + (x y + y) x, (x + y) x + z, etc.
There are however some strings of symbols that are not legitimate al-
gebraic expressions, because they have some sort of syntax error, e.g.:
(x + y, z + +y x, x(y) + z, etc. So syntactically correct al-
gebraic expressions are a subset of the whole set A
of possible strings
over A.
In general, given a nite set A (the alphabet), a (formal) language
over A is a subset of A
)V
n
(by convention also
1
1
.)
Given a grammar G, the language L(G) associated to this grammar
is the subset of T
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
10.2. LANGUAGES AND GRAMMARS 140
where = string concatenation of and , starts with the
production
1
2
.
3. Closure Rule: the language closure of L
1
L
1
= L
0
1
L
1
1
L
2
1
. . .
were L
0
1
= and L
n
1
=
1
2
. . .
n
[
k
L
1
, k = 1, 2, . . . , n
(n = 1, 2, . . . ), starts with the two productions
1
, .
10.2.5. Types of Grammars (Chomskys Classication). Let
G be a grammar and let denote the null string.
0. G is a phrase-structure (or type 0) grammar if every production
is of the form:
,
where V
, V
.
1. G is a context-sensitive (or type 1) grammar if every production
is of the form:
A
(i.e.: we may replace A with in the context of and ), where
, V
, A N, V
.
2. G is a context-free (or type 2) grammar if every production is
of the form:
A ,
where A N, V
.
3. G is a regular (or type 3) grammar if every production is of the
form:
A a or A aB or A ,
where A, B N, a T.
A language L is context-sensitive (respectively context-free, regu-
lar) if there is a context-sensitive (respectively context-free, regular)
grammar G such that L = L(G).
The following examples show that these grammars dene dierent
kinds of languages.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
10.2. LANGUAGES AND GRAMMARS 141
Example: The following language is type 3 (regular):
L = a
n
b
m
[ n = 1, 2, 3 . . . ; m = 1, 2, 3, . . . .
A type 3 grammar for that language is the following: T = a, b,
N = , S, with start symbol , and productions:
a , aS , S bS , S b .
Example: The following language is type 2 (context-free) but not
type 3:
L = a
n
b
n
[ n = 1, 2, 3, . . . .
A type 2 grammar for that language is the following:
T = a, b, N = , with start symbol , and productions
ab , ab .
Example: The following language is type 1 (context-sensitive) but
not type 2:
L = a
n
b
n
c
n
[ n = 1, 2, 3, . . . .
A type 1 grammar for that language is the following:
T = a, b, c, N = , A, C, with start symbol , and productions
abc , aAbc ,
A abC , A aAbC ,
Cb bC , Cc cc .
There are also type 0 languages that are not type 1, but they are
harder to describe.
10.2.6. Equivalent Grammars. Two grammars G and G
are
equivalent if L(G) = L(G
).
Example: The grammar of algebraic expressions dened at the be-
ginning of the section is equivalent to the following one:
Terminal symbols = x, y, z, +, , (, ), nonterminal symbols = E, T, F, L,
with start symbol E, and productions
E T, E E +T,
T F, T T F
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
10.2. LANGUAGES AND GRAMMARS 142
F (E), F L,
L x, L y, L z.
10.2.7. Context-Free Interactive Lindenmayer Grammar.
A context-free interactive Lindenmayer grammar is similar to a usual
context-free grammar with the dierence that it allows productions of
the form A B where A N T (in a context free grammar A must
be nonterminal). Its rules for deriving strings also are dierent. In a
context-free interactive Lindenmayer grammar, to derive string from
string , all symbols in must be replaced simultaneously.
Example: The von Koch Snowake. The von Koch Snowake is a
fractal curve obtained by start with a line segment and then at each
stage transforming all segments of the gure into a four segment polyg-
onal line, as shown below. The von Koch Snowake fractal is the limit
of the sequence of curves dened by that process.
Figure 10.1. Von Koch Snowake, stages 13.
Figure 10.2. Von Koch Snowake, stages 45
A way to represent an intermediate stage of the making of the
fractal is by representing it as a sequence of movements of three kinds:
d= draw a straight line (of a x length) in the current direction, r=
turn right by 60
, l= turn left by 60
L
1
L
2
= [ L
1
, L
2
1
=
1
. . .
n
[
k
L
1
, n N ,
T
L
1
= T
[ / L
1
,
L
1
L
2
= [ L
1
and L
2
.
We justify the above claims about L
1
L
2
, L
1
L
2
and L
1
as follows.
We already know how to combine two grammars (see 10.2.4) L
1
and
L
2
to obtain L
1
L
2
, L
1
L
2
and L
1
, the only problem is that the rules
given in section 10.2.4 do no have the form of a regular grammar, so we
need to modify them slightly (we use the same notation as in section
10.2.4):
1. Union Rule: Instead of adding
1
and
2
, add all
productions of the form RHS, where RHS is the right
hand side of some production (
1
RHS) P
1
or (
2
RHS) P
2
.
2. Product Rule: Instead of adding
1
2
, use
1
as start
symbol and replace each production (A a) P
1
with A
a
2
and (A ) P
1
with A
2
.
3. Closure Rule: Instead of adding
1
and , use
1
as start symbol, add
1
, and replace each production
(A a) P
1
with A a
1
and (A ) P
1
with A
1
.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
10.3. LANGUAGE RECOGNITION 145
10.3.2. Regular Expressions. Regular languages can be charac-
terized as languages dened by regular expressions. Given an alphabet
T, a regular expression over T is dened recursively as follows:
1. , , and a are regular expressions for all a T.
2. If R and S are regular expressions over T the following expres-
sions are also regular: (R), R +S, R S, R
.
In order to use fewer parentheses we assign those operations the fol-
lowing hierarchy (from do rst to do last): , , +. We may omit the
dot: = .
Next we dene recursively the language associated to a given regular
expression:
L() = ,
L() = ,
L(a) = a for each a T,
L(R +S) = L(R) L(S) ,
L(R S) = L(R)L(S) (language product),
L(R
) = L(R)
(language closure).
So, for instance, the expression a
bb
b is the set of
strings over a, b than start with a and end with b, etc.
Another way of characterizing regular languages is as sets of strings
recognized by nite-state automata, as we will see next. But rst we
need a generalization of the concept of nite-state automaton.
10.3.3. Nondeterministic Finite-State Automata. A nonde-
terministic nite-state automaton is a generalization of a nite-state
automaton so that at each state there might be several possible choices
for the next state instead of just one. Formally a nondeterministic
nite-state automaton consists of
1. A nite set of states S.
2. A nite set of input symbols I.
3. A next-state or transition function f : S I P(S).
4. An initial state S.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
10.3. LANGUAGE RECOGNITION 146
5. A subset F of S of accepting or nal states.
We represent the automaton A = (S, I, f, , F). We say that a nonde-
terministic nite-state automaton accepts or recognizes a given string
of input symbols if in its transition diagram there is a path from the
starting state to a nal state with its edges labeled by the symbols of
the given string. A path (which we can express as a sequence of states)
whose edges are labeled with the symbols of a string is said to represent
the given string.
Example: Consider the nondeterministic nite-state automaton de-
ned by the following transition diagram:
start
GFED @ABC
a
GFED @ABC
C
b
b
GFED @ABC ?>=< 89:;
F
This automaton recognizes precisely the strings of the form b
n
ab
m
,
n 0, m > 0. For instance the string bbabb is represented by the path
(, , , C, C, F). Since that path ends in a nal state, the string is
recognized by the automaton.
Next we will see that there is a precise relation between regular
grammars and nondeterministic nite-state automata.
Regular grammar associated to a nondeterministic nite-state au-
tomaton. Let A be a non-deterministic nite-state automaton given as
a transition diagram. Let be the initial state. Let T be the set of
inputs symbols, let N be the set of states, and V = N T. Let P be
the set of productions
S xS
and
S
if S is a nal state. Let G be the regular grammar
G = (V, T, , P) .
Then the set of strings recognized by A is precisely L(G).
Example: For the nondeterministic automaton dened above the
corresponding grammar will be:
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
10.3. LANGUAGE RECOGNITION 147
T = a, b, N = , C, F, with productions
b , aC , C bC , C bF , F .
The string bbabb can be produced like this:
b bb bbaC bbabC bbabbF bbabb .
Nondeterministic nite-state automaton associated to a given regu-
lar grammar. Let G = (V, T, , P) be a regular grammar. Let
I = T.
S = N F, where N = V T, and F / V .
f(S, x) = S
[ S xS
P F [ S x P.
F = F S [ S P .
Then the nondeterministic nite-state automaton A = (S, I, f, , F)
recognizes precisely the strings in L(G).
10.3.4. Relationships Between Regular Languages and Au-
tomata. In the previous section we saw that regular languages co-
incide with the languages recognized by nondeterministic nite-state
automata. Here we will see that the term nondeterministic can be
dropped, so that regular languages are precisely those recognized by
(deterministic) nite-state automata. The idea is to show that given
any nondeterministic nite-state automata it is possible to construct
an equivalent deterministic nite-state automata recognizing exactly
the same set of strings. The main result is the following:
Let A = (S, I, f, , F) be a nondeterministic nite-state automaton.
Then Ais equivalent to the nite-state automaton A
= (S
, I
, f
, F
),
where
1. S
= P(S).
2. I
= I.
3.
= .
4. F
= X S [ X F ,= .
5. f
(X, x) =
_
SX
f(S, x) , f
(, x) = .
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
10.3. LANGUAGE RECOGNITION 148
Example: Find a (deterministic) nite-state automaton A
equiva-
lent to the following nondeterministic nite-state automaton A:
start
GFED @ABC
a
GFED @ABC
C
b
b
GFED @ABC ?>=< 89:;
F
Answer: The set of input symbols is the same as that of the given
automaton: I
= , , C, F, , C, , F, C, F, , C, F .
The starting state is . The nal states of A
= F, , F, C, F, , C, F .
Then for each element X of S
GF ED
@A BC C
a
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
GF ED
@A BC
?> =<
89 :; , C, F
a
GF ED
@A BC
?> =<
89 :; , F
a
x
x
x
x
x
x
x
x
b
GFED
@ABC
b
.
GF ED
@A BC , C
a
I
I
I
I
I
I
I
I
I
b
GF ED
@A BC
?> =<
89 :; F
a
GF ED
@A BC
?> =<
89 :; C, F
a
I
I
I
I
I
I
I
I
I
I
I
b
We notice that some states are unreachable from the starting state.
After removing the unreachable states we get the following simplied
version of the nite-state automaton:
start
GF ED
@A BC
a
GF ED
@A BC C
a
GF ED
@A BC
?> =<
89 :; C, F
a
GFED
@ABC
b
_
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
10.3. LANGUAGE RECOGNITION 149
So, once proved that every nondeterministic nite-state automaton
is equivalent to some deterministic nite-state automaton, we obtain
the main result of this section: A language L is regular if and only
if there exists a nite-state automaton that recognizes precisely the
strings in L.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
APPENDIX A
A.1. Ecient Computation of Powers Modulo m
We illustrate an ecient method of computing powers modulo m
with an example. Assume that we want to compute 3
547
mod 10.
First write 547 in base 2: 1000100011, hence 547 = 2
9
+ 2
5
+ 2 + 1 =
((2
4
+1) 2
4
+1) 2+1, so: 3
547
= ((3
2
4
3)
2
4
3)
2
3. Next we compute the
expression beginning with the inner parenthesis, and reducing modulo
10 at each step: 3
2
= 9 (mod 10), 3
2
2
= 9
2
= 81 = 1 (mod 10),
3
2
3
= 1
2
= 1 (mod 10), 3
2
4
= 1
2
= 1 (mod 10), 3
2
4
3 = 1 3 = 3
(mod 10), etc. At the end we nd 3
547
= 7 (mod 10).
The algorithm in pseudocode would be like this:
1: procedure pow mod(a,x,m) computes a^x mod m
2: p := 1
3: bx := binary array(x) x as a binary array
4: t := a mod m
5: for k := 1 to length(bx)
6: begin
7: p := (p * p) mod m
8: if bx[k] = 1 then
if k-th binary digit of x is 1
9: p := (p * t) mod m
10: end
11: return p
12: end pow mod
150
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
A.1. EFFICIENT COMPUTATION OF POWERS MODULO M 151
The following is a program in C implementing the algorithm:
int pow(int a, int x, int m) {
int p = 1;
int y = (1 << (8 * size of(int) - 2));
a %= m;
while (!(y & x)) y >>= 1;
while (y) {
p *= p;
p %= m;
if (x & y) {
p *= a;
p %= m;
}
y >>= 1;
}
return p;
}
The following is an alternative algorithm equivalent to running
through the binary representation of the exponent from right to left
instead of left to right:
1: procedure pow mod(a,x,m) computes a^x mod m
2: p := 1
3: t := a mod m
4: while x > 0
5: begin
6: if x is odd then
7: p := (p * t) mod m
8: t := (t * t) mod m
9: x := floor(x/2)
10: end
11: return p
12: end pow mod
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
A.2. MACHINES AND LANGUAGES 152
A.2. Machines and Languages
A.2.1. Turing Machines. A Turing machine is a theoretical de-
vice intended to dene rigorously the concept of algorithm. It consists
of
1. An innite tape made of a sequence of cells. Each cell may be
empty or may contain a symbol from a given alphabet.
2. A control unit containing a nite set of instructions.
3. A tape head able to read and write (or delete) symbols from the
tape.
Tape head
Tape
control
unit
Figure A.1. Turing Machine.
Each machine instruction contains the following ve parts:
1. The current machine state.
2. A tape symbol read from the current tape cell.
3. A tape symbol to write into the current tape cell.
4. A direction for the tape head to move: L = move one cell to
the left, R = move one cell to the right, S = stay in the
current cell.
5. The next machine state.
Turing machines are generalizations of nite-state automata. A
nite-state automaton is just a Turing machine whose tape head moves
always from left to right and never writes to the tape. The input of
the nite-state automaton is presented as symbols written in the tape.
In general we make the following assumptions:
1. An input is represented on the tape by placing the letters of
the strings in contiguous tape cells. All other cells contain the
blank symbol, which we may denote .
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
A.2. MACHINES AND LANGUAGES 153
2. The tape is initially positioned at the leftmost cell of the input
string unless specied otherwise.
3. There is one starting state.
4. There is one halt state, which we denote by Halt.
The execution of a Turing machine stops when it enters the Halt state
or when it enters a state for which there is no valid move. The output
of the Turing machine is the contents of the tape when the machine
stops.
We say that an input string is accepted by a Turing machine if
the machine enters the Halt state. Otherwise the string is rejected.
This can happen in two ways: by entering a state other than the Halt
state from which there is no move, or by running forever (for instance
executing an innite loop).
If a Turing machine has at least two instructions with the same state
and input letter, then the machine is nondeterministic. Otherwise it is
deterministic.
Finite-State Automata. A nite-state automata can be interpreted
as a Turing machine whose tape head moves only from left to right and
never writes to the tape.
Pushdown Automata. A pushdown automaton is nite-state au-
tomaton with a stack, i.e., a storage structure in which symbols can be
put and extracted from it by two operations: push (place on the top of
the stack) and pop (take from the top of the stack)consequently the
last symbol put into the stack is the rst symbol taken out. Addition-
ally there is a third operation, nop, that leaves the stack intact. The
next state function takes into account not only the current state and
the symbol read from the input, but also the symbol at the top of the
stack. After reading the next input symbol and the symbol at the top
of the stack, the automaton executes a stack operation and goes to the
next state. Initially there is a single symbol in the stack.
Linearly Bounded Automata. A linearly bounded automaton is a
Turing machine whose tape is limited to the size of its input string
plus two boundary cells that may not be changed.
Computable Functions. Consider a Turing machine T working on
symbols from an alphabet of only one symbol A = [ (stroke). Let
f : N N the function dened so that f(n) = m means that if the
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m
A.2. MACHINES AND LANGUAGES 154
initial input of T consists of a string of n + 1 strokes, the output of T
is a string of m+ 1 strokes. We say that f is computed by the Turing
machine T. A computable function is a function computed by some
Turing machine. A computable function f(n) halts for a given value
of its argument n if T with input n + 1 strokes halts. A computable
function f is total if f(n) halts for every n.
An eective enumeration of a set is a listing of its elements by an
algorithm.
A.2.2. Hierarchy of Languages. Here we mention a hierarchy
of languages that includes (and extends) Chomskys classication, in
increasing order of inclusion.
1. Regular languages. They are recognized by nite-state automata.
Example: a
m
b
n
[ m, n = 1, 2, 3 . . . .
2. Deterministic context-free languages, recognized by determinis-
tic pushdown automata. Example: a
n
b
n
[ n = 1, 2, 3 . . . .
3. Context-free languages, recognized by nondeterministic push-
down automata. Example: palindromes over a, b.
4. Context-sensitive languages, languages without recognized by
linearly bounded automata. Example: a
n
b
n
c
n
[ n = 1, 2, 3 . . .
5. Unrestricted or phrase-structure grammars, recognized by Tur-
ing machines.
6. Recursively enumerable languages. A language is recursively
enumerable if there is a Turing machine that outputs all the
strings of the language. Example: a
n
[ f
n
(n) halts, where
f
0
, f
1
, f
2
, . . . is an eective enumeration of all computable func-
tions.
7. Nongramatical languages, languages that are not denable by
any grammar and cannot be recognized by Turing machines.
Example: a
n
[ f
n
is total.
www.jntuworld.com
w
w
w
.
k
i
n
i
n
d
i
a
.
c
o
m