Data Structure - SEM III - Aeraxia - in
Data Structure - SEM III - Aeraxia - in
Data Structure - SEM III - Aeraxia - in
Unit I
Arrays and sequential representations – ordered lists – Stacks and Queues –
Evaluation ofExpressions – Multiple Stacks and Queues – Singly Linked List
– Linked Stacks and queues – Polynomial addition.
The primitive data structures are primitive data types. The int, char, float, double, and
pointer are the primitive data structures that can hold a single value.
The arrangement of data in a sequential manner is known as a linear data structure. The
data structures used for this purpose are Arrays, Linked list, Stacks, and Queues. In
these data structures, one element is connected to only one another element in a linear
form.
When one element is connected to the 'n' number of elements known as a non-
linear data structure. The best example is trees and graphs. In this case, the elements
are arranged in a random manner.
• Static data structure: It is a type of data structure where the size is allocated
at the compile time. Therefore, the maximum size is fixed.
• Dynamic data structure: It is a type of data structure where the size is
allocated at the run time. Therefore, the maximum size is flexible.
Major Operations
The major or the common operations that can be performed on the data structures are:
Definition
o Arrays are defined as the collection of similar type of data items stored at contiguous
memory locations.
o Arrays are the derived data type in C programming language which can store the
primitive type of data such as int, char, double, float, etc.
o Array is the simplest data structure where each data element can be randomly accessed
by using its index number.
o For example, if we want to store the marks of a student in 6 subjects, then we don't need
to define different variable for the marks in different subject. instead of that, we can
define an array which can store the marks in each subject at a the contiguous memory
locations.
The array marks[10] defines the marks of the student in 10 different subjects where
each subject marks are located at a particular subscript in the array
i.e. marks[0] denotes the marks in first subject, marks[1] denotes the marks in 2nd
subject and so on.
Advantages of Array
o Array provides the single name for the group of variables of the same type therefore,it is
easy to remember the name of all the elements of an array.
o Traversing an array is a very simple process, we just need to increment the base
address of the array in order to visit each element one by one.
o Any element in the array can be directly accessed by using the index.
Memory Allocation of the array
As we have mentioned, all the data elements of an array are stored at contiguous
locations in the main memory. The name of the array represents the base
address or the address of first element in the main memory. Each element of the
array is represented by a proper indexing.
1. 0 (zero - based indexing) : The first element of the array will be arr[0].
2. 1 (one - based indexing) : The first element of the array will be arr[1].
3. n (n - based indexing) : The first element of the array can reside at any random index
number.
In the following image, we have shown the memory allocation of an array arr of size
5. The array follows 0-based indexing approach. The base address of the array is
100th byte. This will be the address of arr[0]. Here, the size of int is 4 bytes therefore
each element will take 4 bytes in the memory.
In 0 based indexing, If the size of an array is n then the maximum index number, an
element can have is n-1. However, it will be n if we use 1 based indexing.
Address of any element of a 1D array can be calculated by using the following formula:
Byte address of element A[i] = base address + size * ( i - first index )
2D Array
2D array can be defined as an array of arrays. The 2D array is organized as matrices
which can be represented as the collection of rows and columns.
However, 2D arrays are created to implement a relational database look alike data
structure. It provides ease of holding bulk of data at once which can be passed to any
number of functions wherever required.
Above image shows the two dimensional array, the elements are organized in the form of rows
and columns. First element of the first row is represented by a[0][0] where the number shown
in the first index is the number of that row while the number shown in the second index is the
number of the column.
How do we access data in a 2D array
Similar to one dimensional arrays, we can access the individual cells in
a 2D array by using the indices of the cells. There are two indices
attached to a particular cell, one is its row number while the other is its
column number.
However, we can store the value stored in any particular cell of a 2D array to some
variable x by using the following syntax.
1. int x = a[i][j];
where i and j is the row and column number of the cell respectively. We
Initializing 2D Arrays
We know that, when we declare and initialize one dimensional array in
C programming simultaneously, we don't need to specify the size of the
array. However this will not work with 2D arrays. We will have to
define at least the second dimension of the array.
The number of elements that can be present in a 2D array will always be equal to
(number of rows * number of columns).
A 3 X 3 two dimensional array is shown in the following image. However, this array
needs to be mapped to a one dimensional array in order to store it into the memory.
There are two main techniques of storing 2D array elements into memory
first, the 1st row of the array is stored into the memory completely, then the 2nd row of
the array is stored into the memory completely and so on till the last row.
first, the 1st column of the array is stored into the memory completely, then the2nd
row of the array is stored into the memory completely and so on till the last column of
the array.
1. Address(a[i][j]) = B. A. + (i * n + j) * size
where, B. A. is the base address or the address of the first element of the array a[0][0] .
Example :
1. a[10...30, 55...75], base address of the array (BA) = 0, size of an element = 4 bytes .
2. Find the location of a[15][68].
3.
4. Address(a[15][68]) = 0 +
5. ((15 - 10) x (68 - 55 + 1) + (68 - 55)) x 4
6.
7. = (5 x 14 + 13) x 4
8. = 83 x 4
9. = 332 answer
If array is declared by a[m][n] where m is the number of rows while n is the numberof
columns, then address of an element a[i][j] of the array stored in row major order is
calculated as,
1. Address(a[i][j]) = ((j*m)+i)*Size + BA
Example:
A [-
5 ... +20][20 ... 70], BA = 1020, Size of element = 8 bytes. Find the location of a[0][30].
Ordered list
An ordered list is a list in which the order of the items is significant.
However, the items in an ordered list are not necessarily sorted.Consequently,
it is possible to change the order o items and still have a valid ordered list.
Consider a list of the titles of the chapters in this book. The order of the items
in the list corresponds to the order in which they appear in the book.
However, since the chapter titles are not sorted alphabetically, we cannot
consider the list to be sorted. Since it is possible to change the order of the
chapters in book, we must be able to do the same with the items of the list.As
a result, we may insert an item into an ordered list at any position.
Insert
-used to put objects into a the container;
withdraw
-used to remove objects from the container;
find
-used to locate objects in the container;
isMember
-used to test whether a given object instance is in the container.
stack
A real-world stack allows operations at one end only. For example, we can place or
remove a card or plate from the top of the stack only. At any given time, we can only
access the top element of a stack.
This feature makes it LIFO data structure. LIFO stands for Last-in-first-out. Here, the
element which is placed (inserted or added) last, is accessed first. In stack terminology,
insertion operation is called PUSH operation and removal operation is called POP
operation.
Stack Representation
The following diagram depicts a stack and its operations −
A stack can be implemented by means of Array, Structure, Pointer, and Linked List.
Stack can either be a fixed size one or it may have a sense of dynamic resizing. Here,
we are going to implement stack using arrays, which makes it a fixed size stack
implementation.
Basic Operations
Stack operations may involve initializing the stack, using it and then de-initializing it.
Apart from these basic stuffs, a stack is used for the following two primary operations
−
push() − Pushing (storing) an element on the stack.
pop() − Removing (accessing) an element from the stack.
When data is PUSHed onto stack.
To use a stack efficiently, we need to check the status of stack as well. For the same
purpose, the following functionality is added to stacks −
peek() − get the top data element of the stack, without removing it.
isFull() − check if stack is full.
isEmpty() − check if stack is empty.
At all times, we maintain a pointer to the last PUSHed data on the stack.
The top pointer provides top value of the stack without actually removing it.
First we should learn about procedures to support stack functions −
peek()
Algorithm of peek() function −
begin procedure peekreturn
stack[top]
end procedure
isfull()
Algorithm of isfull() function −
begin procedure isfull
end procedure
isempty()
Algorithm of isempty() function −
begin procedure isempty
end procedure
Implementation of isempty() function in C programming language is slightly different.
We initialize top at -1, as the index in array starts from 0. So we check if the top is
below zero or -1 to determine if the stack is empty. Here's the code −
Push Operation
The process of putting a new data element onto stack is known as a Push Operation.
Push operation involves a series of steps −
Step 1 − Checks if the stack is full.
Step 2 − If the stack is full, produces an error and exit.
Step 3 − If the stack is not full, increments top to point next empty space.
Step 4 − Adds data element to the stack location, where top is pointing.
Step 5 − Returns success.
If the linked list is used to implement the stack, then in step 3, we need to allocatespace
dynamically.
if stack is fullreturn
null
endif
top ← top + 1
stack[top] ← data
end procedure
Pop Operation
Accessing the content while removing it from the stack, is known as a Pop Operation.
In an array implementation of pop() operation, the data element is not actually
removed, instead top is decremented to a lower position in the stack to point to the
next value. But in linked-list implementation, pop() actually removes data element
and deallocates memory space.
A Pop operation may involve the following steps −
Step 1 − Checks if the stack is empty.
Step 2 − If the stack is empty, produces an error and exit.
Step 3 − If the stack is not empty, accesses the data element at which top is
pointing.
Step 4 − Decreases the value of top by 1.
Step 5 − Returns success.
if stack is emptyreturn
null
endif
data ← stack[top]top ←
top - 1 return data
end procedure
Queue
Queue is an abstract data structure, somewhat similar to Stacks. Unlike stacks, a queue is open
at both its ends. One end is always used to insert data (enqueue) and the other is usedto remove
data (dequeue). Queue follows First-In-First-Out methodology, i.e., the data item stored first will
be accessed first.
Queue Representation
As we now understand that in queue, we access both ends for different reasons. The
following diagram given below tries to explain queue representation as data structure −
Basic Operations
Queue operations may involve initializing or defining the queue, utilizing it, and then
completely erasing it from the memory. Here we shall try to understand the basic
operations associated with queues −
enqueue() − add (store) an item to the queue.
dequeue() − remove (access) an item from the queue.
Few more functions are required to make the above-mentioned queue operation
efficient. These are −
peek() − Gets the element at the front of the queue without removing it.
isfull() − Checks if the queue is full.
isempty() − Checks if the queue is empty.
In queue, we always dequeue (or access) data, pointed by front pointer and while
enqueing (or storing) data in the queue we take help of rear pointer.
peek()
This function helps to see the data at the front of the queue. The algorithm of peek()
function is as follows −
Algorithm
begin procedure peek return
queue[front]
end procedure
isfull()
As we are using single dimension array to implement queue, we just check for the rear
pointer to reach at MAXSIZE to determine that the queue is full. In case we maintain
the queue in a circular linked-list, the algorithm will differ. Algorithm of isfull() function
–
Algorithm
begin procedure isfull
end procedure
isempty()
Algorithm of isempty() function −
Algorithm
begin procedure isempty
end procedure
If the value of front is less than MIN or 0, it tells that the queue is not yet initialized,
hence empty.
Enqueue Operation
Queues maintain two data pointers, front and rear. Therefore, its operations are
comparatively difficult to implement than that of stacks.
The following steps should be taken to enqueue (insert) data into a queue −
Step 1 − Check if the queue is full.
Step 2 − If the queue is full, produce overflow error and exit.
Step 3 − If the queue is not full, increment rear pointer to point the nextempty
space.
Step 4 − Add data element to the queue location, where the rear is pointing.
Step 5 − return success.
Sometimes, we also check to see if a queue is initialized or not, to
handle any unforeseen situations.
Dequeue Operation
Accessing data from the queue is a process of two tasks − access the data where
front is pointing and remove the data after access. The following steps are taken to
perform dequeue operation −
Step 1 − Check if the queue is empty.
Step 2 − If the queue is empty, produce underflow error and exit.
Step 3 − If the queue is not empty, access the data where front is pointing.
Step 4 − Increment front pointer to point to the next available data element.
Step 5 − Return success.
Algorithm for dequeue operation
procedure dequeue
data = queue[front]front
← front + 1 return true
end procedure
Evaluation of Expression
The way to write arithmetic expression is known as a notation. An arithmetic
expression can be written in three different but equivalent notations, i.e., without
changing the essence or output of an expression. These notations are –
Infix Notation
Prefix (Polish) Notation
Postfix (Reverse-Polish) Notation
These notations are named as how they use operator in expression. We shall learnthe
same here in this chapter.
Infix Notation
We write expression in infix notation, e.g. a - b + c, where operators are used in-
between operands. It is easy for us humans to read, write, and speak in infix notation
but the same does not go well with computing devices. An algorithm to
process infix notation could be difficult and costly in terms of time and space
consumption.
Prefix Notation
In this notation, operator is prefixed to operands, i.e. operator is written ahead of
operands. For example, +ab. This is equivalent to its infix notation a + b. Prefix
notation is also known as Polish Notation.
Postfix Notation
This notation style is known as Reversed Polish Notation. In this notation style,
the operator is postfixed to the operands i.e., the operator is written after the
operands. For example, ab+. This is equivalent to its infix notation a + b.
The following table briefly tries to show the difference in all three notations −
2 (a + b) ∗ c ∗+abc ab+c∗
3 a ∗ (b + c) ∗a+bc abc+∗
5 (a + b) ∗ (c + d) ∗+ab+cd ab+cd+∗
Parsing Expressions
As we have discussed, it is not a very efficient way to design an algorithm or program
to parse infix notations. Instead, these infix notations are first converted into either
postfix or prefix notations and then computed.
To parse any arithmetic expression, we need to take care of operator precedence and
associativity also.
Precedence
When an operand is in between two different operators, which operator will take the
operand first, is decided by the precedence of an operator over others. For example
−
a+b*c - a+(b*c)
Associativity
Associativity describes the rule where operators with the same precedence appear in
an expression. For example, in expression a + b − c, both + and – have the same
precedence, then which part of the expression will be evaluated first, is determined by
associativity of those operators. Here, both + and − are left associative, so the
expression will be evaluated as (a + b) − c.
Precedence and associativity determines the order of evaluation of an expression.
Following is an operator precedence and associativity table (highest to lowest) −
The above table shows the default behavior of operators. At any point of time in
expression evaluation, the order can be altered by using parenthesis. For example
−
In a + b*c, the expression part b*c will be evaluated first, with multiplication as
precedence over addition. We here use parenthesis for a + b to be evaluated first, like
(a + b)*c.
Example
❖ Queue 1 expands from the 0th element to the right and circular back to the 0th element
❖ Queue 2 expands from the 8th element to the left and circular back to the 8th element
❖ Temporary boundary between the Queue 1 and the Queue 2; as long as there has
free elements in the array and boundary would be shift
❖ Free elements could be anywhere in the Queue such as before the front, after the rear,
and between front and rear in the Queue
❖ Queue 1’s and Queue 2 ‘s size could be change if it is necessary. When the Queue 1
is full and the Queue 2 has free space; the Queue 1 can increase the size to use that
free space from the Queue 2. Same way for the Queue 2
❖ Elements –1, –2, and –3 are using to store the size of the Queue 1, the front of the
Queue 1, and the data count for the Queue 1 needed to manipulate the Queue 1
❖ Elements –4, –5, and –6 are using to store the size of the Queue 2, the front of the
Queue 2, and the data count for the Queue 2 needed to manipulate the Queue 2
❖ Inserts data to the Queue 1, Q1Rear = (Q1Front + Q1count) % Q1Size
❖ Inserts data to the Queue 2, Q2Rear = (Q2Front + Q2count) % Q2Size + Q1Size
❖ Deletes data from the Queue 1, Q1Front = (Q1Front + 1) % Q1Size
❖ Deletes data from the Queue 2, Q2Front = (Q2Front + 1) % Q2Size + Q1Size
A linked list is a sequence of data structures, which are connected together via links.
Linked List is a sequence of links which contains items. Each link contains a
connection to another link. Linked list is the second most-used data structure after
array. Following are the important terms to understand the concept of Linked List.
Link − Each link of a linked list can store a data called an element.
Next − Each link of a linked list contains a link to the next link called Next.
LinkedList − A Linked List contains the connection link to the first link called
First.
As per the above illustration, following are the important points to be considered.
Linked List contains a link element called first.
Each link carries a data field(s) and a link field called next.
Each link is linked with its next link using its next link.
Last link carries a link as null to mark the end of the list.
Basic Operations
Following are the basic operations supported by a list.
Insertion − Adds an element at the beginning of the list.
Deletion − Deletes an element at the beginning of the list.
Display − Displays the complete list.
Search − Searches an element using the given key.
Delete − Deletes an element using the given key.
Insertion Operation
Adding a new node in linked list is a more than one step activity. We shall learn this
with diagrams here. First, create a node using the same structure and find the location
where it has to be inserted.
Imagine that we are inserting a node B (NewNode), between A (LeftNode)
and C (RightNode). Then point B.next to C −
NewNode.next −> RightNode;
It should look like this −
Now, the next node at the left should point to the new node.
LeftNode.next −> NewNode;
Single linked list is a sequence of elements in which every element has link to its next
element in the sequence.
In any single linked list, the individual element is called as "Node". Every "Node" contains two fields,
data field, and the next field. The data field is used to store actual value of the node and next field
is used to store the address of next node in the sequence. The graphical
representation of a node in a single linked list is as follows...
Example
Insertion
Deletion
Display
Before we implement actual operations, first we need to set up an empty list. First, perform the
following steps before implementing actual operations.
Step 1 - Include all the header files which are used in the program.
Step 2 - Declare all the user defined functions.
Step 3 - Define a Node structure with two members data and next
Step 4 - Define a Node pointer 'head' and set it to NULL.
Step 5 - Implement the main method by displaying operations menu and make suitable
function calls in the main method to perform user selected operation.
Insertion
In a single linked list, the insertion operation can be performed in three ways. They are as
follows...
Step 1 - Create a newNode with given value and newNode → next as NULL.
Step 2 - Check whether list is Empty (head == NULL).
Step 3 - If it is Empty then, set head = newNode.
Step 4 - If it is Not Empty then, define a node pointer temp and initialize with head.
Step 5 - Keep moving the temp to its next node until it reaches to the last node in the list
(until temp → next is equal to NULL).
Step 6 - Set temp → next = newNode.
Deletion
In a single linked list, the deletion operation can be performed in three ways. They are as
follows...
Linked stack
Instead of using array, we can also use linked list to implement stack. Linked list
allocates the memory dynamically. However, time complexity in both the scenario is
same for all the operations i.e. push, pop and peek.
In linked list implementation of stack, the nodes are maintained non-contiguously inthe
memory. Each node contains a pointer to its immediate successor node in the stack.
Stack is said to be overflow if the space left in the memory heap is not enoughto create a
node.
The top most node in the stack always contains null in its address field. Lets discuss
the way in which, each operation is performed in linked list implementation of stack.
1. Check for the underflow condition: The underflow condition occurs when
we try to pop from an already empty stack. The stack will be empty if the head
pointer of the list points to null.
2. Adjust the head pointer accordingly: In stack, the elements are poppedonly
from one end, therefore, the value stored in the head pointer must be deleted
and the node must be freed. The next node of the head node now becomes the
head node.
Display the nodes (Traversing)
Displaying all the nodes of a stack needs traversing all the nodes of the linked list
organized in the form of stack. For this purpose, we need to follow the following
steps.
Linked Queue
In a linked queue, each node of the queue consists of two parts i.e. data part and the link
part. Each element of the queue points to its immediate next element in the memory.
In the linked queue, there are two pointers maintained in the memory i.e. front pointer
and rear pointer. The front pointer contains the address of the starting element of the
queue while the rear pointer contains the address of the last element of the queue.
Insertion and deletions are performed at rear and front end respectively. If front and
rear both are NULL, it indicates that the queue is empty.
The linked representation of queue is shown in the following figure.
Firstly, allocate the memory for the new node ptr by using the following statement.There
can be the two scenario of inserting this new node ptr into the linked queue.
we insert element into an empty queue. In this case, the condition front = NULL
becomes true. Now, the new element will be added as the only element of the queue
and the next pointer of front and rear pointer both, will point to NULL.
Deletion operation removes the element that is first inserted among all the queue
elements. Firstly, we need to check either the list is empty or not. The condition front
== NULL becomes true if the list is empty, in this case , we simply write underflow on
the console and make exit. Otherwise, we will delete the element that is pointed by the
pointer front. For this purpose, copy the node pointed by the front pointer into the
pointer ptr. Now, shift the front pointer, point to its next node and free the node pointed
by the node ptr.
Unit II
Trees – Binary tree representations – Tree Traversal – Threaded Binary Trees
– Binary Tree Representation of Trees – Graphs and Representations –
Traversals, Connected Components and Spanning Trees – Shortest Paths and
Transitive closure – Activity Networks – Topological Sort and Critical Paths.
Tree is a popular data structure used in wide range of applications. A tree data structure
can be defined as follows...
Tree is a non-linear data structure which organizes data in hierarchical structure and
this is a recursive definition.
A tree data structure can also be defined as
follows... A tree is a finite set of one or more nodes
such that:
There is a specially designated node called the root. The remaining nodes are
partitioned into n>=0 disjoint sets T1, ..., Tn, where each of these sets is a tree. We call T1,
..., Tn are the subtrees of the root.
A tree is hierarchical collection of nodes. One of the nodes, known as the root, is at the top
of the hierarchy. Each node can have at most one link coming into it. The node where the
link originatesis called the parent node. The root node has no parent. The links leaving a
node (any number of links are allowed) point to child nodes. Trees are recursive structures.
Each child node is itself the root of a subtree. At the bottom of the tree are leaf nodes, which
have no children.
Advantages of trees
Trees are so useful and frequently used, because they have some very serious
advantages:Trees reflect structural relationships in the data
Trees are used to represent hierarchies
Trees provide an efficient insertion and searching
Trees are very flexible data, allowing to move sub trees around with minimum effort
Introduction Terminology
In a Tree, Every individual element is called as Node. Node in a tree data structure,
storesthe actual data of that particular element and link to next element in hierarchical
structure.
Example
1. Root
In a tree data structure, the first node is called as Root Node. Every tree must have root
node. Wecan say that root node is the origin of tree data structure. In any tree, there must
be only one root node. We never have multiple root nodes in a tree. In above tree, A is a
Root node
2. Edge
In a tree data structure, the connecting link between any two nodes is called as EDGE. In a
treewith 'N' number of nodes there will be a maximum of 'N-1' number of edges.
3. Parent
In a tree data structure, the node which is predecessor of any node is called as PARENT
NODE. In simple words, the node which has branch from it to any other node is called as
parent node. Parent node can also be defined as "The node which has child / children". e.g.,
Parent (A,B,C,D).
4. Child
In a tree data structure, the node which is descendant of any node is called as CHILD Node.
In simple words, the node which has a link from its parent node is called as child node. In a
tree, anyparent node can have any number of child nodes. In a tree, all the nodes except
root are child nodes. e.g., Children of D are (H, I,J).
5. Siblings
In a tree data structure, nodes which belong to same Parent are called as SIBLINGS. In
simplewords, the nodes with same parent are called as Sibling nodes. Ex: Siblings (B,C, D)
6. Leaf
In a tree data structure, the node which does not have a child (or) node with degree zero is
called as LEAF Node. In simple words, a leaf is a node with no child. In a tree data
structure, the leaf
nodes are also called as External Nodes. External node is also a node with no child. In a
tree, leafnode is also called as 'Terminal' node. Ex: (K,L,F,G,M,I,J)
7. Internal Nodes
In a tree data structure, the node which has atleast one child is called as INTERNAL Node.
In simple words, an internal node is a node with atleast one child. In a tree data structure,
nodes other than leaf nodes are called as Internal Nodes. The root node is also said to be
Internal Nodeif the tree has more than one node. Internal nodes are also called as 'Non-
Terminal' nodes.
Ex:B,C,D,E,H
8. Degree
In a tree data structure, the total number of children of a node (or)number of subtrees of a
node iscalled as DEGREE of that Node. In simple words, the Degree of a node is total
number of children it has. The highest degree of a node among all the nodes in a tree is
called as 'Degree of Tree'
9. Level
In a tree data structure, the root node is said to be at Level 0 and the children of root node
are at Level 1 and the children of the nodes which are at Level 1 will be at Level 2 and so
on... In simple words, in a tree each step from top to bottom is called as a Level and the
Level count startswith '0' and incremented by one at each level (Step). Some authors start
root level with 1.
10. Height
In a tree data structure, the total number of edges from leaf node to a particular node
in the longest path is called as HEIGHT of that Node. In a tree, height of the root node
is said to beheight of the tree. In a tree, height of all leaf nodes is '0'.
11. Depth
In a tree data structure, the total number of edges from root node to a particular node is
called as DEPTH of that Node. In a tree, the total number of edges from root node to a leaf
node in the longest path is said to be Depth of the tree. In simple words, the highest depth
of any leaf node ina tree is said to be depth of that tree. In a tree, depth of the root node is
'0'.
12. Path
In a tree data structure, the sequence of Nodes and Edges from one node to another node
is called as PATH between that two Nodes. Length of a Path is total number of nodes in that
path. In below example the path A - B - E - J has length 4.
Tree Representations
A tree data structure can be represented in two methods. Those methods are as
follows...1.List Representation
2. Left Child - Right Sibling
Representation Consider the following
tree...
1. List Representation
In this representation, we use two types of nodes one for representing the node with data
and another for representing only references. We start with a node with data from root
node in the tree. Then it is linked to an internal node through a reference node and is linked
to any other node directly. This process repeats for all the nodes in the tree.
The above tree example can be represented using List representation as follows...
In this representation, every node's data field stores the actual value of that node. If that
node hasleft child, then left reference field stores the address of that left child node
otherwise that field stores NULL. If that node has right sibling then right reference field
stores the address of right sibling node otherwise that field stores NULL. The above tree
example can be represented using Left Child - Right Sibling representation as follows...
Binary Trees
In a normal tree, every node can have any number of children. Binary tree is a
special type of tree data structure in which every node can have a maximum of 2 children.
One is knownas left child and the other is known as right child.
~A tree in which every node can have a maximum of two children is called as Binary Tree.
~In a binary tree, every node can have either 0 children or 1 child or 2 children but not
more than2 children. Example
There are different types of binary trees and they are...
1. Strictly Binary Tree
In a binary tree, every node can have a maximum of two children. But in strictly binary
tree, every node should have exactly two children or none. That means every internal node
must haveexactly two children. A strictly Binary Tree can be defined as follows...
A binary tree in which every node has either two or zero number of children is called
Strictly Binary Tree. Strictly binary tree is also called as Full Binary Tree or Proper
Binary Tree or 2-Tree
2. Complete Binary Tree
In a binary tree, every node can have a maximum of two children. But in strictly binary
tree, every node should have exactly two children or none and in complete binary tree all
the nodes must have exactly two children and at every level of complete binary tree there
must be 2 level number of nodes. For example at level 2 there must be 2^2 = 4 nodes and
at level 3 there must be2^3 = 8 nodes.
A binary tree in which every internal node has exactly two children and all leaf nodes are at
samelevel is called Complete Binary Tree.
Complete binary tree is also called as Perfect Binary Tree
Above two trees are different when viewed as binary trees. But same when viewed as trees.
Properties of Binary Trees
1. Maximum Number of Nodes in BT
maximum number of nodes on level i of a binary tree is 2i-1, i>=1.
k-1, k>=1.
Proof By Induction:
Induction Base: The root is the only node on level i=1.Hence ,the maximum number of
nodes onlevel i=1 is 2i-1=20=1.
Induction Hypothesis: Let I be an arbitrary positive integer greater than 1.Assume that
maximumnumber of nodes on level i-1 is 2i-2.
Induction Step: The maximum number of nodes on level i-1 is 2i-2 by the induction
hypothesis. Since each node in a binary tree has a maximum degree of 2,the maximum
number of nodes on level i is two times the maximum number of nodes on level i-1,or 2i-1.
The maximum number of nodes in a binary tree of depth k is
2. Relation between number of leaf nodes and degree-2 nodes: For any nonempty
binary tree,T, if n0 is the number of leaf nodes and n2 the number of nodes of degree 2,
then n0=n2+1.
PROOF: Let n and B denote the total number of nodes and branches in T. Let n0, n1, n2
representthe nodes with zero children, single child, and two children respectively.
B+1=n B=n1+2n2 ==> n1+2n2+1=
n,n1+2n2+1= n0+n1+n2 ==> n0=n2+1
3. A full binary tree of depth k is a binary tree of depth k having 2 -1 nodes, k>=0.
A binary tree with n nodes and depth k is complete iff its nodes correspond to the nodes
numbered from 1 to n in the full binary tree of depth k.
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left
subtree.Step 2 − Visit root node.
Step 3 − Recursively traverse right subtree.
void inorder(tree_pointer ptr) /* inorder tree traversal */ Recursive
{
if (ptr) {
inorder(ptr->left_child);
printf(―%d‖, ptr->data);
indorder(ptr-
>right_child);
}
}
2. Pre - Order Traversal ( root - leftChild - rightChild )
In Pre-Order traversal, the root node is visited before left child and right child nodes. In this
traversal, the root node is visited first, then its left child and later its right child. This pre-
order traversal is applicable for every root node of all subtrees in the tree. In the above
example of binary tree, first we visit root node 'A' then visit its left child 'B' which is a root
for D and F. So we visit B's left child 'D' and again D is a root for I and J. So we visit D's left
child'I' which is the left most child. So next we go for visiting D's right child 'J'. With this we
have completed root, left and right parts of node D and root, left parts of node B. Next visit
B's right child'F'. With this we have completed root and left parts of node A. So we go for
A's right child 'C' which is a root node for G and H. After visiting C, we go for its left child
'G' which is a root for node K. So next we visit left of G, but it does not have left child so we
go for G's right child 'K'. With this we havecompleted node C's root and left parts. Next visit
C's right child 'H' which is the right most child in the tree. So we stop the process. That
means here we have visited in the order of A-B-D-I-J-F- C-G-K-H using Pre-Order
Traversal.
Algorithm
Until all nodes are traversed
−Step 1 − Visit root node.
Step 2 − Recursively traverse left
subtree. Step 3 − Recursively traverse
right subtree.
void preorder(tree_pointer ptr) /* preorder tree traversal */ Recursive
{
if (ptr) {
printf(―%d‖, ptr->data);
preorder(ptr->left_child);
preorder(ptr-
>right_child);
}
}
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left
subtree. Step 2 − Recursively traverse
right subtree. Step 3 − Visit root node.
void postorder(tree_pointer ptr) /* postorder tree traversal */ Recursive
{
if (ptr) {
postorder(ptr->left_child);
postorder(ptr-
>right_child);
printf(―%d‖, ptr->data);
}
}
Binary Search Trees
Binary Search Tree Representation
Binary Search tree exhibits a special behavior. A node's left child must have value less than
itsparent's value and node's right child must have value greater than it's parent value.
We're going to implement tree using node object and connecting them through references.
Definition: A binary search tree (BST) is a binary tree. It may be empty. If it is not
empty,thenall nodes follows the below mentioned properties −
keys in a nonempty left subtree (right subtree) are smaller (larger) than the key in the
rootof subtree.
Graph
A graph G = (V,E) is composed of:
V: set of vertices
E: set of edges connecting the vertices in V
• An edge e = (u,v) is a pair of vertices
Example:
V= {a,b,c,d,e}
E={(a,b),(a,c),(a,d),(b,e),(c,d),(c,e),(d,e)}
Graph Terminology
Undirected Graph:
An undirected graph is one in which the pair of vertices in a edge is unordered,
(v0, v1) = (v1,v0)
Directed Graph:
A directed graph is one in which each edge is a directed pair of vertices, <v0,
v1> != <v1,v0>
Complete Graph:
A complete graph is a graph that has the maximum number of edges for undirected graph
with n vertices, the maximum number of edges is n(n-1)/2 for directed graph with n
vertices, the maximum number of edges is n(n-1)
Adjacent and Incident:
If (v0, v1) is an edge in an undirected graph,
– v0 and v1 are adjacent
– The edge (v0, v1) is incident on vertices v0 and v1
Multigraph:
In a multigraph, there can be more than one edge from vertex P to
vertex Q. In a simple graph there is at most one.
Subgraph:
A subgraph of G is a graph G‟ such that V(G‟) is a subset of V(G) and E(G‟) is a
subset of E(G)
Path:
A path from vertex vp to vertex vq in a graph G, is a sequence of vertices,
vp, vi1, vi2, ..., vin, vq, such that (vp, vi1), (vi1, vi2), ..., (vin, vq) are edges in
an undirected graph
The length of a path is the number of edges on it.
Simple Path and Style:
A simple path is a path in which all vertices, except possibly the first and the
last, are distinct.
A cycle is a simple path in which the first and the last vertices are the same
In an undirected graph G, two vertices, v0 and v1, are connected if there is
a path in G from v0 to v1.
An undirected graph is connected if, for every pair of distinct vertices vi,
vj, there is a path from vi to vj
Degree
The degree of a vertex is the number of edges incident to that vertex
For directed graph,
– the in-degree of a vertex v is the number of edges that have v as the head
– the out-degree of a vertex v is the number of edges that have v as the tail
– if di is the degree of a vertex i in a graph G with n vertices and e edges,
the number of edges is
Example:
ADT for Graph
Graph ADT is
Data structures: a nonempty set of vertices and a set of undirected
edges, where each edge is a pair of vertices
Functions: for all graph Graph, v, v1 and v2 Vertices
• Graph Create()::=return an empty graph
• Graph InsertVertex(graph, v)::= return a graph with v inserted. V
has no incident edge.
• Graph InsertEdge(graph, v1,v2)::= return a graph with new edge between
v1 and v2
• Graph DeleteVertex(graph, v)::= return a graph in which v and all edges
incident to it are removed
• Graph DeleteEdge(graph, v1, v2)::=return a graph in which the edge
(v1, v2) is removed
• Boolean IsEmpty(graph)::= if (graph==empty graph) return TRUE
else return FALSE
• List Adjacent(graph,v)::= return a list of all vertices that are adjacent
to v
Graph Representations
Graph can be represented in the following ways:
a) Adjacency Matrix
b) Adjacency Lists
c) Adjacency Multilists
a) Adjacency Matrix
Let G=(V,E) be a graph with n vertices.
The adjacency matrix of G is a two-dimensional by array, say
adj_mat. If the edge (vi, vj) is in E(G), adj_mat[i][j]=1
If there is no such edge in E(G), adj_mat[i][j]=0
The adjacency matrix for an undirected graph is symmetric; the adjacency matrix for a
digraph need not be symmetric
Examples for Adjacency Matrix:
Merits of Adjacency Matrix
From the adjacency matrix, to determine the connection of vertices is easy
The degree of a vertex is
For a digraph, the row sum is the out_degree, while the column sum is the
in_degree
b) Adjacency Lists
Each row in adjacency matrix is represented as an adjacency list.
Interesting Operations
• degree of a vertex in an undirected graph
# of nodes in adjacency list
• # of edges in a graph
determined in O(n+e)
• out-degree of a vertex in a directed graph
# of nodes in its adjacency list
• in-degree of a vertex in a directed graph
traverse the whole data structure
Orthogonal representation for graph G3
c) Adjacency Multilists
An edge in an undirected graph is represented by two nodes in adjacency
list representation.
Adjacency Multilists
– lists in which nodes may be shared among several lists. (an edge is shared by
two different paths)
depth first search: v0, v1, v3, v7, v4, v5, v2,
v6 breadth first search: v0, v1, v2, v3, v4, v5,
v6, v7
St Travers Description
ep al
1 Initialize
the
stack.
5 We choose B, mark it as
visited and put onto the
stack. Here B does not have
any unvisited adjacent
node. So, we pop B from the
stack.
As C does not have any unvisited adjacent node so we keep popping the
stack until we find a node that has an unvisited adjacent node. In this case,
there's none and we keep popping until the stack is empty.
Rule 2 − If no adjacent vertex is found, remove the first vertex from the queue.
7 From A we have D as
unvisited adjacent node. We
mark it as visited and
enqueue it.
At this stage, we are left with no unmarked (unvisited) nodes. But as per the
algorithm we keep on dequeuing in order to get all unvisited nodes. When the
queue gets emptied, the program is over.
Spanning Trees
When graph G is connected, a depth first or breadth first search starting at any
vertex will visit all vertices in G
A spanning tree is any tree that consists solely of edges in G and that includes
all the vertices
E(G): T (tree edges) + N (nontree edges) where T: set of edges used during search
N: set of remaining edges
Examples of Spanning Tree
While adding a nontree edge into any spanning tree, this will create a cycle
DFS VS BFS Spanning Tree
A spanning tree is a minimal subgraph, G‟, of G such that V(G‟)=V(G) and G‟ is
connected.
Any connected graph with n vertices must have at least n-1 edges.
A biconnected graph is a connected graph that hasno articulation points.
biconnected component: a maximal connected subgraph H (no subgraph that
is both biconnected and properly contains H).
– Kruskal
– Prim
– Sollin
Kruskal’s Algorithm
Build a minimum cost spanning tree T by adding edges to T one at a time
Select the edges for inclusion in T in nondecreasing order of the cost
An edge is added to T if it does not form a cycle
Since G is connected and has n > 0 vertices, exactly n-1 edges will be selected
Kruskal’s algorithm
1. Sort all the edges in non-decreasing order of their weight.
2. Pick the smallest edge. Check if it forms a cycle with the spanningtree formed so far. If
cycle is not formed, include this edge. Else, discard it.
3. Repeat step#2 until there are (V-1) edges in the spanning tree.
Prim’s Algorithm
Prim's algorithm to find minimum cost spanning tree (as Kruskal's algorithm)
uses the greedy approach. Prim's algorithm shares a similarity with the
shortest path first algorithms.
Prim's algorithm, in contrast with Kruskal's algorithm, treats the nodes as a
single tree and keeps on adding new nodes to the spanning tree from the given
graph.
To contrast with Kruskal's algorithm and to understand Prim's algorithm better,
we shall use the same example −
Steps of Prim's Algorithm: The following are the main 3 steps of the Prim's
Algorithm:
1. Begin with any vertex which you think would be suitable and add it to the
tree.
2. Find an edge that connects any vertex in the tree to any vertex that is not
in the tree. Note that, we don't have to form cycles.
3. Stop when n - 1 edges have been added to the tree.
Solution 1
3)
Solution 2
iently large
number
Algorithm
• An algorithm is a step-by-step procedure to solve a problem in a finite
number of steps.
• Branching and repetition are included in the steps of an algorithm.
• This branching and repetition depend on the problem for which Algorithm is
developed.
• All the steps of Algorithm during the definition should be written in a
human-understandable language which does not depend on any
programming language.
• we can choose any programming language to implement the Algorithm.
• Pseudocode and flow chart are popular ways to represent an algorithm.
Binary Search
Problem definition: Let ai, 1 ≤ i ≤ n be a list of elements that are sorted in non-
decreasing order. The problem is to find whether a given element x is present in
the listor not. If x is present we have to determine a value j (element‘s position)
such that aj=x.If x is not in the list, then j is set to zero.
Solution: Let P = (n, ai…al , x) denote an arbitrary instance of search problem
where n is the number of elements in the list, ai…al is the list of elements and x is
the key elementto be searched for in the given list. Binary search on the list is
done as follows:
Step1: Pick an index q in the middle range [i, l] i.e. q= [(n + 1)/2] and compare x
with aq. Step 2: if x = aq i.e key element is equal to mid element, the problem is
immediatelysolved.
Step 3: if x <aqin this case x has to be searched for only in the sub-list ai, ai+1,
……, aq-Therefore, problem reduces to (q-i, ai…aq-1, x).
Step 4: if x >aq,x has to be searched for only in the sub-list aq+1, ...,., al . Therefore
problemreduces to (l-i, aq+1…al, x).
For the above solution procedure, the Algorithm can be implemented as recursive
or non-recursive algorithm.
Recursive binary search algorithm
Iterative binary search:
Space Complexity
Compared to the straight forward method, the MaxMin method requires extra
stack space for i, j, max, min, max1 and min1. Given n elements there will be
[log2 n] + 1 levels of recursion and we need to save seven values for each
recursive call. (6 + 1 for return address).
Merge Sort
Merge sort is a perfect example of a successful application of the divide-and
conquer technique. It sorts a given array A [O ... n - 1] by dividing it into two
halves A [0 .. \n/2]-1] and A [ 𝗁n/2] .. n-1], sorting each of them recursively,
and then merging the two smaller sorted arrays into a single sorted one.
▪ After that, the index of the smaller element is incremented to point to its
immediate successor in the array it was copied from. This operation is
repeated until one of the two given arrays is exhausted, and then the
remaining elements of the other array are copied to the end of the new
array.
Example:
The operation of the algorithm on the list 8, 3, 2, 9, 7, 1, 5, 4 is illustrated in
the figure.
Analysis
Here the basic operation is key comparison. As merge sort execution does not
depend on the order of the data, best case and average case runtime are the
same as worst case runtime.
Worst case: During key comparison, neither of the two arrays becomes empty
before the
other one contains just one element leads to the worst case of merge sort.
Assuming for
simplicity that total number of elements n is a power of 2, the recurrence
relation for the
number of key comparisons C(n) is
where, Cmerge(n) is the number of key comparisons made during the merging
stage.
Let us analyze C merge(n), the number of key comparisons performed during the
merging stage. At each step, exactly one comparison is made, after which the
total number of elements in the two arrays still needing to be processed is
reduced by 1. In the worst case, neither of the two arrays becomes empty before
the other one contains just one element (e.g., smaller elements may come from
the alternating arrays).Therefore, for the worst case, Cmerge(n) = n –1.
Now,
Obviously, after a partition is achieved, A[s] will be in its final position in the sorted array,
and we can continue sorting the two subarrays to the left and the right of A[s] independently
(e.g.,by the same method).
In quick sort, the entire work happens in the division stage, with no work required to
combine the solutions to the sub problems.
Partitioning
We start by selecting a pivot—an element with respect to whose value we are going to
dividethe subarray. There are several different strategies for selecting a pivot. We use
the sophisticated method suggested by C.A.R. Hoare, the prominent British computer
scientistwho invented quicksort.
Select the subarray‘s first element: p = A[l].Now scan the subarray from both ends,
comparingthe subarray‘s elements to the pivot.
▪ The left-to-right scan, denoted below by index pointer i, starts with the second element.
Since we want elements smaller than the pivot to be in the left part of the subarray, this
scan skips over elements that are smaller than the pivot and stops upon encountering the
first element greater than or equal to the pivot.
▪ The right-to-left scan, denoted below by index pointer j, starts with the last element of
the subarray. Since we want elements larger than the pivot to be in the right part of the
subarray, this scan skips over elements that are larger than the pivot and stops on
encountering the first element smaller than or equal to the pivot.
After both scans stop, three situations may arise, depending on whether or not
the scanning indices have crossed.
▪ If scanning indices i and j have not crossed, i.e., i< j, we simply
exchange A[i] and A[j ] and resume the scans by incrementing I and
decrementing j, respectively:
If the scanning indices stop while pointing to the same element, i.e., i = j,
the value they are pointing to must be equal to p. Thus, we have the
subarray partitioned, with the split position s = i = j :
Analysis
Best Case -Here the basic operation is key comparison. Number of key
comparisons made before a partition is achieved is n + 1 if the scanning indices
cross over and n if they coincide. If all the splits happen in the middle of
corresponding subarrays, we will have the best case. The number of key
comparisons in the best case satisfies the recurrence,
According to the Master Theorem, Cbest(n) ∈Θ(n log2 n); solving it exactly for n
= 2k yields
Cbest(n) = n log2 n.
Worst Case – In the worst case, all the splits will be skewed to the extreme:
one of the two subarrays will be empty, and the size of the other will be just 1
less than the size of the subarray being partitioned. This unfortunate situation
will happen, in particular, for increasing arrays. Indeed, if A[0..n − 1] is a strictly
increasing array and we use A[0] as the pivot, the left-to-right scan will stop on
A[1] while the right-to-left scan will go all the way to reach A[0], indicating the
split at position 0:So, after making n + 1 comparisons to get to this partition
and exchanging the pivot A[0] with itself, the algorithm will be left with the
strictly increasing array A[1..n − 1] to sort. This sorting of strictly increasing
arrays of diminishing sizes will continue until the last one A[n−2.. n−1] has
been processed. The total number of key comparisons made will be equal to
Average Case - Let Cavg(n) be the average number of key comparisons made
by quicksort on a randomly ordered array of size n. A partition can happen in
any position s (0 ≤ s ≤ n−1) after n+1comparisons are made to achieve the
partition. After the partition, the left and right subarrays will have s and n − 1−
s elements, respectively. Assuming that the partition split can happen in each
position s with the same probability 1/n, we get the following recurrence
relation:
Its solution, which is much trickier than the worst- and best-case analyses,
turns out to be
Thus, on the average, quicksort makes only 39% more comparisons than in the
best case. Moreover, its innermost loop is so efficient that it usually runs faster
than mergesort on randomly ordered arrays of nontrivial sizes. This certainly
justifies the name given to the algorithm by its inventor.
Unit - 4
Greedy Method : The General Method – Optimal Storage on Tapes – Knapsack
Problem – Job Sequencing with Deadlines – Optimal Merge Patterns.
GENERAL METHOD
Greedy Method
Greedy is the most straight forward design technique. Most of the problems have n inputs and require us
to obtain a subset that satisfies some constraints. Any subset that satisfies these constraints is called a
feasible solution. We need to find a feasible solutionthat either maximizes or minimizes the objective
function. A feasible solution that does this is called an optimal solution.
The greedy method is a simple strategy of progressively building up a solution, one element at a time, by
choosing the best possible element at each stage. At each stage, adecision is made regarding whether or
not a particular input is in an optimal solution.
This is done by considering the inputs in an order determined by some selection procedure. If the
inclusion of the next input, into the partially constructed optimal solution will result in an infeasible
solution then this input is not added to the partial solution. The selection procedure itself is based on
some optimization measure. Several optimization measures are plausible for a given problem. Most of
them, however, will result in algorithms that generate sub-optimal solutions. This version of greedy
technique is called subset paradigm. Some problems like Knapsack, Job sequencing withdeadlines and
minimum cost spanning trees are based on subset paradigm.
For the problems that make decisions by considering the inputs in some order, each decision is made
using an optimization criterion that can be computed using decisions already made. This version of
greedy method is ordering paradigm. Some problems likeoptimal storage on tapes, optimal merge
patterns and single source shortest path are based on ordering paradigm.
CONTROL ABSTRACTION
Algorithm Greedy (a, n)
// a(1 : n) contains the „n‟ inputs
{
solution := ; // initialize the solution to empty for i:=1 to n do
{
x := select (a);
if feasible (solution, x) then solution
:= Union (Solution, x);
}
return solution;
}
Procedure Greedy describes the essential way that a greedy based algorithm will look,once a particular
problem is chosen and the functions select, feasible and union are properly implemented.
The function select selects an input from „a‟, removes it and assigns its value to „x‟. Feasible is aBoolean valued
function, which determines if „x‟ can be included into the solution vector. The function Union combines „x‟ with
solution and updates the objective function.
KNAPSACK PROBLEM
Let us apply the greedy method to solve the knapsack problem. We are given „n‟ objectsand a knapsack.
The object „i‟ has a weight wi and the knapsack has a capacity „m‟. If afraction xi, 0 < xi < 1 of object i is
placed into the knapsack then a profit of pi xi is earned. The objective is to fill the knapsack that
maximizes the total profit earned.
Since the knapsack capacity is „m‟, we require the total weight of all chosen objects to be atmost „m‟. The
problem is stated as:
Algorithm:
The algorithm for assigning programs to tapes is as follows:
Algorithm Store (n, m)
// n is the number of programs and m the number of tapes
{
j := 0; // next tape to store on for i :=1 to n do
{
Print („append program‟, i, „to permutation for tape‟, j); j := (j + 1) mod m;
}
}
On any given tape, the programs are stored in non-decreasing order of their lengths.
Example:
Let n = 4, (P1, P2, P3, P4,) = (100, 10, 15, 27) and (d1 d2 d3 d4) = (2, 1, 2, 1). The
feasible solutions and their values are:
OPTIMAL MERGE PATERNS
Given „n‟ sorted files, there are many ways to pair wise merge them into a single sortedfile. As, different
pairings require different amounts of computing time, we want to determine an optimal (i.e., one
requiring the fewest comparisons) way to pair wise merge „n‟ sorted files together. This type of merging
is called as 2-way merge patterns.To merge an n-record file and an m-record file requires possibly n + m
record moves, the obvious choice choice is, at each step merge the two smallest files together. The two-
way merge patterns can be represented by binary merge trees.
struct treenode
{
treenode * lchild;
treenode * rchild;
};
Example 1:
Suppose we are having three sorted files X 1, X2 and X3 of length 30, 20, and 10 records each.Merging of the
files can be carried out as follows:
Example 2:
Given five files (X1, X2, X3, X4, X5) with sizes (20, 30, 10, 5, 30). Apply greedy rule tofind optimal way of
pair wise merging to give an optimal solution using binary merge tree representation.
Solution:
Backtracking
Some problems can be solved, by exhaustive search. The exhaustive-search
technique suggests generating all candidate solutions and then identifying the one (or
the ones) with a desired property.
Backtracking is a more intelligent variation of this approach. The principal idea is to
construct solutions one component at a time and evaluate such partially constructed
candidates as follows. If a partially constructed solution can be developed further
without violating the problem‘s constraints, it is done by taking the first remaining
legitimate optionfor the next component. If there is no legitimate option for the next
component, no alternatives for any remaining component need to be considered. In
this case, the algorithm
backtracks to replace the last component of the partially constructed solution with
its nextoption.
It is convenient to implement this kind of processing by constructing a tree of choices
being made, called the state-space tree. Its root represents an initial state before the
search for a solution begins. The nodes of the first level in the tree represent the choices
made for the first component of a solution; the nodes of the second level represent the
choices for the second component, and soon. A node in a state-space tree is said to be
promising if it corresponds to a partially constructed solution that may still lead to a
complete solution; otherwise, it is called non-promising. Leaves represent either non-
promising dead ends or complete solutions found by the algorithm.
In the majority of cases, a state space tree for a backtracking algorithm is constructed
in the manner of depth-first search. If the current node is promising, its child is
generated by addingthe first remaining legitimate option for the next component of a
solution, and the processing moves to this child. If the current node turns out to be non-
promising, the algorithm backtracks to the node‘s parent to consider the next possible
option for its last component; if there is no such option, it backtracks one more level up
the tree, and so on. Finally, if the algorithm reaches a complete solution to the problem,
it either stops (if just one solution is required) or continues searching for other possible
solutions.
General method
General Algorithm (Recursive)
N-Queens problem
The problem is to place n queens on an n × n chessboard so that no two queens attack
each other by being in the same row or in the same column or on the same diagonal.
So let us consider the four-queens problem and solve it by the backtracking technique. Since
each of the four queens has to be placed in its own row, all we need to do is to assign a column for
each queen on the board presented in figure.
We start with the empty board and then place queen 1 in the first possible position of its row,
which is in column 1 of row 1. Then we place queen 2, after trying unsuccessfully columns 1 and 2,
in the first acceptable position for it, which is square (2, 3), the square in row 2 and column 3. This
proves to be a dead end because there is no acceptable position for queen 3. So, the algorithm
backtracks and puts queen 2 in the next possible position at (2, 4). Then queen 3 is placed at (3, 2),
which proves to be another dead end. The algorithm then backtracks all the way to queen 1 and
moves it to (1, 2). Queen 2 then goes to (2, 4), queen 3 to(3, 1), and queen 4 to (4, 3), which is a
solution to the problem. The state- space tree of this search is shown in figure.
Figure: State-space tree of solving the four-queens problem by backtracking. × denotes an
unsuccessful attempt to place a queen in the indicated column. The numbers above the nodes
indicate the order in which the nodes are generated.
If other solutions need to be found, the algorithm can simply resume its operations at the
leaf atwhich it stopped. Alternatively, we can use the board‘s symmetry for this purpose.
Finally, it should be pointed out that a single solution to the n-queens problem for any n ≥
4 canbe found in linear time.
Note: The algorithm NQueens() is not in the syllabus. It is given here for interested
learners. Thealgorithm is referred from textbook T2.
Sum of subsets problem
Problem definition: Find a subset of a given set A = {a 1, . . . , an} of n positive integers
whosesum is equal to a given positive integer d.
For example, for A = {1, 2, 5, 6, 8} and d = 9, there are two solutions: {1, 2, 6} and {1, 8}.
Ofcourse, some instances of this problem may have no solutions.
It is convenient to sort the set‘s elements in increasing order. So, we will
assume thata1< a2< . . . < an.
The state-space tree can be constructed as a binary tree like that in Figure shown below for
theinstance A = {3, 5, 6, 7} and d = 15.
The number inside a node is the sum of the elements already included in the subsets represented by
thenode. The inequality below a leaf indicates the reason for its termination.
The root of the tree represents the starting point, with no decisions about the given
elements madeas yet. Its left and right children represent, respectively, inclusion and
exclusion of a1 in a set being sought.
Similarly, going to the left from a node of the first level corresponds to inclusion of a 2 while
going to the right corresponds to its exclusion, and so on. Thus, a path from the root to a
node onthe ith level of the tree indicates which of the first in numbers have been included in
the subsets represented by that node.
We record the value of s, the sum of these numbers, in the node. If s is equal to d, we have a
solution to the problem. We can either report this result and stop or, if all the solutions need to be
found, continue bybacktracking to the node‘s parent. If s is not equal to d, we can terminate the
node as non-promising if either of the following two inequalities holds:
Example: Apply backtracking to solve the following instance of the subset sum problem: A
= {1, 3, 4, 5} and d = 11.
Graph coloring
Analysis