Data Structure - SEM III - Aeraxia - in

Download as pdf or txt
Download as pdf or txt
You are on page 1of 98

DATA STRUCTURE

Unit I
Arrays and sequential representations – ordered lists – Stacks and Queues –
Evaluation ofExpressions – Multiple Stacks and Queues – Singly Linked List
– Linked Stacks and queues – Polynomial addition.

Data Structure Introduction:


The data structure name indicates itself that organizing the data in memory.
There are many ways of organizing the data in the memory as we have already seen
one of the data structures, i.e., array in C language. Array is a collection of memory
elements in which data is stored sequentially, i.e., one after another. In other words, we
can say that array stores the elements in a continuous manner. This organization of data
is done with the help of an array of data structures. There are also other ways to
organize the data in memory. To structure the data in memory, 'n' number of algorithms
were proposed, and all these algorithms are known as Abstract data types. These
abstract data types are the set of rules.

Types of Data Structures


There are two types of data structures:

o Primitive data structure


o Non-primitive data structure

Primitive Data structure

The primitive data structures are primitive data types. The int, char, float, double, and
pointer are the primitive data structures that can hold a single value.

Non-Primitive Data structure


The non-primitive data structure is divided into two types:

• Linear data structure


• Non-linear data structure

Linear Data Structure

The arrangement of data in a sequential manner is known as a linear data structure. The
data structures used for this purpose are Arrays, Linked list, Stacks, and Queues. In
these data structures, one element is connected to only one another element in a linear
form.

Non- Linear Structure:

When one element is connected to the 'n' number of elements known as a non-
linear data structure. The best example is trees and graphs. In this case, the elements
are arranged in a random manner.

Data structures can also be classified as:

• Static data structure: It is a type of data structure where the size is allocated
at the compile time. Therefore, the maximum size is fixed.
• Dynamic data structure: It is a type of data structure where the size is
allocated at the run time. Therefore, the maximum size is flexible.

Major Operations
The major or the common operations that can be performed on the data structures are:

o Searching: We can search for any element in a data structure.


o Sorting: We can sort the elements of a data structure either in an ascendingor
descending order.
o Insertion: We can also insert the new element in a data structure.
o Updation: We can also update the element, i.e., we can replace the element
with another element.
o Deletion: We can also perform the delete operation to remove the element
from the data structure.

Advantages of Data structures


The following are the advantages of a data structure:
o Efficiency: If the choice of a data structure for implementing a particular ADT is
proper, it makes the program very efficient in terms of time and space.
o Reusability: The data structure provides reusability means that multiple client
programs can use the data structure.
o Abstraction: The data structure specified by an ADT also provides the level of
abstraction. The client cannot see the internal working of the data structure, so
it does not have to worry about the implementation part. The client can only see
the interface.

Arrays and Sequential Representation

Definition
o Arrays are defined as the collection of similar type of data items stored at contiguous
memory locations.
o Arrays are the derived data type in C programming language which can store the
primitive type of data such as int, char, double, float, etc.
o Array is the simplest data structure where each data element can be randomly accessed
by using its index number.
o For example, if we want to store the marks of a student in 6 subjects, then we don't need
to define different variable for the marks in different subject. instead of that, we can
define an array which can store the marks in each subject at a the contiguous memory
locations.

The array marks[10] defines the marks of the student in 10 different subjects where
each subject marks are located at a particular subscript in the array
i.e. marks[0] denotes the marks in first subject, marks[1] denotes the marks in 2nd
subject and so on.

Properties of the Array


1. Each element is of same data type and carries a same size i.e. int = 4 bytes.
2. Elements of the array are stored at contiguous memory locations where the first
element is stored at the smallest memory location.
3. Elements of the array can be randomly accessed since we can calculate the address of
each element of the array with the given base address and the size of data element.

Advantages of Array
o Array provides the single name for the group of variables of the same type therefore,it is
easy to remember the name of all the elements of an array.
o Traversing an array is a very simple process, we just need to increment the base
address of the array in order to visit each element one by one.
o Any element in the array can be directly accessed by using the index.
Memory Allocation of the array
As we have mentioned, all the data elements of an array are stored at contiguous
locations in the main memory. The name of the array represents the base
address or the address of first element in the main memory. Each element of the
array is represented by a proper indexing.

The indexing of the array can be defined in three ways.

1. 0 (zero - based indexing) : The first element of the array will be arr[0].
2. 1 (one - based indexing) : The first element of the array will be arr[1].
3. n (n - based indexing) : The first element of the array can reside at any random index
number.

In the following image, we have shown the memory allocation of an array arr of size
5. The array follows 0-based indexing approach. The base address of the array is
100th byte. This will be the address of arr[0]. Here, the size of int is 4 bytes therefore
each element will take 4 bytes in the memory.

In 0 based indexing, If the size of an array is n then the maximum index number, an
element can have is n-1. However, it will be n if we use 1 based indexing.

Accessing Elements of an array


To access any random element of an array we need the following information:

1. Base Address of the array.


2. Size of an element in bytes.
3. Which type of indexing, array follows.

Address of any element of a 1D array can be calculated by using the following formula:
Byte address of element A[i] = base address + size * ( i - first index )

2D Array
2D array can be defined as an array of arrays. The 2D array is organized as matrices
which can be represented as the collection of rows and columns.

However, 2D arrays are created to implement a relational database look alike data
structure. It provides ease of holding bulk of data at once which can be passed to any
number of functions wherever required.

How to declare 2D Array


The syntax of declaring two dimensional array is very much similar to
that of a one dimensional array, given as follows.
1. int arr[max_rows][max_columns];

however, It produces the data structure which looks like following.

Above image shows the two dimensional array, the elements are organized in the form of rows
and columns. First element of the first row is represented by a[0][0] where the number shown
in the first index is the number of that row while the number shown in the second index is the
number of the column.
How do we access data in a 2D array
Similar to one dimensional arrays, we can access the individual cells in
a 2D array by using the indices of the cells. There are two indices
attached to a particular cell, one is its row number while the other is its
column number.

However, we can store the value stored in any particular cell of a 2D array to some
variable x by using the following syntax.

1. int x = a[i][j];

where i and j is the row and column number of the cell respectively. We

can assign each cell of a 2D array to 0 by using the following code:

1. for ( int i=0; i<n ;i++)2.


{
3. for (int j=0; j<n; j++)
4. {
5. a[i][j] = 0;
6. }
7. }

Initializing 2D Arrays
We know that, when we declare and initialize one dimensional array in
C programming simultaneously, we don't need to specify the size of the
array. However this will not work with 2D arrays. We will have to
define at least the second dimension of the array.

The syntax to declare and initialize the 2D array is given as follows.

1. int arr[2][2] = {0,1,2,3};

The number of elements that can be present in a 2D array will always be equal to
(number of rows * number of columns).

Mapping 2D array to 1D array


When it comes to map a 2 dimensional array, most of us might think that why this
mapping is required. However, 2 D arrays exists from the user point of view. 2D arrays
are created to implement a relational database table lookalike data structure, in
computer memory, the storage technique for 2D array is similar to that of an one
dimensional array.
The size of a two dimensional array is equal to the multiplication of number of rows and
the number of columns present in the array. We do need to map two dimensional array
to the onedimensional array in order to store them in the memory.

A 3 X 3 two dimensional array is shown in the following image. However, this array
needs to be mapped to a one dimensional array in order to store it into the memory.

There are two main techniques of storing 2D array elements into memory

1. Row Major ordering


In row major ordering, all the rows of the 2D array are stored into the memory
contiguously. Considering the array shown in the above image, its memory allocation
according to row major order is shown as follows.

first, the 1st row of the array is stored into the memory completely, then the 2nd row of
the array is stored into the memory completely and so on till the last row.

2. Column Major ordering


According to the column major ordering, all the columns of the
2D array are stored into the memory contiguously. The memory
allocation of the array which is shown in in the above image is
given as follows.

first, the 1st column of the array is stored into the memory completely, then the2nd
row of the array is stored into the memory completely and so on till the last column of
the array.

Calculating the Address of the random element of a 2D array


Due to the fact that, there are two different techniques of storing the two dimensional
array into the memory, there are two different formulas to calculate the address of a
random element of the 2D array.

By Row Major Order


If array is declared by a[m][n] where m is the number of rows while n is the numberof
columns, then address of an element a[i][j] of the array stored in row major order is
calculated as,

1. Address(a[i][j]) = B. A. + (i * n + j) * size
where, B. A. is the base address or the address of the first element of the array a[0][0] .

Example :

1. a[10...30, 55...75], base address of the array (BA) = 0, size of an element = 4 bytes .
2. Find the location of a[15][68].
3.
4. Address(a[15][68]) = 0 +
5. ((15 - 10) x (68 - 55 + 1) + (68 - 55)) x 4
6.
7. = (5 x 14 + 13) x 4
8. = 83 x 4
9. = 332 answer

By Column major order

If array is declared by a[m][n] where m is the number of rows while n is the numberof
columns, then address of an element a[i][j] of the array stored in row major order is
calculated as,

1. Address(a[i][j]) = ((j*m)+i)*Size + BA

where BA is the base address of the array.

Example:

A [-
5 ... +20][20 ... 70], BA = 1020, Size of element = 8 bytes. Find the location of a[0][30].

Address [A[0][30]) = ((30-20) x 24 + 5) x 8 + 1020 = 245 x 8 + 1020 = 2980 bytes

Ordered list
An ordered list is a list in which the order of the items is significant.
However, the items in an ordered list are not necessarily sorted.Consequently,
it is possible to change the order o items and still have a valid ordered list.

Consider a list of the titles of the chapters in this book. The order of the items
in the list corresponds to the order in which they appear in the book.
However, since the chapter titles are not sorted alphabetically, we cannot
consider the list to be sorted. Since it is possible to change the order of the
chapters in book, we must be able to do the same with the items of the list.As
a result, we may insert an item into an ordered list at any position.

A searchable container is a container that supports the following additional


operations:

Insert
-used to put objects into a the container;
withdraw
-used to remove objects from the container;
find
-used to locate objects in the container;
isMember
-used to test whether a given object instance is in the container.
stack

A real-world stack allows operations at one end only. For example, we can place or
remove a card or plate from the top of the stack only. At any given time, we can only
access the top element of a stack.
This feature makes it LIFO data structure. LIFO stands for Last-in-first-out. Here, the
element which is placed (inserted or added) last, is accessed first. In stack terminology,
insertion operation is called PUSH operation and removal operation is called POP
operation.

Stack Representation
The following diagram depicts a stack and its operations −

A stack can be implemented by means of Array, Structure, Pointer, and Linked List.
Stack can either be a fixed size one or it may have a sense of dynamic resizing. Here,
we are going to implement stack using arrays, which makes it a fixed size stack
implementation.

Basic Operations
Stack operations may involve initializing the stack, using it and then de-initializing it.
Apart from these basic stuffs, a stack is used for the following two primary operations

push() − Pushing (storing) an element on the stack.
pop() − Removing (accessing) an element from the stack.
When data is PUSHed onto stack.
To use a stack efficiently, we need to check the status of stack as well. For the same
purpose, the following functionality is added to stacks −
peek() − get the top data element of the stack, without removing it.
isFull() − check if stack is full.
isEmpty() − check if stack is empty.
At all times, we maintain a pointer to the last PUSHed data on the stack.
The top pointer provides top value of the stack without actually removing it.
First we should learn about procedures to support stack functions −

peek()
Algorithm of peek() function −
begin procedure peekreturn
stack[top]
end procedure

isfull()
Algorithm of isfull() function −
begin procedure isfull

if top equals to MAXSIZEreturn


true
else
return falseendif

end procedure

isempty()
Algorithm of isempty() function −
begin procedure isempty

if top less than 1return


true
else
return falseendif

end procedure
Implementation of isempty() function in C programming language is slightly different.
We initialize top at -1, as the index in array starts from 0. So we check if the top is
below zero or -1 to determine if the stack is empty. Here's the code −

Push Operation
The process of putting a new data element onto stack is known as a Push Operation.
Push operation involves a series of steps −
Step 1 − Checks if the stack is full.
Step 2 − If the stack is full, produces an error and exit.
Step 3 − If the stack is not full, increments top to point next empty space.
Step 4 − Adds data element to the stack location, where top is pointing.
Step 5 − Returns success.

If the linked list is used to implement the stack, then in step 3, we need to allocatespace
dynamically.

Algorithm for PUSH Operation


A simple algorithm for Push operation can be derived as follows −
begin procedure push: stack, data

if stack is fullreturn
null
endif

top ← top + 1
stack[top] ← data

end procedure

Pop Operation
Accessing the content while removing it from the stack, is known as a Pop Operation.
In an array implementation of pop() operation, the data element is not actually
removed, instead top is decremented to a lower position in the stack to point to the
next value. But in linked-list implementation, pop() actually removes data element
and deallocates memory space.
A Pop operation may involve the following steps −
Step 1 − Checks if the stack is empty.
Step 2 − If the stack is empty, produces an error and exit.
Step 3 − If the stack is not empty, accesses the data element at which top is
pointing.
Step 4 − Decreases the value of top by 1.
Step 5 − Returns success.

Algorithm for Pop Operation


A simple algorithm for Pop operation can be derived as follows −
begin procedure pop: stack

if stack is emptyreturn
null
endif
data ← stack[top]top ←
top - 1 return data

end procedure

Queue
Queue is an abstract data structure, somewhat similar to Stacks. Unlike stacks, a queue is open
at both its ends. One end is always used to insert data (enqueue) and the other is usedto remove
data (dequeue). Queue follows First-In-First-Out methodology, i.e., the data item stored first will
be accessed first.

Queue Representation
As we now understand that in queue, we access both ends for different reasons. The
following diagram given below tries to explain queue representation as data structure −

Basic Operations
Queue operations may involve initializing or defining the queue, utilizing it, and then
completely erasing it from the memory. Here we shall try to understand the basic
operations associated with queues −
enqueue() − add (store) an item to the queue.
dequeue() − remove (access) an item from the queue.
Few more functions are required to make the above-mentioned queue operation
efficient. These are −
peek() − Gets the element at the front of the queue without removing it.
isfull() − Checks if the queue is full.
isempty() − Checks if the queue is empty.
In queue, we always dequeue (or access) data, pointed by front pointer and while
enqueing (or storing) data in the queue we take help of rear pointer.

peek()
This function helps to see the data at the front of the queue. The algorithm of peek()
function is as follows −
Algorithm
begin procedure peek return
queue[front]
end procedure

isfull()
As we are using single dimension array to implement queue, we just check for the rear
pointer to reach at MAXSIZE to determine that the queue is full. In case we maintain
the queue in a circular linked-list, the algorithm will differ. Algorithm of isfull() function

Algorithm
begin procedure isfull

if rear equals to MAXSIZEreturn


true
else
return falseendif

end procedure
isempty()
Algorithm of isempty() function −
Algorithm
begin procedure isempty

if front is less than MIN OR front is greater than rearreturn true


else
return falseendif

end procedure
If the value of front is less than MIN or 0, it tells that the queue is not yet initialized,
hence empty.

Enqueue Operation
Queues maintain two data pointers, front and rear. Therefore, its operations are
comparatively difficult to implement than that of stacks.
The following steps should be taken to enqueue (insert) data into a queue −
Step 1 − Check if the queue is full.
Step 2 − If the queue is full, produce overflow error and exit.
Step 3 − If the queue is not full, increment rear pointer to point the nextempty
space.
Step 4 − Add data element to the queue location, where the rear is pointing.
Step 5 − return success.
Sometimes, we also check to see if a queue is initialized or not, to
handle any unforeseen situations.

Algorithm for enqueue operation


procedure enqueue(data)if
queue is full return
overflow
endif
rear ← rear + 1
queue[rear] ← data
return true
end procedure

Dequeue Operation
Accessing data from the queue is a process of two tasks − access the data where
front is pointing and remove the data after access. The following steps are taken to
perform dequeue operation −
Step 1 − Check if the queue is empty.
Step 2 − If the queue is empty, produce underflow error and exit.
Step 3 − If the queue is not empty, access the data where front is pointing.
Step 4 − Increment front pointer to point to the next available data element.
Step 5 − Return success.
Algorithm for dequeue operation
procedure dequeue

if queue is empty return


underflow
end if

data = queue[front]front
← front + 1 return true

end procedure

Evaluation of Expression
The way to write arithmetic expression is known as a notation. An arithmetic
expression can be written in three different but equivalent notations, i.e., without
changing the essence or output of an expression. These notations are –
Infix Notation
Prefix (Polish) Notation
Postfix (Reverse-Polish) Notation
These notations are named as how they use operator in expression. We shall learnthe
same here in this chapter.

Infix Notation
We write expression in infix notation, e.g. a - b + c, where operators are used in-
between operands. It is easy for us humans to read, write, and speak in infix notation
but the same does not go well with computing devices. An algorithm to
process infix notation could be difficult and costly in terms of time and space
consumption.

Prefix Notation
In this notation, operator is prefixed to operands, i.e. operator is written ahead of
operands. For example, +ab. This is equivalent to its infix notation a + b. Prefix
notation is also known as Polish Notation.

Postfix Notation
This notation style is known as Reversed Polish Notation. In this notation style,
the operator is postfixed to the operands i.e., the operator is written after the
operands. For example, ab+. This is equivalent to its infix notation a + b.
The following table briefly tries to show the difference in all three notations −

Sr.No. Infix Notation Prefix Notation Postfix Notation

1 a+b +ab ab+

2 (a + b) ∗ c ∗+abc ab+c∗

3 a ∗ (b + c) ∗a+bc abc+∗

4 a/b+c/d +/ab/cd ab/cd/+

5 (a + b) ∗ (c + d) ∗+ab+cd ab+cd+∗

6 ((a + b) ∗ c) - d -∗+abcd ab+c∗d-

Parsing Expressions
As we have discussed, it is not a very efficient way to design an algorithm or program
to parse infix notations. Instead, these infix notations are first converted into either
postfix or prefix notations and then computed.
To parse any arithmetic expression, we need to take care of operator precedence and
associativity also.

Precedence
When an operand is in between two different operators, which operator will take the
operand first, is decided by the precedence of an operator over others. For example

a+b*c - a+(b*c)

As multiplication operation has precedence over addition, b * c will be evaluated first.


A table of operator precedence is provided later.

Associativity
Associativity describes the rule where operators with the same precedence appear in
an expression. For example, in expression a + b − c, both + and – have the same
precedence, then which part of the expression will be evaluated first, is determined by
associativity of those operators. Here, both + and − are left associative, so the
expression will be evaluated as (a + b) − c.
Precedence and associativity determines the order of evaluation of an expression.
Following is an operator precedence and associativity table (highest to lowest) −

Sr.No. Operator Precedence Associativity

1 Exponentiation ^ Highest Right Associative

2 Multiplication ( ∗ ) & Division ( / Second Highest Left Associative


)

3 Addition ( + ) & Subtraction ( − ) Lowest Left Associative

The above table shows the default behavior of operators. At any point of time in
expression evaluation, the order can be altered by using parenthesis. For example

In a + b*c, the expression part b*c will be evaluated first, with multiplication as
precedence over addition. We here use parenthesis for a + b to be evaluated first, like
(a + b)*c.

Postfix Evaluation Algorithm


Step 1 − scan the expression from left to rightStep 2 − if it is an
operand push it to stack
Step 3 − if it is an operator pull operand from stack and performoperation
Step 4 − store the output of step 3, back to stack
Step 5 − scan the expression until all operands are consumedStep 6 − pop the stack
and perform operation
A postfix expression is a collection of operators and operands in which the operator is placed
after the operands. That means, in a postfix expression the operator follows the operands.

Postfix Expression has following general structure...


Operand1 Operand2 Operator

Example

Postfix Expression Evaluation using Stack


DataStructure
A postfix expression can be evaluated using the Stack data structure. To evaluate a postfix
expression using Stack data structure we can use the following steps...Read all the symbols
one by one from left to right in the given Postfix Expression

1. If the reading symbol is operand, then push it on to the Stack.


2. If the reading symbol is operator (+ , - , * , / etc.,), then perform TWO pop
operations and store the two popped oparands in two different variables
(operand1 and operand2). Then perform reading symbol operation using operand1
and operand2 and push result back on to the Stack.
3. Finally! perform a pop operation and display the popped value as final result.
Example Consider the following Expression..
Multiple Stacks and Queues:
Multiple Stacks:
Following pictures are two ways to do two stacks in array:
1. None fixed size of the stacks:
Stack 1 expands from the 0th element to the right
Stack 2 expands from the 12th element to the left
As long as the value of Top1 and Top2 are not next to each other, it has free
elements for input the data in the array
When both Stacks are full, Top1 and Top 2 will be next to each other
There is no fixed boundary between Stack 1 and Stack 2
Elements –1 and –2 are using to store the information needed to manipulate the
stack (subscript for Top 1 and Top 2)

2. Fixed size of the stacks:

Stack 1 expands from the 0th element to the right


Stack 2 expands from the 6th element to the left
As long as the value of Top 1 is less than 6 and greater than 0, Stack 1 has
free elements to input the data in the array
As long as the value of Top 2 is less than 11 and greater than 5, Stack 2 has
free elements to input the data in the array
When the value of Top 1 is 5, Stack 1 is full
When the value of Top 2 is 10, stack 2 is full
Elements –1 and –2 are using to store the size of Stack 1 and the subscript of the
array for Top 1 needed to manipulate Stack 1
Elements –3 and –4 are using to store the size of Stack 2 and the subscript of the
array for Top 2 needed to manipulate Stack 2
procedure ADD (i,X ) //add element X to the i'th stack, 1
i n// if T(i) = B(i + 1) then call STACK-FULL (i)
T(i) ← T(i) + 1
V(T(i)) ← X //add X to the i'th
stack// end ADD

procedure DELETE(i,X ) //delete topmost element of


stack i// if T(i) = B(i) then call STACK-EMPTY(i)
X ← V(T(i))
T(i) ← T(i) - 1
end DELETE
Multiple Queues:
Following pictures are two ways to do two queues in array:

1. None fixed size of the queues:

❖ Queue 1 expands from the 0th element to the right and circular back to the 0th element
❖ Queue 2 expands from the 8th element to the left and circular back to the 8th element
❖ Temporary boundary between the Queue 1 and the Queue 2; as long as there has
free elements in the array and boundary would be shift
❖ Free elements could be anywhere in the Queue such as before the front, after the rear,
and between front and rear in the Queue
❖ Queue 1’s and Queue 2 ‘s size could be change if it is necessary. When the Queue 1
is full and the Queue 2 has free space; the Queue 1 can increase the size to use that
free space from the Queue 2. Same way for the Queue 2
❖ Elements –1, –2, and –3 are using to store the size of the Queue 1, the front of the
Queue 1, and the data count for the Queue 1 needed to manipulate the Queue 1
❖ Elements –4, –5, and –6 are using to store the size of the Queue 2, the front of the
Queue 2, and the data count for the Queue 2 needed to manipulate the Queue 2
❖ Inserts data to the Queue 1, Q1Rear = (Q1Front + Q1count) % Q1Size
❖ Inserts data to the Queue 2, Q2Rear = (Q2Front + Q2count) % Q2Size + Q1Size
❖ Deletes data from the Queue 1, Q1Front = (Q1Front + 1) % Q1Size
❖ Deletes data from the Queue 2, Q2Front = (Q2Front + 1) % Q2Size + Q1Size

2. Fixed size of the queue:


❖ Queue 1 expands from the 0th element to the 4th element and circular back to 0th element
❖ Queue 2 expands from the 8th element to the 5th element and circular back to 8th element
❖ The boundary is fixed between the Queue 1 and the Queue 2
❖ Free elements could be anywhere in the Queue such as before the front, after the rear,
and between front and rear in the Queue
❖ Elements –1, –2, and –3 are using to store the size of the Queue 1, the front of the
Queue 1, and the data count for the Queue 1 needed to manipulate the Queue 1
❖ Elements –4, –5, and –6 are using to store the size of the Queue 2, the front of the
Queue 2, and the data count for the Queue 2 needed to manipulate the Queue 2
❖ Inserts data to the Queue 1, Q1Rear = (Q1Front + Q1count) % Q1Size
❖ Inserts data to the Queue 2, Q2Rear = (Q2Front + Q2count) % Q2Size + Q1Size
❖ Deletes data from the Queue 1, Q1Front = (Q1Front + 1) % Q1Size
❖ Deletes data from the Queue 2, Q2Front = (Q2Front + 1) % Q2Size + Q1Size

A linked list is a sequence of data structures, which are connected together via links.
Linked List is a sequence of links which contains items. Each link contains a
connection to another link. Linked list is the second most-used data structure after
array. Following are the important terms to understand the concept of Linked List.
Link − Each link of a linked list can store a data called an element.
Next − Each link of a linked list contains a link to the next link called Next.
LinkedList − A Linked List contains the connection link to the first link called
First.

Linked List Representation


Linked list can be visualized as a chain of nodes, where every node points to the next
node.

As per the above illustration, following are the important points to be considered.
Linked List contains a link element called first.
Each link carries a data field(s) and a link field called next.
Each link is linked with its next link using its next link.
Last link carries a link as null to mark the end of the list.

Types of Linked List


Following are the various types of linked list.
Simple Linked List − Item navigation is forward only.
Doubly Linked List − Items can be navigated forward and backward.
Circular Linked List − Last item contains link of the first element as next and
the first element has a link to the last element as previous.

Basic Operations
Following are the basic operations supported by a list.
Insertion − Adds an element at the beginning of the list.
Deletion − Deletes an element at the beginning of the list.
Display − Displays the complete list.
Search − Searches an element using the given key.
Delete − Deletes an element using the given key.

Insertion Operation
Adding a new node in linked list is a more than one step activity. We shall learn this
with diagrams here. First, create a node using the same structure and find the location
where it has to be inserted.
Imagine that we are inserting a node B (NewNode), between A (LeftNode)
and C (RightNode). Then point B.next to C −
NewNode.next −> RightNode;
It should look like this −

Now, the next node at the left should point to the new node.
LeftNode.next −> NewNode;

What is Single Linked List?


Simply a list is a sequence of data, and the linked list is a sequence of data linked with each
other.
The formal definition of a single linked list is as follows...

Single linked list is a sequence of elements in which every element has link to its next
element in the sequence.
In any single linked list, the individual element is called as "Node". Every "Node" contains two fields,
data field, and the next field. The data field is used to store actual value of the node and next field
is used to store the address of next node in the sequence. The graphical
representation of a node in a single linked list is as follows...
Example

Operations on Single Linked List


The following operations are performed on a Single Linked List

Insertion
Deletion
Display

Before we implement actual operations, first we need to set up an empty list. First, perform the
following steps before implementing actual operations.

Step 1 - Include all the header files which are used in the program.
Step 2 - Declare all the user defined functions.
Step 3 - Define a Node structure with two members data and next
Step 4 - Define a Node pointer 'head' and set it to NULL.
Step 5 - Implement the main method by displaying operations menu and make suitable
function calls in the main method to perform user selected operation.

Insertion
In a single linked list, the insertion operation can be performed in three ways. They are as
follows...

1. Inserting At Beginning of the list


2. Inserting At End of the list
3. Inserting At Specific location in the list

Inserting At Beginning of the list


We can use the following steps to insert a new node at beginning of the single linked list...

Step 1 - Create a newNode with given value.


Step 2 - Check whether list is Empty (head == NULL)
Step 3 - If it is Empty then, set newNode→next = NULL and head = newNode.
Step 4 - If it is Not Empty then, set newNode→next = head and head = newNode.
Inserting At End of the list
We can use the following steps to insert a new node at end of the single linked list...

Step 1 - Create a newNode with given value and newNode → next as NULL.
Step 2 - Check whether list is Empty (head == NULL).
Step 3 - If it is Empty then, set head = newNode.
Step 4 - If it is Not Empty then, define a node pointer temp and initialize with head.
Step 5 - Keep moving the temp to its next node until it reaches to the last node in the list
(until temp → next is equal to NULL).
Step 6 - Set temp → next = newNode.

Inserting At Specific location in the list (After a


Node)
We can use the following steps to insert a new node after a node in the single linked list...

Step 1 - Create a newNode with given value.


Step 2 - Check whether list is Empty (head == NULL)
Step 3 - If it is Empty then, set newNode → next = NULL and head = newNode.
Step 4 - If it is Not Empty then, define a node pointer temp and initialize with head.
Step 5 - Keep moving the temp to its next node until it reaches to the node after which
we want to insert the newNode (until temp1 → data is equal to location, here location is
the node value after which we want to insert the newNode).
Step 6 - Every time check whether temp is reached to last node or not. If it is reached to
last node then display 'Given node is not found in the list!!! Insertion not possible!!!'
and terminate the function. Otherwise move the temp to next node.
Step 7 - Finally, Set 'newNode → next = temp → next' and 'temp → next = newNode'

Deletion
In a single linked list, the deletion operation can be performed in three ways. They are as
follows...

1. Deleting from Beginning of the list


2. Deleting from End of the list
3. Deleting a Specific Node

Deleting from Beginning of the list


We can use the following steps to delete a node from beginning of the single linked list...

Step 1 - Check whether list is Empty (head == NULL)


Step 2 - If it is Empty then, display 'List is Empty!!! Deletion is not possible' and
terminate the function.
Step 3 - If it is Not Empty then, define a Node pointer 'temp' and initialize with head.
Step 4 - Check whether list is having only one node (temp → next == NULL)
Step 5 - If it is TRUE then set head = NULL and delete temp (Setting Empty list
conditions)
Step 6 - If it is FALSE then set head = temp → next, and delete temp.
Deleting from End of the list
We can use the following steps to delete a node from end of the single linked list...

Step 1 - Check whether list is Empty (head == NULL)


Step 2 - If it is Empty then, display 'List is Empty!!! Deletion is not possible' and
terminate the function.
Step 3 - If it is Not Empty then, define two Node pointers 'temp1' and 'temp2' and
initialize 'temp1' with head.
Step 4 - Check whether list has only one Node (temp1 → next == NULL)
Step 5 - If it is TRUE. Then, set head = NULL and delete temp1. And terminate the
function. (Setting Empty list condition)
Step 6 - If it is FALSE. Then, set 'temp2 = temp1 ' and move temp1 to its next node.
Repeat the same until it reaches to the last node in the list. (until temp1 → next
== NULL)
Step 7 - Finally, Set temp2 → next = NULL and delete temp1.

Deleting a Specific Node from the list


We can use the following steps to delete a specific node from the single linked list...

Step 1 - Check whether list is Empty (head == NULL)


Step 2 - If it is Empty then, display 'List is Empty!!! Deletion is not possible' and
terminate the function.
Step 3 - If it is Not Empty then, define two Node pointers 'temp1' and 'temp2' and
initialize 'temp1' with head.
Step 4 - Keep moving the temp1 until it reaches to the exact node to be deleted or to the
last node. And every time set 'temp2 = temp1' before moving the 'temp1' to its next node.
Step 5 - If it is reached to the last node then display 'Given node not found in the list!
Deletion not possible!!!'. And terminate the function.
Step 6 - If it is reached to the exact node which we want to delete, then check whether
list is having only one node or not
Step 7 - If list has only one node and that is the node to be deleted, then set
head = NULL and delete temp1 (free(temp1)).
Step 8 - If list contains multiple nodes, then check whether temp1 is the first node in the list
(temp1 == head).
Step 9 - If temp1 is the first node then move the head to the next node (head = head →
next) and delete temp1.
Step 10 - If temp1 is not first node then check whether it is last node in the list (temp1 →
next == NULL).
Step 11 - If temp1 is last node then set temp2 → next = NULL
anddelete temp1 (free(temp1)).
Step 12 - If temp1 is not first node and not last node then set temp2 → next = temp1 →
next and delete temp1 (free(temp1)).

Displaying a Single Linked List


We can use the following steps to display the elements of a single linked list...

Step 1 - Check whether list is Empty (head == NULL)


Step 2 - If it is Empty then, display 'List is Empty!!!' and terminate the function.
Step 3 - If it is Not Empty then, define a Node pointer 'temp' and initialize with head.
Step 4 - Keep displaying temp → data with an arrow (--->) until temp reaches to the last
node
Step 5 - Finally display temp → data with arrow pointing to NULL (temp → data ---
>NULL).

Linked stack and Queue:

Linked stack
Instead of using array, we can also use linked list to implement stack. Linked list
allocates the memory dynamically. However, time complexity in both the scenario is
same for all the operations i.e. push, pop and peek.

In linked list implementation of stack, the nodes are maintained non-contiguously inthe
memory. Each node contains a pointer to its immediate successor node in the stack.
Stack is said to be overflow if the space left in the memory heap is not enoughto create a
node.

The top most node in the stack always contains null in its address field. Lets discuss
the way in which, each operation is performed in linked list implementation of stack.

Adding a node to the stack (Push operation)


Adding a node to the stack is referred to as push operation. Pushing an element to a
stack in linked list implementation is different from that of an array implementation. In
order to push an element onto the stack, the following steps are involved.

1. Create a node first and allocate memory to it.


2. If the list is empty then the item is to be pushed as the start node of the list. This
includes assigning value to the data part of the node and assign null to the address part
of the node.
3. If there are some nodes in the list already, then we have to add the new element in the
beginning of the list (to not violate the property of the stack). For this purpose, assign
the address of the starting element to the address field of the new node and make the
new node, the starting node of the list.

Deleting a node from the stack (POP operation)


Deleting a node from the top of stack is referred to as pop operation. Deleting a
node from the linked list implementation of stack is different from that in the
array implementation. In order to pop an element from the stack, we need to
follow the following steps :

1. Check for the underflow condition: The underflow condition occurs when
we try to pop from an already empty stack. The stack will be empty if the head
pointer of the list points to null.
2. Adjust the head pointer accordingly: In stack, the elements are poppedonly
from one end, therefore, the value stored in the head pointer must be deleted
and the node must be freed. The next node of the head node now becomes the
head node.
Display the nodes (Traversing)
Displaying all the nodes of a stack needs traversing all the nodes of the linked list
organized in the form of stack. For this purpose, we need to follow the following
steps.

1. Copy the head pointer into a temporary pointer.


2. Move the temporary pointer through all the nodes of the list and print the
value field attached to every node.

Linked Queue
In a linked queue, each node of the queue consists of two parts i.e. data part and the link
part. Each element of the queue points to its immediate next element in the memory.
In the linked queue, there are two pointers maintained in the memory i.e. front pointer
and rear pointer. The front pointer contains the address of the starting element of the
queue while the rear pointer contains the address of the last element of the queue.
Insertion and deletions are performed at rear and front end respectively. If front and
rear both are NULL, it indicates that the queue is empty.
The linked representation of queue is shown in the following figure.

Operation on Linked Queue


There are two basic operations which can be implemented on the linked
queues. The operations are Insertion and Deletion.
The insert operation append the queue by adding an element to the end of the
queue. The new element will be the last element of the queue.

Firstly, allocate the memory for the new node ptr by using the following statement.There

can be the two scenario of inserting this new node ptr into the linked queue.

we insert element into an empty queue. In this case, the condition front = NULL
becomes true. Now, the new element will be added as the only element of the queue
and the next pointer of front and rear pointer both, will point to NULL.
Deletion operation removes the element that is first inserted among all the queue
elements. Firstly, we need to check either the list is empty or not. The condition front
== NULL becomes true if the list is empty, in this case , we simply write underflow on
the console and make exit. Otherwise, we will delete the element that is pointed by the
pointer front. For this purpose, copy the node pointed by the front pointer into the
pointer ptr. Now, shift the front pointer, point to its next node and free the node pointed
by the node ptr.

Unit II
Trees – Binary tree representations – Tree Traversal – Threaded Binary Trees
– Binary Tree Representation of Trees – Graphs and Representations –
Traversals, Connected Components and Spanning Trees – Shortest Paths and
Transitive closure – Activity Networks – Topological Sort and Critical Paths.

Trees: Non-Linear data structure


A data structure is said to be linear if its elements form a sequence or a linear list.
Previous
linear data structures that we have studied like an array, stacks, queues and linked lists
organize data in linear order. A data structure is said to be non linear if its elements form a
hierarchical classification where, data items appear at various levels.
Trees and Graphs are widely used non-linear data structures. Tree and graph structures
represent hierarchical relationship between individual data elements. Graphs are nothing but
trees with certain restrictions removed.
Trees represent a special case of more general structures known as graphs. In a graph,
there is no restrictions on the number of links that can enter or leave a node, and cycles may
be present in the graph. The figure shows a tree and a non-tree.

Tree is a popular data structure used in wide range of applications. A tree data structure
can be defined as follows...
Tree is a non-linear data structure which organizes data in hierarchical structure and
this is a recursive definition.
A tree data structure can also be defined as
follows... A tree is a finite set of one or more nodes
such that:
There is a specially designated node called the root. The remaining nodes are
partitioned into n>=0 disjoint sets T1, ..., Tn, where each of these sets is a tree. We call T1,
..., Tn are the subtrees of the root.
A tree is hierarchical collection of nodes. One of the nodes, known as the root, is at the top
of the hierarchy. Each node can have at most one link coming into it. The node where the
link originatesis called the parent node. The root node has no parent. The links leaving a
node (any number of links are allowed) point to child nodes. Trees are recursive structures.
Each child node is itself the root of a subtree. At the bottom of the tree are leaf nodes, which
have no children.

Advantages of trees
Trees are so useful and frequently used, because they have some very serious
advantages:Trees reflect structural relationships in the data
Trees are used to represent hierarchies
Trees provide an efficient insertion and searching
Trees are very flexible data, allowing to move sub trees around with minimum effort

Introduction Terminology
In a Tree, Every individual element is called as Node. Node in a tree data structure,
storesthe actual data of that particular element and link to next element in hierarchical
structure.
Example
1. Root
In a tree data structure, the first node is called as Root Node. Every tree must have root
node. Wecan say that root node is the origin of tree data structure. In any tree, there must
be only one root node. We never have multiple root nodes in a tree. In above tree, A is a
Root node
2. Edge
In a tree data structure, the connecting link between any two nodes is called as EDGE. In a
treewith 'N' number of nodes there will be a maximum of 'N-1' number of edges.
3. Parent
In a tree data structure, the node which is predecessor of any node is called as PARENT
NODE. In simple words, the node which has branch from it to any other node is called as
parent node. Parent node can also be defined as "The node which has child / children". e.g.,
Parent (A,B,C,D).
4. Child
In a tree data structure, the node which is descendant of any node is called as CHILD Node.
In simple words, the node which has a link from its parent node is called as child node. In a
tree, anyparent node can have any number of child nodes. In a tree, all the nodes except
root are child nodes. e.g., Children of D are (H, I,J).
5. Siblings
In a tree data structure, nodes which belong to same Parent are called as SIBLINGS. In
simplewords, the nodes with same parent are called as Sibling nodes. Ex: Siblings (B,C, D)
6. Leaf
In a tree data structure, the node which does not have a child (or) node with degree zero is
called as LEAF Node. In simple words, a leaf is a node with no child. In a tree data
structure, the leaf
nodes are also called as External Nodes. External node is also a node with no child. In a
tree, leafnode is also called as 'Terminal' node. Ex: (K,L,F,G,M,I,J)
7. Internal Nodes
In a tree data structure, the node which has atleast one child is called as INTERNAL Node.
In simple words, an internal node is a node with atleast one child. In a tree data structure,
nodes other than leaf nodes are called as Internal Nodes. The root node is also said to be
Internal Nodeif the tree has more than one node. Internal nodes are also called as 'Non-
Terminal' nodes.
Ex:B,C,D,E,H
8. Degree
In a tree data structure, the total number of children of a node (or)number of subtrees of a
node iscalled as DEGREE of that Node. In simple words, the Degree of a node is total
number of children it has. The highest degree of a node among all the nodes in a tree is
called as 'Degree of Tree'

9. Level
In a tree data structure, the root node is said to be at Level 0 and the children of root node
are at Level 1 and the children of the nodes which are at Level 1 will be at Level 2 and so
on... In simple words, in a tree each step from top to bottom is called as a Level and the
Level count startswith '0' and incremented by one at each level (Step). Some authors start
root level with 1.
10. Height
In a tree data structure, the total number of edges from leaf node to a particular node
in the longest path is called as HEIGHT of that Node. In a tree, height of the root node
is said to beheight of the tree. In a tree, height of all leaf nodes is '0'.
11. Depth
In a tree data structure, the total number of edges from root node to a particular node is
called as DEPTH of that Node. In a tree, the total number of edges from root node to a leaf
node in the longest path is said to be Depth of the tree. In simple words, the highest depth
of any leaf node ina tree is said to be depth of that tree. In a tree, depth of the root node is
'0'.
12. Path
In a tree data structure, the sequence of Nodes and Edges from one node to another node
is called as PATH between that two Nodes. Length of a Path is total number of nodes in that
path. In below example the path A - B - E - J has length 4.

13. Sub Tree


In a tree data structure, each child from a node forms a subtree recursively. Every child
node willform a subtree on its parent node

Tree Representations
A tree data structure can be represented in two methods. Those methods are as
follows...1.List Representation
2. Left Child - Right Sibling
Representation Consider the following
tree...

1. List Representation
In this representation, we use two types of nodes one for representing the node with data
and another for representing only references. We start with a node with data from root
node in the tree. Then it is linked to an internal node through a reference node and is linked
to any other node directly. This process repeats for all the nodes in the tree.
The above tree example can be represented using List representation as follows...

Fig: List representation of above Tree

Fig: Possible node structure for a tree of degree k

2. Left Child - Right Sibling Representation


In this representation, we use list with one type of node which consists of three fields
namely Data field, Left child reference field and Right sibling reference field. Data field
stores the actual
value of a node, left reference field stores the address of the left child and right reference
field stores the address of the right sibling node. Graphical representation of that node is as
follows...

In this representation, every node's data field stores the actual value of that node. If that
node hasleft child, then left reference field stores the address of that left child node
otherwise that field stores NULL. If that node has right sibling then right reference field
stores the address of right sibling node otherwise that field stores NULL. The above tree
example can be represented using Left Child - Right Sibling representation as follows...

Representation as a Degree –Two Tree


To obtain degree-two tree representation of a tree, rotate the right- sibling pointers in the
left child-right sibling tree clockwise by 45 degrees. In a degree-two representation, the
two children of anode are referred as left and right children.

Binary Trees
In a normal tree, every node can have any number of children. Binary tree is a
special type of tree data structure in which every node can have a maximum of 2 children.
One is knownas left child and the other is known as right child.
~A tree in which every node can have a maximum of two children is called as Binary Tree.
~In a binary tree, every node can have either 0 children or 1 child or 2 children but not
more than2 children. Example
There are different types of binary trees and they are...
1. Strictly Binary Tree
In a binary tree, every node can have a maximum of two children. But in strictly binary
tree, every node should have exactly two children or none. That means every internal node
must haveexactly two children. A strictly Binary Tree can be defined as follows...
A binary tree in which every node has either two or zero number of children is called
Strictly Binary Tree. Strictly binary tree is also called as Full Binary Tree or Proper
Binary Tree or 2-Tree
2. Complete Binary Tree
In a binary tree, every node can have a maximum of two children. But in strictly binary
tree, every node should have exactly two children or none and in complete binary tree all
the nodes must have exactly two children and at every level of complete binary tree there
must be 2 level number of nodes. For example at level 2 there must be 2^2 = 4 nodes and
at level 3 there must be2^3 = 8 nodes.
A binary tree in which every internal node has exactly two children and all leaf nodes are at
samelevel is called Complete Binary Tree.
Complete binary tree is also called as Perfect Binary Tree

3. Extended Binary Tree


A binary tree can be converted into Full Binary tree by adding dummy nodes to existing
nodeswherever required.
The full binary tree obtained by adding dummy nodes to a binary tree is called as
Extended Binary Tree.

Abstract Data Type


Definition: A binary tree is a finite set of nodes that is either empty or consists of a root and
twodisjoint binary trees called left subtree and right subtree.
ADT contains specification for the binary tree ADT.
Structure Binary_Tree(abbreviated BinTree) is
objects: a finite set of nodes either empty or consisting of a root node, left Binary_Tree, and
rightBinary_Tree.
Functions:
for all bt, bt1, bt2 BinTree, item element
Bintree Create()::= creates an empty binary
tree
Boolean IsEmpty(bt)::= if (bt==empty binary tree) return TRUE else return FALSE
BinTree MakeBT(bt1, item, bt2)::= return a binary tree whose left subtree is bt1, whose
rightsubtree is bt2, and whose root node contains the data item
Bintree Lchild(bt)::= if (IsEmpty(bt)) return error else return the left subtree of bt
element Data(bt)::= if (IsEmpty(bt)) return error else return the data in the root node
of btBintree Rchild(bt)::= if (IsEmpty(bt)) return error else return the right subtree of
bt

Differences between A Tree and A Binary Tree


• The subtrees of a binary tree are ordered; those of a tree are not ordered.

Above two trees are different when viewed as binary trees. But same when viewed as trees.
Properties of Binary Trees
1. Maximum Number of Nodes in BT
maximum number of nodes on level i of a binary tree is 2i-1, i>=1.
k-1, k>=1.

Proof By Induction:
Induction Base: The root is the only node on level i=1.Hence ,the maximum number of
nodes onlevel i=1 is 2i-1=20=1.
Induction Hypothesis: Let I be an arbitrary positive integer greater than 1.Assume that
maximumnumber of nodes on level i-1 is 2i-2.
Induction Step: The maximum number of nodes on level i-1 is 2i-2 by the induction
hypothesis. Since each node in a binary tree has a maximum degree of 2,the maximum
number of nodes on level i is two times the maximum number of nodes on level i-1,or 2i-1.
The maximum number of nodes in a binary tree of depth k is
2. Relation between number of leaf nodes and degree-2 nodes: For any nonempty
binary tree,T, if n0 is the number of leaf nodes and n2 the number of nodes of degree 2,
then n0=n2+1.
PROOF: Let n and B denote the total number of nodes and branches in T. Let n0, n1, n2
representthe nodes with zero children, single child, and two children respectively.
B+1=n B=n1+2n2 ==> n1+2n2+1=
n,n1+2n2+1= n0+n1+n2 ==> n0=n2+1
3. A full binary tree of depth k is a binary tree of depth k having 2 -1 nodes, k>=0.
A binary tree with n nodes and depth k is complete iff its nodes correspond to the nodes
numbered from 1 to n in the full binary tree of depth k.

Binary Tree Representation


A binary tree data structure is represented using two methods. Those methods are
1)Array Representation
2)Linked List
Representation
1)Array Representation: In array representation of binary tree, we use a one
dimensional array(1-D Array) to represent a binary tree. To represent a binary tree of
depth 'n' using array representation, we need one dimensional array with a maximum size
of
A complete binary tree with n nodes (depth = log n + 1) is represented sequentially, then for
any node with index i, 1<=i<=n, we have: a) parent(i) is at i/2 if i!=1. If i=1, i is at the root
and has no parent. b)left_child(i) ia at 2i if 2i<=n. If 2i>n, then i has no left child. c)
right_child(i) is at 2i+1 if 2i +1 <=n. If 2i +1 >n, then i has no right child

2. Linked Representation : We use linked list to represent a binary tree. In a linked


list, everynode consists of three fields. First field, for storing left child address, second for
storing actual data and third for storing right child address. In this linked list
representation, a node has the following structure...

typedef struct node


*tree_pointer;typedef struct
node
{
int data;
tree_pointer left_child, right_child;
};.

Binary Tree Traversals


When we wanted to display a binary tree, we need to follow some order in which all the
nodes ofthat binary tree must be displayed. In any binary tree displaying order of nodes
depends on the
traversal method. Displaying (or) visiting order of nodes in a binary tree is called as Binary
TreeTraversal.
There are three types of binary tree traversals.
1) In - Order Traversal 2)Pre - Order Traversal 3)Post - Order
TraversalBinary Tree Traversals
• 1. In - Order Traversal ( leftChild - root -
rightChild ) I - D - J - B - F - A - G - K - C –
H
• 2. Pre - Order Traversal ( root - leftChild -
rightChild ) A - B - D - I - J - F - C - G - K – H
• 3. Post - Order Traversal ( leftChild - rightChild -
root )I - J - D - F - B - K - G - H - C – A

1. In - Order Traversal ( leftChild - root - rightChild )


In In-Order traversal, the root node is visited between left child and right child. In this
traversal, the left child node is visited first, then the root node is visited and later we go for
visiting right child node. This in-order traversal is applicable for every root node of all
subtrees in the tree. Thisis performed recursively for all nodes in the tree. In the above
example of binary tree, first we try to visit left child of root node 'A', but A's left child is a
root node for left subtree. so we try to visit its (B's) left child 'D' and again D is a root for
subtree with nodes D, I and J. So we try to visit its left child 'I' and it is the left most child.
So first we visit 'I'then go for its root node 'D' and later we visit D's right child 'J'. With this
we have completed the left part of node B. Then visit 'B' and next B's right child 'F' is
visited. With this we have completed left part of node A. Then visit root node 'A'. With this
we have completed left and root parts of node A. Then we go for right part of the node A. In
right of A again there is a subtree with root C. So go for left child of C and again it is a
subtree with root G. But G does not have left part so we visit 'G' and then visit G's right child
K. With this we have completed the left part of node C. Then visit root node'C' and next
visit C's right child 'H' which is the right most child in the tree so we stop the process. That
means here wehave visited in the order of I - D - J - B - F - A - G - K - C - H using In-Order
Traversal.
In-Order Traversal for above example of binary
tree isI - D - J - B - F - A - G - K - C – H

Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left
subtree.Step 2 − Visit root node.
Step 3 − Recursively traverse right subtree.
void inorder(tree_pointer ptr) /* inorder tree traversal */ Recursive
{
if (ptr) {
inorder(ptr->left_child);
printf(―%d‖, ptr->data);
indorder(ptr-
>right_child);
}
}
2. Pre - Order Traversal ( root - leftChild - rightChild )
In Pre-Order traversal, the root node is visited before left child and right child nodes. In this
traversal, the root node is visited first, then its left child and later its right child. This pre-
order traversal is applicable for every root node of all subtrees in the tree. In the above
example of binary tree, first we visit root node 'A' then visit its left child 'B' which is a root
for D and F. So we visit B's left child 'D' and again D is a root for I and J. So we visit D's left
child'I' which is the left most child. So next we go for visiting D's right child 'J'. With this we
have completed root, left and right parts of node D and root, left parts of node B. Next visit
B's right child'F'. With this we have completed root and left parts of node A. So we go for
A's right child 'C' which is a root node for G and H. After visiting C, we go for its left child
'G' which is a root for node K. So next we visit left of G, but it does not have left child so we
go for G's right child 'K'. With this we havecompleted node C's root and left parts. Next visit
C's right child 'H' which is the right most child in the tree. So we stop the process. That
means here we have visited in the order of A-B-D-I-J-F- C-G-K-H using Pre-Order
Traversal.

Algorithm
Until all nodes are traversed
−Step 1 − Visit root node.
Step 2 − Recursively traverse left
subtree. Step 3 − Recursively traverse
right subtree.
void preorder(tree_pointer ptr) /* preorder tree traversal */ Recursive
{
if (ptr) {
printf(―%d‖, ptr->data);
preorder(ptr->left_child);
preorder(ptr-
>right_child);
}
}

3. Post - Order Traversal ( leftChild - rightChild - root )


In Post-Order traversal, the root node is visited after left child and right child. In this
traversal, left child node is visited first, then its right child and then its root node. This is
recursively performed until the right most node is visited. Here we have visited in the order
of I - J - D - F - B
- K - G - H - C - A using Post-Order Traversal.

Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left
subtree. Step 2 − Recursively traverse
right subtree. Step 3 − Visit root node.
void postorder(tree_pointer ptr) /* postorder tree traversal */ Recursive
{
if (ptr) {
postorder(ptr->left_child);
postorder(ptr-
>right_child);
printf(―%d‖, ptr->data);
}
}
Binary Search Trees
Binary Search Tree Representation
Binary Search tree exhibits a special behavior. A node's left child must have value less than
itsparent's value and node's right child must have value greater than it's parent value.

We're going to implement tree using node object and connecting them through references.

Definition: A binary search tree (BST) is a binary tree. It may be empty. If it is not
empty,thenall nodes follows the below mentioned properties −

keys in a nonempty left subtree (right subtree) are smaller (larger) than the key in the
rootof subtree.

left sub-tree and right sub-tree and can be defined as −


left_subtree (keys) ≤ node (key) ≤ right_subtree (keys)

Fig: Example Binary Search Trees

Graph
A graph G = (V,E) is composed of:
V: set of vertices
E: set of edges connecting the vertices in V
• An edge e = (u,v) is a pair of vertices
Example:

V= {a,b,c,d,e}
E={(a,b),(a,c),(a,d),(b,e),(c,d),(c,e),(d,e)}

Graph Terminology
Undirected Graph:
An undirected graph is one in which the pair of vertices in a edge is unordered,
(v0, v1) = (v1,v0)
Directed Graph:
A directed graph is one in which each edge is a directed pair of vertices, <v0,
v1> != <v1,v0>

Complete Graph:
A complete graph is a graph that has the maximum number of edges for undirected graph
with n vertices, the maximum number of edges is n(n-1)/2 for directed graph with n
vertices, the maximum number of edges is n(n-1)
Adjacent and Incident:
If (v0, v1) is an edge in an undirected graph,
– v0 and v1 are adjacent
– The edge (v0, v1) is incident on vertices v0 and v1

If <v0, v1> is an edge in a directed graph


– v0 is adjacent to v1, and v1 is adjacent from v0
– The edge <v0, v1> is incident on v0 and v1

Multigraph:
In a multigraph, there can be more than one edge from vertex P to
vertex Q. In a simple graph there is at most one.

Graph with self edge or graph with feedback loops:


A self loop is an edge that connects a vertex to itself. In some graph it makes sense to allow
self- loops; in some it doesn't.

Subgraph:
A subgraph of G is a graph G‟ such that V(G‟) is a subset of V(G) and E(G‟) is a
subset of E(G)

Path:
A path from vertex vp to vertex vq in a graph G, is a sequence of vertices,
vp, vi1, vi2, ..., vin, vq, such that (vp, vi1), (vi1, vi2), ..., (vin, vq) are edges in
an undirected graph
The length of a path is the number of edges on it.
Simple Path and Style:
A simple path is a path in which all vertices, except possibly the first and the
last, are distinct.
A cycle is a simple path in which the first and the last vertices are the same
In an undirected graph G, two vertices, v0 and v1, are connected if there is
a path in G from v0 to v1.
An undirected graph is connected if, for every pair of distinct vertices vi,
vj, there is a path from vi to vj

Degree
The degree of a vertex is the number of edges incident to that vertex
For directed graph,
– the in-degree of a vertex v is the number of edges that have v as the head
– the out-degree of a vertex v is the number of edges that have v as the tail
– if di is the degree of a vertex i in a graph G with n vertices and e edges,
the number of edges is

Example:
ADT for Graph
Graph ADT is
Data structures: a nonempty set of vertices and a set of undirected
edges, where each edge is a pair of vertices
Functions: for all graph Graph, v, v1 and v2 Vertices
• Graph Create()::=return an empty graph
• Graph InsertVertex(graph, v)::= return a graph with v inserted. V
has no incident edge.
• Graph InsertEdge(graph, v1,v2)::= return a graph with new edge between
v1 and v2
• Graph DeleteVertex(graph, v)::= return a graph in which v and all edges
incident to it are removed
• Graph DeleteEdge(graph, v1, v2)::=return a graph in which the edge
(v1, v2) is removed
• Boolean IsEmpty(graph)::= if (graph==empty graph) return TRUE
else return FALSE
• List Adjacent(graph,v)::= return a list of all vertices that are adjacent
to v
Graph Representations
Graph can be represented in the following ways:
a) Adjacency Matrix
b) Adjacency Lists
c) Adjacency Multilists
a) Adjacency Matrix
Let G=(V,E) be a graph with n vertices.
The adjacency matrix of G is a two-dimensional by array, say
adj_mat. If the edge (vi, vj) is in E(G), adj_mat[i][j]=1
If there is no such edge in E(G), adj_mat[i][j]=0
The adjacency matrix for an undirected graph is symmetric; the adjacency matrix for a
digraph need not be symmetric
Examples for Adjacency Matrix:
Merits of Adjacency Matrix
From the adjacency matrix, to determine the connection of vertices is easy
The degree of a vertex is
For a digraph, the row sum is the out_degree, while the column sum is the
in_degree

b) Adjacency Lists
Each row in adjacency matrix is represented as an adjacency list.
Interesting Operations
• degree of a vertex in an undirected graph
# of nodes in adjacency list
• # of edges in a graph
determined in O(n+e)
• out-degree of a vertex in a directed graph
# of nodes in its adjacency list
• in-degree of a vertex in a directed graph
traverse the whole data structure
Orthogonal representation for graph G3
c) Adjacency Multilists
An edge in an undirected graph is represented by two nodes in adjacency
list representation.
Adjacency Multilists
– lists in which nodes may be shared among several lists. (an edge is shared by
two different paths)

Example for Adjacency Multlists


Lists: vertex 0: M1->M2->M3, vertex 1: M1->M4-
>M5 vertex 2: M2->M4->M6, vertex 3: M3->M5-
>M6
Some Graph Operations
The following are some graph operations:
a) Traversal Given G=(V,E) and vertex v, find all w V, such that w connects v.
– Depth First Search (DFS) preorder tree traversal
– Breadth First Search (BFS) level order tree traversal
b) Spanning Trees
c) Connected Components
Graph G and its adjacency lists

depth first search: v0, v1, v3, v7, v4, v5, v2,
v6 breadth first search: v0, v1, v2, v3, v4, v5,
v6, v7

Depth First Search


Depth First Search (DFS) algorithm traverses a graph in a depthward motion
and uses a stack to remember to get the next vertex to start a search, when a
dead end occurs in any iteration.
As in the example given above, DFS algorithm traverses from A to B to C to D first then to
E, then to F and lastly to G. It employs the following rules.
• Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it.
Push it in a stack.
• Rule 2 − If no adjacent vertex is found, pop up a vertex from the stack. (It
will pop up all the vertices from the stack, which do not have adjacent
vertices.)
• Rule 3 − Repeat Rule 1 and Rule 2 until the stack is empty.

St Travers Description
ep al

1 Initialize
the
stack.

2 Mark S as visited and put it


onto the stack. Explore any
unvisited adjacent node
from S. We have three
nodes and we can pick any
of them. For this example,
we shall take the node in
an alphabetical order.
3 Mark A as visited and put it
onto the stack. Explore any
unvisited adjacent node
from A. Both S and D are
adjacent to A but we are
concerned for unvisited
nodes only.

4 Visit D and mark it as


visited and put onto the
stack. Here, we have B
and C nodes, which are
adjacent to D and both
are unvisited. However,
we shall again choose in
an alphabetical order.

5 We choose B, mark it as
visited and put onto the
stack. Here B does not have
any unvisited adjacent
node. So, we pop B from the
stack.

6 We check the stack top for


return to the previous node
and check if it has any
unvisited nodes. Here, we
find D to be on the top of
the stack.
7 Only unvisited adjacent
node is from D is C now.
So we visit C, mark it as
visited and put it onto the
stack.

As C does not have any unvisited adjacent node so we keep popping the
stack until we find a node that has an unvisited adjacent node. In this case,
there's none and we keep popping until the stack is empty.

Breadth First Search


Breadth First Search (BFS) algorithm traverses a graph in a breadthward motion and
uses a queue to remember to get the next vertex to start a search, when a dead end
occurs in any iteration.

As in the example given above, BFS algorithm traverses from A to B to E to


F first then to C and G lastly to D. It employs the following rules.
Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Insert it in
a queue.

Rule 2 − If no adjacent vertex is found, remove the first vertex from the queue.

Rule 3 − Repeat Rule 1 and Rule 2 until the queue is empty.


Step Traversal Description
1 Initialize the queue.

2 We start from visiting S


(starting node), and mark it
as visited.

3 We then see an unvisited


adjacent node from S. In
this example, we have three
nodes but alphabetically we
choose A, mark it as visited
and enqueue it.

4 Next, the unvisited adjacent


node from S is B. We mark
it as visited and enqueue it.

5 Next, the unvisited adjacent


node from S is C. We mark
it as visited and enqueue it.
6 Now, S is left with no
unvisited adjacent nodes.
So, we dequeue and find
A.

7 From A we have D as
unvisited adjacent node. We
mark it as visited and
enqueue it.

At this stage, we are left with no unmarked (unvisited) nodes. But as per the
algorithm we keep on dequeuing in order to get all unvisited nodes. When the
queue gets emptied, the program is over.

Spanning Trees
When graph G is connected, a depth first or breadth first search starting at any
vertex will visit all vertices in G
A spanning tree is any tree that consists solely of edges in G and that includes
all the vertices
E(G): T (tree edges) + N (nontree edges) where T: set of edges used during search
N: set of remaining edges
Examples of Spanning Tree

Either dfs or bfs can be used to create a spanning tree


– When dfs is used, the resulting spanning tree is known as a depth first
spanning tree
– When bfs is used, the resulting spanning tree is known as a breadth first
spanning tree

While adding a nontree edge into any spanning tree, this will create a cycle
DFS VS BFS Spanning Tree
A spanning tree is a minimal subgraph, G‟, of G such that V(G‟)=V(G) and G‟ is
connected.
Any connected graph with n vertices must have at least n-1 edges.
A biconnected graph is a connected graph that hasno articulation points.
biconnected component: a maximal connected subgraph H (no subgraph that
is both biconnected and properly contains H).

Minimum Cost Spanning Tree


cost of a spanning tree of a weighted undirected graph is the sum of the
costs of the edges in the spanning tree

– Kruskal
– Prim
– Sollin

Kruskal’s Algorithm
Build a minimum cost spanning tree T by adding edges to T one at a time
Select the edges for inclusion in T in nondecreasing order of the cost
An edge is added to T if it does not form a cycle
Since G is connected and has n > 0 vertices, exactly n-1 edges will be selected
Kruskal’s algorithm
1. Sort all the edges in non-decreasing order of their weight.
2. Pick the smallest edge. Check if it forms a cycle with the spanningtree formed so far. If
cycle is not formed, include this edge. Else, discard it.
3. Repeat step#2 until there are (V-1) edges in the spanning tree.

Psuedocode for Kruskal’s Algorithm


Kruskal(G, V, E)
{
T= {};
while(T contains less than n-1 edges && E is not empty)
{
choose a least cost edge (v,w) from
E; delete (v,w) from E;
if ((v,w) does not create a cycle in T)
add (v,w) to T
else
discard (v,w);
}
if (T contains fewer than n-1 edges)
printf(“No spanning tree\n”);
}
Examples for Kruskal’s Algorithm

Prim’s Algorithm
Prim's algorithm to find minimum cost spanning tree (as Kruskal's algorithm)
uses the greedy approach. Prim's algorithm shares a similarity with the
shortest path first algorithms.
Prim's algorithm, in contrast with Kruskal's algorithm, treats the nodes as a
single tree and keeps on adding new nodes to the spanning tree from the given
graph.
To contrast with Kruskal's algorithm and to understand Prim's algorithm better,
we shall use the same example −
Steps of Prim's Algorithm: The following are the main 3 steps of the Prim's
Algorithm:
1. Begin with any vertex which you think would be suitable and add it to the
tree.
2. Find an edge that connects any vertex in the tree to any vertex that is not
in the tree. Note that, we don't have to form cycles.
3. Stop when n - 1 edges have been added to the tree.

Psuedocode of Prim’s algorithm


Prims(G,V,E)
{
T={};
TV={0};
while (T contains fewer than n-1 edges)
{
let (u,v) be a least cost edge such that and if (there is no such edge ) break;
add v to TV;
add (u,v) to T;
}
if (T contains fewer than n-1 edges)
printf(“No spanning tree\n”);
}
#define MAX_VERTICES
6 int
cost[][MAX_VERTICES]=
{{ 0, 50, 10, 1000, 45, 1000},
{1000, 0, 15, 1000, 10, 1000},
{ 20, 1000, 0, 15, 1000, 1000},
{1000, 20, 1000, 0, 35, 1000},
{1000, 1000, 30, 1000, 0, 1000},
{1000, 1000, 1000, 3, 1000, 0}};
int distance[MAX_VERTICES];
short int found{MAX_VERTICES];
int n = MAX_VERTICES;
void shortestpath(int v, int cost[][MAX_ERXTICES], int distance[],
int n, short int found[])
{
int i, u, w;
for (i=0; i<n; i++)
{
found[i] = FALSE;
distance[i] = cost[v][i];
}
found[v] = TRUE;
distance[v] = 0;
for (i=0; i<n-2; i++)
{
determine n-1 paths from v
u = choose(distance, n,
found); found[u] = TRUE;
for (w=0; w<n; w++)
if (!found[w])
if (distance[u]+cost[u][w]<distance[w])
distance[w] = distance[u]+cost[u][w];
}
}

All Pairs Shortest Paths


All pairs shortest path algorithm finds the shortest paths between all pairs of
vertices.

Solution 1
3)

Solution 2

iently large
number

vertices with an index <= k

index greater than n-1


t[i][j]

Graph with negative cycle


Algorithm for All Pairs Shortest Paths
void allcosts(int cost[][MAX_VERTICES], int distance[][MAX_VERTICES], int n)
{
int i, j, k;
for (i=0; i<n; i++)
for (j=0; j<n; j++) distance[i][j] = cost[i][j];
for (k=0; k<n; k++)
for (i=0; i<n; i++)
for (j=0; j<n; j++)
if (distance[i][k]+distance[k][j] < distance[i][j])
distance[i][j]= distance[i][k]+distance[k][j];
}
Example
Directed graph and its cost matrix
Transitive Closure
Goal: given a graph with unweighted edges, determine if there is a path
from i to j for all i and j.
(1) Require positive path (> 0) lengths. transitive closure matrix
reflexive transitive closure matrix
Unit III
Algorithms – Priority Queues - Heaps – Heap Sort – Merge Sort – Quick
Sort – Binary Search – Finding the Maximum and Minimum.

Algorithm
• An algorithm is a step-by-step procedure to solve a problem in a finite
number of steps.
• Branching and repetition are included in the steps of an algorithm.
• This branching and repetition depend on the problem for which Algorithm is
developed.
• All the steps of Algorithm during the definition should be written in a
human-understandable language which does not depend on any
programming language.
• we can choose any programming language to implement the Algorithm.
• Pseudocode and flow chart are popular ways to represent an algorithm.

An algorithm must satisfy the following criteria:


1. Input: An algorithm should have zero or more but should be a finite number of
inputs. We can also say that it is essential for any algorithm before starting. Input
should be given to it initially before the Algorithm begins.
2. Output: An algorithm must give at least one required result from the given
set of input values. These output values are known as the solution to a problem.
3. Definiteness: Each step must be clear, unambiguous, and precisely defined.
4. Finiteness: Finiteness means Algorithm should be terminated after a finite
number of steps. Also, each step should be finished in a finite amount of time.
5. Effectiveness: Each step of the Algorithm must be feasible i.e., it should be
practically possible to perform the action. Every Algorithm is generally expected to
be effective.

Divide and Conquer


Divide and Conquer is one of the best-known general algorithm design
technique. It works according to the following general plan:
Given a function to compute on „n‟ inputs the divide-and-conquer
strategy suggests splitting the inputs into „k‟ distinct subsets, 1<k<=n,
yielding „k‟ sub problems.
These sub problems must be solved, and then a method must be
found to combine sub solutions into a solution of the whole.
If the sub problems are still relatively large, then the divide-and-conquer
strategy can possibly be reapplied.
Often the sub problems resulting from a divide-and-conquer design are
of the same type as the original problem. For those cases the
reapplication of the divide-and- conquer principle is naturally expressed
by a recursive algorithm.

A typical case with k=2is diagrammatically shown below.


In the above specification,
Initially DAndC(P) is invoked, where „P‟ is the problem to be solved.
Small (P) is a Boolean-valued function that determines whether the
input size is small enough that the answer can be computed without
splitting. If this so, the function „S’ is invoked. Otherwise, the problem
P is divided into smaller sub problems. These sub problems P1, P2 …Pk
are solved by recursive application of DAndC.
Combine is a function that determines the solution to P using the
solutions to the „k‟ sub problems.

Binary Search
Problem definition: Let ai, 1 ≤ i ≤ n be a list of elements that are sorted in non-
decreasing order. The problem is to find whether a given element x is present in
the listor not. If x is present we have to determine a value j (element‘s position)
such that aj=x.If x is not in the list, then j is set to zero.
Solution: Let P = (n, ai…al , x) denote an arbitrary instance of search problem
where n is the number of elements in the list, ai…al is the list of elements and x is
the key elementto be searched for in the given list. Binary search on the list is
done as follows:
Step1: Pick an index q in the middle range [i, l] i.e. q= [(n + 1)/2] and compare x
with aq. Step 2: if x = aq i.e key element is equal to mid element, the problem is
immediatelysolved.
Step 3: if x <aqin this case x has to be searched for only in the sub-list ai, ai+1,
……, aq-Therefore, problem reduces to (q-i, ai…aq-1, x).
Step 4: if x >aq,x has to be searched for only in the sub-list aq+1, ...,., al . Therefore
problemreduces to (l-i, aq+1…al, x).
For the above solution procedure, the Algorithm can be implemented as recursive
or non-recursive algorithm.
Recursive binary search algorithm
Iterative binary search:

Finding the maximum and minimum


Problem statement: Given a list of n elements, the problem is to find the
maximum and minimum items.
StraightMaxMin: A simple and straight forward algorithm to achieve this is given below.
Explanation:
▪ StraightMaxMin requires 2(n-1) comparisons in the best, average & worst cases.
▪ By realizing the comparison of a[i]>max is false, improvement in a algorithm can be
done.Hence we can replace the contents of the for loop by,
If(a[i]>Max) then Max = a[i]; Else if (a[i]<min) min=a[i]
▪ On the average a[i] is > max half the time. So, the avg. no. of comparison is 3n/2-1.
Algorithm based on Divide and Conquer strategy
Let P = (n, a [i],……,a [j]) denote an arbitrary instance of the problem. Here ‗n‘ is the no. of
elements in the list (a[i],….,a[j]) and we are interested in finding the maximum and minimum
ofthe list. If the list has more than 2 elements, P has to be divided into smaller instances.
For example, we might divide ‗P‘ into the 2 instances, P1= ( [n/2],a[1],
a[n/2])P2= (n-[n/2], a[[n/2]+1],……., a[n])
After having divided ‗P‘ into 2 smaller sub problems, we can solve them by recursively
invokingthe same divide-and-conquer algorithm.
Algorithm:
Compared with the straight forward method (2n-2) this method saves 25% in
comparisons.

Space Complexity
Compared to the straight forward method, the MaxMin method requires extra
stack space for i, j, max, min, max1 and min1. Given n elements there will be
[log2 n] + 1 levels of recursion and we need to save seven values for each
recursive call. (6 + 1 for return address).

Merge Sort
Merge sort is a perfect example of a successful application of the divide-and
conquer technique. It sorts a given array A [O ... n - 1] by dividing it into two
halves A [0 .. \n/2]-1] and A [ 𝗁n/2] .. n-1], sorting each of them recursively,
and then merging the two smaller sorted arrays into a single sorted one.

The merging of two sorted arrays can be done as follows.


▪ Two pointers (array indices) are initialized to point to the first elements of
the arrays being merged.

▪ The elements pointed to are compared, and the smaller of them is


added to a new array being constructed

▪ After that, the index of the smaller element is incremented to point to its
immediate successor in the array it was copied from. This operation is
repeated until one of the two given arrays is exhausted, and then the
remaining elements of the other array are copied to the end of the new
array.

Example:
The operation of the algorithm on the list 8, 3, 2, 9, 7, 1, 5, 4 is illustrated in
the figure.
Analysis
Here the basic operation is key comparison. As merge sort execution does not
depend on the order of the data, best case and average case runtime are the
same as worst case runtime.
Worst case: During key comparison, neither of the two arrays becomes empty
before the
other one contains just one element leads to the worst case of merge sort.
Assuming for
simplicity that total number of elements n is a power of 2, the recurrence
relation for the
number of key comparisons C(n) is

where, Cmerge(n) is the number of key comparisons made during the merging
stage.
Let us analyze C merge(n), the number of key comparisons performed during the
merging stage. At each step, exactly one comparison is made, after which the
total number of elements in the two arrays still needing to be processed is
reduced by 1. In the worst case, neither of the two arrays becomes empty before
the other one contains just one element (e.g., smaller elements may come from
the alternating arrays).Therefore, for the worst case, Cmerge(n) = n –1.
Now,

Solving the recurrence equation using master theorem:


Here a = 2, b = 2, f (n) = n, d = 1. Therefore 2 = 21 , case 2 holds in the master
theorem
Cworst(n) = Θ (nd log n) = Θ (n1 log n) = Θ (n log n)Therefore Cworst(n) = Θ (n log
n)
Advantages:
▪ Number of comparisons performed is nearly optimal.
▪ For large n, the number of comparisons made by this algorithm in the
average case turns out to be about 0.25n less and hence is also in Θ(n log
n).
▪ Mergesort will never degrade to O (n2)
▪ Another advantage of mergesort over quicksort and heapsort is its
stability. (A sorting algorithm is said to be stable if two objects with
equal keys appear in the same order in sorted output as they appear in
the input array to be sorted.)
Limitations:
The principal shortcoming of mergesort is the linear amount [O(n) ] of extra
storage the algorithm requires. Though merging can be done in-place, the
resulting algorithm is quite complicated and of theoretical interest only.

Variations of merge sort


1. The algorithm can be implemented bottom up by merging pairs of the array‟s
elements, then merging the sorted pairs, and so on. (If n is not a power of 2,
only slight bookkeeping complications arise.) This avoids the time and space
overhead of using a stack to handle recursive calls.
2. We can divide a list to be sorted in more than two parts, sort each
recursively, and then merge them together. This scheme, which is particularly
useful for sorting files residing on secondary memory devices, is called multiway
mergesort.
Quick sort
Quicksort is the other important sorting algorithm that is based on the divide-and-conquer
approach. Unlike mergesort, which divides its input elements according to their position in
thearray, quicksort divides (or partitions) them according to their value.
A partition is an arrangement of the array‘s elements so that all the elements to the left of
someelement A[s] are less than or equal to A[s], and all the elements to the right of A[s] are
greater than or equal to it:

Obviously, after a partition is achieved, A[s] will be in its final position in the sorted array,
and we can continue sorting the two subarrays to the left and the right of A[s] independently
(e.g.,by the same method).
In quick sort, the entire work happens in the division stage, with no work required to
combine the solutions to the sub problems.

Partitioning
We start by selecting a pivot—an element with respect to whose value we are going to
dividethe subarray. There are several different strategies for selecting a pivot. We use
the sophisticated method suggested by C.A.R. Hoare, the prominent British computer
scientistwho invented quicksort.
Select the subarray‘s first element: p = A[l].Now scan the subarray from both ends,
comparingthe subarray‘s elements to the pivot.
▪ The left-to-right scan, denoted below by index pointer i, starts with the second element.
Since we want elements smaller than the pivot to be in the left part of the subarray, this
scan skips over elements that are smaller than the pivot and stops upon encountering the
first element greater than or equal to the pivot.

▪ The right-to-left scan, denoted below by index pointer j, starts with the last element of
the subarray. Since we want elements larger than the pivot to be in the right part of the
subarray, this scan skips over elements that are larger than the pivot and stops on
encountering the first element smaller than or equal to the pivot.

After both scans stop, three situations may arise, depending on whether or not
the scanning indices have crossed.
▪ If scanning indices i and j have not crossed, i.e., i< j, we simply
exchange A[i] and A[j ] and resume the scans by incrementing I and
decrementing j, respectively:

If the scanning indices have crossed over, i.e.,


i> j, we will have partitioned the subarray after exchanging the pivot with
A[j]:

If the scanning indices stop while pointing to the same element, i.e., i = j,
the value they are pointing to must be equal to p. Thus, we have the
subarray partitioned, with the split position s = i = j :
Analysis
Best Case -Here the basic operation is key comparison. Number of key
comparisons made before a partition is achieved is n + 1 if the scanning indices
cross over and n if they coincide. If all the splits happen in the middle of
corresponding subarrays, we will have the best case. The number of key
comparisons in the best case satisfies the recurrence,

According to the Master Theorem, Cbest(n) ∈Θ(n log2 n); solving it exactly for n
= 2k yields
Cbest(n) = n log2 n.
Worst Case – In the worst case, all the splits will be skewed to the extreme:
one of the two subarrays will be empty, and the size of the other will be just 1
less than the size of the subarray being partitioned. This unfortunate situation
will happen, in particular, for increasing arrays. Indeed, if A[0..n − 1] is a strictly
increasing array and we use A[0] as the pivot, the left-to-right scan will stop on
A[1] while the right-to-left scan will go all the way to reach A[0], indicating the
split at position 0:So, after making n + 1 comparisons to get to this partition
and exchanging the pivot A[0] with itself, the algorithm will be left with the
strictly increasing array A[1..n − 1] to sort. This sorting of strictly increasing
arrays of diminishing sizes will continue until the last one A[n−2.. n−1] has
been processed. The total number of key comparisons made will be equal to

Average Case - Let Cavg(n) be the average number of key comparisons made
by quicksort on a randomly ordered array of size n. A partition can happen in
any position s (0 ≤ s ≤ n−1) after n+1comparisons are made to achieve the
partition. After the partition, the left and right subarrays will have s and n − 1−
s elements, respectively. Assuming that the partition split can happen in each
position s with the same probability 1/n, we get the following recurrence
relation:

Its solution, which is much trickier than the worst- and best-case analyses,
turns out to be

Thus, on the average, quicksort makes only 39% more comparisons than in the
best case. Moreover, its innermost loop is so efficient that it usually runs faster
than mergesort on randomly ordered arrays of nontrivial sizes. This certainly
justifies the name given to the algorithm by its inventor.

Variations: Because of quicksort‟s importance, there have been persistent


efforts over the years to refine the basic algorithm. Among several
improvements discovered by researchers are:
▪ Better pivot selection methods such as randomized quicksort that uses a
random element or the median-of-three method that uses the median of the
leftmost, rightmost, and the middle element of the array
▪ Switching to insertion sort on very small subarrays (between 5 and 15
elements for most computer systems) or not sorting small subarrays at all and
finishing the algorithm with insertion sort applied to the entire nearly sorted
array
▪ Modifications of the partitioning algorithm such as the three-way partition into
segments smaller than, equal to, and larger than the pivot

Limitations: 1. It is not stable. 2. It requires a stack to store parameters of


subarrays that are yet to be sorted. 3. While Performance on randomly ordered
arrays is known to be sensitive not only to the implementation details of the
algorithm but also to both computer architecture and data type.

Unit - 4
Greedy Method : The General Method – Optimal Storage on Tapes – Knapsack
Problem – Job Sequencing with Deadlines – Optimal Merge Patterns.

GENERAL METHOD

Greedy Method
Greedy is the most straight forward design technique. Most of the problems have n inputs and require us
to obtain a subset that satisfies some constraints. Any subset that satisfies these constraints is called a
feasible solution. We need to find a feasible solutionthat either maximizes or minimizes the objective
function. A feasible solution that does this is called an optimal solution.
The greedy method is a simple strategy of progressively building up a solution, one element at a time, by
choosing the best possible element at each stage. At each stage, adecision is made regarding whether or
not a particular input is in an optimal solution.
This is done by considering the inputs in an order determined by some selection procedure. If the
inclusion of the next input, into the partially constructed optimal solution will result in an infeasible
solution then this input is not added to the partial solution. The selection procedure itself is based on
some optimization measure. Several optimization measures are plausible for a given problem. Most of
them, however, will result in algorithms that generate sub-optimal solutions. This version of greedy
technique is called subset paradigm. Some problems like Knapsack, Job sequencing withdeadlines and
minimum cost spanning trees are based on subset paradigm.
For the problems that make decisions by considering the inputs in some order, each decision is made
using an optimization criterion that can be computed using decisions already made. This version of
greedy method is ordering paradigm. Some problems likeoptimal storage on tapes, optimal merge
patterns and single source shortest path are based on ordering paradigm.

CONTROL ABSTRACTION
Algorithm Greedy (a, n)
// a(1 : n) contains the „n‟ inputs
{
solution := ; // initialize the solution to empty for i:=1 to n do
{
x := select (a);
if feasible (solution, x) then solution
:= Union (Solution, x);
}
return solution;
}
Procedure Greedy describes the essential way that a greedy based algorithm will look,once a particular
problem is chosen and the functions select, feasible and union are properly implemented.
The function select selects an input from „a‟, removes it and assigns its value to „x‟. Feasible is aBoolean valued
function, which determines if „x‟ can be included into the solution vector. The function Union combines „x‟ with
solution and updates the objective function.

KNAPSACK PROBLEM
Let us apply the greedy method to solve the knapsack problem. We are given „n‟ objectsand a knapsack.
The object „i‟ has a weight wi and the knapsack has a capacity „m‟. If afraction xi, 0 < xi < 1 of object i is
placed into the knapsack then a profit of pi xi is earned. The objective is to fill the knapsack that
maximizes the total profit earned.
Since the knapsack capacity is „m‟, we require the total weight of all chosen objects to be atmost „m‟. The
problem is stated as:

The profits and weights are positive numbers.


Algorithm
If the objects are already been sorted into non-increasing order of p[i] / w[i] then thealgorithm given
below obtains solutions corresponding to this strategy.

Algorithm GreedyKnapsack (m, n)


// P[1 : n] and w[1 : n] contain the profits and weights respectively of
// Objects ordered so that p[i] / w[i] > p[i + 1] / w[i + 1].
// m is the knapsack size and x[1: n] is the solution vector.
{
for i := 1 to n do x[i] := 0.0 // initialize x U := m; for i := 1 to
n do
{
if (w(i) > U) then break;
x [i] := 1.0; U := U – w[i];
}
if (i < n) then x[i] := U / w[i];
}
Running time:
The objects are to be sorted into non-decreasing order of pi / wi ratio. But if we disregardthe time to
initially sort the objects, the algorithm requires only O(n) time.
Example:
Consider the following instance of the knapsack problem: n = 3, m = 20, (p 1, p2, p3) = (25, 24,15) and (w1, w2, w3)
= (18, 15, 10).

OPTIMAL STORAGE ON TAPES


There are „n‟ programs that are to be stored on a computer tape of length „L‟. Each program „i‟ is of
length li, 1 ≤ i ≤ n. All the programs can be stored on the tape if andonly if the sum of the lengths of
the programs is at most „L‟.
We shall assume that whenever a program is to be retrieved from this tape, the tape isinitially positioned
at the front. If the programs are stored in the order i = i1, i2, . . . . .
, in, the time tJ needed to retrieve program iJ is proportional to
Example
Let n = 3, (l1, l2, l3) = (5, 10, 3). Then find the optimal ordering?
Solution:
There are n! = 6 possible orderings. They are:

Algorithm:
The algorithm for assigning programs to tapes is as follows:
Algorithm Store (n, m)
// n is the number of programs and m the number of tapes
{
j := 0; // next tape to store on for i :=1 to n do
{
Print („append program‟, i, „to permutation for tape‟, j); j := (j + 1) mod m;
}
}
On any given tape, the programs are stored in non-decreasing order of their lengths.

JOB SEQUENCING WITH DEADLINES


When we are given a set of „n‟ jobs. Associated with each Job i, deadline di > 0 and profit Pi > 0. For any
job „i‟ the profit pi is earned iff the job is completed by its deadline.Only one machine is available for
processing jobs. An optimal solution is the feasible solution with maximum profit.
Sort the jobs in „j‟ ordered by their deadlines. The array d [1 : n] is used to store the deadlines of the
order of their p-values. The set of jobs j [1 : k] such that j [r], 1 ≤ r ≤ k are the jobs in „j‟ and d (j [1]) ≤ d
(j[2]) ≤ . . . ≤ d (j[k]). To test whether J U {i} is feasible, we have just to insert i into J preserving the
deadline ordering and then verify that d [J[r]] ≤ r, 1 ≤ r ≤ k+1.

Example:
Let n = 4, (P1, P2, P3, P4,) = (100, 10, 15, 27) and (d1 d2 d3 d4) = (2, 1, 2, 1). The
feasible solutions and their values are:
OPTIMAL MERGE PATERNS
Given „n‟ sorted files, there are many ways to pair wise merge them into a single sortedfile. As, different
pairings require different amounts of computing time, we want to determine an optimal (i.e., one
requiring the fewest comparisons) way to pair wise merge „n‟ sorted files together. This type of merging
is called as 2-way merge patterns.To merge an n-record file and an m-record file requires possibly n + m
record moves, the obvious choice choice is, at each step merge the two smallest files together. The two-
way merge patterns can be represented by binary merge trees.

Algorithm to Generate Two-way Merge Tree:

struct treenode
{
treenode * lchild;
treenode * rchild;
};
Example 1:
Suppose we are having three sorted files X 1, X2 and X3 of length 30, 20, and 10 records each.Merging of the
files can be carried out as follows:

Example 2:
Given five files (X1, X2, X3, X4, X5) with sizes (20, 30, 10, 5, 30). Apply greedy rule tofind optimal way of
pair wise merging to give an optimal solution using binary merge tree representation.
Solution:

Merge X4 and X3 to get 15 record moves. Call this Z1.

Merge Z1 and X1 to get 35 record moves. Call this Z2.


Unit V
Back tracking: The General Method – The 8-Queens Problem – Sum of Subsets – Graph Coloring.

Backtracking
Some problems can be solved, by exhaustive search. The exhaustive-search
technique suggests generating all candidate solutions and then identifying the one (or
the ones) with a desired property.
Backtracking is a more intelligent variation of this approach. The principal idea is to
construct solutions one component at a time and evaluate such partially constructed
candidates as follows. If a partially constructed solution can be developed further
without violating the problem‘s constraints, it is done by taking the first remaining
legitimate optionfor the next component. If there is no legitimate option for the next
component, no alternatives for any remaining component need to be considered. In
this case, the algorithm
backtracks to replace the last component of the partially constructed solution with
its nextoption.
It is convenient to implement this kind of processing by constructing a tree of choices
being made, called the state-space tree. Its root represents an initial state before the
search for a solution begins. The nodes of the first level in the tree represent the choices
made for the first component of a solution; the nodes of the second level represent the
choices for the second component, and soon. A node in a state-space tree is said to be
promising if it corresponds to a partially constructed solution that may still lead to a
complete solution; otherwise, it is called non-promising. Leaves represent either non-
promising dead ends or complete solutions found by the algorithm.
In the majority of cases, a state space tree for a backtracking algorithm is constructed
in the manner of depth-first search. If the current node is promising, its child is
generated by addingthe first remaining legitimate option for the next component of a
solution, and the processing moves to this child. If the current node turns out to be non-
promising, the algorithm backtracks to the node‘s parent to consider the next possible
option for its last component; if there is no such option, it backtracks one more level up
the tree, and so on. Finally, if the algorithm reaches a complete solution to the problem,
it either stops (if just one solution is required) or continues searching for other possible
solutions.

General method
General Algorithm (Recursive)

General Algorithm (Iterative)


General Algorithm for backtracking

N-Queens problem
The problem is to place n queens on an n × n chessboard so that no two queens attack
each other by being in the same row or in the same column or on the same diagonal.
So let us consider the four-queens problem and solve it by the backtracking technique. Since
each of the four queens has to be placed in its own row, all we need to do is to assign a column for
each queen on the board presented in figure.

We start with the empty board and then place queen 1 in the first possible position of its row,
which is in column 1 of row 1. Then we place queen 2, after trying unsuccessfully columns 1 and 2,
in the first acceptable position for it, which is square (2, 3), the square in row 2 and column 3. This
proves to be a dead end because there is no acceptable position for queen 3. So, the algorithm
backtracks and puts queen 2 in the next possible position at (2, 4). Then queen 3 is placed at (3, 2),
which proves to be another dead end. The algorithm then backtracks all the way to queen 1 and
moves it to (1, 2). Queen 2 then goes to (2, 4), queen 3 to(3, 1), and queen 4 to (4, 3), which is a
solution to the problem. The state- space tree of this search is shown in figure.
Figure: State-space tree of solving the four-queens problem by backtracking. × denotes an
unsuccessful attempt to place a queen in the indicated column. The numbers above the nodes
indicate the order in which the nodes are generated.
If other solutions need to be found, the algorithm can simply resume its operations at the
leaf atwhich it stopped. Alternatively, we can use the board‘s symmetry for this purpose.
Finally, it should be pointed out that a single solution to the n-queens problem for any n ≥
4 canbe found in linear time.
Note: The algorithm NQueens() is not in the syllabus. It is given here for interested
learners. Thealgorithm is referred from textbook T2.
Sum of subsets problem
Problem definition: Find a subset of a given set A = {a 1, . . . , an} of n positive integers
whosesum is equal to a given positive integer d.
For example, for A = {1, 2, 5, 6, 8} and d = 9, there are two solutions: {1, 2, 6} and {1, 8}.
Ofcourse, some instances of this problem may have no solutions.
It is convenient to sort the set‘s elements in increasing order. So, we will
assume thata1< a2< . . . < an.
The state-space tree can be constructed as a binary tree like that in Figure shown below for
theinstance A = {3, 5, 6, 7} and d = 15.
The number inside a node is the sum of the elements already included in the subsets represented by
thenode. The inequality below a leaf indicates the reason for its termination.

The root of the tree represents the starting point, with no decisions about the given
elements madeas yet. Its left and right children represent, respectively, inclusion and
exclusion of a1 in a set being sought.
Similarly, going to the left from a node of the first level corresponds to inclusion of a 2 while
going to the right corresponds to its exclusion, and so on. Thus, a path from the root to a
node onthe ith level of the tree indicates which of the first in numbers have been included in
the subsets represented by that node.
We record the value of s, the sum of these numbers, in the node. If s is equal to d, we have a
solution to the problem. We can either report this result and stop or, if all the solutions need to be
found, continue bybacktracking to the node‘s parent. If s is not equal to d, we can terminate the
node as non-promising if either of the following two inequalities holds:

Example: Apply backtracking to solve the following instance of the subset sum problem: A
= {1, 3, 4, 5} and d = 11.

Graph coloring
Analysis

You might also like