Cd3281 Dsa Question Bank
Cd3281 Dsa Question Bank
Cd3281 Dsa Question Bank
QUESTION BANK
UNIT-I
ABSTRACT DATA TYPES
1. What is a data structure? PART-A
A data structure is a method for organizing and storing data which would allow
efficient data retrieval and usage. A data structure is a way of organizing data that considers
not only the items stored, but also their relationships to each other.
● Once data structure has been implemented, it can be used again and again in various
applications.
● Stacks
● Queues
● Lists
● Trees
● Graphs
● Tables
a. Modularity
i. Divide program into small functions
ii. Easy to debug and maintain
iii. Easy to modify
b. Reuse
i. Define some operations only once and reuse them in future
c. Easy to change the implementation
1. Add an element
2. Delete an element
3. Traverse
4. Sort the list of elements
5. Search for a data element
For example Stack, Queue, Tables, List, and Linked Lists.
9. Define class.
A class is a user-defined data type. It consists of data members and member
functions, which can be accessed and used by creating an instance of that class. It
represents the set of properties or methods that are common to all objects of one type. A
class is like a blueprint for an object.
Example: Consider the Class of Cars. There may be many cars with different names and
brands but all of them will share some common properties like all of them will have 4
wheels, Speed Limit, Mileage range, etc. So here, Car is the class, and wheels, speed limits,
mileage are their properties
The example for class of car can be :
class Car :
classPlane:
def init (self):
self.wings = 2
if x == 1:
return 1
else:
return (x * factorial(x-1))
num = 3
print("The factorial of", num, "is", factorial(num))
PART-B
classChildClassName(ParentClassName):
ChildClass implementation
.
.
classParentClass:
deffeature_1(self):
print('feature_1 from ParentClass is running...')
deffeature_2(self):
print('feature_2 from ParentClass is running...')
classChildClass(ParentClass):
deffeature_3(self):
print('feature_3 from ChildClass is running...')
obj = ChildClass()
obj.feature_1()
obj.feature_2()
We can think of class as a sketch of a parrot with labels. It contains all the details about the name, colors,
size etc. Based on these descriptions, we can study about the parrot. Here, a parrot is an object.
Object
class Parrot:
Anpass
object (instance) is an instantiation of a class. When class is defined, only the description for the object
is defined. Therefore, no memory or storage is allocated.
Here, we use the class keyword to define an empty class Parrot. From class, we construct instances. An
The example for object of parrot class can be:
instance is a specific object created from a particular class.
obj = Parrot()
8
CD3291-DATA STRUCTURES AND ALGORITHMS
Example:
class Parrot:
# class attribute
species = "bird"
# instance attribute
def init (self, name, age):
self.name = name self.age = age
if x == 1:
return 1
else:
return (x * factorial(x-1))
num = 3
print("The factorial of", num, "is", factorial(num))
A binary search is an algorithm to find a particular element in the list. Suppose we have a list of thousand
elements, and we need to get an index position of a particular element. We can find the element's index
position very fast using the binary search algorithm.
There are many searching algorithms but the binary search is most popular among them.
The elements in the list must be sorted to apply the binary search algorithm. If elements are not sorted then
sort them first.
9
CD3291-DATA STRUCTURES AND ALGORITHMS
Let's understand the concept of binary search.
10
CD3291-DATA STRUCTURES AND ALGORITHMS
Concept of Binary Search
The divide and conquer approach technique is followed by the recursive method. In this method,
a functionis called itself again and again until it found an element in the list.
A set of statements is repeated multiple times to find an element's index position in the
iterative method. The while loop is used for accomplish this task.
Binary search is more effective than the linear search because we don't need to search each list
index. The list must be sorted to achieve the binary search algorithm.
We have a sorted list of elements, and we are looking for the index position of 45.
So, we are setting two pointers in our list. One pointer is used to denote the smaller value called low
and thesecond pointer is used to denote the highest value called high.
1. mid = (low+high)/2
2. Here, the low is 0 and
the high is 7. 3. mid =
(0+7)/2
4. mid = 3 (Integer)
11
5.Brieflyexplain about the various asymptotic notations.
Asymptotic analysis of an algorithm refers to defining the mathematical boundation/framing
CD3291-DATA STRUCTURES AND ALGORITHMS
of its run-time performance. Using asymptotic analysis, we can very well conclude the best
case, average case, and worst case scenario of an algorithm.
Asymptotic analysis is input bound i.e., if there's no input to the algorithm, it is concluded to
work in a constant time. Other than the "input" all other factors are considered constant.
Asymptotic analysis refers to computing the running time of any operation in mathematical
units of computation. For example, the running time of one operation is computed as f(n)
and may be for another operation it is computed as g(n2).
This means the first operation running time will increase linearly with the increase in n and
the running time of the second operation will increase exponentially when n increases.
Similarly, the running time of both operations will be nearly the same if n is significantly
sma
12
CD3291-DATA STRUCTURES AND ALGORITHMS
Asymptotic Notations
Following are the commonly used asymptotic notations to calculate the running time
complexity of an algorithm.
Ο Notation
Ω Notation
θ Notation
Big Oh Notation, Ο
The notation Ο(n) is the formal way to express the upper bound of an algorithm's running
time. It measures the worst case time complexity or the longest amount of time an algorithm
can possibly take to complete.
Omega Notation, Ω
The notation Ω(n) is the formal way to express the lower bound of an algorithm's running
time. It measures the best case time complexity or the best amount of time an algorithm can
possibly take to complete.
14
CD3291-DATA STRUCTURES AND ALGORITHMS
θ(f(n)) = { g(n) if and only if g(n) = Ο(f(n)) and g(n) = Ω(f(n)) for all n > n0. }
12. List out the basic operations that can be performed on a stack
The basic operations that can be performed on a stack are
• Push operation
• Pop operation
• Peek operation
21. Define Linked Lists . State the different types of linked lists
Linked list consists of a series of structures, which are not necessarily adjacent in memory.
Each structure contains the element and a pointer to a structure containing its successor. We call this
theNext Pointer. The last cell’sNext pointer points to NULL.
The different types of linked list include singly linked list, doubly linked list andcircular linked
list.
PART-B
1. Explain in detail about Stack ADT and its operations with Python program.
Stack: It is linear data structure that uses the LIFO (Last In-First Out) rule in which the data added
last will be removed first. The addition of data element in a stack is known as a push operation, and
the deletion of data element form the list is known as pop operation.
The two basic operations associated with stacks are:
1. Push
2. Pop
While performing push and pop operations the following test must be conducted on the stack.
a) Stack is empty or not b) stack is full or not
1. Push: Push operation is used to add new elements in to the stack. At the time of addition
first check the stack is full or not. If the stack is full it generates an error message "stack
overflow".
2. Pop: Pop operation is used to delete elements from the stack. At the time of deletion first
check the stack is empty or not. If the stack is empty it generates an error message "stack
underflow".
All insertions and deletions take place at the same end, so the last element added to the
stack will be the first element removed from the stack. When a stack is created, the stack
base remains fixed while the stack top changes as elements are added and removed. The most
accessible element is the top and the least accessible element is the bottom of the stack.
Initially top=-1, we can insert an element in to the stack, increment the top value i.e top=top+1.
We can insert an element in to the stack first check the condition is stack is full or not. i.e
top>=size-1. Otherwise add the element in to the stack.
1. Pop(): When an element is taken off from the stack, the operation is performed by pop().
Below figure shows a stack initially with three elements and shows the deletion of elements
using pop().
1. display(): This operation performed display the elements in the stack. We display the
element in the stack check the condition is stack is empty or not i.e top==-1.Otherwise
display the list of elements in the stack.
2. Explain in detail about Queue ADT and its operations with Python program.
Queue: It is a data structure that uses the FIFO rule (First In-First Out). In this rule, the element
which is added first will be removed first. There are two terms used in the queue front end
and rear The insertion operation performed at the back end is known ad enqueue, and the deletion
operation performed at the front end is known as dequeue.
Operations on QUEUE:
A queue is an object or more specifically an abstract data structure (ADT) that allows the
following operations:
Enqueue or insertion: which inserts an element at the end of the queue.
Dequeue or deletion: which deletes an element at the start of the queue.
Queue operations work as follows:
1. Two pointers called FRONT and REAR are used to keep track of the first and
last elements in the queue.
2. When initializing the queue, we set the value of FRONT and REAR to 0.
3. On enqueing an element, we increase the value of REAR index and place the new
element in the position pointed to by REAR.
4. On dequeueing an element, we return the value pointed to by FRONT and increase
the FRONT index.
5. Before enqueing, we check if queue is already full.
6. Before dequeuing, we check if queue is already empty.
7. When enqueing the first element, we set the value of FRONT to 1.
8. When dequeing the last element, we reset the values of FRONT and REAR to 0.
Again insert another element 33 to the queue. The status of the queue is:
Now, delete an element. The element deleted is the element at the front of the queue.So the
status of the queue is:
Again, delete an element. The element to be deleted is always pointed to by the FRONT
pointer. So, 22 is deleted. The queue status is as follows:
Now, insert new elements 44 and 55 into the queue. The queue status is:
Next insert another element, say 66 to the queue. We cannot insert 66 to the queue as the rear
crossed the maximum size of the queue (i.e., 5). There will be queue full signal. The queue status
is as follows:
Now it is not possible to insert an element 66 even though there are two vacant positions in the
linear queue. To overcome this problem the elements of the queue are to be shifted towards the
beginning of the queue so that it creates vacant position at the rear end. Then the FRONT and
REAR are to be adjusted properly. The element 66 can be inserted at the rear end. After this
operation, the queue status is as follows:
In the New list the Node A points to the new Node N and the new node N points to the node B to
which Node A previously pointed.
Deletion:
Let list be a Linked list with node N between Nodes A and B is as shown in the following
figure.
In the new list the node N is to be deleted from the Linked List. The deletion occurs as the link
field in the Node A is made to point node B this excluding node N from its path.
4.Expalin how a Doubly Linked list is implemented with appropriate example programs.
Lptr contains the address of the before node. Rptr contains the address of next node. Data
Contains the Linked List is as follows.
In the above diagram Last and Start are pointer variables which contains the address of last node
and starting node respectively.
Insertion in to the Double Linked List:Let list be a double linked list with successive modes A
and B as shown in the following diagram. Suppose a node N is to be inserted into the list
between the node s A and B this is shown in the following diagram.
As in the new list the right pointer of node A points to the new node N ,the Lptr of the node ‘N’
points to the node A and Rptr of node ‘N’ points to the node ‘B’ and Lpts of node B points the
new node ‘N’
Deletion Of Double Linked List :- Let list be a linked list contains node N between the
nodes A and B as shown in the following diagram.
Circular Linked List:- Circular Linked List is a special type of linked list in which all the nodes
are linked in continuous circle. Circular list can be singly or doubly linked list. Note that, there
are no Nulls in Circular Linked Lists. In these types of lists, elements can be added to the back of
the list and removed from the front in constant time.
Both types of circularly-linked lists benefit from the ability to traverse the full list beginning at
any given node. This avoids the necessity of storing first Node and last node, but we need a
special representation for the empty list, such as a last node variable which points to some node
in the list or is null if it's empty. This representation significantly simplifies adding and removing
nodes with a non-empty list, but empty lists are then a special case. Circular linked lists are
most useful for describing naturally circular structures, and have the advantage of being able to
traverse the list starting at any point. They also allow quick access to the first and last records
through a single pointer (the address of the last element)
Circular linked list are one they of liner linked list. In which the link fields of last node
of the list contains the address of the first node of the list instead of contains a null
pointer.
Advantages:- Circular list are frequency used instead of ordinary linked list because in
circular list all nodes contain a valid address. The important feature of circular list is as
follows.
(1) In a circular list every node is accessible from a given node.
(2) Certain operations like concatenation and splitting becomes more efficient in
circular list.
Disadvantages: Without some conditions in processing it is possible to get into an
infinite Loop.
Circular Double Linked List :- These are one type of double linked list. In which the
rpt field of the last node of the list contain the address of the first node ad the left points
of the first node contains the address of the last node of the list instead of containing
null pointer.
UNIT-III
SORTING AND SEARCHING
PART-A
1. Define sorting
Sorting arranges the numerical and alphabetical data present in a list in a specific
order or sequence. There are a number of sorting techniques available. The algorithms can be
chosen based on the following factors
– Size of the data structure
– Algorithm efficiency
– Programmer’s knowledge of the technique.
order) element in the unsorted sublist, exchanging it with the leftmost unsorted element
(putting it in sorted order), and moving the sublist boundaries one element to the right.
Disadvantages
8. Define searching
Searching refers to determining whether an element is present in a given list of
elements or not. If the element is present, the search is considered as successful, otherwise it
is considered as an unsuccessful search. The choice of a searching technique is based on the
following factors
a. Order of elements in the list i.e., random or sorted
b. Size of the list
tiple hashing
15. Define separate chaining
It is an open hashing technique. A pointer field is added to each record location, when
an overflow occurs, this pointer is set to point to overflow blocks making a linked list.
In this method, the table can never overflow, since the linked lists are only extended upon the
arrival of new keys.
Second Iteration
Third Iteration
2
CD3291-DATA STRUCTURES AND ALGORITHMS
Fifth Iteration
[3, 2, 5, 6, 7, 8] - > [2, 3, 5, 6, 7, 8]
Check the each element and as we can see that our list is sorted using the bubble sort
technique.
Program
def bubble_sort(list1):
for i in range(0,len(list1)-1):
for j in range(len(list1)-1):
if(list1[j]>list1[j+1]):
temp = list1[j]
list1[j] = list1[j+1]
list1[j+1] = temp
return list1
list1 = [5, 3, 8, 6, 7, 2]
print("The unsorted list is: ", list1)
# Calling the bubble sort function
print("The sorted list is: ", bubble_sort(list1))
Output:
The unsorted list is: [5, 3, 8, 6, 7, 2]
The sorted list is: [2, 3, 5, 6, 7, 8]
Given a list of five elements, the following illustrate how the selection sort algorithm iterates
through the values when sorting them.
The following shows the unsorted list
Step 1)
3
CD3291-DATA STRUCTURES AND ALGORITHMS
The first value 21 is compared with the rest of the values to check if it is the minimum value.
3 is the minimum value, so the positions of 21 and 3 are swapped. The values with a green
background represent the sorted partition of the list.
Step 2)
The value 6 which is the first element in the unsorted partition is compared with the rest of
the values to find out if a lower value exists
4
CD3291-DATA STRUCTURES AND ALGORITHMS
The first element of the unsorted list with the value of 9 is compared with the rest of the
values to check if it is the minimum value.
The value 9 is the minimum value, so it maintains its position in the sorted partition.
Step 4)
The value 21 is lower than 33, so the positions are swapped to produce the above new list.
Step 5)
We only have one value left in the unpartitioned list. Therefore, it is already sorted.
def selection_sort(array):
length = len(array)
for i in range(length-1):
minIndex = i
for j in range(i+1, length):
if array[j]<array[minIndex]:
minIndex = j
5
CD3291-DATA STRUCTURES AND ALGORITHMS
3. What is searching? Explain the linear search and binary search with python program and
example.
Searching is the process of selecting particular information from a collection of
data based on specific criteria.
There are two types of searching.
1. Linear search
2. Binary search
if key in theArray :
print( "The key is in the array." )
else :
print( "The key is not in the array." )
To determine if value 31 is in the array, the search begins with the value in the first
element.
Since the first element does not contain the target value, the next element in
sequential order is compared to value 31.
6
CD3291-DATA STRUCTURES AND ALGORITHMS
This process is repeated until the item is found in the sixth position.
For example, suppose we want to search for value 8 in the sample array.
The search begins at the first entry as before, but this time every item in the array is
compared to the target value.
It cannot be determined that the value is not in the sequence until the entire array has
been traversed.
A loop is used to traverse through the sequence during which each element is
compared against the target value.
If the item is in the sequence, the loop is terminated and True is returned.
Otherwise, a full traversal is performed and False is returned after the loop terminates.
Assuming the sequence contains n items, the linear search has a worst case time of
O(n)
We have a sorted list of elements, and we are looking for the index position of 45.
So, we are setting two pointers in our list. One pointer is used to denote the smaller value
called low and the second pointer is used to denote the highest value called high.
7
CD3291-DATA STRUCTURES AND ALGORITHMS
mid = (low+high)/2
Here, the low is 0 and the high is 7.
mid = (0+7)/2
mid = 3 (Integer)
Now, we will compare the searched element to the mid index value. In this case, 32 is not
equal to 45. So we need to do further comparison to find the element.
If the number we are searching equal to the mid. Then return mid otherwise move to the
further comparison.
The number to be search is greater than the middle number, we compare the n with the
middle element of the elements on the right side of mid and set low to low = mid + 1.
Otherwise, compare the n with the middle element of the elements on the left side
of mid and set high to high = mid - 1.
The complexity of the binary search algorithm is O(1) for the best case.
The O(logn) is the worst and the average case complexity of the binary search.
8
CD3291-DATA STRUCTURES AND ALGORITHMS
9
CD3291-DATA STRUCTURES AND ALGORITHMS
A simple and efficient way for dealing with collisions is to have each bucket A[ j]
store its own secondary container, holding items (k,v) such that h(k) = j.
A natural choice for the secondary container is a small map instance implemented
using a list.This collision resolution rule is known as separate chaining.
10
CD3291-DATA STRUCTURES AND ALGORITHMS
In the worst case, operations on an individual bucket take time proportional to the size
of the bucket.
Assuming we use a good hash function to index the n items of our map in a bucket
array of capacity N, the expected size of a bucket is n/N.
Therefore, if given a good hash function, the core map operations run in O(n/N_).
The ratio λ = n/N, called the load factor of the hash table, should be bounded by a
small constant, preferably below 1. As long as λ is O(1), the core operations on the
hash table run in O(1) expected time.
This approach saves space because no auxiliary structures are employed, but it
requires a bit more complexity to deal with collisions.
There are several variants of this approach, collectively referred to as open
addressing schemes, which we discuss next.
Open addressing requires that the load factor is always at most 1 and that items are
stored directly in the cells of the bucket array itself.
5. The keys are stored in an array called a hash table and a hash function is associated with the
table.
• If our hash function is good, then we expect the entries to be uniformly distributed in
the N cells of the bucket array.
• Thus, to store n entries, the expected number of keys in a bucket would be n/N_,
which is O(1) if n is O(N).
• In the worst case, a poor hash function could map every item to the same bucket. The
basic command c = a + b involves two calls to getiting in the dictionary for the
local namespace to retrieve the values identified as a and b, and a call to setitem
to store the result associated with name c in that namespace.
11
CD3291-DATA STRUCTURES AND ALGORITHMS
6. Explain about Quick sort. Sort the following integer elements using Quick Sort 40,
20,70,14,60,61,97,30
• Divide:
• If S has a sequence of elements, select a specific element x from S, which is
called the pivot. As is common, choose the pivot x to be the last element in
S.
• Remove all the elements from S and put them into three sequences:
• L, storing the elements in S less than x
• E, storing the elements in S equal to x
G, storing the elements in S greater than x Of course, if the elements of S
are distinct, then E holds just one element— the pivot itself.
• Conquer:
• Recursively sort sequences L and G.
• Combine:
• Put back the elements into S in order by first inserting the elements of L,
then those of E, and finally those of G.
Working of Quicksort Algorithm:
a. A pointer is fixed at the pivot element. The pivot element is compared with the
elements beginning from the first index.
b. If the element is greater than the pivot element, a second pointer is set for that
element
Now, pivot is compared with other elements. If an element smaller than the pivot element is
12
CD3291-DATA STRUCTURES AND ALGORITHMS
reached, the smaller element is swapped with the greater element found earlier.
a. Again, the process is repeated to set the next greater element as the second pointer.
And, swap it with another smaller element.
b. The process goes on until the second last element is reached.
c. Finally, the pivot element is swapped with the second pointer.
13
CD3291-DATA STRUCTURES AND ALGORITHMS
UNIT-IV
TREE STRUCTURES
PART-A
1. Define a tree
A tree is a collection of nodes. The collection can be empty; otherwise, a tree
consists of a distinguished node r, called the root, and zero or more nonempty (sub) trees T1,
T2,…,Tk, each of whose roots are connected by a directed edge from r.
3. Define leaves
These are the terminal nodes of the tree. The nodes with degree 0 are always the
leaves.
14
CD3291-DATA STRUCTURES AND ALGORITHMS
A complete binary tree is a tree in which every non-leaf node has exactly two
children not necessarily to be on the same level.
15
CD3291-DATA STRUCTURES AND ALGORITHMS
• The left and right sub-trees of each node are again binary search trees
16. Traverse the given tree using Inorder, Preorder and Postorder traversals.
Inorder :
Inorder : D H B E A F C I G J
Preorder: A B D H E C F G I J
Postorder: H D E B F I J G C A
16
CD3291-DATA STRUCTURES AND ALGORITHMS
PART-B
1.What is Binary search tree. Write an algorithm to find an element from binary search tree .
A binary tree is an ordered tree with the following properties:
1. Every node has at most two children.
2. Each child node is labeled as being either a left child or a right child.
3. A left child precedes a right child in the order of children of a node.
The subtree rooted at a left or right child of an internal node v is called a left subtree
or right subtree, respectively, of v.
A binary tree is proper if each node has either zero or two children.
Some people also refer to such trees as being full binary trees. Thus, in a proper
binary tree, every internal node has exactly two children.
A binary tree that is not proper is improper.
17
CD3291-DATA STRUCTURES AND ALGORITHMS
18
CD3291-DATA STRUCTURES AND ALGORITHMS
19
CD3291-DATA STRUCTURES AND ALGORITHMS
Consider the binary tree in Figure . The dashed lines show the logical order the nodes
would be visited during the traversal: A, B, D, E, H, C, F, G, I, J. This traversal is
known as a preorder traversal since we first visit the node followed by the subtree
traversals.
In the algorithm the subtree argument will either be a null reference or a reference to
the root of a subtree in the binary tree.
If the reference is not None, the node is first visited and then the two subtrees are
traversed.
By convention, the left subtree is always visited before the right subtree.
The subtree argument will be a null reference when the binary tree is empty or we
attempt to follow a non-existent link for one or both of the children.
Inorder Traversal
Another traversal that can be performed is the inorder traversal, in which we first
traverse the left subtree and then visit the node followed by the traversal of the right
subtree.
20
CD3291-DATA STRUCTURES AND ALGORITHMS
Figure shows the logical ordering of the node visits in the example tree: D, B, H, E,
A, F, C, I, G, J
Postorder Traversal
We can also perform a postorder traversal, which can be viewed as the opposite of the
preorder traversal.
In a postorder traversal, the left and right subtrees of each node are traversed before
the node is visited.
The example tree with the logical ordering of the node visits in a postorder traversal is
shown in Figure .
The nodes are visited in this order: D, H, E, B, F, I, J, G, C, A.
You may notice that the root node is always visited first in a preorder traversal but last
in a postorder traversal.
21
CD3291-DATA STRUCTURES AND ALGORITHMS
ALgorithm:
With each node in an AVL tree, we associate a balance factor, which indicates the
height difference between the left and right branch.
The balance factor can be one of three states:
left high: When the left subtree is higher than the right subtree.
equal high: When the two subtrees have equal height.
right high: When the right subtree is higher than the left subtree.
The balance factors of the tree nodes in our illustrations are indicated by symbols: >
for a left high state, = for the equal high state, and < for a right high state.
When a node is out of balance, we will use either << or >> to indicate which subtree
is higher.
22
CD3291-DATA STRUCTURES AND ALGORITHMS
The search and traversal operations are the same with an AVL tree as with a binary
search tree.
The insertion and deletion operations have to be modified in order to maintain the
balance property of the tree as new keys are inserted and existing ones removed.
Insertions
Inserting a key into an AVL tree begins with the same process used with a binary
search tree.
We search for the new key in the tree and add a new node at the child link where we
fall of the tree.
When a new key is inserted into an AVL tree, the balance property of the tree must be
maintained. If the insertion of the new key causes any of the subtrees to become
unbalanced, they will have to be rebalanced.
For example, suppose we want to add key 120 to the sample AVL tree from Figure
14.14(a).
Following the insertion operation of the binary search tree, the new key will be
inserted as the right child of node 100, as illustrated in Figure 14.15(a). The tree
remains balanced since the insertion does not change the height of any subtree, but it
does cause a change in the balance factors.
After the key is inserted, the balance factors have to be adjusted in order to determine
if any subtree is out of balance.
Figure 14.15(b) shows the new balance factors after key 120 is added.
23
CD3291-DATA STRUCTURES AND ALGORITHMS
Assume we need to add key 28 to the AVL.The new node 28 is inserted as the left
child of node 30, as illustrated in Figure 14.16(a).
When the balance factors are recalculated, as in Figure 14.16(b), we can see all of the
subtrees along the path that are above node 30 are now out of balance, which violates
the AVL balance property.
For this example, we can correct the imbalance by rearranging the subtree rooted at
node 35, as illustrated in Figure 14.16(c).
Rotations
24
CD3291-DATA STRUCTURES AND ALGORITHMS
the pivot node. To rebalance the subtree, the pivot node has to be rotated right over its left
child. The rotation is accomplished by changing the links such that P becomes the right child
of C and the right child of C becomes the left child of P.
Case 2: This case involves three nodes: the pivot (P), the left child of the pivot (C), and the
right child (G) of C. For this case to occur, the balance factor of the pivot is left high before
the insertion and the new key is inserted into either the right subtree of C. This case, which is
illustrated in Figure 14.18, requires two rotations. Node C has to be rotated left over node V
and the pivot node has to be rotated right over its left child. The link modifications required
to accomplish this rotation include setting the right child of G as the new left child of the
pivot node, changing the left child of G to become the right child of C, and setting C to be the
new left child of G.
Cases 3 and 4: The third case is a mirror image of the first case and the fourth case is a
mirror image of the second case. The difference is the new key is inserted in the right subtree
of the pivot node or a descendant of its right subtree. The two cases are illustrated in Figure
14.19.
25
CD3291-DATA STRUCTURES AND ALGORITHMS
26
CD3291-DATA STRUCTURES AND ALGORITHMS
Heap-Order Property: In a heap T, for every position p other than the root, the key stored at
p is greater than or equal to the key stored at p’s parent. As a consequence of the heap-order
property, the keys encountered on a path from the root to a leaf of T are in nondecreasing
order. Also, a minimum key is always stored at the root of T.
Complete Binary Tree Property: A heap T with height h is a complete binary tree if levels
0,1,2, . . . ,h−1 of T have the maximum number of nodes possible (namely, level i has 2i
nodes, for 0 ≤ i ≤ h−1) and the remaining nodes at level h reside in the leftmost possible
positions at that level.
A max-heap has the property, known as the heap order property, that for each non-
leaf node V , the value in V is greater than the value of its two children.
The min-heap has the opposite property. For each non-leaf node V , the value in V is
smaller than the value of its two children.
Insertions
When a new value is inserted into a heap, the heap order property and the heap shape
property (a complete binary tree) must be maintained.
Suppose we want to add value 90 to the max-heap in Figure 13.21(b). If we are to
maintain the property of the max-heap, there are only two places in the tree where 90
can be inserted, as shown in Figure 13.22(a).
Contrast this to the possible locations if we were to add value 41 to the max-heap,
shown in Figure 13.22(b).
27
CD3291-DATA STRUCTURES AND ALGORITHMS
If we insert 90 into the heap, First, we create a new node and fil it with the new value
as shown in part (a).
The node is then attached as a leaf node at the only spot in the tree where the heap
shape property can be maintained (part (b)).
Remember, a heap is a complete tree and in such a tree, the leaf nodes on the lowest
level must be filled from left to right.
28
CD3291-DATA STRUCTURES AND ALGORITHMS
Extractions(Removing)
When a value is extracted and removed from the heap, it can only come from the root
node.
Thus, in a max-heap, we always extract the largest value and in a min-heap, we
always extract the smallest value.
After the value in the root has been removed, the binary tree is no longer a heap since
there is now a gap in the root node, as illustrated in Figure 13.25.
General trees can be used as multiway search trees.Map items stored in a search tree
are pairs of the form (k,v), where k is the key and v is the value associated with the
key.
Let w be a node of an ordered tree. We say that w is a d-node if w has d children.
We define a multiway search tree to be an ordered tree T that has the following
properties, which are illustrated in Figure
29
11.23a:
• Each internal node of T has at least two children. That is, each internal node is a d-node
CD3291-DATA STRUCTURES AND ALGORITHMS
such that d ≥ 2.
• Each internal d-node w of T with children c1, . . . ,cd stores an ordered set of d−1 key-value
pairs (k1,v1), . . ., (kd−1,vd−1), where k1 ≤ · · · ≤ kd−1.
• Let us conventionally define k0 = −∞ and kd =+∞. For each item (k,v) stored at a node in
the subtree of w rooted at ci, i = 1, . . . ,d, we have that ki−1 ≤ k ≤ ki.
30
CD3291-DATA STRUCTURES AND ALGORITHMS
31
CD3291-DATA STRUCTURES AND ALGORITHMS
Otherwise, we continue the search in the child ci of w such that ki−1 < k < ki.
If we reach an external node, then we know that there is no item with key k in T,
andthe search terminates unsuccessfully.
UNIT-IV
GRAPH STRUCTURES
PART-A
1. Define Graph.
A graph G consist of a nonempty set V which is a set of nodes of the graph, a set E
which is the set of edges of the graph, and a mapping from the set for edge E to a set of pairs
of elements of V. It can also be represented as G=(V, E).
1. Kruskal’salgorithm
2. Prim’s algorithm
32
CD3291-DATA STRUCTURES AND ALGORITHMS
7. Define biconnectivity.
A connected graph G is said to be biconnected, if it remains connected after removal
of any one vertex and the edges that are incident upon that vertex. A connected graph is
biconnected, if it has no articulation points.
9. What is DAG?
A DAG is a graph that flows in one direction, where no element can be a child of
itself. So most of us are familiar with LinkedLists, trees, and even graphs. A DAG is very
similar to the first two, and an implementation of the third.
A DAG will have 4 things:
1. Nodes: A place to store the data.
2. Directed Edges: Arrows that point in one direction (the thing that makes this data
structure different)
3. Some great ancestral node with no parents. (Fun fact: Most ancestry trees are actually
DAGs and not actually trees because cousins at some point get married to each other.)
4. Leaves: Nodes with no children
In a directed graph, for any node v, the number of edges, which have v as their initial
node, is called the out degree of the node v. Outdegree: Number of edges having the node v
as root node is the outdegree of the node v.
33
13. Name the different ways of representing
CD3291-DATA a graph?
STRUCTURES AND ALGORITHMS
a. Adjacency matrix
b. Adjacency list
PART-B
1. Explain the various representation of graph with example in detail?
Graph Representations
1. Adjacency Matrix
2. Adjacency List
3. Adjacency Multilists
1.Adjacency Matrix
In this representation, graph can be represented using a matrix of size total number of vertices by total number of
vertices; means if a graph with 4 vertices can be represented using a matrix of 4X4 size.
This matrix is filled with either 1 or 0. Here, 1 represents there is an edge from row vertex to column vertex and 0
represents there is no edge from row vertex to column vertex.
Adjacency Matrix : let G = (V, E) with n vertices, n 1. The adjacency matrix of G is a 2-dimensional n n
matrix, A, A(i, j) = 1 iff (vi, vj) E(G) (vi, vj for a diagraph), A(i, j) = 0 otherwise.
35
CD3291-DATA STRUCTURES AND ALGORITHMS
The adjacency matrix for an undirected graph is symmetric; the adjacency matrix for a digraph need not be
symmetric.
For a digraph, the row sum is the out_degree, while the column sum is the in_degree
n1 n1
ind (vi) A[ j, i]
j 0
outd(vi) A[i, j]
j 0
The space needed to represent a graph using adjacency matrix is n2 bits. To identify the edges in a graph,
adjacency matrices will require at least O(n2) time.
2. Adjacency List
In this representation, every vertex of graph contains list of its adjacent vertices. The n rows of the adjacency
matrix are represented as n chains. The nodes in chain I represent the vertices that are adjacent to vertex i.
It can be represented in two forms. In one form, array is used to store n vertices and chain is used to store its
adjacencies. Example:
So that we can access the adjacency list for any vertex in O(1) time. Adjlist[i] is a pointer to to first node in the
adjacency list for vertex i. Structure is
#define MAX_VERTICES 50
typedef struct node *node_pointer;
typedef struct node {
int vertex;
struct node *link;
};
node_pointer graph[MAX_VERTICES];
int n=0; /* vertices currently in use */
example: consider the following directed graph representation implemented using linked list
This representation can also be implemented
CD3291-DATA using array
STRUCTURES AND ALGORITHMS
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
9 11 13 15 17 18 20 22 23 2 1 3 0 0 3 1 2 5 6 4 5 7 6
Graph
Instead of chains, we can use sequential representation into an integer array with size n+2e+1. For 0<=i<n,
Array[i] gives starting point of the list for vertex I, and array[n] is set to n+2e+1. The adjacent vertices of node I
are stored sequentially from array[i].
For an undirected graph with n vertices and e edges, linked adjacency list requires an array of size n and 2e chain
nodes. For a directed graph, the number of list nodes is only e. the out degree of any vertex may be determined by
counting the number of nodes in its adjacency list. To find in-degree of vertex v, we have to traverse complete
list.
3.Adjacency Multilists
In the adjacency-list representation of an undirected graph each edge (u, v) is represented by two entries one on
the list for u and the other on tht list for v. As we shall see in some situations it is necessary to be able to determin
ie ~ nd enty for a particular edge and mark that edg as having been examined. This can be accomplished easily
if the adjacency lists are actually maintained as multilists (i.e., lists in which nodes may be shared among several
lists). For each edge there will be exactly one node but 37 this node will be in two lists (i.e. the adjacency lists for
each of the two nodes to which it is incident).
For adjacency multilists, node structure is
typedef struct edge *edge_pointer;
typedef struct edge {
short int marked;
int vertex1, vertex2;
edge_pointer path1, path2;
};
edge_pointer graph[MAX_VERTICES];
CD3291-DATA STRUCTURES AND ALGORITHMS
4. Weighted edges
In many applications the edges of a graph have weights assigned to them. These weights may represent the
distance from one vertex to another or the cost of going from one; vertex to an adjacent vertex In these
applications the adjacency matrix entries A [i][j] would keep this information too. When adjacency lists are used
the weight information may be kept in the list’nodes by including an additional field weight. A graph with
weighted edges is called a network.
1. Vertex : An individual data element of a graph is called as Vertex. Vertex is also known as node. In above
example graph, A, B, C, D & E are known as vertices.
2. Edge : An edge is a connecting link between two vertices. Edge is also known as Arc. An edge is represented as
(starting Vertex, ending Vertex).
1. Undirected Edge - An undirected edge is a bidirectional edge. If there is an undirected edge between vertices A
and B then edge (A , B) is equal to edge (B , A).
2. Directed Edge - A directed edge is a unidirectional edge. If there is a directed edge between vertices A and B
then edge (A , B) is not equal to edge (B , A).
1
3. Weighted Edge - A weighted edge is anSTRUCTURES
CD3291-DATA edge with cost on it.AND ALGORITHMS
Types of Graphs
1.Undirected
Graph
3. Directed Graph
4. Complete Graph
A graph in which any V node is adjacent to all other nodes present in the graph is known as a complete graph. An
undirected graph contains the edges that are equal to edges = n(n-1)/2 where n is the number of vertices present in
the graph. The following figure shows a complete graph.
5. Regular Graph
Regular graph is the graph in which nodes are adjacent to each other, i.e., each node is accessible from any other
node.
6. Cycle Graph
A graph having cycle is called cycle graph. In this case the first and last nodes are the same. A closed simple path
is a cycle.
2
CD3291-DATA STRUCTURES AND ALGORITHMS
7. Acyclic Graph
8. Weighted Graph
A graph is said to be weighted if there are some non negative value assigned to each edges of the graph. The
value is equal to the length between two vertices. Weighted graph is also called a network.
Outgoing Edge
Incoming Edge
Degree
Indegree
Total number of incoming edges connected to a vertex is said to be indegree of that vertex.
Outdegree
Total number of outgoing edges connected to a vertex is said to be outdegree of that vertex.
If there are two undirected edges to have the same end vertices, and for two directed edges to have the same
origin and the same destination. Such edges are called parallel edges or multiple edges.
Self-loop
Simple Graph
In an undirected graph the edge between v1 and v2 is incident on node v1 and v2.
Walk
A walk is defined as a finite alternating sequence of vertices and edges, beginning and ending with vertices, such
that each edge is incident with the vertices preceding and following it.
Closed walk
A walk which is to begin and end at the same vertex is called close walk. Otherwise it is an open walk.
If e1,e2,e3,and e4 be the edges of pair of vertices (v1,v2),(v2,v4),(v4,v3) and (v3,v1) respectively ,then v1 e1 v2
e2 v4 e3 v3 e4 v1 be its closed walk or circuit.
Path
A open walk in which no vertex appears more than once is called a path.
If e1 and e2 be the two edges between the pair of vertices (v1,v3) and (v1,v2) respectively, then v3 e1 v1 e2 v2 be
its path.
Length of a path
The number of edges in a path is called the length of that path. In the following, the length of the path is 3.
Circuit
A closed walk in which no vertex (except the initial and the final vertex) appears more than once is called a
circuit.
A circuit having three vertices and three edges.
4
CD3291-DATA STRUCTURES AND ALGORITHMS
Sub Graph
A graph S is said to be a sub graph of a graph G if all the vertices and all the edges of S are in G, and each edge of
S has the same end vertices in S as in G. A subgraph of G is a graph G’ such that V(G’) V(G) and E(G’)
E(G)
Connected Graph
A graph G is said to be connected if there is at least one path between every pair of vertices in G. Otherwise,G is
disconnected.
This graph is disconnected because the vertex v1 is not connected with the other vertices of the graph.
Degree
In an undirected graph, the number of edges connected to a node is called the degree of that node or the degree of
a node is the number of edges incident on it.
In the above graph, degree of vertex v1 is 1, degree of vertex v2 is 3, degree of v3 and v4 is 2 in a connected
graph.
Indegree
The indegree of a node is the number of edges connecting to that node or in other words edges incident to it.
In the above graph,the indegree of vertices v1, v3 is 2, indegree of vertices v2, v5 is 1 and indegree of v4 is zero.
5
Outdegree CD3291-DATA STRUCTURES AND ALGORITHMS
The outdegree of a node (or vertex) is the number of edges going outside from that node or in other words the
ADT of Graph:
Structure Graph is
objects: a nonempty set of vertices and a set of undirected edges, where each edge is a pair of vertices
Graph InsertEdge(graph, v1,v2)::= return a graph with new edge between v1 and v2
Graph DeleteVertex(graph, v)::= return a graph in which v and all edges incident to it are removed
Graph DeleteEdge(graph, v1, v2)::=return a graph in which the edge (v1, v2) is removed
Depth-First Search
We begin by visiting the start vertex v. Next an unvisited vertex w adjacent to v is selected, and a depth-first
search from w is initiated. When a vertex u is reached such that all its adjacent vertices have been visited, we back
up to the last vertex visited that has an unvisited vertex w adjacent to it and initiate a depth-first search from w.
The search terminates when no unvisited vertex can be reached from any of the visited vertices.
DFS traversal of a graph, produces a spanning tree as final result. Spanning Tree is a graph without any loops.
We use Stack data structure with maximum size of total number of vertices in the graph to implement DFS
6
traversal of a graph.CD3291-DATA STRUCTURES AND ALGORITHMS
Step 2: Select any vertex as starting point for traversal. Visit that vertex and push it on to the Stack.
Step 3: Visit any one of the adjacent vertex of the verex which is at top of the stack which is not visited and push
it on to the stack.
Step 4: Repeat step 3 until there are no new vertex to be visit from the vertex on top of the stack.
Step 5: When there is no new vertex to be visit then use back tracking and pop one vertex from the stack.
Step 7: When stack becomes Empty, then produce final spanning tree by removing unused edges from the graph
#define FALSE 0
#define TRUE 1
int visited[MAX_VERTICES];
void dfs(int v)
{
node_pointer w;
visited[v]= TRUE;
printf(“%d”, v);
for (w=graph[v]; w; w=w->link)
if (!visited[w->vertex])
dfs(w->vertex);
}
Consider the graph G of Figure 6.16(a), which is represented by its adjacency lists as in Figure 6.16(b). If a depth-
first search is initiated from vertex 0 then the vertices of G are visited in the following order: 0 1 3 7 4 5 2 6.
Since DFS(O) visits all vertices that can be reached from 0 the vertices visited, together with all edges in
G incident to these vertices form a connected component of G.
Figure: Graph and its adjacency list representation, DFS spanning tree
Analysis or DFS:
When G is represented by its adjacency lists, the vertices w adjacent to v can be determined by following a chain
of links. Since DFS examines each node in the adjacency lists at most once and there are 2e list nodes the time to
complete the search is O(e). If G is represented by its adjacency matrix then the time to determine all
vertices adjacent to v is O(n). Since at most n vertices are visited the total time is O(n2).
Breadth-First Search
In a breadth-first search, we begin by visiting the start vertex v. Next all unvisited vertices adjacent to v are
visited. Unvisited vertices adjacent to these newly visited vertices are then visited and so on. Algorithm BFS
(Program 6.2) gives the details.
typedef struct queue *queue_pointer;
typedef struct queue {
int vertex;
7
queue_pointer link;
CD3291-DATA STRUCTURES AND ALGORITHMS
};
void addq(queue_pointer *,
queue_pointer *, int);
int deleteq(queue_pointer *);
void bfs(int v)
{
node_pointer w;
queue_pointer front, rear;
front = rear = NULL;
printf(“%d”, v);
visited[v] = TRUE;
addq(&front, &rear, v);
while (front) {
v= deleteq(&front);
for (w=graph[v]; w; w=w->link)
if (!visited[w->vertex]) {
printf(“%d”, w->vertex);
addq(&front, &rear, w->vertex);
visited[w->vertex] = TRUE;
}
}
}
Steps:
BFS traversal of a graph, produces a spanning tree as final result. Spanning Tree is a graph without any loops. We
use Queue data structure with maximum size of total number of vertices in the graph to implement BFS traversal
of a graph.
9
CD3291-DATA STRUCTURES AND ALGORITHMS
10
CD3291-DATA STRUCTURES AND ALGORITHMS
Every graph can have more than one topological sorting possible. It depends on the in-degree of the
node in the graph. Also, the topological sorting of the graph starts with the node that has in-degree as
0 i.e a node with no incoming edges. Let us learn an example for a clear understanding.
Example
Consider the following directed acyclic graph with their in-degree mentioned.
Identifying vertices that have no incoming edge. Here, nodes A and F have no incoming edges.
We will choose node A as the source node and deletes this node with all its outgoing edges and put it
in the result array.
Now, update the in-degree of the adjacent nodes of the source node after deleting the outgoing
edges of node A
Now again delete the node with in-degree 0 and its outgoing edges and insert it in the result array.
CD3291-DATA STRUCTURES AND ALGORITHMS
Later update the in-degree of all its adjacent nodes.
In the second step, if we have chosen the source node as F then the topological sort of the graph will
be F, A, B, C, D, E. Therefore, there is more than one topological sort possible for every directed
acyclic graph.
Algorithm
The algorithm of the topological sort goes like this:
1. Identify the node that has no in-degree(no incoming edges) and select that node as the
source node of the graph
2. Delete the source node with zero in-degree and also delete all its outgoing edges from the
graph. Insert the deleted vertex in the result array.
3. Update the in-degree of the adjacent nodes after deleting the outgoing edges
4. Repeat step 1 to step 3 until the graph is empty.