Unit 3- Searching, Sorting and Graphs
Unit-3
Searching: Linear Search- Binary Search- Comparison of Linear and Binary Search.
Sorting: Insertion Sort- Selection Sort- Bubble sort - Quick Sort- Merge Sort.
Graphs: Introduction, Graph representations, Graph traversals DFS- BFS, Graph applications.
3.1 Searching algorithms are essential tools in computer science used to locate specific
items within a collection of data. These algorithms are designed to efficiently navigate
through data structures to find the desired information, making them fundamental in various
applications such as databases, web search engines, and more. Below are some searching
algorithms:
• Linear Search
• Binary Search
• Ternary Search
• Jump Search
• Interpolation Search
• Fibonacci Search
• Exponential Search
3.1.1 Linear Search Algorithm: Linear Search, also known as Sequential Search, is one of
the simplest and most straightforward searching algorithms. It works by sequentially
examining each element in a collection of data (array or list) until a match is found or the
entire collection has been traversed.
In Linear Search, we iterate over all the elements of the array and check if it the current
element is equal to the target element. If we find any element to be equal to the target element,
then return the index of the current element. Otherwise, if no element is equal to the target
element, then return -1 as the element is not found.
Algorithm:
int search(int arr[], int N, int x)
{
for (int i = 0; i < N; i++)
if (arr[i] == x)
return i;
return -1;
}
Applications of Linear Search Algorithm:
• Unsorted Lists: When we have an unsorted array or list, linear search is most commonly
used to find any element in the collection.
• Small Data Sets: Linear Search is preferred over binary search when we have small data
sets with
• Searching Linked Lists: In linked list implementations, linear search is commonly used to
find elements within the list. Each node is checked sequentially until the desired element
is found.
• Simple Implementation: Linear Search is much easier to understand and implement as
compared to Binary Search or Ternary Search.
Dr. Thontadari C., School of CSA, REVA University Page | 1
Unit 3- Searching, Sorting and Graphs
3.1.2 Binary Search Algorithm is a searching algorithm used in a sorted array by repeatedly
dividing the search interval in half. Binary search is a search algorithm used to find the
position of a target value within a sorted array.
It works by repeatedly dividing the search interval in half until the target value is found or
the interval is empty. The search interval is halved by comparing the target element with the
middle value of the search space.
Steps:
• Divide the search space into two halves by finding the middle index “mid”.
• mid = (low + high ) / 2;
• Compare the middle element of the search space with the key.
• If the key is found at middle element, the process is terminated.
• If the key is not found at middle element, choose which half will be used as the next
search space.
o If the key is smaller than the middle element, then the left side is used for next
search.
o If the key is larger than the middle element, then the right side is used for next
search.
• This process is continued until the key is found or the total search space is exhausted.
The idea of binary search is to use the information that the array is sorted and reduce the time
complexity to O(log N).
Algorithm:
int binarySearch(int arr[], int low, int high, int x)
{
while (low <= high) {
int mid = (low + high ) / 2;
if (arr[mid] == x)
return mid;
if (arr[mid] < x)
low = mid + 1;
else
high = mid - 1;
}
return -1;
}
Applications of Binary Search:
• Searching in sorted arrays: Binary search is used to efficiently find an element in a sorted
array.
• Database queries: Binary search can be used to quickly locate records in a database table
that is sorted by a specific key.
• Finding the closest match: Binary search can be used to find the closest value to a target
value in a sorted list.
• Interpolation search: Binary search can be used as a starting point for interpolation search,
which is an even faster search algorithm.
Dr. Thontadari C., School of CSA, REVA University Page | 2
Unit 3- Searching, Sorting and Graphs
3.1.3 Differences between Linear search and Binary search
Linear Search Binary Search
The linear search starts searching from the first It finds the position of the searched element
element and compares each element with a key by finding the middle element of the array.
element till the element is not found.
In linear search input data need not to be in In binary search input data need to be in
sorted. sorted order.
It is based on the sequential approach. It is based on the divide and conquer
approach.
The time complexity of linear search O(n). The time complexity of binary
search O(log n).
It is preferrable for the small-sized data sets. It is preferrable for the large-size data sets.
Multidimensional array can be used. Only single dimensional array is used.
Linear search performs equality comparisons Binary search performs ordering
comparisons
It is less efficient in the case of large-size data It is more efficient in the case of large-size
sets. data sets.
It is very slow process. It is very fast process.
3.2 Sorting: Sorting is the process of arranging the elements of an array so that they can be
placed either in ascending or descending order. Below are some sorting algorithms:
• Selection Sort
• Bubble Sort
• Insertion Sort
• Merge Sort
• Quick Sort
• Heap Sort
• Counting Sort
• Radix Sort
• Bucket Sort
3.2.1 Insertion Sort Algorithm:
Insertion sort works similar to the sorting of playing cards in hands. It is assumed that the first
card is already sorted in the card game, and then we select an unsorted card. If the selected
unsorted card is greater than the first card, it will be placed at the right side; otherwise, it will
be placed at the left side. Similarly, all unsorted cards are taken and put in their exact place.
The same approach is applied in insertion sort. The idea behind the insertion sort is that first
take one element, iterate it through the sorted array. Although it is simple to use, it is not
appropriate for large data sets as the time complexity of insertion sort in the average case and
worst case is O(n2), where n is the number of items.
Algorithm:
void insert(int a[], int n)
{
int i, j, temp;
for (i = 1; i < n; i++)
{
temp = a[i];
j = i - 1;
Dr. Thontadari C., School of CSA, REVA University Page | 3
Unit 3- Searching, Sorting and Graphs
while(j>=0 && temp <= a[j])
{
a[j+1] = a[j];
j = j-1;
}
a[j+1] = temp;
}
}
3.2.2 Selection Sort Algorithm
In selection sort, the smallest value among the unsorted elements of the array is selected in
every pass and inserted to its appropriate position into the array. It is also the simplest
algorithm. It is an in-place comparison sorting algorithm. In this algorithm, the array is divided
into two parts, first is sorted part, and another one is the unsorted part. Initially, the sorted part
of the array is empty, and unsorted part is the given array. Sorted part is placed at the left, while
the unsorted part is placed at the right.
In selection sort, the first smallest element is selected from the unsorted array and placed at the
first position. After that second smallest element is selected and placed in the second position.
The process continues until the array is entirely sorted.
Algorithm:
void selection(int arr[], int n)
{
int i, j, small;
for (i = 0; i < n-1; i++)
{
small = i;
for (j = i+1; j < n; j++)
if (arr[j] < arr[small])
small = j;
int temp = arr[small];
arr[small] = arr[i];
arr[i] = temp;
}
}
3.2.3 Bubble sort Algorithm
Bubble sort works on the repeatedly swapping of adjacent elements until they are not in the
intended order. It is called bubble sort because the movement of array elements is just like the
movement of air bubbles in the water. Bubbles in water rise up to the surface; similarly, the
array elements in bubble sort move to the end in each iteration.
Although it is simple to use, it is primarily used as an educational tool because the performance
of bubble sort is poor in the real world. It is not suitable for large data sets.
Algorithm:
void bubble(int a[], int n)
{
int i, j, temp;
for(i = 0; i < n; i++)
Dr. Thontadari C., School of CSA, REVA University Page | 4
Unit 3- Searching, Sorting and Graphs
{
for(j = i+1; j < n; j++)
{
if(a[j] < a[i])
{
temp = a[i];
a[i] = a[j];
a[j] = temp;
}
}
}
}
3.2.4 Quick Sort Algorithm
Quicksort is the widely used sorting algorithm that makes n log n comparisons in average case
for sorting an array of n elements. It is a faster and highly efficient sorting algorithm. This
algorithm follows the divide and conquer approach.
Divide: In Divide, first pick a pivot element. After that, partition or rearrange the array into
two sub-arrays such that each element in the left sub-array is less than or equal to the pivot
element and each element in the right sub-array is larger than the pivot element.
Conquer: Recursively, sort two subarrays with Quicksort.
Combine: Combine the already sorted array.
Quicksort picks an element as pivot, and then it partitions the given array around the picked
pivot element. In quick sort, a large array is divided into two arrays in which one holds values
that are smaller than the specified value (Pivot), and another array holds the values that are
greater than the pivot.
After that, left and right sub-arrays are also partitioned using the same approach. It will
continue until the single element remains in the sub-array.
Algorithm:
QUICKSORT (array A, start, end)
{
if (start < end)
{
p = partition(A, start, end)
QUICKSORT (A, start, p - 1)
QUICKSORT (A, p + 1, end)
}
}
PARTITION (array A, start, end)
{
pivot =A[end]
Dr. Thontadari C., School of CSA, REVA University Page | 5
Unit 3- Searching, Sorting and Graphs
i = start-1
for j = start to end -1
{
do if (A[j] < pivot)
{
then i = i + 1
swap A[i] with A[j]
}
}
swap A[i+1] with A[end]
return i+1
}
3.2.5 Merge Sort Algorithm
Merge sort is a divide-and-conquer algorithm based on the principle of breaking down a list
into numerous sub-lists until each sub-list has only one element, then merging those sub-lists
into a sorted list.
The characteristics of the merge sort algorithm are as follows –
• Subdivide the unsorted list into N sublists, each containing an element.
• Combine adjacent pairings of two singleton lists to generate a list with two elements.
N will now be converted into N/2 two-size lists.
• Repeat the process until you have a single sorted list.
Algorithm:
MERGE_SORT(arr, beg, end)
if beg < end
set mid = (beg + end)/2
MERGE_SORT(arr, beg, mid)
MERGE_SORT(arr, mid + 1, end)
MERGE (arr, beg, mid, end)
end of if
END MERGE_SORT
Dr. Thontadari C., School of CSA, REVA University Page | 6
Unit 3- Searching, Sorting and Graphs
Time Complexity:
Time Complexity is defined as the number of times a particular instruction set is executed
rather than the total time taken. It is because the total time taken also depends on some external
factors like the compiler used, the processor’s speed, etc.
Space Complexity:
Space Complexity is the total memory space required by the program for its execution.
Both are calculated as the function of input size(n). One important thing here is that despite
these parameters, the efficiency of an algorithm also depends upon the nature and size of the
input.
3.3 Non Linear data structure- Graph
In a non-linear data structure, data is connected to its previous, next, and more elements like a
complex structure. In simple terms, data is not organized sequentially in such a type of data
structure. In Non-linear data structure the data elements connect to each other hierarchically.
Ex; Tree and Graph
i. Tree: A tree data structure is an example of a nonlinear data structure. It is a collection of
nodes where each node consists of the data element. Every tree structure contains a single
root node.
• In the tree, every child node is connected to the parent node. Every parent and child
node has a parent-child node relationship.
• In the tree, Nodes remaining at the last level are called leaf nodes, and other nodes are
called internal nodes.
• Types of trees: Binary tree, Binary search tree, AVL tree and Red-black tree
• The Heap data structure is also a non-linear tree-based data structure, and it uses the
complete binary tree to store the data.
Dr. Thontadari C., School of CSA, REVA University Page | 7
Unit 3- Searching, Sorting and Graphs
ii. Graph: The graph contains vertices and edges. Every vertex of the graph stores the
data element. Edges are used to build a vertices relationship. Social networks are
real-life examples of graph data structure.
3.3.1 Graph
Graph is a non-linear data structure consisting of vertices and edges. A graph G can be defined
as an ordered set G(V, E) where V(G) represents the set of vertices and E(G) represents the set
of edges which are used to connect these vertices.
A Graph G(V, E) with 5 vertices (A, B, C, D, E) and six edges ((A,B), (B,C), (C,E), (E,D),
(D,B), (D,A)) is shown in the following figure.
• Undirected Graph: In an undirected graph, edges are not associated with the directions
with them. An undirected graph is shown in the above figure since its edges are not
attached with any of the directions. If an edge exists between vertex A and B then the
vertices can be traversed from B to A as well as A to B.
• Directed graph: In a directed graph, edges form an ordered pair. Edges represent a
specific path from some vertex A to another vertex B. Node A is called initial node
while node B is called terminal node. A directed graph is shown in the following figure.
3.3.2 Graph Terminology
• Path: A path can be defined as the sequence of nodes that are followed in order to reach
some terminal node V from the initial node U.
• Closed Path: A path will be called as closed path if the initial node is same as terminal
node. A path will be closed path if V0=VN.
• Simple Path: If all the nodes of the graph are distinct with an exception V0=VN, then such
path P is called as closed simple path.
• Cycle: A cycle can be defined as the path which has no repeated edges or vertices except
the first and last vertices.
Dr. Thontadari C., School of CSA, REVA University Page | 8
Unit 3- Searching, Sorting and Graphs
• Connected Graph: A connected graph is the one in which some path exists between every
two vertices (u, v) in V. There are no isolated nodes in connected graph.
• Complete Graph: A complete graph is the one in which every node is connected with all
other nodes. A complete graph contain n(n-1)/2 edges where n is the number of nodes in
the graph.
• Weighted Graph: In a weighted graph, each edge is assigned with some data such as length
or weight. The weight of an edge e can be given as w(e) which must be a positive (+) value
indicating the cost of traversing the edge.
3.3.3 Basic Operations on Graphs: Below are the basic operations on the graph:
• Insertion of Nodes/Edges in the graph – Insert a node into the graph.
• Deletion of Nodes/Edges in the graph – Delete a node from the graph.
• Searching on Graphs – Search an entity in the graph.
• Traversal of Graphs – Traversing all the nodes in the graph.
3.3.4 Applications of Graph: Following are the real-life applications:
• Graph data structures can be used to represent the interactions between players on a
team, such as passes, shots, and tackles.
• Commonly used to represent social networks, such as networks of friends on social
media.
• Graphs can be used to represent the topology of computer networks, such as the
connections between routers and switches.
• Graphs are used to represent the connections between different places in a
transportation network, such as roads and airports.
• Graphs are used in Neural Networks where vertices represent neurons and edges
represent the synapses between them.
3.3.5 Advantages of graphs:
• Graphs can be used to model and analyze complex systems and relationships.
• They are useful for visualizing and understanding data.
• Graph algorithms are widely used in computer science and other fields, such as social
network analysis, logistics, and transportation.
• Graphs can be used to represent a wide range of data types, including social networks,
road networks, and the internet.
3.3.6 Disadvantages of graphs:
• Large graphs can be difficult to visualize and analyze.
• Graph algorithms can be computationally expensive, especially for large graphs.
• The interpretation of graph results can be subjective and may require domain-specific
knowledge.
• Graphs can be susceptible to noise and outliers, which can impact the accuracy of
analysis results.
Dr. Thontadari C., School of CSA, REVA University Page | 9
Unit 3- Searching, Sorting and Graphs
3.3.7 Representations of Graph: The two most common ways to represent a graph :
i. Adjacency Matrix
ii. Adjacency List
i. Adjacency Matrix
An adjacency matrix is a way of representing a graph as a matrix of boolean (0’s and 1’s). Let’s
assume there are n vertices in the graph So, create a 2D matrix adjMat[n][n] having
dimension n x n.
• If there is an edge from vertex i to j, mark adjMat[i][j] as 1.
• If there is no edge from vertex i to j, mark adjMat[i][j] as 0.
Representation of Undirected Graph to Adjacency Matrix: The below figure shows an
undirected graph.
• Initially, the entire Matrix is initialized to 0.
• If there is an edge from source to destination, we insert 1 to both cases
(adjMat[destination] and adjMat[destination]) because we can go either way
Representation of Directed Graph to Adjacency Matrix: The below figure shows a
directed graph.
• Initially, the entire Matrix is initialized to 0.
• If there is an edge from source to destination, we insert 1 for that
particular adjMat[destination].
ii. Adjacency List: An array of Lists is used to store edges between two vertices. The size of
array is equal to the number of vertices (i.e, n). Each index in this array represents a specific
vertex in the graph. The entry at the index i of the array contains a linked list containing the
vertices that are adjacent to vertex i.
Let’s assume there are n vertices in the graph So, create an array of list of size n as adjList[n].
• adjList[0] will have all the nodes which are connected (neighbour) to vertex 0.
• adjList[1] will have all the nodes which are connected (neighbour) to vertex 1 and so
on.
Representation of Undirected Graph to Adjacency list:
• The below undirected graph has 3 vertices. So, an array of list will be created of size 3,
where each indices represent the vertices.
• Now, vertex 0 has two neighbours (i.e, 1 and 2). So, insert vertex 1 and 2 at indices 0
of array. Similarly, For vertex 1, it has two neighbour (i.e, 2 and 0) So, insert vertices 2
and 0 at indices 1 of array. Similarly, for vertex 2, insert its neighbours in array of list.
Dr. Thontadari C., School of CSA, REVA University Page | 10
Unit 3- Searching, Sorting and Graphs
Representation of Directed Graph to Adjacency list:
• The below directed graph has 3 vertices. So, an array of list will be created of size 3,
where each indices represent the vertices. Now, vertex 0 has no neighbours. For
vertex 1, it has two neighbour (i.e, 0 and 2) So, insert vertices 0 and 2 at indices 1 of
array. Similarly, for vertex 2, insert its neighbours in array of list.
3.3.8 Graph Traversal
To traverse a Graph means to start in one vertex, and go along the edges to visit other vertices
until all vertices, or as many as possible, have been visited. The two most common ways a
Graph can be traversed are:
i. Depth First Search (DFS)
ii. Breadth First Search (BFS)
i. Breadth First Search (BFS) is a fundamental graph traversal algorithm. It involves visiting
all the connected nodes of a graph in a level-by-level manner. Breadth First Search (BFS) is a
graph traversal algorithm that explores all the vertices in a graph at the current depth before
moving on to the vertices at the next depth level.
It starts at a specified vertex and visits all its neighbors before moving on to the next level of
neighbors. BFS is commonly used in algorithms for pathfinding, connected components, and
shortest path problems in graphs.
BFS puts every vertex of the graph into two categories - visited and non-visited. It selects a
single node in a graph and, after that, visits all the nodes adjacent to the selected node.
Algorithm:
• Initialization: Enqueue the starting node into a queue and mark it as visited.
• Exploration: While the queue is not empty:
o Dequeue a node from the queue and visit it (e.g., print its value).
o For each unvisited neighbor of the dequeued node:
• Enqueue the neighbor into the queue.
• Mark the neighbor as visited.
• Termination: Repeat step 2 until the queue is empty.
Let us understand the working of the algorithm with the help of the following example.
Dr. Thontadari C., School of CSA, REVA University Page | 11
Unit 3- Searching, Sorting and Graphs
Step 1 - First, add A to queue1 and NULL to queue2.
• QUEUE1 = {A}
• QUEUE2 = {NULL}
Step 2 - Now, delete node A from queue1 and add it into queue2. Insert all neighbors of node
A to queue1.
• QUEUE1 = {B, D}
• QUEUE2 = {A}
Step 3 - Now, delete node B from queue1 and add it into queue2. Insert all neighbors of node
B to queue1.
• QUEUE1 = {D, C, F}
• QUEUE2 = {A, B}
Step 4 - Now, delete node D from queue1 and add it into queue2. Insert all neighbors of node
D to queue1. The only neighbor of Node D is F since it is already inserted, so it will not be
inserted again.
• QUEUE1 = {C, F}
• QUEUE2 = {A, B, D}
Step 5 - Delete node C from queue1 and add it into queue2. Insert all neighbors of node C to
queue1.
• QUEUE1 = {F, E, G}
• QUEUE2 = {A, B, D, C}
Step 5 - Delete node F from queue1 and add it into queue2. Insert all neighbors of node F to
queue1. Since all the neighbors of node F are already present, we will not insert them again.
• QUEUE1 = {E, G}
• QUEUE2 = {A, B, D, C, F}
Step 6 - Delete node E from queue1. Since all of its neighbors have already been added, so
we will not insert them again. Now, all the nodes are visited, and the target node E is
encountered into queue2
• QUEUE1 = {G}
• QUEUE2 = {A, B, D, C, F, E}
Using BFS traversal is {A B D C F E}
ii. Depth First Search (DFS): The depth-first search (DFS) algorithm starts with the initial
node of graph G and goes deeper until we find the goal node or the node with no children.
Step 1 - First, push H onto the stack.
• STACK: H
Step 2 - POP the top element from the stack, i.e., H, and print it. Now, PUSH all the neighbours
of H onto the stack that are in ready state.
• Print: STACK: A
Dr. Thontadari C., School of CSA, REVA University Page | 12
Unit 3- Searching, Sorting and Graphs
Step 3 - POP the top element from the stack, i.e., A, and print it. Now, PUSH all the neighbors
of A onto the stack that are in ready state.
• Print: A
• STACK: B, D
Step 4 - POP the top element from the stack, i.e., D, and print it. Now, PUSH all the neighbors
of D onto the stack that are in ready state.
• Print: D
• STACK: B, F
Step 5 - POP the top element from the stack, i.e., F, and print it. Now, PUSH all the neighbors
of F onto the stack that are in ready state.
• Print: F
• STACK: B
Step 6 - POP the top element from the stack, i.e., B, and print it. Now, PUSH all the neighbors
of B onto the stack that are in ready state.
• 1. Print: B
• 2. STACK: C
Step 7 - POP the top element from the stack, i.e., C, and print it. Now, PUSH all the neighbors
of C onto the stack that are in ready state.
• 1. Print: C
• 2. STACK: E, G
Step 8 - POP the top element from the stack, i.e., G and PUSH all the neighbors of G onto the
stack that are in ready state.
• 1. Print: G
• 2. STACK: E
Step 9 - POP the top element from the stack, i.e., E and PUSH all the neighbors of E onto the
stack that are in ready state.
• 1. Print: E
• 2. STACK:
Now, all the graph nodes have been traversed, and the stack is empty.
3.3.9 Applications of Graphs
• Graphs are utilised in computer science to describe the flow of analysis and
computation.
• It is utilised to describe networks of communication.
• It is used to define data organisation.
• Social Networks: Graphs are unique network configurations with just one kind of edge
separating each vertex.
• Web Graphs: There are many allusions to URLs on the internet. In other terms, the
internet is a great source of network data.
• Information Graphs: Geographical data is organized in a graph-based style, and
information A is connected to information B when A specifically represents B.
• Neural Networks: Large diagrams that artificially link neurons with synapses create
neural networks. There are numerous varieties of neural networks, and the primary
distinction among them is how graphs are formed.
• Blockchains: Each block’s vertices can contain numerous deals, and the edges link the
blocks that follow. The present benchmark for historical transactions is the biggest
branch from the first block.
Dr. Thontadari C., School of CSA, REVA University Page | 13