Sorting Algorithm
Sorting Algorithm
Suitable for: Unsorted data or small datasets where the overhead of sorting is not worth the
performance gain.
Binary Search:
How it works: Requires a sorted list and repeatedly divides the search interval in half, comparing the
target value to the middle element.
Suitable for: Sorted datasets, especially large ones, where efficiency is crucial.
Sorting Algorithm
A sorting algorithm is used to arrange elements of an array/list in a specific order. For example,
Sorting an array
Here, we are sorting the array in ascending order.
There are various sorting algorithms that can be used to complete this operation. And, we can
use any algorithm based on the requirement.
1. Time Complexity: Time complexity refers to the time taken by an algorithm to complete its
execution with respect to the size of the input. It can be represented in different forms:
Big-O notation (O)
Omega notation (Ω)
Theta notation (Θ)
2. Space Complexity: Space complexity refers to the total amount of memory used by the
algorithm for a complete execution. It includes both the auxiliary memory and the input.
The auxiliary memory is the additional space occupied by the algorithm apart from the input
data. Usually, auxiliary memory is considered for calculating the space complexity of an
algorithm.
Bubble n n2
n2
1
Sort
Selection
n2
n 2
n2
1
Sort
Insertion
n n 2
n2
1
Sort
Merge
nlog n nlog n nlog n n
Sort
Quicksort nlog n n 2
nlog n log n
Counting
n+k n+k n+k max
Sort
Bucket
n+k n 2
n n+k
Sort
For example, in the image below, there are two items with the same value 3. An unstable
sorting algorithm allows two possibilities where the two positions of 3 may or may not be
maintained.
Unstable sorting with two possible outcomes
However, after a stable sorting algorithm, there is always one possibility where the positions are
maintained as in the original array.
Selection Sort No
Quicksort No
Counting Sort Yes
Heap Sort No
Shell Sort No
Bubble Sort
Bubble sort is a sorting algorithm that compares two adjacent elements and swaps them until
they are in the intended order.
Just like the movement of air bubbles in the water that rise up to the surface, each element of
the array move to the end in each iteration. Therefore, it is called a bubble sort.
2. If the first element is greater than the second element, they are swapped.
3. Now, compare the second and the third elements. Swap them if they are not in order.
4. The above process goes on until the last element.
2. Remaining Iteration
The same process goes on for the remaining iterations.
After each iteration, the largest element among the unsorted elements is placed at the end.
The array is sorted when all the unsorted elements are placed at their correct positions.
// Bubble sort in C
#include <stdio.h>
// print array
void printArray(int array[], int size) {
for (int i = 0; i < size; ++i) {
printf("%d ", array[i]);
}
printf("\n");
}
int main() {
int data[] = {-2, 45, 0, 11, -9};
bubbleSort(data, size);
Compare minimum with the third element. Again, if the third element is smaller, then
assign minimum to the third element otherwise do nothing. The process goes on until the last
element. Compare minimum with the
remaining elements
3. After each iteration, minimum is placed in the front of the unsorted list.
// Selection sort in C
#include <stdio.h>
// driver code
int main() {
int data[] = {20, 12, 10, 15, 2};
int size = sizeof(data) / sizeof(data[0]);
selectionSort(data, size);
printf("Sorted array in Acsending Order:\n");
printArray(data, size);
}
Selection Sort Applications
The selection sort is used when
cost of writing to a memory matters like in flash memory (number of writes/swaps is O(n) as
compared to O(n2) of bubble sort)
Complexity = O(n ) 2
Also, we can analyze the complexity by simply observing the number of loops. There are 2
loops so the complexity is n*n = n2.
Time Complexities:
Worst Case Complexity: O(n ) 2
If we want to sort in ascending order and the array is in descending order then, the worst case
occurs.
Best Case Complexity: O(n ) 2
It occurs when the elements of the array are in jumbled order (neither ascending nor
descending).
The time complexity of the selection sort is the same in all cases. At every step, you have to find
the minimum element and put it in the right place. The minimum element is not known until the
end of the array is not reached.
Space Complexity:
Space complexity is O(1) because an extra variable min_idx is used.
We assume that the first card is already sorted then, we select an unsorted card. If the unsorted
card is greater than the card in hand, it is placed on the right otherwise, to the left. In the same
way, other unsorted cards are taken and put in their right place.
Initial array
1. The first element in the array is assumed to be sorted. Take the second element and store it
separately in key .
Compare key with the first element. If the first element is greater than key , then key is placed in
front of the first element. If the first
element is greater than key, then key is placed in front of the first element.
2. Now, the first two elements are sorted.
Take the third element and compare it with the elements on the left of it. Placed it just behind
the element smaller than it. If there is no element smaller than it, then place it at the beginning
insertionSort(array)
mark first element as sorted
for each unsorted element X
'extract' the element X
for j <- lastSortedIndex down to 0
if current element j > X
move sorted element to the right by 1
break loop and insert X here
end insertionSort
We assume that the first card is already sorted then, we select an unsorted card. If the unsorted
card is greater than the card in hand, it is placed on the right otherwise, to the left. In the same
way, other unsorted cards are taken and put in their right place.
Initial array
1. The first element in the array is assumed to be sorted. Take the second element and store it
separately in key .
Compare key with the first element. If the first element is greater than key , then key is placed in
front of the first element. If the first
element is greater than key, then key is placed in front of the first element.
2. Now, the first two elements are sorted.
Take the third element and compare it with the elements on the left of it. Placed it just behind
the element smaller than it. If there is no element smaller than it, then place it at the beginning
of the array. Place 1 at the
beginning
3. Similarly, place every unsorted element at its correct position.
Place 4 behind 1
Place 3 behind 1 and the array is
sorted
insertionSort(array)
mark first element as sorted
for each unsorted element X
'extract' the element X
for j <- lastSortedIndex down to 0
if current element j > X
move sorted element to the right by 1
break loop and insert X here
end insertionSort
/ Insertion sort in C
#include <stdio.h>
// Compare key with each element on the left of it until an element smaller than
// it is found.
// For descending order, change key<array[j] to key>array[j].
while (j >=0 && key < array[j]) {
array[j + 1] = array[j];
--j;
}
array[j + 1] = key;
}
}
// Driver code
int main() {
int data[] = {9, 5, 1, 4, 3};
int size = sizeof(data) / sizeof(data[0]);
insertionSort(data, size);
printf("Sorted array in ascending order:\n");
printArray(data, size);
}
Best O(n)
Worst O(n ) 2
Average O(n ) 2
Stability Yes
Time Complexities
Worst Case Complexity: O(n ) 2
Suppose, an array is in ascending order, and you want to sort it in descending order. In this
case, worst case complexity occurs.
Each element has to be compared with each of the other elements so, for every nth
element, (n-1) number of comparisons are made.
It occurs when the elements of an array are in jumbled order (neither ascending nor
descending).
Space Complexity
Space complexity is O(1) because an extra variable key is used.
Quicksort Algorithm
Quicksort is a sorting algorithm based on the divide and conquer approach where
1. An array is divided into subarrays by selecting a pivot element (element selected from the
array).
While dividing the array, the pivot element should be positioned in such a way that elements
less than pivot are kept on the left side and elements greater than pivot are on the right side of
the pivot.
2. The left and right subarrays are also divided using the same approach. This process continues
until each subarray contains a single element.
3. At this point, elements are already sorted. Finally, elements are combined to form a sorted
array.
Comparison of pivot
element with element beginning from the first index
2. If the element is greater than the pivot element, a second pointer is set for that element.
3. Now, pivot is compared with other elements. If an element smaller than the pivot element is
reached, the smaller element is swapped with the greater element found earlier.
4. Again, the process is repeated to set the next greater element as the second pointer. And, swap
it with another smaller element.
The process is repeated
to set the next greater element as the second pointer.
3. Divide Subarrays
Pivot elements are again chosen for the left and the right sub-parts separately. And, step 2 is
repeated.
Select pivot element of in each
half and put at correct place using recursion
The subarrays are divided until each subarray is formed of a single element. At this point, the
array is already sorted.
Sorting the
elements on the right of pivot using recursion
/ Quick sort in C
#include <stdio.h>
// main function
int main() {
int data[] = {8, 7, 2, 1, 0, 9, 6};
printf("Unsorted Array\n");
printArray(data, n);
Quicksort Complexity
Time Complexity
Best O(n*log n)
Worst O(n )2
Average O(n*log n)
Stability No
1. Time Complexities
Worst Case Complexity [Big-O]: O(n2)
It occurs when the pivot element picked is either the greatest or the smallest element.
This condition leads to the case in which the pivot element lies in an extreme end of the sorted
array. One sub-array is always empty and another sub-array contains n - 1 elements. Thus,
quicksort is called only on this sub-array.
However, the quicksort algorithm has better performance for scattered pivots.
Best Case Complexity [Big-omega]: O(n*log n)
It occurs when the pivot element is always the middle element or near to the middle element.
Average Case Complexity [Big-theta]: O(n*log n)
Quicksort Applications
Quicksort algorithm is used when
Me
rge Sort example
Divide and Conquer Strategy
Using the Divide and Conquer technique, we divide a problem into subproblems. When the
solution to each subproblem is ready, we 'combine' the results from the subproblems to solve
the main problem.
Suppose we had to sort an array A . A subproblem would be to sort a sub-section of this array
starting at index p and ending at index r , denoted as A[p..r] .
Divide
If q is the half-way point between p and r, then we can split the subarray A[p..r] into two
arrays A[p..q] and A[q+1, r] .
Conquer
In the conquer step, we try to sort both the subarrays A[p..q] and A[q+1, r] . If we haven't yet
reached the base case, we again divide both these subarrays and try to sort them.
Combine
When the conquer step reaches the base step and we get two sorted
subarrays A[p..q] and A[q+1, r] for array A[p..r] , we combine the results by creating a sorted
array A[p..r] from two sorted subarrays A[p..q] and A[q+1, r] .
MergeSort Algorithm
The MergeSort function repeatedly divides the array into two halves until we reach a stage
where we try to perform MergeSort on a subarray of size 1 i.e. p == r .
After that, the merge function comes into play and combines the sorted arrays into larger arrays
until the whole array is merged.
MergeSort(A, p, r):
if p > r
return
q = (p+r)/2
mergeSort(A, p, q)
mergeSort(A, q+1, r)
merge(A, p, q, r)
As shown in the image below, the merge sort algorithm recursively divides the array into halves
until we reach the base case of array with 1 element. After that, the merge function picks up the
sorted sub-arrays and merges them to gradually sort the entire array.
The algorithm maintains three pointers, one for each of the two arrays and one for maintaining
the current index of the final sorted array.
Have we reached the end of any of the arrays?
No:
Yes:
Merge step
Writing the Code for Merge Algorithm
A noticeable difference between the merging step we described above and the one we use for
merge sort is that we only perform the merge function on consecutive sub-arrays.
This is why we only need the array, the first position, the last index of the first subarray(we can
calculate the first index of the second subarray) and the last index of the second subarray.
Our task is to merge two subarrays A[p..q] and A[q+1..r] to create a sorted array A[p..r] . So
the inputs to the function are A, p, q and r
The merge function works as follows:
4. When we run out of elements in either L or M , pick up the remaining elements and put in A[p..q]
int i, j, k;
i = 0;
j = 0;
k = p;
Maintain indices of copies of sub array and main array
This step would have been needed if the size of M was greater than L.
#include <stdio.h>
// Divide the array into two subarrays, sort them and merge them
void mergeSort(int arr[], int l, int r) {
if (l < r) {
mergeSort(arr, l, m);
mergeSort(arr, m + 1, r);
// Driver program
int main() {
int arr[] = {6, 5, 12, 10, 9, 1};
int size = sizeof(arr) / sizeof(arr[0]);
Best O(n*log n)
Worst O(n*log n)
Average O(n*log n)
Stability Yes
Time Complexity
Best Case Complexity: O(n*log n)
Space Complexity
The space complexity of merge sort is O(n) .
External sorting
E-commerce applications
Given array
2. Initialize an array of length max+1 with all elements 0. This array is used for storing the count of
the elements in the array.
Count array
3. Store the count of each element at their respective index in count array.
For example: if the count of element 3 is 2 then, 2 is stored in the 3rd position of count array. If
element "5" is not present in the array, then 0 is stored in 5th position.
Count of
each element stored
4. Store cumulative sum of the elements of the count array. It helps in placing the elements into
the correct index of the sorted array.
Cumulative
count
5. Find the index of each element of the original array in the count array. This gives the cumulative
count. Place the element at the index calculated as shown in figure below.
Cou
nting sort
6. After placing each element at its correct position, decrease its count by one.
#include <stdio.h>
// Find the index of each element of the original array in count array, and
// place the elements in output array
for (int i = size - 1; i >= 0; i--) {
output[count[array[i]] - 1] = array[i];
count[array[i]]--;
}
// Driver code
int main() {
int array[] = {4, 2, 2, 8, 3, 3, 1};
int n = sizeof(array) / sizeof(array[0]);
countingSort(array, n);
printArray(array, n);
}
Complexity
Time Complexity
Best O(n+max)
Worst O(n+max)
Average O(n+max)
Stability Yes
Time Complexities
There are mainly four main loops. (Finding the greatest value can be done outside the function.)
1st O(max)
2nd O(size)
3rd O(max)
4th O(size)
Space Complexity
The space complexity of Counting Sort is O(max). Larger the range of elements, larger is the
space complexity.
Heap sort works by visualizing the elements of the array as a special kind of complete binary
tree called a heap.
Note: As a prerequisite, you must know about a complete binary tree and heap data structure.
Relationship between Array Indexes and Tree Elements
A complete binary tree has an interesting property that we can use to find the children and
parents of any node.
If the index of any element in the array is i , the element in the index 2i+1 will become the left
child and element in 2i+2 index will become the right child. Also, the parent of any element at
index i is given by the lower bound of (i-1)/2 .
Relationship
between array and heap indices
= element in 1 index
= 12
Right child of 1
= element in 2 index
= 9
Similarly,
Left child of 12 (index 1)
= element in 3 index
= 5
Right child of 12
= element in 4 index
= 6
Let us also confirm that the rules hold for finding parent of any node
Parent of 9 (position 2)
= (2-1)/2
= ½
= 0.5
~ 0 index
= 1
Parent of 12 (position 1)
= (1-1)/2
= 0 index
= 1
Understanding this mapping of array indexes to tree positions is critical to understanding how
the Heap Data Structure works and how it is used to implement Heap Sort.
Since heapify uses recursion, it can be difficult to grasp. So let's first think about how you would
heapify a tree with just three elements.
heapify(array)
Root = array[0]
Largest = largest( array[0] , array [2*0 + 1]. array[2*0+2])
if(Root != Largest)
Swap(Root, Largest)
He
apify base cases
The example above shows two scenarios - one in which the root is the largest element and we
don't need to do anything. And another in which the root had a larger element as a child and we
needed to swap to maintain max-heap property.
If you're worked with recursive algorithms before, you've probably identified that this must be the
base case.
Now let's think of another scenario in which there is more than one level.
How to heapify root element when its subtrees are already max
heaps
The top element isn't a max-heap but all the sub-trees are max-heaps.
To maintain the max-heap property for the entire tree, we will have to keep pushing 2
downwards until it reaches its correct position.
How to heapify root
element when its subtrees are max-heaps
Thus, to maintain the max-heap property in a tree where both sub-trees are max-heaps, we
need to run heapify on the root element repeatedly until it is larger than its children or it
becomes a leaf node.
This function works for both the base case and for a tree of any size. We can thus move the root
element to the correct position to maintain the max-heap status for any tree size as long as the
sub-trees are max-heaps.
Build max-heap
To build a max-heap from any tree, we can thus start heapifying each sub-tree from the bottom
up and end up with a max-heap after the function is applied to all the elements including the root
element.
In the case of a complete tree, the first index of a non-leaf node is given by n/2 - 1 . All other
nodes after that are leaf-nodes and thus don't need to be heapified.
So, we can build a maximum heap as
As shown in the above diagram, we start by heapifying the lowest smallest trees and gradually
move up until we reach the root element.
If you've understood everything till here, congratulations, you are on your way to mastering the
Heap sort.
2. Swap: Remove the root element and put at the end of the array (nth position) Put the last item
of the tree (heap) at the vacant place.
3. Remove: Reduce the size of the heap by 1.
4. Heapify: Heapify the root element again so that we have the highest element at root.
5. The process is repeated until all the items of the list are sorted.
Swap, Remove, and Heapify
The code below shows the operation.
// Heap sort
for (int i = n - 1; i >= 0; i--) {
swap(&arr[0], &arr[i]);
// Heap Sort in C
#include <stdio.h>
// Heap sort
for (int i = n - 1; i >= 0; i--) {
swap(&arr[0], &arr[i]);
// Driver code
int main() {
int arr[] = {1, 12, 9, 5, 6, 10};
int n = sizeof(arr) / sizeof(arr[0]);
heapSort(arr, n);
Best O(nlog n)
Worst O(nlog n)
Average O(nlog n)
Stability No
Heap Sort has O(nlog n) time complexities for all the cases ( best case, average case, and
worst case).
Let us understand the reason why. The height of a complete binary tree containing n elements
is log n
As we have seen earlier, to fully heapify an element whose subtrees are already max-heaps, we
need to keep comparing the element with its left and right children and pushing it downwards
until it reaches a point where both its children are smaller than it.
In the worst case scenario, we will need to move an element from the root to the leaf node
making a multiple of log(n) comparisons and swaps.
During the build_max_heap stage, we do that for n/2 elements so the worst case complexity of
the build_heap step is n/2*log n ~ nlog n.
During the sorting step, we exchange the root element with the last element and heapify the root
element. For each element, this again takes log n worst time because we might have to bring
the element all the way from the root to the leaf. Since we repeat this n times, the heap_sort
step is also nlog n.
Also since the build_max_heap and heap_sort steps are executed one after another, the
algorithmic complexity is not multiplied and it remains in the order of nlog n.
Also it performs sorting in O(1) space complexity. Compared with Quick Sort, it has a better
worst case ( O(nlog n) ). Quick Sort has complexity O(n^2) for worst case. But in other cases,
Quick Sort is fast. Introsort is an alternative to heapsort that combines quicksort and heapsort to
retain advantages of both: worst case speed of heapsort and average speed of quicksort.
Searching and sorting algorithms are fundamental in computer science, and their complexity analysis
helps determine their efficiency. Searching algorithms find specific elements, while sorting algorithms
arrange elements in a specific order, each with varying time and space complexities.
Searching Algorithms:
Linear Search: A simple algorithm that sequentially checks each element until a match is found or the
end of the list is reached. Its time complexity is O(n) in the worst case (searching for an element at the
end or not present) and O(1) in the best case (element found at the beginning).
Binary Search: An efficient algorithm for sorted lists that repeatedly divides the search interval in
half. Its time complexity is O(log n) in the worst and average cases, making it significantly faster than
linear search for large datasets.
Hash Search: Uses a hash function to map keys to indices in a hash table, providing near constant-time
(O(1)) average-case lookup. However, worst-case performance can degrade to O(n) if collisions are
frequent.
Sorting Algorithms:
Bubble Sort: A simple algorithm that repeatedly steps through the list, comparing adjacent elements
and swapping them if they are in the wrong order. Its time complexity is O(n^2) in the worst and
average cases, making it inefficient for large datasets.
Insertion Sort: Builds a sorted list by iteratively inserting elements into their correct positions. Its time
complexity is O(n^2) in the worst case, but O(n) in the best case (already sorted) and O(n^2) in the
average case.
Selection Sort: Repeatedly finds the minimum element from the unsorted portion and moves it to the
sorted portion. Its time complexity is O(n^2) in all cases.
Merge Sort: A divide-and-conquer algorithm that recursively divides the list into smaller sublists, sorts
them, and then merges them back together. Its time complexity is O(n log n) in all cases.
Quick Sort: Another divide-and-conquer algorithm that partitions the list around a pivot element. Its
average time complexity is O(n log n), but its worst-case complexity is O(n^2).
Heap Sort: Uses a heap data structure to efficiently sort the elements. Its time complexity is O(n log n) in
all cases.
Radix Sort: Sorts elements by their digits (or characters) in a non-comparison-based manner, which can
be very efficient for certain types of data. Its time complexity can be O(nk) where n is the number of
elements and k is the number of digits/characters.