Báo Cáo

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 63

ĐẠI HỌC QUỐC GIA THÀNH PHỐ HỒ CHÍ MINH

TRƯỜNG ĐẠI HỌC KHOA HỌC TỰ NHIÊN


KHOA CÔNG NGHỆ THÔNG TIN

BÁO CÁO ĐỒ ÁN
KHẢO SÁT CÁC THUẬT TOÁN SẮP XẾP

CẤU TRÚC DỮ LIỆU VÀ GIẢI THUẬT – 23CLC01

Hồ Chí Minh, ngày 04 tháng 07 năm 2024


1. Information:
Member information
MSSV Họ và Tên
23127140 Đặng Nguyễn Duy Tuyên
23127232 Trần Hoàng Nam
23127364 Đặng Nguyễn Thành Hiếu
23127419 Cao Phạm Tuấn Minh

Computer information

Central Processing Unit (CPU) Intel i7 - 12700H

Graphic Proccessing Unit (GPU) Nvidia geforce rtx 4050

RAM 16gb ddr4

Storage 512 gb ssd

OS Windows 10

Project term: 8 June 2024 - 4 July 2024

Special Thanks: I would like to extend my heartfelt gratitude to Teacher Bùi Huy Thông
and Teacher Phan Thị Phương Uyên for their invaluable guidance and teachings throughout my
studies. Their dedication to imparting knowledge and fostering my understanding of sorting
algorithms has been instrumental in shaping this analysis report. Thank you for equipping me
with the skills and insights necessary to undertake this project.
2. Introduction:
The sorting algorithm is a widely used and highly applicable algorithm in everyday life. It can be
simply understood as the process of arranging the elements in a given array in a specific order.
However, this concept is still too vague. Therefore, the project is carried out with the goal of
getting a comprehensive overview of sorting algorithms, understanding their working principles,
characteristics, properties, advantages, and limitations, as well as the applications of some
common algorithms.

Organizing data through the application of sorting algorithms is a fundamental technique that
enables efficient data retrieval and management. Despite the abundance of sorting algorithms
available, each individual method has its own distinct advantages and disadvantages.

In the context of this project, the team will undertake the implementation and comprehensive
exploration of 11 prominent sorting algorithms: Selection Sort, Insertion Sort, Bubble Sort,
Shaker Sort, Shell Sort, Heap Sort, Merge Sort, Quick Sort, Counting Sort, Radix Sort, and Flash
Sort.
3. Algorithm presentation:
3.1 Bubble Sort:
3.1.1 Idea:
● The bubble sort algorithm is a brute-force sorting algorithm.
● The algorithm iterates through the array, comparing and swapping adjacent
elements if they are not in the correct order.
3.1.2 Pseudo code:

1 Function BubbleSort(arr)
2 n ← length(arr)
3 for i ← 0 to n - 2 do
4 for j ← 0 to n - i - 2 do
5 if arr[j] > arr[j + 1] then
6 swap(arr[j], arr[j + 1])
7 end
8 end
9 end
10 end

3.1.3 Illustration:

Step Array Annotations


0 {5,3,2,1}
1 {3,5,2,1}
swap (5, 3)
2 {3,2,5,1}
swap (5, 2)
3 {3,2,1,5}
swap (5, 1)
4 {2,3,1,5}
swap (3, 2)
5 {2,1,3,5}
swap (3, 1)
6 {1,2,3,5}
swap (2, 1)

3.1.4 Complexity:
Space:
The algorithm does not requires extra arrays, so space complexity is always Θ(1) for all cases.

Time:

Algorithms Best-case Worst-case Stable

Bubble Sort O(n) O(n2) Yes


● Base operator is comparison in line 5
● The number of times the basic operation is executed depends on n and the nature of input.

Worst-case:
● Sum of number of times the basic operation for the worst-case is:
n −2 n −i−2 n−2
n(n−1)
∑¿ ∑ 1 }=∑ ( n−i−1 )=
2
¿
i=0 i=0 i=0

● Order of growth: O(n2)

Best-case
● In the best-case, the input array is already sorted in ascending or descending order.
● We can use an additional flag to track whether any swaps occurred during an iteration. If
no swaps occurred, it means the array is already sorted, and we can stop the algorithm.
This improvement will make the best-case time complexity of Bubble Sort to be O(n).

Optimize: Bubble sort with flag


● The algorithm is similar to regular Bubble Sort, but it adds a flag to keep track of whether
any swaps occurred during the last iteration.
● If no swaps occurred, it means the array is already sorted, so the algorithm can stop.

1 Function BubbleSort_with_flag(arr):

2 n ← length(arr)

3 swapped ← false
4 for i ← 0 to n - 1 do
5 swapped ← false

6 for j ← 0 to n - i - 1 do

7 if arr[j] > arr[j + 1] then

8 swap(arr[j], arr[j + 1])

9 swapped ← true

10 end
11 end

12 if swapped = false then


13 break

14 end
15 end

16
end
3.2 Selection Sort:
3.2.1 Idea:
● The selection sort algorithm is a brute-force sorting algorithm
● The algorithm iterates through the array, finding the minimum element in the unsorted
part.
● It then swaps the minimum element with the element at the current position.

3.2.2 Pseudo code:


1 Function SelectionSort(arr)
2 n ← length(arr)
3
4 for i ← 0 to n - 2 do
5 min ← i
6 for j ← i + 1 to n - 1 do
7 if arr[j] < arr[m] then
8 min ← j
9 end
10 end
11 if min != i then
12 swap(arr[i], arr[min])
13 end
14 end
15 end

3.2.3 Illustration:
Step Array Annotations

Find the minimum element, which is 1. Swap it with the first


0 {5, 3, 2, 1}
element
Find the minimum element from the remaining unsorted part,
1 {1, 3, 2, 5}
which is 2. Swap it with the second element
Find the minimum element from the remaining unsorted part,
2 {1, 2, 3, 5}
which is 3. Since it's already in the correct position, no swap is
needed
Find the minimum element from the remaining unsorted part,
3 {1, 2, 3, 5} which is 5. Since it's already in the correct position, no swap is
needed

3.2.4 Complexity:

Space:
The algorithm does not requires extra arrays, so space complexity is always Θ(1) for all cases.

Time:

Algorithms Best-case Worst-case Stable

Selection Sort O(n2) O(n2) No


● Basic operation: key comparison arr[j] < arr[m] in line 7.
● The number of times the basic operation is executed depends only on n → the running
time is the same for every input of a specific n.
● The sum represents the total number of executions of the algorithm’s basic operation:
n −2 n−1 n−2
n (n−1)
∑¿ ∑ 1 }=∑ ( n−1−i−1+1 ) =
2
¿
i=0 j=i+1 i=0

● Order of growth: O(n2).


3.3 Insertion Sort:

3.3.1 Idea:
● The insertion sort algorithm is a decrease and conquer sorting algorithm.
● The algorithm iterates through the array, starting from the second element.
● For each element, it finds the correct position to insert it into the already sorted subarray
on the left.

3.3.2 Pseudo code:


1 Function InsertionSort(arr)
2 n ← length(arr)
3 for i ← 1 to n - 1 do
4 key ← arr[i]
5 j←i-1
6 while j >= 0 and arr[j] > key do
7 arr[j + 1] ← arr[j]
8 j←j-1
9 end
10 arr[j + 1] ← key
11 end
12 end

3.3.3 Illustration:
Annotations
Step Array

0 {5, |3, 2, 1}

Take the element (3), compare it with the sorted portion (5) and
1 {3, 5, |2, 1}
swap them.
Consider the next element (2). Compare it with the sorted portion
2 {2, 3, 5, |1} (3, 5) and swap it with (5). The sorted portion is now {3, 2, 5}.
Then swap it with (3).
Find the minimum element from the remaining unsorted part,
3 {1, 2, 3, 5} which is 1. Compare it with the sorted portion (2,3, 5) and swap
it with (5), (3) then (2).
3.3.4 Complexity:

Space:
The algorithm does not requires extra arrays, so space complexity is always Θ(1) for all cases.

Time:

Algorithms Best-case Worst-case Stable

Insertion Sort O(n) O(n2) No


● Basic operation: key comparison arr[j] > key in line 6.
● The number of times the basic operation is executed depends on n and the nature of input.

Worst-case: Descending sorted array


● Sum of number of times the basic operation for the worst-case is:
n −1 i −1 n−1
n (n−1)
∑ ¿ ∑ 1 }=∑ i= 2
¿
i=1 j=0 i=1

● Order of growth: O(n2)

Best-case: Ascending sorted array


 Sum of number of times the basic operation for the best-case is:
n −1

∑ 1=n−1
i=1

 Order of growth: O(n)

Optimize: Insertion sort with binary search


● Instead of sequentially comparing each element, insertion binary sort uses binary search
to find the appropriate insertion position for each new element. This reduces the number
of comparisons from O(n) to O(log n).
● Similar to regular insertion sort, insertion binary sort maintains a sorted subarray. Each
time a new element is added, it uses binary search to find the correct insertion position.
● After finding the appropriate insertion position using binary search, insertion binary sort
shifts the larger elements to the right to make room for the new element.

1 Function binarySearch(arr, key, low, high)


2 if high <= low then
3 if key > arr[low] then
4 return low + 1
5 else
6 return low
7 end
8 end
9
10 mid = (low + high) / 2
11
12 if key == arr[mid] then
13 return mid + 1
14 end
15
16 if key > arr[mid] then
17 return binarySearch(arr, key, mid + 1, high)
18 else
19 return binarySearch(arr, key, low, mid - 1)
20 end
21 end
22
23 Function insertionSort(arr, n)
24 for i = 1 to n - 1 do
25 selected = arr[i]
26 loc = binarySearch(arr, selected, 0, i - 1)
27
28 for j = i - 1 down to loc do
29 arr[j + 1] = arr[j]
30 end
31
32 arr[loc] = selected
33 end
34 end
3.4 Shaker Sort (Cocktail Sort)

3.4.1 Idea:
● The main idea of Shaker Sort is to combine Bubble Sort in two directions, from left to
right and from right to left.
● The sorting process works as follows:
● Start from the first element, compare and swap if the next element is larger.
● Then, start from the last element, compare and swap if the previous element is larger.
● Repeat this process until the entire array is sorted.
● Shaker Sort is considered a variation of Bubble Sort, with the advantage of reducing the
number of comparisons and swaps compared to the standard Bubble Sort.

3.4.2 Pseudo code:


1 Function ShakerSort(arr, n)
2 left = 0
3 right = n-1
4 temp = 0
5 sorted = true
6 while left < right do
7 for i from left to right-1 do
8 if arr[i] > arr[i+1] then
9 swap(arr[i], arr[i+1])
10 temp = i
11 sorted = false
12 end
13 end
14 if sorted = true then
15 break
16 end
17 sorted = true
18 right = right - 1
19 for i from right-1 to left do
20 if arr[i] < arr[i-1] then
21 swap(arr[i], arr[i-1])
22 temp = i
23 sorted = false
24 end
25 end
26 left = left + 1
27 if sorted = true then
28 break
29 end
30 end
31 end

3.4.3 Illustration:
Step Array Annotations

0 {5, 2, 3, 1}

1 {2, 5, 3, 1} Start the first pass, moving from left to right:


Compare (5, 2), swap => {2, 5, 3, 1}
2 {2, 3, 5, 1}
Compare (5, 3), swap => {2, 3, 5, 1}
3 {2, 3, 1, 5}
Compare (5, 1), swap => {2, 3, 1, 5}
4 {2, 1, 3, 5}
Compare (3, 1), swap => {2, 1, 3, 5}
5 {1, 2, 3, 5}
Compare (2, 1), swap => {1, 2, 3, 5}
6 {1,2,3,5} Start the second pass, moving from right to left: Compare (2, 3),
no swap

3.4.4 Complexity

Space:
The algorithm does not requires extra arrays, so space complexity is always Θ(1) for all cases.
Time:

Algorithms Best-case Worst-case Stable

Shaker Sort O(n) O(n2) Yes


● Base operator is comparison in line 8 and 20.
● The number of times the basic operation is executed depends on n and the nature of input.

Worst-case:
● Order of growth: O(n2)

Best-case:
● In the best-case, the input array is already sorted in ascending or descending order.
● We can use an additional flag like BubbleSort_with_flag that we have already mentioned.
This improvement will make the best-case time complexity of Bubble Sort to be O(n).
3.5 Shell sort

3.5.1 Idea:
● Shell Sort is an improvement over the simple Insertion Sort algorithm, with better
efficiency for larger arrays
● The main idea of Shell Sort is to use the difference in the gap between the elements being
compared and swapped.
● The algorithm starts with a large gap (usually n/2, where n is the size of the array) and
gradually reduces the gap down to 1.
● In each iteration, the algorithm uses Insertion Sort on the elements separated by the
current gap.
● When the gap reaches 1, the algorithm has sorted the entire array.

3.5.2 Pseudo code:


1 Function ShellSort(arr, n)
2 for gap = n/2 to 0 do
3 for i from gap to n-1 do
4 temp = arr[i]
5 j=i
6 while j >= gap and arr[j-gap] > temp do
7 arr[j] = arr[j-gap]
8 j = j - gap
9 end
10 arr[j] = temp
11 end
12 end
13 end
3.5.3 Illustration:
Step Array Annotations

0 {5, 2, 3, 1}

1 {|2, 5, 3, 1}
For the initial gap of 2, compare (5, 2), swap => {2, 5, 3, 1}
2 {2, 5, |1, 3}
Compare (3, 1), swap => {2, 5, 1, 3}
3 {2, |5, 1, 3} Repeat the sorting process with the new gap size of 1, compare
(2, 5), no swap => {2, 5, 1, 3}
4 {1, 2, |5, 3}
Compare (5), (2) with (1), swap => {1, 2, 5, 3}
5 {1, 2, 3, |5}
Compare (1), (2), (5) with (3), swap => {1, 2, 3, 5}
3.5.4 Complexity:

Space:
● The algorithm does not requires extra arrays, so space complexity is always Θ(1) for all
cases.

Time:

Algorithms Best-case Worst-case Stable

Shell Sort O(nlogn) O(n2) No


● Base operator is comparison in line 8 and 20.
● The number of times the basic operation is executed depends on n and the nature of input.

Worst-case:
● The worst-case occurs in a sorting algorithm when the elements to be sorted are in reverse order
● Order of growth: O(n2)

Best-case:
● In the best-case, the input array is already sorted in ascending or descending order.
● Order of growth: O(nlogn).
3.6 Merge Sort

3.6.1 Idea:
● The merge sort algorithm is a divide-and-conquer sorting algorithm.
● The main idea of Merge Sort is to divide the input array recursively into smaller subarrays, sort
them, and then merge them back together.
● The SplitMS function is the recursive part of the algorithm, where it divides the array into two
halves and recursively sorts each half.
● The Merge function is responsible for merging the two sorted subarrays back together. It
compares the elements from the two subarrays and places them in the correct order in a temporary
array.
● After the two subarrays are merged, the elements in the temporary array are copied back to the
original array.

3.6.2 Pseudo code:


1 Function mergeSort(arr, n)
2 mergeSort_I(arr, 0, n - 1)
3 end
4
5 Function mergeSort_I(arr, l, r)
6 if l >= r then
7 return
8 mid ← l + (r - l) / 2
9 mergeSort_I(arr, l, mid)
10 mergeSort_I(arr, mid + 1, r)
11 merge(arr, l, mid, r)
12 end
13
14 Function merge(arr, l, mid, r)
15 left_size ← mid - l + 1
16 right_size ← r - mid
17 leftArr ← new array of size left_size
18 rightArr ← new array of size right_size
19
20 for i ← 0 to left_size - 1 do
21 leftArr[i] ← arr[l + i]
22 end
23
24 for j ← 0 to right_size - 1 do
25 rightArr[j] ← arr[mid + 1 + j]
26 end
27
28 left_idx ← 0, right_idx ← 0, merged_idx ← l
29
30 while left_idx < left_size and right_idx < right_size do
31 if leftArr[left_idx] <= rightArr[right_idx] then
32 arr[merged_idx] ← leftArr[left_idx]
33 left_idx ← left_idx + 1
34 else
35 arr[merged_idx] ← rightArr[right_idx]
36 right_idx ← right_idx + 1
37 end
38 merged_idx ← merged_idx + 1
39 end
40
41 while left_idx < left_size do
42 arr[merged_idx] ← leftArr[left_idx]
43 left_idx ← left_idx + 1
44 merged_idx ← merged_idx + 1
45 end
46
47 while right_idx < right_size do
48 arr[merged_idx] ← rightArr[right_idx]
49 right_idx ← right_idx + 1
50 merged_idx ← merged_idx + 1
51 end
52 end

3.6.3 Illustration:
Step Array Annotations

0 {5, 2, 4, 0, 6, 1}

1 {5, 2, 4}, {0, 6, 1}


Divide the array into two halves
2 {5,2}, {4}, {0,6}, {1}
Divide the array into two halves
3 {5}, {2}, {4}, {0}, {6}, {1}
Divide the array into two halves
4 {2,5},{4}, {0,6},{1}
Merge two halves.
5 {2, 4,5}, {0,1,6}
Merge two halves.
6 {0, 1, 2, 4, 5, 6}
Merge two halves.
3.6.4 Complexity

Space:
● Each recursion Merge step make an extra temp array to copy data, space complexity is O(n) for
all cases.

Time:

Algorithms Best-case Worst-case Stable

Merge Sort O(nlogn) O(nlogn) Yes

● Basic operations: 2SplitMs + 1Merge


● The number of times the basic operation is:

C ( n )=2 C ( n2 )+ Cmerge ( n) for n>1 ,C ( 1)=0


● The number of key comparisons in Merge depends on the nature of 2 subarrays. In the
worst-case:
● When the number of comparisons is maximum that means every element of array will be
compared at least once (Storing alternate elements in left and right subarray):

Cmerge ( n )=n−1

● According to Master Theorem, C(n) ∈ O(nlogn)


:

3.7 Heap Sort:

3.7.1 Idea:
● The heap sort algorithm is a divide-and-conquer sorting algorithm.
● The main idea of Heap Sort is to use the Max-Heap data structure to sort the input array.
● The algorithm first builds a Max-Heap from the input array using the MaxHeapify
function.
● In the MaxHeapify function, the algorithm ensures that the root of the subtree at index i is
the largest element by comparing it with its left and right children and swapping if
necessary.
● After building the Max-Heap, the algorithm repeatedly extracts the maximum element
(the root of the heap) and places it at the end of the sorted portion of the array.
● This process is repeated until the entire array is sorted.

3.7.2 Pseudo code:


1 Function HeapSort(arr)
2 n = length(arr)
3 // Build Max Heap
4 for i from floor(n/2) - 1 to 0
5 MaxHeapify(arr, i, n)
6 end
7
8 // Second Stage
9 for i from n-1 to 1
10 swap(arr[0], arr[i])
11 MaxHeapify(arr, 0, i)
12 end
13 end
14
15 Function MaxHeapify(arr, i, n)
16 left = 2*i + 1
17 right = 2*i + 2
18 largest = i
19 if left < n and arr[left] > arr[largest] then
20 largest = left
21 end
22 if right < n and arr[right] > arr[largest] then
23 largest = right
24 end
25 if largest != I then
26 swap(arr[i], arr[largest])
27 MaxHeapify(arr, largest, n)
28 end
29 end

3.7.3 Illustration:
Node Array Annotations

2 {5, 2, 4, 0, 6, 1, 2} Build the Heap, starting with the index 3 (n/2 - 1). 4 is at right
position.
1 {5, 2, 4, 0, 6, 1, 2} 2 is smaller than 6, so 2 will be swapped with 6.

1 {5, 6, 4, 0, 2, 1, 2} 2 does not have any leaf, no swap.

0 {5, 6, 4, 0, 2, 1, 2} 5 is smaller than 6, so 5 will be swapped with 6.

0 {6, 5, 4, 0, 2, 1, 2} 5 is at right position. Extract the maximum element and rebuild the
Heap.
0 {2, 5, 4, 0, 2, 1 | 6} 2 is smaller than 5 so 2 will be swapped with 5.

0 {5, 2, 4, 0, 2, 1 | 6} 2 is at right position. Extract the maximum element and rebuild the
Heap.
0 {1, 2, 4, 0, 2 | 5, 6} 1 is smaller than 4 so 1 will be swapped with 4.

0 {4, 2, 1, 0, 2 | 5, 6} 1 does not have any leaf, no swap. Extract the maximum element and
rebuild the Heap.
0 {2, 2, 1, 0 | 4, 5, 6} 2 is at the right position so no swap. Extract the maximum element
and rebuild the Heap.
0 {0, 2, 1 | 2, 4, 5, 6} 0 is smaller than 2 so 0 will be swapped with 2.

0 {2, 0, 1 | 2, 4, 5, 6} 0 does not have any leaf, no swap. Extract the maximum element and
rebuild the Heap.
0 {1, 0 | 2, 2, 4, 5, 6} 1 is at the right position, no swap. Extract the maximum element and
rebuild the Heap.
0 (0 | 1, 2, 2, 4, 5, 6} Extract the maximum element and rebuild the Heap.

0 {0, 1, 2, 2, 4, 5, 6}
3.7.4 Complexity:

Space:
The algorithm does not requires extra arrays, so space complexity is always Θ(1) for all cases.

Time:

Algorithms Best-case Worst-case Stable

Heap Sort O(nlogn) O(nlogn) No


● The number of times the basic operation is executed depends on n and the nature of input.
But the complexity of worst-case and best-case for heap sort are both O(nlogn).
● The heap construction stage(Max_heapify and build max heap): O(nlogn)
● The second stage: the number of comparisons in n-1 times calling Max-heapify is: (n-1)*
O(nlogn) ∈ O(nlogn)
● For both stages, we get: O(nlogn) + O(nlogn) = O(nlogn).
3.8 Quick Sort:

3.8.1 Idea:
● The quick sort algorithm is a divide-and-conquer sorting algorithm.
● Select a pivot element, which is typically the middle element or the first/last element of the
current partition.
● Partition the array into two parts: one part with elements less than the pivot, and the other part
with elements greater than or equal to the pivot.
● Recursively apply the quick sort algorithm to the two partitions.
● For small partitions (less than 10 elements), use insertion sort instead of recursive quick sort to
improve efficiency.

3.8.2 Pseudo code:


1 Function Sort_First_Middle_Last(arr, left, right)
2 mid = (left + right) / 2
3 if (arr[left] > arr[mid]) then
4 Swap(arr[left], arr[mid])
5 if (arr[mid] > arr[right]) then
6 Swap(arr[mid], arr[right])
7 if (arr[left] > arr[right]) then
8 Swap(arr[left], arr[right])
9 return mid
10
11 Function Partition(arr, left, right)
12 pivot = Sort_First_Middle_Last(arr, left, right)
13 Swap(arr[pivot], arr[right-1])
14 pivot = right - 1
15
16 left = left + 1
17 right = right - 2
18
19 i = left
20 while (left <= right) do
21 while arr[pivot] < arr[right] and left <= right do
22 right = right - 1
23 end
24 while arr[pivot] > arr[left] and left <= right do
25 left = left + 1
26 end
27 if (left <= right) then
28 Swap(arr[right], arr[left])
29 left = left + 1
30 right = right - 1
31 end
32 end
33 Swap(arr[pivot], arr[left])
34 return left
35
36 Function Qs_recursion(arr, left, right)
37 if (right - left + 1 >= 10) then
38 pivot = Partition(arr, left, right)
39 QS_Recursion(arr, left, pivot - 1)
40 QS_Recursion(arr, pivot + 1, right)
41 end
42 else then
43 for i from left+1 to right-left do
44 key = arr[i]
45 j=i-1
46 while j >= left and arr[j] > key do
47 arr[j+1] = arr[j]
48 j=j-1
49 end
50 arr[j+1] = key
51 end
52 end
53 end
54 Function QuickSort(arr, n)
55 QS_Recursion(arr, 0, n-1)
56 end
3.8.3 Illustration:
Step Array Annotations

0 {5, 2, 4, 0, 9, 3, 7, 1, 8, 5, 10, Mid = 7 (1) (mid= (last (15) + first (0)) /2) is smaller than 2
8, 1, 6, 1, 2} and 2 is smaller than 5, 2 is pivot
1 {1, 1, 1, 0}, 2 {3, 7, 4, 8, 5,
10, 8, 2, 6, 5, 9} The array is partitioned.
2 {0,1,1,1}, 2, {3, 7, 4, 8, 5, 10, The left sub-array is sorted by insertion sort. Mid = 10(10)
8, 2, 6, 5, 9} (last = 15(9), first = 5(3)). Pivot is 9.
3 {0,1,1,1}, 2, {3, 7, 4, 8, 5, 10, 5 is smaller than 6, so 5 will be swapped with 6.
8, 2, 6, 5, 9}
4 {0,1,1,1}, 2, {3, 7, 4, 8, 5, 5, 5 is at right position. Extract the maximum element and
8, 2, 6, 10, 9} rebuild the Heap.
5 {0,1,1,1}, 2, {3, 7, 4, 8, 5, 5, The left sub-array is sorted by insertion sort.
8, 2, 6), 9, {10}
6 {0, 1, 1, 1, 2, 2, 3, 4, 5, 5, 6,
7, 8, 8, 9, 10}
3.8.4 Complexity:

Space:
The algorithm does not requires extra arrays, so space complexity is always Θ(1) for all cases.

Time:

Algorithms Best-case Worst-case Stable

Quick Sort O(nlogn) O(n2) No


● Basic operation: 2 QS_Recursion + 1 Partition
● The number of key comparisons in partition function depends only on n.
● The number of QS_Recursion calls depend on pivot.
● The sum of basic operation are calculated for 2 cases:
● In the best-case, all the pivot splits in the middle of corresponding subarrays:

C ( n )=C ( n2 )+n−1 for n>1 , C ( 1)=0 ∈O ( nlogn ) by Master Theorem


● In the worst-case, 1 of 2 subarrays is empty:

n ( n−1 )
C ( n )=n−1+n−2+ n−3+ …+1= ∈O(n 2)
2

Optimize: Smart Pivot Selection


● Instead of choosing a random pivot or using the first/last element, select the pivot using other
strategies such as:
○ Choosing the median of three elements (first, middle, last)
○ Median of medians algorithms
● Choosing a good pivot can help balance the recursion tree and limit stack overflow.
3.9 Counting Sort:

3.9.1 Idea:
● The counting sort algorithm is a non-comparative sorting algorithm.
● Find the maximum element in the input array.
● Create a frequency array to store the count of each unique element in the input array.
● Modify the frequency array such that each element at each index stores the sum of the previous
counts.
● Build the output array by iterating the input array in reverse order and using the frequency array
to decide the position of each element.
● Lastly, copy the sorted elements from the output array to the original input array.

3.9.2 Pseudo code:


1 Function CountingSort(a, n)
2 length ← 0
3
4 temp ← new vector<int>(n, 0)
5
6 // Find the maximum element in the array
7 for i ← 1 to n-1 do
8 if a[i] > a[length] then
9 length ← i
10 end
11 end
12
13 // Create a frequency array to store count of individual digits
14 frequency ← new vector<int>(a[length] + 1, 0)
15 for i ← 0 to n-1 do
16 frequency[a[i] % 10] ← frequency[a[i] % 10] + 1
17 end
18
19 // Compute the cumulative frequency
20 for i ← 1 to a[length] do
21 frequency[i] ← frequency[i] + frequency[i-1]
22 end
23
24 // Place the elements in sorted order
25 for i ← n-1 to 0 do
26 temp[frequency[a[i] % 10] - 1] ← a[i]
27 frequency[a[i]] ← frequency[a[i]] - 1
28 end
29
30 // Overwrite the original array
31 for i ← 0 to n-1 do
32 a[i] ← temp[i]
33 end
34 end
3.9.3 Complexity:

Space:
Counting sort requires array to store the frequency of numbers (smallest element – largest element), so
space complexity is O(k) (k is the largest element).

Time:

Algorithms Best-case Worst-case Stable

Counting Sort O(n+k) O(n+k) Yes

● Basic operation: assignments inside 4 loops.


● The number of key comparisons depends on the array size and the max value of the array.
● Sum of number of times the basic operations executed is:

C ( n , k )=k +1+ n+k +n+ n=2 k +3 n+ 1

● Order of growth: 𝑶(𝒏 + 𝒌)


3.10 Radix Sort:

3.10.1 Idea:
● The radix sort algorithm is a non-comparative sorting algorithm.
● Find the maximum element in the input array to know the number of digits.
● Do counting sort for every digit. Instead of moving digits, exp is changed. exp is 1 (one's place),
10 (ten's place), 100 (hundred's place), and so on.
● Repeat step 2 until the maximum digit in the array is reached.

3.10.2 Pseudo code:


1 Function RadixSort(arr, n)
2 max ← GetMax(arr, n)
3
4 for exp ← 1 to max/10 do
5 frequency ← new vector<int>(10, 0)
6 temp ← new vector<int>(n, 0)
7
8 for i ← 0 to n-1 do
9 frequency[(arr[i]/exp)%10] ← frequency[(arr[i]/exp)%10] + 1
1 end
0
1
1
1
2 for i ← 1 to 9 do
1 frequency[i] ← frequency[i] + frequency[i-1]
3
1 end
4
1
5
1 for i ← n-1 to 0 do
6
1 temp[frequency[(arr[i]/exp)%10]-1] ← arr[i]
7
1 frequency[(arr[i]/exp)%10] ← frequency[(arr[i]/exp)%10] - 1
8
1 end
9
2
0
2 for i ← 0 to n-1 do
1
2 arr[i] ← temp[i]
2
2
3 end
2 end
4
2 end
5
2
6
2 Function GetMax(arr, n)
7
2 max ← 0
8
2
9
3 for i ← 1 to n-1 do
0
3
1 if arr[i] > arr[max] then
3 max ← i
2
3 end
3
3 end
4
3 end
5
3.10.3 Illustration:
Step Array Annotations

0 {18, 29, 59, 30, 99, The largest element has 2 digits
17}
1 {30, 17, 18, 29, 39, 59, The array will be sorted based on their last digit. The order of
99} elements with have the same last digit will be kept.
2 {17, 18, 29, 10, 39, 59, The array will be sorted based on their second last digit. The order
99} of elements with have the same last second digit will be kept.
3 {17, 18, 29, 10, 39, 59,
99}
3.10.4 Complexity:

Space:
Radix sort requires a extra array to store for the buckets used during sorting and for sorted output. So the
space complexity of Radix sort is O(n+k)

Time:

Algorithms Best-case Worst-case Stable

Radix Sort O(n) O(n2) Yes


● The time complexity of radix sort depends on the number of digits(d) in the largest number.
● In the worst-case, time complexity is O(d*(n+k)) is d is constant it will be O(n). When all
elements have the same digits, radix sort will need to process each digit of each element.
● In the best-case, time complexity is O(n) when all the elements of the array only have one digit.
3.11 Flash Sort:

3.11.1 Idea:
● Find the maximum (max) and minimum (min) values in the array.
● Divide the array into m buckets, where m = 0.45 * n (n is the size of the array).
● Distribute the elements of the array into the buckets based on the formula: k = floor((m-1) * (a[i]
- min) / (max - min)), where i is the index of the element.
● Sort the elements within each bucket using a different sorting algorithm (here it's Insertion Sort).
● Merge the sorted buckets to create the final sorted array.

3.11.2 Pseudo code:


1 Function flashSort(arr, n)
2
3 min_arr ← arr[0], max_arr ← arr[0]
4 for i ← 1 to n - 1 do
5 if arr[i] < min_arr then
6 min_arr ← arr[i]
7 else if arr[i] > max_arr then
8 max_arr ← arr[i]
9 end
10 end
11
12 if max_arr == min_arr then
13 return
14 end
15
16 m ← 0.45 * n
17 if m <= 2 then
18 m←2
19 end
20 L ← array of m elements, all initialized to 0
21 for i ← 0 to n - 1 do
22 k ← (m - 1) * (arr[i] - min_arr) / (max_arr - min_arr)
23 L[k] ← L[k] + 1
24 end
25 for k ← 1 to m - 1 do
26 L[k] ← L[k] + L[k - 1]
27 end
28
29 i←0
30 count ← 0
31 k←m-1
32 while count < n do
33 while i >= L[k] do
34 i←i+1
35 k ← (m - 1) * (arr[i] - min_arr) / (max_arr - min_arr)
36 end
37 flash ← arr[i]
38 while i != L[k] do
39 k ← (m - 1) * (flash - min_arr) / (max_arr - min_arr)
40 swap(arr[L[k] - 1], flash)
41 L[k] ← L[k] - 1
42 count ← count + 1
43 end
44 end
45
46 for k ← 1 to m - 1 do
47 for i ← L[k - 1] + 1 to L[k] - 1 do
48 selected ← arr[i]
49 j←i-1
50 while j >= 0 and arr[j] > selected do
51 arr[j + 1] ← arr[j]
52 j←j-1
53 end
54 end
55 arr[j + 1] ← selected
56 end
57 return
3.11.3 Complexity:

Space:
Flash sort requires an extra array to store m buckets, which is 0.45 x n. So that the space complexity of
flash sort is O(m).

Time:

Algorithms Best-case Worst-case Stable


Flash Sort O(n) O(n2) No

● The best-case time complexity is O(n).


● However, the worst-case time complexity can be O(n 2). This can happen when all the elements
belong to one bucket. That leads to the worst-case time complexity of insertion sort.
4. Experimental results and comments:
4.1 Sorted data:
Table 4.1a: Performance metrics of sorting algorithms for sorted data (part 1)

Sorted data
Datasize 10000 30000 50000
Resulting Comparis Running Comparis Running Comparis Running
statics ons time ons time ons time
Selection 100019998 144 900059998 1227 250009999 3882
sort 8
Insertion 29998 0.046 89998 0.14 149998 0.22
sort
Bubble sort 20001 0.05 60001 0.11 100001 0.17
Shaker sort 20001 0.03 60001 0.09 100001 0.17
Shell sort 360042 0.62 1170050 1.57 2100049 2.83
Heap sort 670329 1.73 2236648 5.65 3925351 9.58
Merge sort 465243 0.78 1529915 2.76 2672827 4.56
Quick sort 203112 0.31 732326 1.09 1324613 2.04
Counting 60002 0.15 180002 0.63 300002 1.09
sort
Radix sort 140056 0.63 510070 2.04 850070 3.3
Flash sort 114302 0.31 342902 0.78 571502 1.57
Table 4.1b: Performance metrics of sorting algorithms for sorted data (part 2)

Sorted data
Datasize 100000 300000 500000
Resulting Comparis Running Comparis Running Comparis Running
statics ons time ons time ons time
Selection 100001999 14957 900005999 134853 2.50001E+ 352155
sort 98 98 11
Insertion 299998 0.47 899998 1.57 1499998 2.55
sort
Bubble sort 200001 0.34 600001 1.17 1000001 1.98
Shaker sort 200001 0.35 600001 1.38 1000001 2.12
Shell sort 4500051 5.97 15300061 21.06 25500058 35.69
Heap sort 8365080 19.97 27413230 67.96 47404886 116.99
Merge sort 5645659 9.27 18345947 30.81 31517851 53.13
Quick sort 2849191 4.56 9098354 13.21 16196654 24.12
Counting 600002 2.04 1800002 6.23 3000002 10.24
sort
Radix sort 1700070 6.59 6000084 24.9 10000084 41.61
Flash sort 1143002 2.98 3429002 9.69 5715002 16.17
Based on the statistical results presented above, we can construct a bar chart to compare the
number of comparisons performed by each algorithm, as illustrated below.

Figure 4.1a: Algorithm comparison counts in sorted data chart

Analysis:

- The Selection sort algorithm stands out with a significantly higher number of comparisons,
attributed to its stable O(n2) time complexity, which results in comparisons in all cases.

- In contrast, other O(n2) algorithms such as bubble sort and shaker sort have early that reduce
unnecessary comparisons and swap. They can quickly detect when no elements need reordering,
terminating early and achieving optimal performance under sorted data conditions.

- Insertion sort and shaker sort are also O(n2) algorithms that are efficient in this scenario due to
their best-case of O(n) is when the data is sorted.
To enhance clarity in presenting other algorithms, selection sort is excluded from the analysis in
the following figures.

Figure 4.1b: Algorithm comparison counts in sorted data chart (without selection sort)

Analysis:

- All 4 O(nlogn) algorithms have much more significant comparison counts than the rest here,
since they don’t have any terminate mechanisms that would work well in this specific scenario.

- Although heap sort maintains an O(nlogn) in all cases, its still have the most comparison counts
in this scenario due to its rigorous requirement to restructure the entire data to maintain the heap
property, doesn’t matter if the data is in ordered or not.
The following figures are line graphs illustrate the running times of each sorting algorithm when
processing sorted input data.

Figure 4.2a: Algorithm running times in sorted data graph

Analysis

- The elevated running time observed for Selection sort mirrors its high comparison count, both
attributed to the algorithm's stable O(n2) time complexity as mentioned above.

To enhance clarity in presenting other algorithms, selection sort is excluded from the analysis in
the following figures.

Figure 4.2b Algorithm running times in sorted data graph (without selection sort)
Analysis:

- After removing selection sort, heap sort continues to exhibit the highest output, which is
runtime this time due to the same reason as mentioned at comparison counts section above.

- Bubble sort and insertion are extremely fast in this scenario due to their adaptive nature when
handling data that is already partially or fully sorted which is mentioned above.

- Shaker sort achieves its fastest performance in sorted input scenarios for all data sizes, in
contrast to heap sort due to its bidirectional sorting strategy and the fact it’s the improved version
of bubble sort. Unlike bubble sort, which terminates when no swaps are made from left to right,
the bidirectional movement property allows shaker sort to alternate passes from left to right and
right to left, stopping when no swaps are made in either direction, make it more efficient than
bubble sort in sorted input.
4.2 Nearly sorted data
Table 4.2a: Performance metrics of sorting algorithms for nearly sorted data (part 1)

Nearly sorted data


Datasize 10000 30000 50000
Resulting Comparis Running Comparis Running Comparis Running
statics ons time ons time ons time
Selection sort 100019998 137 900059998 1214 250009999 4052
8
Insertion sort 192106 0.28 454798 0.69 503230 0.77
Bubble sort 100009999 141 900029999 1210 250004999 3380
9
Shaker sort 87566160 126 899923840 1276 217936772 3116
0
Shell sort 409433 0.62 1333868 2.35 2293924 3.45
Heap sort 669983 1.72 2236600 5.5 3925367 9.91
Merge sort 499057 0.78 1622420 2.58 2784047 4.71
Quick sort 203112 0.33 732326 0.93 1324610 1.89
Counting sort 60002 0.15 180002 0.62 300002 1.1
Radix sort 140056 0.46 510070 2.04 850070 3.78
Flash sort 114281 0.31 342885 0.94 571477 1.61

Table 4.2b: Performance metrics of sorting algorithms for nearly sorted data (part 2)

Nearly sorted data


Datasize 100000 300000 500000
Resulting Comparis Running Comparis Running Comparis Running
statics ons time ons time ons time
Selection sort 100001999 14630 900005999 130739 2.50001E+ 367284
98 98 11
Insertion sort 765346 1.16 1225186 2.02 1876706 3.19
Bubble sort 100000999 17671 900002999 138565 2.5E+11 609686
99 99
Shaker sort 531000477 7931 170897996 25921 243655498 36614
6 01 80
Shell sort 4668570 7.07 15481810 22.21 25667931 37.11
Heap sort 8364662 20.49 27413269 67.64 47404820 116.52
Merge sort 5742963 9.58 18437021 30.5 31621456 53.52
Quick sort 2849191 4.24 9098354 13.36 16196654 23.75
Counting sort 600002 2.04 1800002 5.12 3000002 10.37
Radix sort 1700070 6.53 6000084 24.85 10000084 41.51
Flash sort 1142986 3.29 3428980 9.59 5714982 15.88
Based on the statistical results presented above, we can construct a bar chart to compare the
number of comparisons performed by each algorithm, as illustrated below.

Figure 4.3a: Algorithm comparison counts in nearly sorted Data chart

Analysis:

- It’s no surprise here that selection sort still maintains its reputation for the least efficiency.
However, it’s noteworthy that bubble sort, particularly in handling nearly sorted input, also
shares this trait.

- This extremely high number of comparisons can be explained by the fact that the Bubble Sort
algorithm compares two elements at a time throughout the entire O(n^2) loop, irrespective of the
elements' order. Although it possesses an early termination mechanism, it’s not really an
effective one, as it requires the array to be fully sorted before termination can occur.

- Shaker Sort, on the other hand, employs a more effective termination mechanism due to its
bidirectional sorting strategy. This approach allows it to detect sorted segments in nearly sorted
data more easily, resulting in significantly lower comparison counts.

- However, this termination mechanism in nearly sorted data is not as effective as in sorted data,
which leads to the fact that shaker sort still exhibits significantly higher comparison counts
compared to algorithms with O(nlogn) and O(n) complexities.
- The only O(n2) algorithm that stands out in this context is insertion sort, as its performance is
completely not visible when the other O(n2) algorithms are included in this comparative chart,
suggesting a notable efficiency improvement from insertion sort.
To enhance clarity in presenting other algorithms, selection sort, bubble sort and shaker sort is
excluded from the analysis in the following figures.

Figure 4.3b: Algorithm comparison counts in nearly sorted Data chart (without selection,
bubble and shaker sort)

- Insertion sort not only demonstrates superior efficiency in terms of comparison counts among
O(n2) algorithms but also outperforms all other algorithms in this scenario. This can be attributed
to its efficient insertion method, which minimizes comparisons and shifts needed to insert each
element into the sorted portion since the data is nearly sorted. In such scenarios, insertion sort
can even be performed with O(n) time complexity.

- Heap sort's status as the algorithm with the highest comparison counts can be explained
similarly to its behavior in sorted input data scenarios.
The following figures are line graphs illustrating the running times of each sorting algorithm
when processing nearly sorted input data.

Figure 4.4a: Algorithm running times in nearly sorted data graph

Exclude selection sort, bubble sort and shaker sort are excluded from the analysis to get this
following figure.

Figure 4.4b: Algorithm running times in nearly sorted data graph (without selection, bubble and
shaker sort)

Analysis:
- The efficency of sorting algorithms in terms of running time closely reflects their comparison
counts, with selection sort, bubble sort, and shaker sort being prominently visible on graphs that
include them. Excluding these algorithms highlights heap sort's persistently inefficient
characteristics.

- This suggests that the performance of insertion sort and heap sort in comparison to others
remains consistent as the fastest and slowest algorithms, whether in nearly sorted or fully sorted
scenarios.
4.3 Reversed sorted data
Table 4.3a: Performance metrics of sorting algorithms for nearly sorted data (part 1)

Reverse sorted data


Datasize 10000 30000 50000
Resulting Comparis Running Comparis Running Comparis Running
statics ons time ons time ons time
Selection 100019998 125 900059998 1222 250009999 4961
sort 8
Insertion sort 100009999 125 900029999 1194 250004999 4557
9
Bubble sort 100019998 282 900059998 2553 250009999 7293
8
Shaker sort 100015000 283 900045000 2571 250007500 7365
0
Shell sort 475175 0.78 1554051 2.46 2844628 4.66
Heap sort 606771 1.58 2063324 5.36 3612724 9.23
Merge sort 466442 0.79 1543466 2.36 2683946 4.42
Quick sort 293854 0.46 1016716 1.41 1794416 2.67
Counting 60002 0.14 180002 0.62 300002 1.09
sort
Radix sort 140056 0.47 510070 2.04 850070 3.32
Flash sort 100704 0.38 302104 0.94 503504 1.41

Table 4.3b: Performance metrics of sorting algorithms for nearly sorted data (part 2)

Reverse sorted data


Datasize 100000 300000 500000
Resulting Comparis Running Comparis Running Comparis Running
statics ons time ons time ons time
Selection sort 100001999 14408 900005999 125176 2.50001E+ 491254
98 98 11
Insertion sort 100000999 14087 900002999 123956 2.5E+11 440740
99 99
Bubble sort 100001999 28740 900005999 340185 2.50001E+ 955477
98 98 11
Shaker sort 100001500 28685 900004500 353376 2.50001E+ 991395
00 00 11
Shell sort 6089190 9.74 20001852 32.18 33857581 55.67
Heap sort 7718943 19.56 25569379 66.27 44483348 114.54
Merge sort 5667898 9.61 18408314 31.37 31836410 52.56
Quick sort 3814050 5.68 12198880 18 21473014 33.32
Counting sort 600002 2.07 1800002 6.46 3000002 10.92
Radix sort 1700070 6.79 6000084 25.69 10000084 42.76
Flash sort 1007004 2.84 3021004 9.3 5035004 15.35
Based on the statistical results presented above, we can construct a bar chart to compare the
number of comparisons performed by each algorithm, as illustrated below.

Figure 4.5a: Algorithm running times in reverse sorted data chart

All O(n2) algorithms are excluded from the analysis to get this following figure.

Figure 4.5b: Algorithm running times in reverse sorted data chart (without O(n2) algorithms)
Analysis:

- It seems that the advantage of insertion sort, bubble sort, and selection sort in this reversed
sorted input scenario (early termination mechanism) diminishes because their efficiencies
heavily rely on sorted portions of data.

- Heap sort still maintains the highest comparison counts among all O(nlogn) and O(n)
algorithms.

- Since insertion sort is no longer exceptionally efficient in this reversed sorted input scenario,
counting sort emerges as the algorithm with the highest efficiency in terms of comparison counts
among all algorithms (it’s always high but not the highest in sorted and nearly sorted input
scenario). This is attributed to its ability to sort without relying on comparisons to implement the
sorting process.

The following figures are line graphs illustrating the running times of each sorting algorithm
when processing reverse sorted input data.

Figure 4.6a: Algorithm running times in nearly sorted data graph

All O(n2) algorithms are excluded from the analysis to get this following figure.

Figure 4.6b: Algorithm running times in reverse sorted data graph (without O(n2) algorithms)
Analysis:

- Both bubble sort and shaker sort rank as the slowest algorithms here, followed by selection sort
and insertion sort, all of which are categorized as inefficient O(n2) algorithms.

- The efficency of the other algorithms in terms of running time closely mirrors their comparison
counts relative to each other, with counting sort is the fastest among all algorithms and heap sort
is the slowest when all O(n2) algorithms are not included.
4.4 Randomized
Table 4.4a: Performance metrics of sorting algorithms for randomized data (part 1)

Randomized data
Datasize 10000 30000 50000
Resulting Comparis Running Comparis Running Comparis Running
statics ons time ons time ons time
Selection 100019998 140 900059998 1259 250009999 3525
sort 8
Insertion sort 50272632 62 451465111 600 124910287 1721
0
Bubble sort 99990072 347 900049800 3319 250000140 9412
5
Shaker sort 100015000 269 900045000 2441 250007500 6918
0
Shell sort 636227 1.7 2277704 5.9 4559070 11.3
Heap sort 637987 1.9 2150336 6.4 3772295 11.18
Merge sort 573720 1.4 1907183 4.6 3332661 8.09
Quick sort 263968 0.8 872588 2.8 1536347 4.9
Counting 60002 0.15 180001 0.635 282770 0.94
sort
Radix sort 140056 0.55 510070 2.056 850070 3.14
Flash sort 96583 0.3 305972 1.1 489823 1.9

Table 4.4b: Performance metrics of sorting algorithms for randomized data (part 2)

Randomized data
Datasize 100000 300000 500000
Resulting Comparis Running Comparis Running Comparis Running
statics ons time ons time ons time
Selection sort 100001999 13805 900005999 123306 2.50001E+ 346689
98 98 11
Insertion sort 499934864 6890 449737088 63757 1.24762E+ 160667
4 63 11
Bubble sort 999998470 37314 899981162 358214 2.49998E+ 967177
5 25 11
Shaker sort 100001500 28608 900004500 254611 2.50001E+ 738371
00 00 11
Shell sort 10431455 24.6 34781771 125 64630026 219
Heap sort 8046043 23.9 26490065 135 45971506 192
Merge sort 7065711 17.1 23082794 94 39882624 130
Quick sort 3419803 10.38 11261721 31 19732260 77
Counting sort 532769 1.7 1532770 4.7 2532770 15
Radix sort 1700070 6.37 5100070 20.6 8500070 41
Flash sort 939592 3.7 2920436 12.6 5062874 49
Based on the statistical results presented above, we can construct a bar chart to compare the
number of comparisons performed by each algorithm, as illustrated below.

Figure 4.7a: Algorithm running times in randomized data chart

All O(n2) algorithms are excluded from the analysis to get this following figure.

Figure 4.7b: Algorithm running times in randomized data chart (without O(n2) algorithms)
Analysis:

- Just like in reverse sorted data input scenario, O(n2) algorithms exhibit high comparison counts,
due to the stable O(n2) time complexity of selction sort and the diminishing of early termination
mechanism of the rest O(n2) algorithm.

- Among the O(n2) algorithms, insertion sort demonstrates notably lower comparison counts than
others due to its highly adaptive nature, particularly effective in sorted and nearly sorted inputs.
In randomized scenarios, especially with sufficiently large inputs, there’s a chance that some
segments of the data may inadvertently resemble sorted sequences, allowing insertion sort to
perform more efficiently in these cases compared to its counterparts.

- The most efficient algorithm in terms of comparison counts among all algorithms is still
counting sort in this scenario, due to its ability to sort without relying on comparisons, leveraging
direct counting and indexing techniques instead.

- There’s a notable feature here, which is the highest comparison counts of shell sort when all the
O(n2) algorithms are excluded. This contrasts with scenarios where heap sort typically records
the highest counts. Heap sort's strength lies in maintaining a balanced heap structure, ensuring
O(nlogn) time complexity for all cases. While shell sort can benefit from partially sorted data
segments, albeit less so than algorithms like insertion sort, it still achieves greater runtime
efficiency than heap sort. But in this randomized input scenario, that benefit is further
diminished, which results in what is shown in this chart.
The following figures are line graphs illustrating the running times of each sorting algorithm
when processing reverse sorted input data.

Figure 4.8a: Algorithm running times in randomized data graph

All O(n2) algorithms are excluded from the analysis to get this following figure.

Figure 4.8b: Algorithm running times in randomized data graph (without O(n2) algorithms)
Analysis:

- All O(n2) algorithms have a visibly distinct runtime compared to each other, with bubble sort
being the slowest, indicating that selection is no longer the least efficient one in this scenario.
And the fastest O(n2) algorithm is insertion sort, with the same reason the same as why it’s the
most efficient O(n2) algorithm in terms of comparison counts.

- In this randomized input data scenario, among O(n2) algorithms runtime comparison, each
demonstrates visibly distinct runtime behaviors, contrasting with other scenarios where their
performance line graphs often overlap.

- Bubble sort notably showing the slowest performance here, thereby suggesting that selection
sort is no longer the least efficient in this context.

- Among these algorithms, insertion sort emerges as the fastest O(n2) algorithm, due to the reason
as to why it has superior performance in terms of comparison counts among O(n2) algorithms.

- The slowest algorithm within an input size of up to 300,000 is heap sort, while beyond this
threshold, shell sort becomes the least efficient.

- Counting sort maintains its superior efficiency, as observed in both reverse sorted and
randomized input scenarios.

4.5 Overall conclusion from the experiment


Based on the statistical experiment and analysis presented above, we can draw the following
conclusions:

- Counting sort is the fastest sorting algorithm, followed by flash sort in all input order scenarios.
Counting sort maintains its overall speed advantage over flash sort due to its stable linear time
complexity O(n + k) for all cases, while flash sort while flash sort relies on a more variable time
complexity that can degrade to O(n2) in worst-case scenarios with uneven distributions of
elements or input characteristics that do not align with its assumptions.

- While bubble sort may perform quickly on fully sorted input, it generally ranks as the slowest
algorithm and even slower than selection sort overall. Its limited ability to handle partially
ordered data segments means its early termination mechanism offers minimal efficiency gains.
Even in nearly sorted input data scenarios, bubble sort remains among the slowest algorithms.

- In contrast to bubble sort, both shaker sort and insertion sort can capitalize significantly on
partially ordered input data, even if it is not as large as in nearly sorted input data. Insertion sort
becomes exceptionally efficient in such scenarios, outperforming even counting sort in terms of
runtime efficiency.
5. Project organization & programming notes
Project organization
All the source code files are saved in SOURCE directory:

+ Algorithms.cpp and Algorithms.h file : contain source code of 11 sorting algorithms including
one .cpp file and one .h file providing algorithm data for testing and evaluation.
+ DataGenerator.cpp and DataGenerator.h file : provide functions to generate array data with
different characteristics. These functions help to generate randomly ordered, ascending ordered,
descending ordered and nearly ordered arrays.
+ Helper.cpp and Helper.h file : contain functions that support the necessary requirements during the
project implementation process, including: implement sorting algorithms based on algorithm id,
convert data type of algorithm and input order, print out the result, check for valid input/file.
+ Experiment.cpp and Experiment.h file : contain five commands and functions and writes the
experimental results to a file with information including data order, data size, sorting algorithm name,
running time, and number of comparisons.
+ In addition, besides the mentioned directories, there is also a main.cpp file that uses command line
parameters to execute the written program.

Programming notes
Some libraries used in this project are:
+ fstream library: provides classes and functions for file handling operations, such as opening,
reading, and writing to files.
+ string, cstring library: is used to perform string related operations.
+ iomanip, chrono library: provides functions and manipulators for formatting the output in a desired
manner.
+ cmath library: provides a wide range of mathematical functions and operations.
+ time.h library: provides functions and data types for working with time and date.

Usage guide:
Open a command prompt, use cd [direction to our source folder] to compile our source files into an
exe file, use : g++ .\*.cpp -o [exe file]
In this command:
• It is used to compile C++ source files into an executable file with a specified name.
• g++: This is the command to invoke the g++ C++ compiler.
• .\*.cpp: This is a file pattern that specifies all the C++ source files in the current directory. The *
character is a wildcard that matches any file name, and .cpp specifies that only files with the .cpp
extension will be compiled.
• -o [exe file]: This is an option that specifies the name of the output executable file. You need to
replace [exe file] with the desired name for your executable file.
6. List of references:
1. Selection Sort
 Sort Algorithm. (2022, September 1). Retrieved November 5, 2022, from
https://www.geeksforgeeks.org/selection-sort/
2. Insertion Sort
 Insertion Sort. (2022, October 18). Retrieved November 1, 2022, from
https://www.geeksforgeeks.org/insertion-sort/
 GeeksforGeeks. (2022a, June 20). Binary Insertion Sort. Retrieved November 2,
2022, from https://www.geeksforgeeks.org/binary-insertion-sort/
3. Bubble Sort:
 GeeksforGeeks. (2022j, November 1). Bubble Sort Algorithm. Retrieved November
5, 2022, from https://www.geeksforgeeks.org/bubble-sort/
4. Shaker Sort:
 Wikipedia contributors. (2022b, October 31). Cocktail shaker sort. Retrieved
November 1, 2022, from https://en.wikipedia.org/wiki/Cocktail_shaker_sort
5. Shell Sort:
 GeeksforGeeks. (2022i, October 24). ShellSort. Retrieved October 28, 2022, from
https://www.geeksforgeeks.org/shellsort/
 RobEdwards. (2016, December 9). Sorts 5 Shell Sort. Retrieved October 28, 2022,
from https://www.youtube.com/watch?v=ddeLSDsYVp8
6. Heap Sort:
 GeeksforGeeks. (2022d, September 22). Heap Sort. Retrieved October 23, 2022, from
https://www.geeksforgeeks.org/heap-sort/
7. Merge Sort:
 GeeksforGeeks. (2022e, September 23). Merge Sort Algorithm. Retrieved November
5, 2022, from https://www.geeksforgeeks.org/merge-sort/
8. Quick Sort:
 GeeksforGeeks. (2022h, September 27). QuickSort. Retrieved October 23, 2022,
from https://www.geeksforgeeks.org/quick-sort/
 GeeksforGeeks. (2022f, September 26). Hoare’s vs Lomuto partition scheme in
QuickSort. Retrieved October 23, 2022, from https://www.geeksforgeeks.org/hoares-
vs-lomuto-partition-scheme-quicksort/
9. Counting Sort:
 GeeksforGeeks. (2022g, September 27). Counting Sort. Retrieved November 2, 2022,
from https://www.geeksforgeeks.org/counting-sort/
10. Radix Sort:
 GeeksforGeeks. (2022c, August 9). Radix Sort. Retrieved November 5, 2022, from
https://www.geeksforgeeks.org/radix-sort/
11. Flash Sort:
 Flashsort. (2022, June 29). Retrieved October 29, 2022, from
https://en.wikipedia.org/wiki/Flashsort
 Hawkular. (2015, March 24). Flash sort- a sorting algorithm. Retrieved October 29,
2022, from https://www.youtube.com/watch?v=kdhPnSIT7dY

You might also like