Báo Cáo
Báo Cáo
Báo Cáo
BÁO CÁO ĐỒ ÁN
KHẢO SÁT CÁC THUẬT TOÁN SẮP XẾP
Computer information
OS Windows 10
Special Thanks: I would like to extend my heartfelt gratitude to Teacher Bùi Huy Thông
and Teacher Phan Thị Phương Uyên for their invaluable guidance and teachings throughout my
studies. Their dedication to imparting knowledge and fostering my understanding of sorting
algorithms has been instrumental in shaping this analysis report. Thank you for equipping me
with the skills and insights necessary to undertake this project.
2. Introduction:
The sorting algorithm is a widely used and highly applicable algorithm in everyday life. It can be
simply understood as the process of arranging the elements in a given array in a specific order.
However, this concept is still too vague. Therefore, the project is carried out with the goal of
getting a comprehensive overview of sorting algorithms, understanding their working principles,
characteristics, properties, advantages, and limitations, as well as the applications of some
common algorithms.
Organizing data through the application of sorting algorithms is a fundamental technique that
enables efficient data retrieval and management. Despite the abundance of sorting algorithms
available, each individual method has its own distinct advantages and disadvantages.
In the context of this project, the team will undertake the implementation and comprehensive
exploration of 11 prominent sorting algorithms: Selection Sort, Insertion Sort, Bubble Sort,
Shaker Sort, Shell Sort, Heap Sort, Merge Sort, Quick Sort, Counting Sort, Radix Sort, and Flash
Sort.
3. Algorithm presentation:
3.1 Bubble Sort:
3.1.1 Idea:
● The bubble sort algorithm is a brute-force sorting algorithm.
● The algorithm iterates through the array, comparing and swapping adjacent
elements if they are not in the correct order.
3.1.2 Pseudo code:
1 Function BubbleSort(arr)
2 n ← length(arr)
3 for i ← 0 to n - 2 do
4 for j ← 0 to n - i - 2 do
5 if arr[j] > arr[j + 1] then
6 swap(arr[j], arr[j + 1])
7 end
8 end
9 end
10 end
3.1.3 Illustration:
3.1.4 Complexity:
Space:
The algorithm does not requires extra arrays, so space complexity is always Θ(1) for all cases.
Time:
Worst-case:
● Sum of number of times the basic operation for the worst-case is:
n −2 n −i−2 n−2
n(n−1)
∑¿ ∑ 1 }=∑ ( n−i−1 )=
2
¿
i=0 i=0 i=0
Best-case
● In the best-case, the input array is already sorted in ascending or descending order.
● We can use an additional flag to track whether any swaps occurred during an iteration. If
no swaps occurred, it means the array is already sorted, and we can stop the algorithm.
This improvement will make the best-case time complexity of Bubble Sort to be O(n).
1 Function BubbleSort_with_flag(arr):
2 n ← length(arr)
3 swapped ← false
4 for i ← 0 to n - 1 do
5 swapped ← false
6 for j ← 0 to n - i - 1 do
9 swapped ← true
10 end
11 end
14 end
15 end
16
end
3.2 Selection Sort:
3.2.1 Idea:
● The selection sort algorithm is a brute-force sorting algorithm
● The algorithm iterates through the array, finding the minimum element in the unsorted
part.
● It then swaps the minimum element with the element at the current position.
3.2.3 Illustration:
Step Array Annotations
3.2.4 Complexity:
Space:
The algorithm does not requires extra arrays, so space complexity is always Θ(1) for all cases.
Time:
3.3.1 Idea:
● The insertion sort algorithm is a decrease and conquer sorting algorithm.
● The algorithm iterates through the array, starting from the second element.
● For each element, it finds the correct position to insert it into the already sorted subarray
on the left.
3.3.3 Illustration:
Annotations
Step Array
0 {5, |3, 2, 1}
Take the element (3), compare it with the sorted portion (5) and
1 {3, 5, |2, 1}
swap them.
Consider the next element (2). Compare it with the sorted portion
2 {2, 3, 5, |1} (3, 5) and swap it with (5). The sorted portion is now {3, 2, 5}.
Then swap it with (3).
Find the minimum element from the remaining unsorted part,
3 {1, 2, 3, 5} which is 1. Compare it with the sorted portion (2,3, 5) and swap
it with (5), (3) then (2).
3.3.4 Complexity:
Space:
The algorithm does not requires extra arrays, so space complexity is always Θ(1) for all cases.
Time:
∑ 1=n−1
i=1
3.4.1 Idea:
● The main idea of Shaker Sort is to combine Bubble Sort in two directions, from left to
right and from right to left.
● The sorting process works as follows:
● Start from the first element, compare and swap if the next element is larger.
● Then, start from the last element, compare and swap if the previous element is larger.
● Repeat this process until the entire array is sorted.
● Shaker Sort is considered a variation of Bubble Sort, with the advantage of reducing the
number of comparisons and swaps compared to the standard Bubble Sort.
3.4.3 Illustration:
Step Array Annotations
0 {5, 2, 3, 1}
3.4.4 Complexity
Space:
The algorithm does not requires extra arrays, so space complexity is always Θ(1) for all cases.
Time:
Worst-case:
● Order of growth: O(n2)
Best-case:
● In the best-case, the input array is already sorted in ascending or descending order.
● We can use an additional flag like BubbleSort_with_flag that we have already mentioned.
This improvement will make the best-case time complexity of Bubble Sort to be O(n).
3.5 Shell sort
3.5.1 Idea:
● Shell Sort is an improvement over the simple Insertion Sort algorithm, with better
efficiency for larger arrays
● The main idea of Shell Sort is to use the difference in the gap between the elements being
compared and swapped.
● The algorithm starts with a large gap (usually n/2, where n is the size of the array) and
gradually reduces the gap down to 1.
● In each iteration, the algorithm uses Insertion Sort on the elements separated by the
current gap.
● When the gap reaches 1, the algorithm has sorted the entire array.
0 {5, 2, 3, 1}
1 {|2, 5, 3, 1}
For the initial gap of 2, compare (5, 2), swap => {2, 5, 3, 1}
2 {2, 5, |1, 3}
Compare (3, 1), swap => {2, 5, 1, 3}
3 {2, |5, 1, 3} Repeat the sorting process with the new gap size of 1, compare
(2, 5), no swap => {2, 5, 1, 3}
4 {1, 2, |5, 3}
Compare (5), (2) with (1), swap => {1, 2, 5, 3}
5 {1, 2, 3, |5}
Compare (1), (2), (5) with (3), swap => {1, 2, 3, 5}
3.5.4 Complexity:
Space:
● The algorithm does not requires extra arrays, so space complexity is always Θ(1) for all
cases.
Time:
Worst-case:
● The worst-case occurs in a sorting algorithm when the elements to be sorted are in reverse order
● Order of growth: O(n2)
Best-case:
● In the best-case, the input array is already sorted in ascending or descending order.
● Order of growth: O(nlogn).
3.6 Merge Sort
3.6.1 Idea:
● The merge sort algorithm is a divide-and-conquer sorting algorithm.
● The main idea of Merge Sort is to divide the input array recursively into smaller subarrays, sort
them, and then merge them back together.
● The SplitMS function is the recursive part of the algorithm, where it divides the array into two
halves and recursively sorts each half.
● The Merge function is responsible for merging the two sorted subarrays back together. It
compares the elements from the two subarrays and places them in the correct order in a temporary
array.
● After the two subarrays are merged, the elements in the temporary array are copied back to the
original array.
3.6.3 Illustration:
Step Array Annotations
0 {5, 2, 4, 0, 6, 1}
Space:
● Each recursion Merge step make an extra temp array to copy data, space complexity is O(n) for
all cases.
Time:
Cmerge ( n )=n−1
3.7.1 Idea:
● The heap sort algorithm is a divide-and-conquer sorting algorithm.
● The main idea of Heap Sort is to use the Max-Heap data structure to sort the input array.
● The algorithm first builds a Max-Heap from the input array using the MaxHeapify
function.
● In the MaxHeapify function, the algorithm ensures that the root of the subtree at index i is
the largest element by comparing it with its left and right children and swapping if
necessary.
● After building the Max-Heap, the algorithm repeatedly extracts the maximum element
(the root of the heap) and places it at the end of the sorted portion of the array.
● This process is repeated until the entire array is sorted.
3.7.3 Illustration:
Node Array Annotations
2 {5, 2, 4, 0, 6, 1, 2} Build the Heap, starting with the index 3 (n/2 - 1). 4 is at right
position.
1 {5, 2, 4, 0, 6, 1, 2} 2 is smaller than 6, so 2 will be swapped with 6.
0 {6, 5, 4, 0, 2, 1, 2} 5 is at right position. Extract the maximum element and rebuild the
Heap.
0 {2, 5, 4, 0, 2, 1 | 6} 2 is smaller than 5 so 2 will be swapped with 5.
0 {5, 2, 4, 0, 2, 1 | 6} 2 is at right position. Extract the maximum element and rebuild the
Heap.
0 {1, 2, 4, 0, 2 | 5, 6} 1 is smaller than 4 so 1 will be swapped with 4.
0 {4, 2, 1, 0, 2 | 5, 6} 1 does not have any leaf, no swap. Extract the maximum element and
rebuild the Heap.
0 {2, 2, 1, 0 | 4, 5, 6} 2 is at the right position so no swap. Extract the maximum element
and rebuild the Heap.
0 {0, 2, 1 | 2, 4, 5, 6} 0 is smaller than 2 so 0 will be swapped with 2.
0 {2, 0, 1 | 2, 4, 5, 6} 0 does not have any leaf, no swap. Extract the maximum element and
rebuild the Heap.
0 {1, 0 | 2, 2, 4, 5, 6} 1 is at the right position, no swap. Extract the maximum element and
rebuild the Heap.
0 (0 | 1, 2, 2, 4, 5, 6} Extract the maximum element and rebuild the Heap.
0 {0, 1, 2, 2, 4, 5, 6}
3.7.4 Complexity:
Space:
The algorithm does not requires extra arrays, so space complexity is always Θ(1) for all cases.
Time:
3.8.1 Idea:
● The quick sort algorithm is a divide-and-conquer sorting algorithm.
● Select a pivot element, which is typically the middle element or the first/last element of the
current partition.
● Partition the array into two parts: one part with elements less than the pivot, and the other part
with elements greater than or equal to the pivot.
● Recursively apply the quick sort algorithm to the two partitions.
● For small partitions (less than 10 elements), use insertion sort instead of recursive quick sort to
improve efficiency.
0 {5, 2, 4, 0, 9, 3, 7, 1, 8, 5, 10, Mid = 7 (1) (mid= (last (15) + first (0)) /2) is smaller than 2
8, 1, 6, 1, 2} and 2 is smaller than 5, 2 is pivot
1 {1, 1, 1, 0}, 2 {3, 7, 4, 8, 5,
10, 8, 2, 6, 5, 9} The array is partitioned.
2 {0,1,1,1}, 2, {3, 7, 4, 8, 5, 10, The left sub-array is sorted by insertion sort. Mid = 10(10)
8, 2, 6, 5, 9} (last = 15(9), first = 5(3)). Pivot is 9.
3 {0,1,1,1}, 2, {3, 7, 4, 8, 5, 10, 5 is smaller than 6, so 5 will be swapped with 6.
8, 2, 6, 5, 9}
4 {0,1,1,1}, 2, {3, 7, 4, 8, 5, 5, 5 is at right position. Extract the maximum element and
8, 2, 6, 10, 9} rebuild the Heap.
5 {0,1,1,1}, 2, {3, 7, 4, 8, 5, 5, The left sub-array is sorted by insertion sort.
8, 2, 6), 9, {10}
6 {0, 1, 1, 1, 2, 2, 3, 4, 5, 5, 6,
7, 8, 8, 9, 10}
3.8.4 Complexity:
Space:
The algorithm does not requires extra arrays, so space complexity is always Θ(1) for all cases.
Time:
n ( n−1 )
C ( n )=n−1+n−2+ n−3+ …+1= ∈O(n 2)
2
3.9.1 Idea:
● The counting sort algorithm is a non-comparative sorting algorithm.
● Find the maximum element in the input array.
● Create a frequency array to store the count of each unique element in the input array.
● Modify the frequency array such that each element at each index stores the sum of the previous
counts.
● Build the output array by iterating the input array in reverse order and using the frequency array
to decide the position of each element.
● Lastly, copy the sorted elements from the output array to the original input array.
Space:
Counting sort requires array to store the frequency of numbers (smallest element – largest element), so
space complexity is O(k) (k is the largest element).
Time:
3.10.1 Idea:
● The radix sort algorithm is a non-comparative sorting algorithm.
● Find the maximum element in the input array to know the number of digits.
● Do counting sort for every digit. Instead of moving digits, exp is changed. exp is 1 (one's place),
10 (ten's place), 100 (hundred's place), and so on.
● Repeat step 2 until the maximum digit in the array is reached.
0 {18, 29, 59, 30, 99, The largest element has 2 digits
17}
1 {30, 17, 18, 29, 39, 59, The array will be sorted based on their last digit. The order of
99} elements with have the same last digit will be kept.
2 {17, 18, 29, 10, 39, 59, The array will be sorted based on their second last digit. The order
99} of elements with have the same last second digit will be kept.
3 {17, 18, 29, 10, 39, 59,
99}
3.10.4 Complexity:
Space:
Radix sort requires a extra array to store for the buckets used during sorting and for sorted output. So the
space complexity of Radix sort is O(n+k)
Time:
3.11.1 Idea:
● Find the maximum (max) and minimum (min) values in the array.
● Divide the array into m buckets, where m = 0.45 * n (n is the size of the array).
● Distribute the elements of the array into the buckets based on the formula: k = floor((m-1) * (a[i]
- min) / (max - min)), where i is the index of the element.
● Sort the elements within each bucket using a different sorting algorithm (here it's Insertion Sort).
● Merge the sorted buckets to create the final sorted array.
Space:
Flash sort requires an extra array to store m buckets, which is 0.45 x n. So that the space complexity of
flash sort is O(m).
Time:
Sorted data
Datasize 10000 30000 50000
Resulting Comparis Running Comparis Running Comparis Running
statics ons time ons time ons time
Selection 100019998 144 900059998 1227 250009999 3882
sort 8
Insertion 29998 0.046 89998 0.14 149998 0.22
sort
Bubble sort 20001 0.05 60001 0.11 100001 0.17
Shaker sort 20001 0.03 60001 0.09 100001 0.17
Shell sort 360042 0.62 1170050 1.57 2100049 2.83
Heap sort 670329 1.73 2236648 5.65 3925351 9.58
Merge sort 465243 0.78 1529915 2.76 2672827 4.56
Quick sort 203112 0.31 732326 1.09 1324613 2.04
Counting 60002 0.15 180002 0.63 300002 1.09
sort
Radix sort 140056 0.63 510070 2.04 850070 3.3
Flash sort 114302 0.31 342902 0.78 571502 1.57
Table 4.1b: Performance metrics of sorting algorithms for sorted data (part 2)
Sorted data
Datasize 100000 300000 500000
Resulting Comparis Running Comparis Running Comparis Running
statics ons time ons time ons time
Selection 100001999 14957 900005999 134853 2.50001E+ 352155
sort 98 98 11
Insertion 299998 0.47 899998 1.57 1499998 2.55
sort
Bubble sort 200001 0.34 600001 1.17 1000001 1.98
Shaker sort 200001 0.35 600001 1.38 1000001 2.12
Shell sort 4500051 5.97 15300061 21.06 25500058 35.69
Heap sort 8365080 19.97 27413230 67.96 47404886 116.99
Merge sort 5645659 9.27 18345947 30.81 31517851 53.13
Quick sort 2849191 4.56 9098354 13.21 16196654 24.12
Counting 600002 2.04 1800002 6.23 3000002 10.24
sort
Radix sort 1700070 6.59 6000084 24.9 10000084 41.61
Flash sort 1143002 2.98 3429002 9.69 5715002 16.17
Based on the statistical results presented above, we can construct a bar chart to compare the
number of comparisons performed by each algorithm, as illustrated below.
Analysis:
- The Selection sort algorithm stands out with a significantly higher number of comparisons,
attributed to its stable O(n2) time complexity, which results in comparisons in all cases.
- In contrast, other O(n2) algorithms such as bubble sort and shaker sort have early that reduce
unnecessary comparisons and swap. They can quickly detect when no elements need reordering,
terminating early and achieving optimal performance under sorted data conditions.
- Insertion sort and shaker sort are also O(n2) algorithms that are efficient in this scenario due to
their best-case of O(n) is when the data is sorted.
To enhance clarity in presenting other algorithms, selection sort is excluded from the analysis in
the following figures.
Figure 4.1b: Algorithm comparison counts in sorted data chart (without selection sort)
Analysis:
- All 4 O(nlogn) algorithms have much more significant comparison counts than the rest here,
since they don’t have any terminate mechanisms that would work well in this specific scenario.
- Although heap sort maintains an O(nlogn) in all cases, its still have the most comparison counts
in this scenario due to its rigorous requirement to restructure the entire data to maintain the heap
property, doesn’t matter if the data is in ordered or not.
The following figures are line graphs illustrate the running times of each sorting algorithm when
processing sorted input data.
Analysis
- The elevated running time observed for Selection sort mirrors its high comparison count, both
attributed to the algorithm's stable O(n2) time complexity as mentioned above.
To enhance clarity in presenting other algorithms, selection sort is excluded from the analysis in
the following figures.
Figure 4.2b Algorithm running times in sorted data graph (without selection sort)
Analysis:
- After removing selection sort, heap sort continues to exhibit the highest output, which is
runtime this time due to the same reason as mentioned at comparison counts section above.
- Bubble sort and insertion are extremely fast in this scenario due to their adaptive nature when
handling data that is already partially or fully sorted which is mentioned above.
- Shaker sort achieves its fastest performance in sorted input scenarios for all data sizes, in
contrast to heap sort due to its bidirectional sorting strategy and the fact it’s the improved version
of bubble sort. Unlike bubble sort, which terminates when no swaps are made from left to right,
the bidirectional movement property allows shaker sort to alternate passes from left to right and
right to left, stopping when no swaps are made in either direction, make it more efficient than
bubble sort in sorted input.
4.2 Nearly sorted data
Table 4.2a: Performance metrics of sorting algorithms for nearly sorted data (part 1)
Table 4.2b: Performance metrics of sorting algorithms for nearly sorted data (part 2)
Analysis:
- It’s no surprise here that selection sort still maintains its reputation for the least efficiency.
However, it’s noteworthy that bubble sort, particularly in handling nearly sorted input, also
shares this trait.
- This extremely high number of comparisons can be explained by the fact that the Bubble Sort
algorithm compares two elements at a time throughout the entire O(n^2) loop, irrespective of the
elements' order. Although it possesses an early termination mechanism, it’s not really an
effective one, as it requires the array to be fully sorted before termination can occur.
- Shaker Sort, on the other hand, employs a more effective termination mechanism due to its
bidirectional sorting strategy. This approach allows it to detect sorted segments in nearly sorted
data more easily, resulting in significantly lower comparison counts.
- However, this termination mechanism in nearly sorted data is not as effective as in sorted data,
which leads to the fact that shaker sort still exhibits significantly higher comparison counts
compared to algorithms with O(nlogn) and O(n) complexities.
- The only O(n2) algorithm that stands out in this context is insertion sort, as its performance is
completely not visible when the other O(n2) algorithms are included in this comparative chart,
suggesting a notable efficiency improvement from insertion sort.
To enhance clarity in presenting other algorithms, selection sort, bubble sort and shaker sort is
excluded from the analysis in the following figures.
Figure 4.3b: Algorithm comparison counts in nearly sorted Data chart (without selection,
bubble and shaker sort)
- Insertion sort not only demonstrates superior efficiency in terms of comparison counts among
O(n2) algorithms but also outperforms all other algorithms in this scenario. This can be attributed
to its efficient insertion method, which minimizes comparisons and shifts needed to insert each
element into the sorted portion since the data is nearly sorted. In such scenarios, insertion sort
can even be performed with O(n) time complexity.
- Heap sort's status as the algorithm with the highest comparison counts can be explained
similarly to its behavior in sorted input data scenarios.
The following figures are line graphs illustrating the running times of each sorting algorithm
when processing nearly sorted input data.
Exclude selection sort, bubble sort and shaker sort are excluded from the analysis to get this
following figure.
Figure 4.4b: Algorithm running times in nearly sorted data graph (without selection, bubble and
shaker sort)
Analysis:
- The efficency of sorting algorithms in terms of running time closely reflects their comparison
counts, with selection sort, bubble sort, and shaker sort being prominently visible on graphs that
include them. Excluding these algorithms highlights heap sort's persistently inefficient
characteristics.
- This suggests that the performance of insertion sort and heap sort in comparison to others
remains consistent as the fastest and slowest algorithms, whether in nearly sorted or fully sorted
scenarios.
4.3 Reversed sorted data
Table 4.3a: Performance metrics of sorting algorithms for nearly sorted data (part 1)
Table 4.3b: Performance metrics of sorting algorithms for nearly sorted data (part 2)
All O(n2) algorithms are excluded from the analysis to get this following figure.
Figure 4.5b: Algorithm running times in reverse sorted data chart (without O(n2) algorithms)
Analysis:
- It seems that the advantage of insertion sort, bubble sort, and selection sort in this reversed
sorted input scenario (early termination mechanism) diminishes because their efficiencies
heavily rely on sorted portions of data.
- Heap sort still maintains the highest comparison counts among all O(nlogn) and O(n)
algorithms.
- Since insertion sort is no longer exceptionally efficient in this reversed sorted input scenario,
counting sort emerges as the algorithm with the highest efficiency in terms of comparison counts
among all algorithms (it’s always high but not the highest in sorted and nearly sorted input
scenario). This is attributed to its ability to sort without relying on comparisons to implement the
sorting process.
The following figures are line graphs illustrating the running times of each sorting algorithm
when processing reverse sorted input data.
All O(n2) algorithms are excluded from the analysis to get this following figure.
Figure 4.6b: Algorithm running times in reverse sorted data graph (without O(n2) algorithms)
Analysis:
- Both bubble sort and shaker sort rank as the slowest algorithms here, followed by selection sort
and insertion sort, all of which are categorized as inefficient O(n2) algorithms.
- The efficency of the other algorithms in terms of running time closely mirrors their comparison
counts relative to each other, with counting sort is the fastest among all algorithms and heap sort
is the slowest when all O(n2) algorithms are not included.
4.4 Randomized
Table 4.4a: Performance metrics of sorting algorithms for randomized data (part 1)
Randomized data
Datasize 10000 30000 50000
Resulting Comparis Running Comparis Running Comparis Running
statics ons time ons time ons time
Selection 100019998 140 900059998 1259 250009999 3525
sort 8
Insertion sort 50272632 62 451465111 600 124910287 1721
0
Bubble sort 99990072 347 900049800 3319 250000140 9412
5
Shaker sort 100015000 269 900045000 2441 250007500 6918
0
Shell sort 636227 1.7 2277704 5.9 4559070 11.3
Heap sort 637987 1.9 2150336 6.4 3772295 11.18
Merge sort 573720 1.4 1907183 4.6 3332661 8.09
Quick sort 263968 0.8 872588 2.8 1536347 4.9
Counting 60002 0.15 180001 0.635 282770 0.94
sort
Radix sort 140056 0.55 510070 2.056 850070 3.14
Flash sort 96583 0.3 305972 1.1 489823 1.9
Table 4.4b: Performance metrics of sorting algorithms for randomized data (part 2)
Randomized data
Datasize 100000 300000 500000
Resulting Comparis Running Comparis Running Comparis Running
statics ons time ons time ons time
Selection sort 100001999 13805 900005999 123306 2.50001E+ 346689
98 98 11
Insertion sort 499934864 6890 449737088 63757 1.24762E+ 160667
4 63 11
Bubble sort 999998470 37314 899981162 358214 2.49998E+ 967177
5 25 11
Shaker sort 100001500 28608 900004500 254611 2.50001E+ 738371
00 00 11
Shell sort 10431455 24.6 34781771 125 64630026 219
Heap sort 8046043 23.9 26490065 135 45971506 192
Merge sort 7065711 17.1 23082794 94 39882624 130
Quick sort 3419803 10.38 11261721 31 19732260 77
Counting sort 532769 1.7 1532770 4.7 2532770 15
Radix sort 1700070 6.37 5100070 20.6 8500070 41
Flash sort 939592 3.7 2920436 12.6 5062874 49
Based on the statistical results presented above, we can construct a bar chart to compare the
number of comparisons performed by each algorithm, as illustrated below.
All O(n2) algorithms are excluded from the analysis to get this following figure.
Figure 4.7b: Algorithm running times in randomized data chart (without O(n2) algorithms)
Analysis:
- Just like in reverse sorted data input scenario, O(n2) algorithms exhibit high comparison counts,
due to the stable O(n2) time complexity of selction sort and the diminishing of early termination
mechanism of the rest O(n2) algorithm.
- Among the O(n2) algorithms, insertion sort demonstrates notably lower comparison counts than
others due to its highly adaptive nature, particularly effective in sorted and nearly sorted inputs.
In randomized scenarios, especially with sufficiently large inputs, there’s a chance that some
segments of the data may inadvertently resemble sorted sequences, allowing insertion sort to
perform more efficiently in these cases compared to its counterparts.
- The most efficient algorithm in terms of comparison counts among all algorithms is still
counting sort in this scenario, due to its ability to sort without relying on comparisons, leveraging
direct counting and indexing techniques instead.
- There’s a notable feature here, which is the highest comparison counts of shell sort when all the
O(n2) algorithms are excluded. This contrasts with scenarios where heap sort typically records
the highest counts. Heap sort's strength lies in maintaining a balanced heap structure, ensuring
O(nlogn) time complexity for all cases. While shell sort can benefit from partially sorted data
segments, albeit less so than algorithms like insertion sort, it still achieves greater runtime
efficiency than heap sort. But in this randomized input scenario, that benefit is further
diminished, which results in what is shown in this chart.
The following figures are line graphs illustrating the running times of each sorting algorithm
when processing reverse sorted input data.
All O(n2) algorithms are excluded from the analysis to get this following figure.
Figure 4.8b: Algorithm running times in randomized data graph (without O(n2) algorithms)
Analysis:
- All O(n2) algorithms have a visibly distinct runtime compared to each other, with bubble sort
being the slowest, indicating that selection is no longer the least efficient one in this scenario.
And the fastest O(n2) algorithm is insertion sort, with the same reason the same as why it’s the
most efficient O(n2) algorithm in terms of comparison counts.
- In this randomized input data scenario, among O(n2) algorithms runtime comparison, each
demonstrates visibly distinct runtime behaviors, contrasting with other scenarios where their
performance line graphs often overlap.
- Bubble sort notably showing the slowest performance here, thereby suggesting that selection
sort is no longer the least efficient in this context.
- Among these algorithms, insertion sort emerges as the fastest O(n2) algorithm, due to the reason
as to why it has superior performance in terms of comparison counts among O(n2) algorithms.
- The slowest algorithm within an input size of up to 300,000 is heap sort, while beyond this
threshold, shell sort becomes the least efficient.
- Counting sort maintains its superior efficiency, as observed in both reverse sorted and
randomized input scenarios.
- Counting sort is the fastest sorting algorithm, followed by flash sort in all input order scenarios.
Counting sort maintains its overall speed advantage over flash sort due to its stable linear time
complexity O(n + k) for all cases, while flash sort while flash sort relies on a more variable time
complexity that can degrade to O(n2) in worst-case scenarios with uneven distributions of
elements or input characteristics that do not align with its assumptions.
- While bubble sort may perform quickly on fully sorted input, it generally ranks as the slowest
algorithm and even slower than selection sort overall. Its limited ability to handle partially
ordered data segments means its early termination mechanism offers minimal efficiency gains.
Even in nearly sorted input data scenarios, bubble sort remains among the slowest algorithms.
- In contrast to bubble sort, both shaker sort and insertion sort can capitalize significantly on
partially ordered input data, even if it is not as large as in nearly sorted input data. Insertion sort
becomes exceptionally efficient in such scenarios, outperforming even counting sort in terms of
runtime efficiency.
5. Project organization & programming notes
Project organization
All the source code files are saved in SOURCE directory:
+ Algorithms.cpp and Algorithms.h file : contain source code of 11 sorting algorithms including
one .cpp file and one .h file providing algorithm data for testing and evaluation.
+ DataGenerator.cpp and DataGenerator.h file : provide functions to generate array data with
different characteristics. These functions help to generate randomly ordered, ascending ordered,
descending ordered and nearly ordered arrays.
+ Helper.cpp and Helper.h file : contain functions that support the necessary requirements during the
project implementation process, including: implement sorting algorithms based on algorithm id,
convert data type of algorithm and input order, print out the result, check for valid input/file.
+ Experiment.cpp and Experiment.h file : contain five commands and functions and writes the
experimental results to a file with information including data order, data size, sorting algorithm name,
running time, and number of comparisons.
+ In addition, besides the mentioned directories, there is also a main.cpp file that uses command line
parameters to execute the written program.
Programming notes
Some libraries used in this project are:
+ fstream library: provides classes and functions for file handling operations, such as opening,
reading, and writing to files.
+ string, cstring library: is used to perform string related operations.
+ iomanip, chrono library: provides functions and manipulators for formatting the output in a desired
manner.
+ cmath library: provides a wide range of mathematical functions and operations.
+ time.h library: provides functions and data types for working with time and date.
Usage guide:
Open a command prompt, use cd [direction to our source folder] to compile our source files into an
exe file, use : g++ .\*.cpp -o [exe file]
In this command:
• It is used to compile C++ source files into an executable file with a specified name.
• g++: This is the command to invoke the g++ C++ compiler.
• .\*.cpp: This is a file pattern that specifies all the C++ source files in the current directory. The *
character is a wildcard that matches any file name, and .cpp specifies that only files with the .cpp
extension will be compiled.
• -o [exe file]: This is an option that specifies the name of the output executable file. You need to
replace [exe file] with the desired name for your executable file.
6. List of references:
1. Selection Sort
Sort Algorithm. (2022, September 1). Retrieved November 5, 2022, from
https://www.geeksforgeeks.org/selection-sort/
2. Insertion Sort
Insertion Sort. (2022, October 18). Retrieved November 1, 2022, from
https://www.geeksforgeeks.org/insertion-sort/
GeeksforGeeks. (2022a, June 20). Binary Insertion Sort. Retrieved November 2,
2022, from https://www.geeksforgeeks.org/binary-insertion-sort/
3. Bubble Sort:
GeeksforGeeks. (2022j, November 1). Bubble Sort Algorithm. Retrieved November
5, 2022, from https://www.geeksforgeeks.org/bubble-sort/
4. Shaker Sort:
Wikipedia contributors. (2022b, October 31). Cocktail shaker sort. Retrieved
November 1, 2022, from https://en.wikipedia.org/wiki/Cocktail_shaker_sort
5. Shell Sort:
GeeksforGeeks. (2022i, October 24). ShellSort. Retrieved October 28, 2022, from
https://www.geeksforgeeks.org/shellsort/
RobEdwards. (2016, December 9). Sorts 5 Shell Sort. Retrieved October 28, 2022,
from https://www.youtube.com/watch?v=ddeLSDsYVp8
6. Heap Sort:
GeeksforGeeks. (2022d, September 22). Heap Sort. Retrieved October 23, 2022, from
https://www.geeksforgeeks.org/heap-sort/
7. Merge Sort:
GeeksforGeeks. (2022e, September 23). Merge Sort Algorithm. Retrieved November
5, 2022, from https://www.geeksforgeeks.org/merge-sort/
8. Quick Sort:
GeeksforGeeks. (2022h, September 27). QuickSort. Retrieved October 23, 2022,
from https://www.geeksforgeeks.org/quick-sort/
GeeksforGeeks. (2022f, September 26). Hoare’s vs Lomuto partition scheme in
QuickSort. Retrieved October 23, 2022, from https://www.geeksforgeeks.org/hoares-
vs-lomuto-partition-scheme-quicksort/
9. Counting Sort:
GeeksforGeeks. (2022g, September 27). Counting Sort. Retrieved November 2, 2022,
from https://www.geeksforgeeks.org/counting-sort/
10. Radix Sort:
GeeksforGeeks. (2022c, August 9). Radix Sort. Retrieved November 5, 2022, from
https://www.geeksforgeeks.org/radix-sort/
11. Flash Sort:
Flashsort. (2022, June 29). Retrieved October 29, 2022, from
https://en.wikipedia.org/wiki/Flashsort
Hawkular. (2015, March 24). Flash sort- a sorting algorithm. Retrieved October 29,
2022, from https://www.youtube.com/watch?v=kdhPnSIT7dY