CO322: DS & A
(Simple yet) efficient algorithms
Dhammika Elkaduwe
Department of Computer Engineering
Faculty of Engineering
University of Peradeniya
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
What is to come
We will look at;
I Efficient algorithms (for sorting/searching)
I Start with simple algorithms (not so efficient ones)
I Complexity analysis of recursive algorithms
I Algorithms vs. their implementation
I Features of programming languages
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Searching algorithm
Look whether the given key is there.
Assume that there is no order/configuration within the given
records.
static boolean linear_search(int [] records, int key) {
int i;
for(i=0; i < records.length; i++)
if(records[i] == key) break;
return i != records.length;
}
What is the complexity of this algorithm in best, worst and average
cases. (Note: Fixed time, you call it O(1))
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Can we improve linear search?
static boolean linear_search(int [] records, int key) {
int i;
for(i=0; i < records.length; i++)
if(records[i] == key) break;
return i != records.length;
}
I One comparison takes out one element
I We need N comparisons (so O(N) algorithms)
I Can we better this?
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Binary search
I Records should be sorted (precondition for the algorithm)
I Compare with the middle element, if found done!
I Select the left or the right part of records based on the
comparison (remember the array is sorted)
I One comparison will half the problem!!
I Question: I have 20million records. How many comparisons
in the worst case?
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Binary search
I Records should be sorted (precondition for the algorithm)
I Compare with the middle element, if found done!
I Select the left or the right part of records based on the
comparison (remember the array is sorted)
I One comparison will half the problem!!
I Question: I have 20million records. How many comparisons
in the worst case?
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
The binary search: a recursive implementation
static boolean binary_search_impl(int start, int end,
int [] records, int key) {
int mid = (int)((start + end) / 2);
if(start > end) return false;
if(records[mid] == key) return true;
if(records[mid] > key)
return binary_search_impl(start, mid-1, records,
key);
return binary_search_impl(mid+1, end, records, key);
}
/* wrapper for actual implementation */
static boolean binary_search(int [] records, int key) {
return binary_search_impl(0, records.length, records,
key);
}
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
The binary search: Analysis
static boolean binary_search_impl(int start, int end,
int [] records, int key) {
int mid = (int)((start + end) / 2);
if(start > end) return false;
if(records[mid] == key) return true;
if(records[mid] > key)
return binary_search_impl(start, mid-1, records,
key);
return binary_search_impl(mid+1, end, records, key);
}
What is the time complexity of the above algorithm?
What is the space complexity of the above algorithm? (space that
it uses)
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
The binary search: iterative implementation
static boolean binary_search_ite(int [] rec, int key) {
int high = rec.length;
int start = 0;
while(start <= high) {
int mid = (int)((high + start)/2);
if(rec[mid] == key) return true;
if(rec[mid] > key) high = mid-1;
else start = mid+1;
}
return false;
}
What is the time complexity of this?
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Complexities
Note that graph only goes to
1000!
I O(N 2 ) does not scale at all,
not useful for any real
problem
I O(N) algorithms are
somewhat OK
I O(log (n)) is the best, of
cause :)
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
‘Power of‘ function
Another example:
static int power_of(int a, int k) {
/* function should return a^k
* Do I have preconditions?
*/
int i = 1;
for(; k < 1; k--)
i = i * a;
return i;
}
What is the complexity of this algorithm?
Can we better this? (recall: problem is halved)
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
“power of”: better version
static int power_of(int a, int k) {
/* function should return a^k
* Do I have preconditions?
*/
int i = 1;
for(; k >= 1; k--)
i *= a;
return i;
}
Can we improve:
I n2k = (nk )2 and n2k+1 = n2k ∗ n
I Can you use this observation to make the above algorithm
better? (code using recursion/iteration)
I What is the time complexity of the new version
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Merge Sort: divide and conquer
in a picture
Basic Idea: It is easy to merge
two sorted arrays!
I Break the array into two
parts (in the middle)
I (merge) sort the two halves
I Merge them together
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Merge Sort: divide and conquer
pseudocode
Basic Idea: It is easy to merge mergeSort(A, p, r)
two sorted arrays! if p >= r, do nothing
if p < r then
I Break the array into two
q = |(p + r) / 2|
parts (in the middle) mergeSort(A, p, q);
I (merge) sort the two halves mergeSort(A, q+1, r);
I Merge them together combine(A, p, q, r)
Exercises:
I Implement the merge sort, using Java
I What is the time complexity of merge sort?
I What is the space complexity of merge sort? needs additional
space)
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Merge Sort: divide and conquer
pseudocode
Basic Idea: It is easy to merge mergeSort(A, p, r)
two sorted arrays! if p >= r, do nothing
if p < r then
I Break the array into two
q = |(p + r) / 2|
parts (in the middle) mergeSort(A, p, q);
I (merge) sort the two halves mergeSort(A, q+1, r);
I Merge them together combine(A, p, q, r)
Exercises:
I Implement the merge sort, using Java
I What is the time complexity of merge sort?
I What is the space complexity of merge sort? needs additional
space)
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Merge Sort: Not so good implementation
Merge function: needs new arrays each time!
static int [] combine(int [] a, int [] b) {
int total = a.length + b.length;
int [] merged = new int [total];
int i=0, j=0, k;
for(k=0; k < total && i < a.length && j < b.length;
k++) {
if(a[i] < b[j]) merged[k] = a[i++];
else merged[k] = b[j++];
} /* for(k=0; .. */
for(; k < total && i < a.length; k++, i++)
merged[k] = a[i];
for(; k < total && j < b.length; k++, j++)
merged[k] = b[j];
return merged;
}
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Merge Sort: Not so good implementation
static int [] split(int [] a, int from, int to) {
int [] newarray = new int [to - from];
for(int i=0; i < newarray.length; i++) {
newarray[i] = a[from + i];
}
return newarray;
}
static int [] merge_sort(int [] a) {
if(a.length == 1) return a;
int mid = (int)((a.length) / 2);
int [] left = split(a, 0, mid); // excluding mid
int [] right = split(a, mid, a.length);
left = merge_sort(left);
right = merge_sort(right);
return combine(left, right);
}
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Merge Sort: Analysis
What is the time complexity of the above implementation of the
merge sort?
I Time complexity = O(N.log (N))
I Space complexity = O(N.log (N)) (note that there are better
ways of doing this. In the above implementation we allocate
new arrays (while splitting) in each iteration.
I We only need O(N) additional space, if we do a good
implementation!!
Take home point: time and space efficiency
depends on the implementation. Each algorithm
has a best that it can do; but you may not get
their unless implemented correctly.
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Merge Sort: Better implementation
At most we need additional memory to hold N elements; use the
same space for all temporary merges!
static void mergeSort(int [] data) {
int [] tmp = new int [data.length];
mergeSort(data, tmp, 0, data.length-1);
}
static void mergeSort(int [] data, int [] tmp, int start,
int end){
if(start < end) {
int mid = (start + end) / 2;
mergeSort(data, tmp, start, mid);
mergeSort(data, tmp, mid+1, end);
/* merge the two sorted arrays */
merge(data, tmp, start, mid, end);
}
}
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Merge Sort: Better implementation
static void merge(int [] data, int [] tmp, int start, int
mid, int end) {
int total = end - start + 1;
int i = start, j = mid+1, k;
/* use tmp array for movements; starting from 0 */
for(k=0; (k < total) && (i <= mid) && (j <= end); k++) {
if(data[i] < data[j]) tmp[k] = data[i++];
else tmp[k] = data[j++];
}
for(;k < total && i <= mid; k++, i++) tmp[k] = data[i];
for(;k < total && j <= end; k++, j++) tmp[k] = data[j];
/* copy the merged results back to the data array */
for(k=0; k < total; k++)
data[start + k] = tmp[k];
}
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Merge Sort: Analysis
I The merge sort algorithm has, O(N.log (N)) time complexity
and needs O(N) additional memory. (need to implement
properly, of cause)
I Next algorithm: can we get rid of the additional memory?
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Quick sort
Basic algorithm:
I Pick some value from the array (call it pivot)
I pivot can be the first element in the array, last element, or
some random element.
I You can use other ways to pick the pivot. Given above are
some examples.
I Rearrange elements such that all elements less than to pivot
are in front of the array followed by the pivot and elements
which are greater than or equal to the pivot (called
partitioning)
I Partitioning is typically done by selecting an item smaller than
pivot from the right side of the array and swapping it with an
item larger than pivot which is in the left hand side of the
array.
I After partitioning, the pivot is in the correct position.
I Quick sort the left and the right portions from the pivot
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Quick sort: Partitioning example
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Quick sort: Implementation in Java
static void quickSort(int [] data) {
quickSort(data, 0, data.length-1);
/* including the last element, hence length-1 */
}
static void quickSort(int [] data, int start, int end) {
if(start < end) {
int pivotPosition = partition(data, start, end);
/* including the last element, hence length-1
* Pivot is in the correct position!
*/
quickSort(data, start, pivotPosition-1);
quickSort(data, pivotPosition+1, end);
}
}
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Quick sort: Analysis questions
Assume that the first element of the array is selected as the pivot.
Ideally we need a pivot that would halve the problem.
I What is the best case? What is the corresponding time
complexity?
I What is the worst case? What is the corresponding time
complexity?
I What is the average case? What is the corresponding time
complexity?
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Quick sort: Analysis
Case When does it occur Run time complexity
best case pivot is the median O(N.log (N))
worst case pivot is the smallest O(N 2 )
average case random data set O(N.log (N))
Note: If you are selecting the first element as pivot, a sorted array
gives the worst run time!
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Quick sort: Selecting the pivot
Run time varies from O(N.log (N)) to O(N 2 ) depending on the
pivot.
I Ideally we need a pivot that would halve the problem. i.e. the
median. But we cannot say what the median is without
sorting the array!
I Methods for selecting a pivot:
I First, last or middle element of the array
I Random element
I Pick 3 elements randomly and use median of those 3 as pivot.
I Make sure pivot selection is O(1)!
I No selection guarantees best case.
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Quick sort: What is the average case?
So what happens in the average case? Everything depends on how
the array is partitioned!
I We have N data items and 1 median value (assuming there
are no duplicates)
I giving us N1 chance of picking the median (and therefore
getting the best case). Let us call it ideal pivot
I for a large N, this chance is very small.
continues in the next slide ..
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Quick sort: What is the average case?...
We do not need the ideal pivot
I We need a good pivot which lies between 1 and 3 of the array
4 4
(see figure)
I We have 1 chance of getting a good pivot
2
I If we get a good pivot our worst case run-time is O(log 3 N)
4
(longest path)
I If we do not get the good pivot (which happens 1 the times)
2
then partitioning will not break the array at all.
I So average number of partitions will be
O(2.log 3 N) = O(log (N))
4
I So average run-time is O(N.log (N))
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Bucket sort: Sorting by partitioning
Basic idea (assume we need to sort student names):
I Group the names based on the first letter (giving us 26 groups
or buckets)
I Buckets will be kept sorted
1
I Now we have a problem which is 26 of the original
I Now sort each of the buckets with regards to the 2nd letter
I Repeat for all α letters
Exercises:
1. Find the following for N names with α letters:
1.1 the best case time complexity
1.2 the best case space complexity
2. what is suitable Java Collection for implementing this?
3. If someone asks you to pictorially represent this sorting
algorithm what will your answer looks like?
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Bucket sort: Sorting by partitioning
Basic idea (assume we need to sort student names):
I Group the names based on the first letter (giving us 26 groups
or buckets)
I Buckets will be kept sorted
1
I Now we have a problem which is 26 of the original
I Now sort each of the buckets with regards to the 2nd letter
I Repeat for all α letters
Exercises:
1. Find the following for N names with α letters:
1.1 the best case time complexity
1.2 the best case space complexity
2. what is suitable Java Collection for implementing this?
3. If someone asks you to pictorially represent this sorting
algorithm what will your answer looks like?
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Wrap-up:Practical issues
Some issues that you need to consider:
I Keys vs. records
I What to do with equal keys?
I Arrangement of data (should we pick a random permutation
of given data so our randomised algorithms will work)
I Distribution of data (how many students have names starting
with “z”? will this effect bucket sort?)
I Data in memory or in disk?
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Wrap-up:What is expected from you
You should be able to:
I Implement a given algorithm using a suitable method
(recursion/iteration) and data structures.
I Modify the simple algorithms to suite a new situation
(tutorial).
I Analyse the time/space complexity of algorithms.
I Compare algorithms (based on their complexities as well as
other considerations).
I Reason about performance of algorithms for a set of random
inputs.
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms
Wrap-up:Additional reading
I Chapter 2 & 4 of The Algorithm Design Manual (pdf is in
Moodle)
Dhammika Elkaduwe CO322: DS & A (Simple yet) efficient algorithms