Lecture 4: The Linear Time Selection
Lecture 4: The Linear Time Selection
Lecture 4: The Linear Time Selection
Selection Problem Given a sequence of numbers , and an integer , , nd the th smallest element. When , it is called the median problem. Example: Given smallest element is 19.
! #" #$ % #"!"
, the & th
Step 1: Sort the elements in ascending order with any algorithm of complexity .
Question: Can we do better? Answer: YES, but we need to recall Partition( used in Quicksort!
Denition: Rearrange the array into two (pos sibly empty) subarrays and such that
and for any .
p x
q x x x = A[r]
(1) The original is used as the pivot. (2) It is a deterministic algorithm. (3) The element for the th position is found! Note that this partition is different from the partition we used in COMP 171.
3
)
r x
i x
j unrestricted
(1) Initially
(2) Increase by 1 each time to nd a place for At the same time increase when necessary.
i x i x i x
j >x
j <x
by 1.
(B)
.
5
): Example
r
2 2
p, i
8 8 8 8
i
7 7
j
1 3 1 1
j
5 5 5 5 5
j
6 6 6 6 6 6
j
4
r
p, i j
3 3 3
j
4
r
2
p, i
7 7 7
i
4
r
2
p
1 8 8 8 8 4
4
r
2
p
1 1 1 1 1
3 7 7 7 7
4 4
r
2
p
3
i
5 5 5 5
2
p
3
i
6 6 6
4
j, r
2
p
3
i
4
j, r
The Partition(
) Algorithm
Partition(A, p, r) { // A[r] is the pivot element x = A[r]; i = p-1; for (j = p to r-1) { if (A[j] <= x) { i = i+1; exchange A[i] and A[j] } } // put pivot in position exchange A[i+1] and A[r] // q = i+1 return i+1; }
7
): 1 1
for
if
to
3 1
Randomized-Partition(
)
The Idea: In the algorithm Partition( ), is al ways used as the pivot to partition the array . In the algorithm Randomized-Partition( domly choose an , , and use
), we ran as pivot.
Randomized-Partition(A, p, r) { j = random(p, r); exchange A[r] and A[j] Partition(A, p, r); } Remark: random( , ) is a pseudorandom-number generator that returns a random number between and .
9
Randomized-Select(
),
), getting
r
Case 1:
must be the
Case 3:
must be the
10
Randomized-Select( if
),
return Randomized-Partition
if
return
else if return Randomized-Select else return Randomized-Select
11
"
and for
that
&
for all
and .
12
Proof that
&
Induction basis: " & Induction that
step: Assume . Then for all and
&
13
Proof that
&
where
&%
&
& "
&!
&
"
for
. Hence
&!
&
&%
14
)
. Since
15
Randomized-Quicksort Algorithm We make use of the Randomized-Partition idea to develop a new version of quicksort. Randomized-Quicksort(A, p, r) { if (p < r) { q = Randomized-Partition(A, p, r); Randomized-Quicksort(A, p, q-1); Randomized-Quicksort(A, q+1, r); } } Does it run faster than the original version of quicksort?
16
17
The running time of (randomized) quicksort is dominated by the time spent in (randomized) partition. In the partition procedure, the time is dominated by the number of key comparisons.
When a pivot is selected, the pivot is compared with every other elements, then the elements are partitioned into two parts accordingly.
Elements in different partition are NEVER compared with each other in all operations.
Average running time of Randomized-Quicksort Let be the input array which is a permutation of the
. distinct elements be the total number of comparisons performed Let in ALL calls to randomized-partition. Let be the number of comparisons between and , observe that can only be 0 or 1. Our goal is to compute the expected value of , i.e.,
19
is compared to
, let
).
Key observations: If or is selected as a pivot BEFORE any el ements in , and will be compared.
Conversely, if any element in other then or is selected as a pivot before and , and will be placed in DIFFERENT partitions, and hence they will NOT compare with each other in ALL randomized-partition calls. ANY element other than the elements in has no effect to . is compared to
20
Average running time of Randomized-Quicksort It remains to nd the probability that or pivot chosen from .
is the rst
is compared to or is the rst pivot chosen from is the rst pivot chosen from is the rst pivot chosen from
21
Hence, the expected number of comparisons is which is the average running time of RandomizedQuicksort.
22