Lecture 4: The Linear Time Selection

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

Lecture 4: The Linear Time Selection

Selection Problem Given a sequence of numbers , and an integer  ,      , nd the  th smallest element. When   , it is called the median problem. Example: Given  smallest element is 19.
 ! #" #$ % #"!"


, the & th

Question: How do you solve this problem?

First Solution: Selection by sorting

Step 1: Sort the elements in ascending order with any algorithm of complexity   .

Step 2: Return the  th element of the sorted array.

The complexity of this solution is

Question: Can we do better? Answer: YES, but we need to recall Partition( used in Quicksort!


Second Solution : Linear running time in average Recall of Partition(




 Denition: Rearrange the array into two (pos    sibly empty) subarrays and such that
   and    for any   .

p x

q x x x = A[r]

(1) The original is used as the pivot. (2) It is a deterministic algorithm. (3) The element for the th position is found! Note that this partition is different from the partition we used in COMP 171.
3

The Idea of Partition(


p x




)
r x

i x

j unrestricted

(1) Initially

(2) Increase by 1 each time to nd a place for At the same time increase  when necessary.




(3) The procedure stops when

One Iteration of the Procedure Partition


p x p x p x p i x

i x i x i x

j >x

r x (A) A[j] > x j r x

j <x

r x (B) A[j] < x j r

(A) Only increase

by 1.

(B)


.
5

The Operation of Partition(


i p, j

): Example
r

2 2
p, i

8 8 8 8
i

7 7
j

1 3 1 1
j

5 5 5 5 5
j

6 6 6 6 6 6
j

4
r

(1) (2) (3) (4) (5)


r

p, i j

3 3 3
j

4
r

2
p, i

7 7 7
i

4
r

2
p

1 8 8 8 8 4

4
r

2
p

1 1 1 1 1

3 7 7 7 7

4 4
r

2
p

3
i

5 5 5 5

(6) (7) (8) (9)

2
p

3
i

6 6 6

4
j, r

2
p

3
i

4
j, r

The Partition(



) Algorithm

Partition(A, p, r) { // A[r] is the pivot element x = A[r]; i = p-1; for (j = p to r-1) { if (A[j] <= x) { i = i+1; exchange A[i] and A[j] } } // put pivot in position exchange A[i+1] and A[r] // q = i+1 return i+1; }
7

The Running Time of Partition(



comparison of array elements assignment, addition, comparison of loop variables Partition(

): 1 1

  

for

if

to
 




 

exchange  exchange    return 




3 1

and  Total:  Running time is  of the array .

, that is, linear in the length

Randomized-Partition(

)


The Idea: In the algorithm Partition( ), is al  ways used as the pivot to partition the array . In the algorithm Randomized-Partition(   domly choose an ,  , and use

), we ran as pivot.



Randomized-Partition(A, p, r) { j = random(p, r); exchange A[r] and A[j] Partition(A, p, r); } Remark: random( , ) is a pseudorandom-number generator that returns a random number between  and .
9


Randomized-Select(

),


Problem: Select the  th smallest element in  where      .


Solution: Apply Randomized-Partition(

), getting
r

q kth element k = qp+1

Case 1:

, pivot is the solution.

Case 2: , the th smallest element in th smallest element in


  .

must be the

Case 3:

 , the th smallest element in


  th smallest element in  
. 

must be the

If necessary, recursively call the same procedure to the subarray.

10

Randomized-Select( if

),


return Randomized-Partition 



if


return
else if  return Randomized-Select  else return Randomized-Select 

the pivot is the answer


 


Remark: To nd the  th smallest element in call Randomized-Select(    ).


11

Running Time of Randomized-Select(

Let   be the average number of comparisons of array elements for      . Then

"

and for


we get initial partition


recursion, recursion,

 

We will prove by induction on


 


that
& 

for all


and  .

12

Proof that

&

 


Induction basis:    " & Induction that
step: Assume . Then for all  and   

&

 

       

          

                                  

 

13

Proof that

&

 

where

 

&%

&


& " 

&! 

&

"

for


 

. Hence
 

&!

for all . Therefore

&

 

&%

14

Running Time of Randomized-Select(



We proved that   we have in particular that &%

)


. Since

15

Randomized-Quicksort Algorithm We make use of the Randomized-Partition idea to develop a new version of quicksort. Randomized-Quicksort(A, p, r) { if (p < r) { q = Randomized-Partition(A, p, r); Randomized-Quicksort(A, p, q-1); Randomized-Quicksort(A, q+1, r); } } Does it run faster than the original version of quicksort?

16

Running Time of the Randomized-Quicksort Results : Worst Case:  Average Case:


 

Clearly, the worst case is still average case?

, what about the

17

Average running time of Randomized-Quicksort Key observations:

The running time of (randomized) quicksort is dominated by the time spent in (randomized) partition. In the partition procedure, the time is dominated by the number of key comparisons.

When a pivot is selected, the pivot is compared with every other elements, then the elements are partitioned into two parts accordingly.

Elements in different partition are NEVER compared with each other in all operations.

Tricks: We nd the expected number of comparisons for all randomized-partition calls.


18

Average running time of Randomized-Quicksort Let be the input array which is a permutation of the


 .  distinct elements be the total number of comparisons performed Let in ALL calls to randomized-partition. Let be the number of comparisons between and , observe that can only be 0 or 1. Our goal is to compute the expected value of , i.e.,

is compared to   is not compared to "  is compared to

19

Average running time of Randomized-Quicksort It remains to show how to nd For    


(remember

is compared to

, let

).

Key observations: If or is selected as a pivot BEFORE any el    ements in  , and will be compared.

Conversely, if any element in other then or is selected as a pivot before and , and will be placed in DIFFERENT partitions, and hence they will NOT compare with each other in ALL randomized-partition calls. ANY element other than the elements in has  no effect to . is compared to

20

Average running time of Randomized-Quicksort It remains to nd the probability that or pivot chosen from .

is the rst

 

is compared to or is the rst pivot chosen from is the rst pivot chosen from is the rst pivot chosen from
  

 

21

Average running time of Randomized-Quicksort Putting everything together, we have

 

 

Hence, the expected number of comparisons is  which is the average running time of RandomizedQuicksort.


22

You might also like