0% found this document useful (0 votes)
76 views

Lesson3 3IOAvoidingAlgorithms PDF

The document describes algorithms for sorting data that exceeds the capacity of fast memory by using a two-level memory hierarchy. It proposes an external merge sort algorithm that partitions the data into chunks that fit in fast memory, sorts each chunk, then merges the sorted chunks using a multi-way merging technique. This achieves optimal comparison complexity of O(n log n) while minimizing data transfers between fast and slow memory to O(n/L log(n/L, Z/L)).

Uploaded by

Projectlouis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views

Lesson3 3IOAvoidingAlgorithms PDF

The document describes algorithms for sorting data that exceeds the capacity of fast memory by using a two-level memory hierarchy. It proposes an external merge sort algorithm that partitions the data into chunks that fit in fast memory, sorts each chunk, then merges the sorted chunks using a multi-way merging technique. This achieves optimal comparison complexity of O(n log n) while minimizing data transfers between fast and slow memory to O(n/L log(n/L, Z/L)).

Uploaded by

Projectlouis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Lesson33I/OAvoidingAlgorithms

SenseofScale

GoalofLesson:developalowerboundfortheamountofcommunicationtosortonamachine
withslowandfastmemory.

nisthenumberofdatapoints
Zisthesizeofthefastmemory(inwords)
Listhenumberofwordspertransfer

QuizInformation:

Given:
50
volumeofdatatosort:r*n=1PiB(2
Bytes)

record(item)size:r=256Bytes

36
Fastmemorysze:r*Z=64GiB(2
Bytes)

15
Memorytransfersize:r*L=32KiB(2
Bytes)

Thenumberoftransferoperations:
8
43
28
7
Use:r=2
B,n=2
records,Z=2
records,L=2
records

nlog
n=185T
2
ops
nlog
(n/L)=154T
2
ops
n=4.40T

ops
n/Llog
(n/L)=1.20T
2
ops
n/Llog
(n/Z)=0.275T
2
ops
n/Llog
(n/L)=0.0523T
Z/L
ops

Notetheimprovementsrelativetothebaseline(nlog
n):
2

Abigimprovementcomesfromreducingnton/L.ThismeanstransferringthedatainLsized
fragments.

Anotherbigimprovementcomesfromgoingfrombase2tobaseZ/L.Thisimprovementcomes
fromthecapacityoffastmemory(Z).


Handylogfactoid:
log
x=(log
x)/(log
Z/L)
Z/L
2
2

ExternalMemoryMergeSort

Letssortnelementsinatwolevelmemoryhierarchy.

Phase1:
Assumeprocessorissequential.
1. DividethenelementsintochunksofsizeZ,sothateachchunkfitsintofastmemory.
2. Readachunkfromslowmemorytofastmemory.
3. Nowsortthischunk,callitarun.
4. Writethechunkbacktoslowmemory.
5. Repeatforeachchunkofdata.
6. Youwillendupwithn/(fZ)chunksofsortedruns.

Phase1:
Partitioninputinton/(fZ)chunks
foreachchunki1ton/(fZ)do
Readchunki
Sortitintoarun
Writeruni

Phase2:
Mergethen/(fZ)runsintoasinglerun

PartitionedSortingStepAnalysis
(Phase1fromabove)

Countthenumberofasymptotictransfersateachsteptoobtainatotalasymptoticcost.
Readchunki

O(n/L)transfers
Sortitintoarun

O(n*log(Z))transfers
Writeruni

O(n/L)transfers

Toderivethevalues:
Assume:L|(fZ)and(fZ)|n+optimalcomparisonsort

Readchunki(fZ/L)*(n/(f*Z))=O(n/L)
Sortit(thenumberofcomparisons)O(fZlog(fZ))*n/fZ=O(nlog
Z)
2
WriteruniO(n/L)


Theschemeisgivingussomethingproportionalton/Ztransactions

TwoWayExternalMemoryMerging

Assumemrunsofsizes.
Thenumberofitemsisn=m*s
Nowwemustmergeallofthesortedrunsintoasinglesortedrun

Onemethod:mergerpairsofruns,thenpairsofthepairs,etc.
Ifwefollowthismethod.
k1
k
Foreachmerge,therunsizegrows.s,2s,,2
s,2
s

Tovisualizethis:

k1
Wehavetworunseachofsize2
sinslowmemory.
k
GoaltomergethetworunsintoanewrunCofsize2
s
Todothis:
Infastmemorytherewillbethreebuffers,eachholdingLitems(recallListhetransaction
size)
WellusethefirsttwobufferswillholdthetworunsAandB.Thethirdbuffer,C,willholdthe
combined,sortedrun.

Tobegin:
1. MoveanLsizeblockfromAandonefromB.StoretheminthefastmemoryA^and
B^.
2. SortA^andB^intoC^
ReadLsizedblocksofA,BandstoreinA^andB^
whileanyunmergeditemsinA&Bdo
mergeA^,B^C^aspossible
ifA^orB^emptythenreadmore
ifC^fullthenflush
FlushanyunmergedinAorB

Whatisthecostofthis?
ThisschemeonlyloadsitemsfromAorBonce
k1
k1
Transfers=(2
s)/L+(2
s)/L

k
Itonlywritesagivenoutputblockonce(2
s)/L

k1
k1
k
k+1
SoTransfers=(2
s)/L+(2
s)/L+(2
s)/L=(2
s)/L

k
Numberofcomparisons (2
s)

k
Numberofpairsmergedatlevelk=n/(2
s)
Numberoflevels=log
(n/s)
2

Totalnumberoftransfersis:2(n/L)log
(n/s)
2
Totalnumberofcomparisonsis: (nlog
n/s)
2

Thequestiontoaskisisthisgoodorbad?


ExternalMemoryMergeSort

MergeSortinExternalMemory:

Phase1:
Partitioninputinto (n/Z)chunks
Sorteachchunk,producing (n/Z)runsofsizeZeach

Phase2:
Mergeallrunsusinga2waymerge

Whataretheasymptoticcosts:

Phase1:
Comparisons=O(nlog
Z)
2
Transfers=O(n/L)

Phase2:
Comparisons=O(nlog
n/Z)
2
Transfers=O(n/Llog
n/Z)
2

Total:
Comparisons:O(nlog(n))=nlogn(thisisgoodnews,theschemeisworkoptimal)
Transfers:O(n/L*log(n/Z))

Thelowerboundis:~n/Llog
(n/L)
Z/L

WhatsWrongwith2WayMerging

Thenumberoftransfersinexternalmemorymergesortwith2waymerging:

Q(nZ,L)=O(n/Llog
(n/Z))=O(n/L[log
(n/L)log
(Z/L)])
2
2
2

Thelowerbound:

Q(nZ,L)= (n/Llog
(n/L))= (n/L(log
(n/L))/(log
(Z/L))
Z/L
2
2

Thisdifferencecanbequitehigh,dependingon
thesystem.

Whydoesnt2waymergedoabetteratachievinglowercosts?
Twowaymergingisnotgoodatutilizingfastmemory(Z)capacity.Themergingprocedureonly
worksonpairsofarraysatatimeandusesablockofsizeL.Sotwowaymergeissensitiveto
LbutnotsensitivetoZ.

Youshouldbeabletofixthisissue.

MultiwayMerging
Todobetterthan2waymerging,letsmergealotofrunsatonce.

Assume:
Kruns,storedinslowmemoryandsortedinascendingorder
K+1blockswillfitinfastmemory,(K+1)L Z(Zissizeoffastmemory)
(thiswillallowoneblockforeachinputand1blockfortheoutput)

1.Loadthefastmemorywithkblocks.
2.Nowfindthesmallestvalueofalltheblocks,moveittotheoutputblock(thek+1block).
Thesmallestvaluecanbefoundanumberofwaysalinearscanorminheaparetwo
possibilities.ThelinearscanwillworkifKissmall.
Touseapriorityqueue(orminheap)
1.loadtheblocks
2.buildtheheap(costO(K)operations)
3.extractMin(costO(logK)operations)
4.insert(costO(logk)operations)
Theseareallfastmemoryoperationssowecancountthemascomparisons.
3.Whentheoutputblockiffilled,flushitbywritingittoslowmemory.
4.Ifaninputblockisempty,refillit.

WhatisthecostofaKwaymerge?

Transfers:readinputblocksonce,writeblocksonce:2Ks/L

Comparisons:initialcosttobuildtheheap,theneachitemiseitherinsertedorextracted

O(K+KslogK)..forasinglekwaymerge

CostofMultiwayMerge

Initialinputhasnelements,dividedintosortedruns,withzitemseach.
Ifyoudokwaymerges,thenumberofcomparisons=nlog(n)

Whatisthetotalnumberofasymptoticmemorytransfers?
Assumek= (z/L)<z/L

l= (log
n/L)(lismaximumnumberofmergetrees)
z/L

i
Transfersperrunati= (K
s/L)
i
#ofrunsatleveli=n/K
s
Totaltransfersatleveli= (n/L)
#oflevels= (log
n/L)
z/L

Whatisthetotalnumberofasymptoticmemorytransfers?
Totaltransfersatleveli*#oflevels=O(n/L*log(n/L,Z/L)

ALowerBoundonExternalMemorySorting

Mergesortwith (Z/L) waymerges:Q(nZ,L)= (n/Llog


(n/L))
Z/L

Thisisverygood.

#ofpossibleorderings:n!
#orderingsaftert1transfers:K(t1)
soK(0)=n!

Now,afteracertainnumberoforderings,yourealfromslowmemorytofastmemoryLitems.
SothereareL!waysoforderingthisnewitems.
SonowyouhaveZLolditems(alreadyordered)andLnewitems.
Howmanywayscanthesebeordered? (ZchooseL) L!

t
Aftertreads,thelowerboundonthenumberoforderings:K(t) K(t1)/[(ZchooseL)L!]

Thisisconservative,itassumesLisunordered.
IfLhasbeenreadbefore.

Nowanswerthequestion,whendoesonly1orderingremain?
Thisisthelowerboundonthenumberoftransfers:

You might also like