Ext Sort
Ext Sort
( ) 8,9 6
OUTPUT 1,2
Idea: Divide and conquer: 2,3
INPUT 2
sort subfiles and merge 3,4 8-page runs
4,5
Main memory buffers Disk
Disk 6,6
7,8
9
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 3 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 4
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 7 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 8
Fact: average length of a run in heapsort is 2B … longer runs often means fewer passes!
The “snowplow” analogy Actually, do I/O a page at a time
Worst-Case: In fact, read a block of pages sequentially!
What is min length of a run? Suggests we should make each buffer
How does this arise? (input/output) be a block of pages.
Best-Case: B But this will reduce fan-out during merge passes!
What is max length of a run? In practice, most files still sorted in 2-3 passes.
How does this arise?
Quicksort is faster, but ...
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 9 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 10
INPUT 1'
1,000,000 3 2 2
INPUT 2
10,000,000 4 3 3 INPUT 2'
OUTPUT
OUTPUT'
100,000,000 5 3 3 b
1,000,000,000 5 4 3 Disk INPUT k
block size
Disk
INPUT k'
* Block size = 32, initial pass produces runs of size 2B. B main memory buffers, k-way merge
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 11 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 12
Sorting Records! Using B+ Trees for Sorting
Sorting has become a blood sport! Scenario: Table to be sorted has B+ tree index on
Parallel sorting is the name of the game ...
sorting column(s).
Idea: Can retrieve records in order by traversing
Datamation: Sort 1M records of size 100 bytes
leaf pages.
Typical DBMS: 15 minutes
World record: 3.5 seconds Is this a good idea?
• 12-CPU SGI machine, 96 disks, 2GB of RAM Cases to consider:
New benchmarks proposed: B+ tree is clustered Good idea!
Minute Sort: How many can you sort in 1 minute? B+ tree is not clustered Could be a very bad idea!
Dollar Sort: How many can you sort for $1.00?
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 13 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 14
Clustered B+ Tree Used for Sorting Unclustered B+ Tree Used for Sorting
Cost: root to the left-
Alternative (2) for data entries; each data
most leaf, then retrieve
Index
(Directs search)
entry contains rid of a data record. In general,
all leaf pages one I/O per data record!
(Alternative 1)
Data Entries
If Alternative 2 is used? ("Sequence set") Index
(Directs search)
Additional cost of
retrieving data records:
each page fetched just Data Entries
("Sequence set")
once. Data Records
N Sorting p=1 p=10 p=100 External sorting is important; DBMS may dedicate
part of buffer pool for sorting!
100 200 100 1,000 10,000
External merge sort minimizes disk I/O cost:
1,000 2,000 1,000 10,000 100,000
Pass 0: Produces sorted runs of size B (# buffer pages).
10,000 40,000 10,000 100,000 1,000,000 Later passes: merge runs.
100,000 600,000 100,000 1,000,000 10,000,000 # of runs merged at a time depends on B, and block size.
1,000,000 8,000,000 1,000,000 10,000,000 100,000,000 Larger block size means less I/O cost per page.
10,000,000 80,000,000 10,000,000 100,000,000 1,000,000,000 Larger block size means smaller # runs merged.
In practice, # of runs rarely more than 2 or 3.
* p: # of records per page
* B=1,000 and block size=32 for sorting
* p=100 is the more realistic value.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 17 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 18
Summary, cont.