Skip to content

Commit f1ea98a

Browse files
committed
Reduce non-leaf keys overlap in GiST indexes produced by a sorted build
The GiST sorted build currently chooses split points according to the only page space utilization. That may lead to higher non-leaf keys overlap and, in turn, slower search query answers. This commit makes the sorted build use the opclass's picksplit method. Once four pages at the level are accumulated, the picksplit method is applied until each split partition fits the page. Some of our split algorithms could show significant performance degradation while processing 4-times more data at once. But those opclasses haven't received the sorted build support and shouldn't receive it before their split algorithms are improved. Discussion: https://postgr.es/m/CAHqSB9jqtS94e9%3D0vxqQX5dxQA89N95UKyz-%3DA7Y%2B_YJt%2BVW5A%40mail.gmail.com Author: Aliaksandr Kalenik, Sergei Shoulbakov, Andrey Borodin Reviewed-by: Björn Harrtell, Darafei Praliaskouski, Andres Freund Reviewed-by: Alexander Korotkov
1 parent 42a9e88 commit f1ea98a

File tree

4 files changed

+197
-110
lines changed

4 files changed

+197
-110
lines changed

contrib/pageinspect/expected/gist.out

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -33,14 +33,13 @@ COMMIT;
3333
SELECT * FROM gist_page_items(get_raw_page('test_gist_idx', 0), 'test_gist_idx');
3434
itemoffset | ctid | itemlen | dead | keys
3535
------------+-----------+---------+------+-------------------
36-
1 | (1,65535) | 40 | f | (p)=((166,166))
37-
2 | (2,65535) | 40 | f | (p)=((332,332))
38-
3 | (3,65535) | 40 | f | (p)=((498,498))
39-
4 | (4,65535) | 40 | f | (p)=((664,664))
40-
5 | (5,65535) | 40 | f | (p)=((830,830))
41-
6 | (6,65535) | 40 | f | (p)=((996,996))
42-
7 | (7,65535) | 40 | f | (p)=((1000,1000))
43-
(7 rows)
36+
1 | (1,65535) | 40 | f | (p)=((185,185))
37+
2 | (2,65535) | 40 | f | (p)=((370,370))
38+
3 | (3,65535) | 40 | f | (p)=((555,555))
39+
4 | (4,65535) | 40 | f | (p)=((740,740))
40+
5 | (5,65535) | 40 | f | (p)=((870,870))
41+
6 | (6,65535) | 40 | f | (p)=((1000,1000))
42+
(6 rows)
4443

4544
SELECT * FROM gist_page_items(get_raw_page('test_gist_idx', 1), 'test_gist_idx') LIMIT 5;
4645
itemoffset | ctid | itemlen | dead | keys
@@ -63,7 +62,6 @@ SELECT itemoffset, ctid, itemlen FROM gist_page_items_bytea(get_raw_page('test_g
6362
4 | (4,65535) | 40
6463
5 | (5,65535) | 40
6564
6 | (6,65535) | 40
66-
7 | (7,65535) | 40
67-
(7 rows)
65+
(6 rows)
6866

6967
DROP TABLE test_gist;

src/backend/access/gist/README

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ The current implementation of GiST supports:
2626
* Concurrency
2727
* Recovery support via WAL logging
2828
* Buffering build algorithm
29+
* Sorted build method
2930

3031
The support for concurrency implemented in PostgreSQL was developed based on
3132
the paper "Access Methods for Next-Generation Database Systems" by
@@ -414,6 +415,21 @@ emptied yet; tuples never move upwards in the tree. The final emptying loops
414415
through buffers at a given level until all buffers at that level have been
415416
emptied, and then moves down to the next level.
416417

418+
Sorted build method
419+
-------------------
420+
421+
Sort all input tuples, pack them into GiST leaf pages in the sorted order,
422+
and create downlinks and internal pages as we go. This method builds the index
423+
from the bottom up, similar to how the B-tree index is built.
424+
425+
The sorted method is used if the operator classes for all columns have a
426+
"sortsupport" defined. Otherwise, we fall back on inserting tuples one by one
427+
with optional buffering.
428+
429+
Sorting GiST build requires good linearization of the sort opclass. That is not
430+
always the case in multidimensional data. To tackle the anomalies, we buffer
431+
index tuples and apply a picksplit function that can be multidimensional-aware.
432+
417433
Bulk delete algorithm (VACUUM)
418434
------------------------------
419435

0 commit comments

Comments
 (0)