Skip to content

Commit b150a76

Browse files
Fix nbtree deduplication README commentary.
Descriptions of some aspects of how deduplication works were unclear in a couple of places.
1 parent 112b006 commit b150a76

File tree

1 file changed

+24
-21
lines changed
  • src/backend/access/nbtree

1 file changed

+24
-21
lines changed

src/backend/access/nbtree/README

Lines changed: 24 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -780,20 +780,21 @@ order. Delaying deduplication minimizes page level fragmentation.
780780
Deduplication in unique indexes
781781
-------------------------------
782782

783-
Very often, the range of values that can be placed on a given leaf page in
784-
a unique index is fixed and permanent. For example, a primary key on an
785-
identity column will usually only have page splits caused by the insertion
786-
of new logical rows within the rightmost leaf page. If there is a split
787-
of a non-rightmost leaf page, then the split must have been triggered by
788-
inserts associated with an UPDATE of an existing logical row. Splitting a
789-
leaf page purely to store multiple versions should be considered
790-
pathological, since it permanently degrades the index structure in order
791-
to absorb a temporary burst of duplicates. Deduplication in unique
792-
indexes helps to prevent these pathological page splits. Storing
793-
duplicates in a space efficient manner is not the goal, since in the long
794-
run there won't be any duplicates anyway. Rather, we're buying time for
795-
standard garbage collection mechanisms to run before a page split is
796-
needed.
783+
Very often, the number of distinct values that can ever be placed on
784+
almost any given leaf page in a unique index is fixed and permanent. For
785+
example, a primary key on an identity column will usually only have leaf
786+
page splits caused by the insertion of new logical rows within the
787+
rightmost leaf page. If there is a split of a non-rightmost leaf page,
788+
then the split must have been triggered by inserts associated with UPDATEs
789+
of existing logical rows. Splitting a leaf page purely to store multiple
790+
versions is a false economy. In effect, we're permanently degrading the
791+
index structure just to absorb a temporary burst of duplicates.
792+
793+
Deduplication in unique indexes helps to prevent these pathological page
794+
splits. Storing duplicates in a space efficient manner is not the goal,
795+
since in the long run there won't be any duplicates anyway. Rather, we're
796+
buying time for standard garbage collection mechanisms to run before a
797+
page split is needed.
797798

798799
Unique index leaf pages only get a deduplication pass when an insertion
799800
(that might have to split the page) observed an existing duplicate on the
@@ -838,13 +839,15 @@ list splits.
838839

839840
Only a few isolated extra steps are required to preserve the illusion that
840841
the new item never overlapped with an existing posting list in the first
841-
place: the heap TID of the incoming tuple is swapped with the rightmost/max
842-
heap TID from the existing/originally overlapping posting list. Also, the
843-
posting-split-with-page-split case must generate a new high key based on
844-
an imaginary version of the original page that has both the final new item
845-
and the after-list-split posting tuple (page splits usually just operate
846-
against an imaginary version that contains the new item/item that won't
847-
fit).
842+
place: the heap TID of the incoming tuple has its TID replaced with the
843+
rightmost/max heap TID from the existing/originally overlapping posting
844+
list. Similarly, the original incoming item's TID is relocated to the
845+
appropriate offset in the posting list (we usually shift TIDs out of the
846+
way to make a hole for it). Finally, the posting-split-with-page-split
847+
case must generate a new high key based on an imaginary version of the
848+
original page that has both the final new item and the after-list-split
849+
posting tuple (page splits usually just operate against an imaginary
850+
version that contains the new item/item that won't fit).
848851

849852
This approach avoids inventing an "eager" atomic posting split operation
850853
that splits the posting list without simultaneously finishing the insert

0 commit comments

Comments
 (0)