@@ -780,20 +780,21 @@ order. Delaying deduplication minimizes page level fragmentation.
780
780
Deduplication in unique indexes
781
781
-------------------------------
782
782
783
- Very often, the range of values that can be placed on a given leaf page in
784
- a unique index is fixed and permanent. For example, a primary key on an
785
- identity column will usually only have page splits caused by the insertion
786
- of new logical rows within the rightmost leaf page. If there is a split
787
- of a non-rightmost leaf page, then the split must have been triggered by
788
- inserts associated with an UPDATE of an existing logical row. Splitting a
789
- leaf page purely to store multiple versions should be considered
790
- pathological, since it permanently degrades the index structure in order
791
- to absorb a temporary burst of duplicates. Deduplication in unique
792
- indexes helps to prevent these pathological page splits. Storing
793
- duplicates in a space efficient manner is not the goal, since in the long
794
- run there won't be any duplicates anyway. Rather, we're buying time for
795
- standard garbage collection mechanisms to run before a page split is
796
- needed.
783
+ Very often, the number of distinct values that can ever be placed on
784
+ almost any given leaf page in a unique index is fixed and permanent. For
785
+ example, a primary key on an identity column will usually only have leaf
786
+ page splits caused by the insertion of new logical rows within the
787
+ rightmost leaf page. If there is a split of a non-rightmost leaf page,
788
+ then the split must have been triggered by inserts associated with UPDATEs
789
+ of existing logical rows. Splitting a leaf page purely to store multiple
790
+ versions is a false economy. In effect, we're permanently degrading the
791
+ index structure just to absorb a temporary burst of duplicates.
792
+
793
+ Deduplication in unique indexes helps to prevent these pathological page
794
+ splits. Storing duplicates in a space efficient manner is not the goal,
795
+ since in the long run there won't be any duplicates anyway. Rather, we're
796
+ buying time for standard garbage collection mechanisms to run before a
797
+ page split is needed.
797
798
798
799
Unique index leaf pages only get a deduplication pass when an insertion
799
800
(that might have to split the page) observed an existing duplicate on the
@@ -838,13 +839,15 @@ list splits.
838
839
839
840
Only a few isolated extra steps are required to preserve the illusion that
840
841
the new item never overlapped with an existing posting list in the first
841
- place: the heap TID of the incoming tuple is swapped with the rightmost/max
842
- heap TID from the existing/originally overlapping posting list. Also, the
843
- posting-split-with-page-split case must generate a new high key based on
844
- an imaginary version of the original page that has both the final new item
845
- and the after-list-split posting tuple (page splits usually just operate
846
- against an imaginary version that contains the new item/item that won't
847
- fit).
842
+ place: the heap TID of the incoming tuple has its TID replaced with the
843
+ rightmost/max heap TID from the existing/originally overlapping posting
844
+ list. Similarly, the original incoming item's TID is relocated to the
845
+ appropriate offset in the posting list (we usually shift TIDs out of the
846
+ way to make a hole for it). Finally, the posting-split-with-page-split
847
+ case must generate a new high key based on an imaginary version of the
848
+ original page that has both the final new item and the after-list-split
849
+ posting tuple (page splits usually just operate against an imaginary
850
+ version that contains the new item/item that won't fit).
848
851
849
852
This approach avoids inventing an "eager" atomic posting split operation
850
853
that splits the posting list without simultaneously finishing the insert
0 commit comments