Skip to content

Commit 3f44959

Browse files
Avoid unneeded nbtree backwards scan buffer locks.
Teach nbtree backwards scans to avoid relocking a just-read leaf page to read its current left sibling link when it isn't truly necessary. This happened inside _bt_readnextpage whenever _bt_readpage had already determined that there'll be no further matches to the left (or at least none for the current primitive index scan, for a scan with array keys). A new precheck inside _bt_readnextpage is all that we need to avoid these useless lock acquisitions. Arguably, using a precheck like this was a missed opportunity for commit 2ed5b87, which taught nbtree to drop leaf page pins early to avoid blocking cleanup by VACUUM. Forwards scans already managed to avoid relocking the page like this. The optimization added by this commit is particularly helpful with backwards scans that use array keys where the scan must perform multiple primitive index scans. Such backwards scans will now avoid a useless leaf page re-lock at the end of each primitive index scan. Note that this commit does not attempt to avoid needlessly re-locking a leaf page that was just read when the scan must follow the leaf page's left link. That more ambitious optimization could work by stashing the left link when the page is first read by a backwards scan, allowing the subsequent _bt_readnextpage call to optimistically skip re-reading the original page just to get a new copy of its left link. For now we only address cases where we don't care about our original page's left link. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAH2-Wz=xgs7PojG=EUvhgadwENzu_mY_riNh-w9wFPsaS717ew@mail.gmail.com
1 parent f011e82 commit 3f44959

File tree

1 file changed

+18
-5
lines changed

1 file changed

+18
-5
lines changed

src/backend/access/nbtree/nbtsearch.c

Lines changed: 18 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1916,16 +1916,20 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum,
19161916
}
19171917
}
19181918
}
1919+
/* When !continuescan, there can't be any more matches, so stop */
19191920
if (!pstate.continuescan)
1920-
{
1921-
/* there can't be any more matches, so stop */
1922-
so->currPos.moreLeft = false;
19231921
break;
1924-
}
19251922

19261923
offnum = OffsetNumberPrev(offnum);
19271924
}
19281925

1926+
/*
1927+
* We don't need to visit page to the left when no more matches will
1928+
* be found there
1929+
*/
1930+
if (!pstate.continuescan || P_LEFTMOST(opaque))
1931+
so->currPos.moreLeft = false;
1932+
19291933
Assert(itemIndex >= 0);
19301934
so->currPos.firstItem = itemIndex;
19311935
so->currPos.lastItem = MaxTIDsPerBTreePage - 1;
@@ -2240,6 +2244,15 @@ _bt_readnextpage(IndexScanDesc scan, BlockNumber blkno, ScanDirection dir)
22402244
so->currPos.currPage = blkno;
22412245
}
22422246

2247+
/* Done if we know that the left sibling link isn't of interest */
2248+
if (!so->currPos.moreLeft)
2249+
{
2250+
BTScanPosUnpinIfPinned(so->currPos);
2251+
_bt_parallel_done(scan);
2252+
BTScanPosInvalidate(so->currPos);
2253+
return false;
2254+
}
2255+
22432256
/*
22442257
* Walk left to the next page with data. This is much more complex
22452258
* than the walk-right case because of the possibility that the page
@@ -2260,7 +2273,7 @@ _bt_readnextpage(IndexScanDesc scan, BlockNumber blkno, ScanDirection dir)
22602273

22612274
for (;;)
22622275
{
2263-
/* Done if we know there are no matching keys to the left */
2276+
/* Done if we know that the left sibling link isn't of interest */
22642277
if (!so->currPos.moreLeft)
22652278
{
22662279
_bt_relbuf(rel, so->currPos.buf);

0 commit comments

Comments
 (0)