Skip to content

Commit 322c9b3

Browse files
nbtree VACUUM: cope with right sibling link corruption.
Avoid "right sibling's left-link doesn't match" errors when vacuuming a corrupt nbtree index. Just LOG the issue and press on. That way VACUUM will have a decent chance of finishing off all required processing for the index (and for the table as a whole). This error was seen in the field from time to time (it's more than a theoretical risk), so giving VACUUM the ability to press on like this has real value. Nothing short of a REINDEX is expected to fix the underlying index corruption, so giving up (by throwing an error) risks making a bad situation far worse. Anything that blocks forward progress by VACUUM like this might go unnoticed for a long time. This could eventually lead to a wraparound/xidStopLimit outage. Note that _bt_unlink_halfdead_page() has always been able to bail on page deletion when the target page's left sibling page was in an inconsistent state. It now does the same thing (returns false to back out of the second phase of deletion) when it notices sibling link corruption in the target page's right sibling page. This is similar to the work from commit 5b861ba (later backpatched as commit 43e409c), which taught nbtree to press on with vacuuming an index when page deletion fails to "re-find" a downlink in the target page's parent page. The "re-find" check seems to make VACUUM bail on page deletion more often in practice, but there is no reason to take any chances here. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/CAH2-Wzko2q2kP1+UvgJyP9g0mF4hopK0NtQZcxwvMv9_ytGhkQ@mail.gmail.com Backpatch: 11- (all supported versions).
1 parent f8320cc commit 322c9b3

File tree

2 files changed

+45
-20
lines changed

2 files changed

+45
-20
lines changed

src/backend/access/nbtree/nbtpage.c

Lines changed: 44 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -2415,24 +2415,22 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, BlockNumber scanblkno,
24152415

24162416
if (!leftsibvalid)
24172417
{
2418-
if (target != leafblkno)
2419-
{
2420-
/* we have only a pin on target, but pin+lock on leafbuf */
2421-
ReleaseBuffer(buf);
2422-
_bt_relbuf(rel, leafbuf);
2423-
}
2424-
else
2425-
{
2426-
/* we have only a pin on leafbuf */
2427-
ReleaseBuffer(leafbuf);
2428-
}
2429-
2418+
/*
2419+
* This is known to fail in the field; sibling link corruption
2420+
* is relatively common. Press on with vacuuming rather than
2421+
* just throwing an ERROR.
2422+
*/
24302423
ereport(LOG,
24312424
(errcode(ERRCODE_INDEX_CORRUPTED),
24322425
errmsg_internal("valid left sibling for deletion target could not be located: "
2433-
"left sibling %u of target %u with leafblkno %u and scanblkno %u in index \"%s\"",
2426+
"left sibling %u of target %u with leafblkno %u and scanblkno %u on level %u of index \"%s\"",
24342427
leftsib, target, leafblkno, scanblkno,
2435-
RelationGetRelationName(rel))));
2428+
targetlevel, RelationGetRelationName(rel))));
2429+
2430+
/* Must release all pins and locks on failure exit */
2431+
ReleaseBuffer(buf);
2432+
if (target != leafblkno)
2433+
_bt_relbuf(rel, leafbuf);
24362434

24372435
return false;
24382436
}
@@ -2507,13 +2505,40 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, BlockNumber scanblkno,
25072505
rbuf = _bt_getbuf(rel, rightsib, BT_WRITE);
25082506
page = BufferGetPage(rbuf);
25092507
opaque = (BTPageOpaque) PageGetSpecialPointer(page);
2508+
2509+
/*
2510+
* Validate target's right sibling page. Its left link must point back to
2511+
* the target page.
2512+
*/
25102513
if (opaque->btpo_prev != target)
2511-
ereport(ERROR,
2514+
{
2515+
/*
2516+
* This is known to fail in the field; sibling link corruption is
2517+
* relatively common. Press on with vacuuming rather than just
2518+
* throwing an ERROR (same approach used for left-sibling's-right-link
2519+
* validation check a moment ago).
2520+
*/
2521+
ereport(LOG,
25122522
(errcode(ERRCODE_INDEX_CORRUPTED),
25132523
errmsg_internal("right sibling's left-link doesn't match: "
2514-
"block %u links to %u instead of expected %u in index \"%s\"",
2515-
rightsib, opaque->btpo_prev, target,
2516-
RelationGetRelationName(rel))));
2524+
"right sibling %u of target %u with leafblkno %u "
2525+
"and scanblkno %u spuriously links to non-target %u "
2526+
"on level %u of index \"%s\"",
2527+
rightsib, target, leafblkno,
2528+
scanblkno, opaque->btpo_prev,
2529+
targetlevel, RelationGetRelationName(rel))));
2530+
2531+
/* Must release all pins and locks on failure exit */
2532+
if (BufferIsValid(lbuf))
2533+
_bt_relbuf(rel, lbuf);
2534+
_bt_relbuf(rel, rbuf);
2535+
_bt_relbuf(rel, buf);
2536+
if (target != leafblkno)
2537+
_bt_relbuf(rel, leafbuf);
2538+
2539+
return false;
2540+
}
2541+
25172542
rightsib_is_rightmost = P_RIGHTMOST(opaque);
25182543
*rightsib_empty = (P_FIRSTDATAKEY(opaque) > PageGetMaxOffsetNumber(page));
25192544

@@ -2737,6 +2762,7 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, BlockNumber scanblkno,
27372762
*/
27382763
_bt_pendingfsm_add(vstate, target, safexid);
27392764

2765+
/* Success - hold on to lock on leafbuf (might also have been target) */
27402766
return true;
27412767
}
27422768

src/backend/access/nbtree/nbtree.c

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1088,8 +1088,7 @@ btvacuumpage(BTVacState *vstate, BlockNumber scanblkno)
10881088
* can't be half-dead because only an interrupted VACUUM process can
10891089
* leave pages in that state, so we'd definitely have dealt with it
10901090
* back when the page was the scanblkno page (half-dead pages are
1091-
* always marked fully deleted by _bt_pagedel()). This assumes that
1092-
* there can be only one vacuum process running at a time.
1091+
* always marked fully deleted by _bt_pagedel(), barring corruption).
10931092
*/
10941093
if (!opaque || !P_ISLEAF(opaque) || P_ISHALFDEAD(opaque))
10951094
{

0 commit comments

Comments
 (0)