Skip to content

Commit 1d84751

Browse files
committed
Avoid killing btree items that are already dead
_bt_killitems marks btree items dead when a scan leaves the page where they live, but it does so with only share lock (to improve concurrency). This was historicall okay, since killing a dead item has no consequences. However, with the advent of data checksums and wal_log_hints, this action incurs a WAL full-page-image record of the page. Multiple concurrent processes would write the same page several times, leading to WAL bloat. The probability of this happening can be reduced by only killing items if they're not already dead, so change the code to do that. The problem could eliminated completely by having _bt_killitems upgrade to exclusive lock upon seeing a killable item, but that would reduce concurrency so it's considered a cure worse than the disease. Backpatch all the way back to 9.5, since wal_log_hints was introduced in 9.4. Author: Masahiko Sawada <masahiko.sawada@2ndquadrant.com> Discussion: https://postgr.es/m/CA+fd4k6PeRj2CkzapWNrERkja5G0-6D-YQiKfbukJV+qZGFZ_Q@mail.gmail.com
1 parent 5663844 commit 1d84751

File tree

1 file changed

+13
-3
lines changed

1 file changed

+13
-3
lines changed

src/backend/access/nbtree/nbtutils.c

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1789,9 +1789,19 @@ _bt_killitems(IndexScanDesc scan)
17891789

17901790
if (ItemPointerEquals(&ituple->t_tid, &kitem->heapTid))
17911791
{
1792-
/* found the item */
1793-
ItemIdMarkDead(iid);
1794-
killedsomething = true;
1792+
/*
1793+
* Found the item. Mark it as dead, if it isn't already.
1794+
* Since this happens while holding a buffer lock possibly in
1795+
* shared mode, it's possible that multiple processes attempt
1796+
* to do this simultaneously, leading to multiple full-page
1797+
* images being sent to WAL (if wal_log_hints or data checksums
1798+
* are enabled), which is undesirable.
1799+
*/
1800+
if (!ItemIdIsDead(iid))
1801+
{
1802+
ItemIdMarkDead(iid);
1803+
killedsomething = true;
1804+
}
17951805
break; /* out of inner search loop */
17961806
}
17971807
offnum = OffsetNumberNext(offnum);

0 commit comments

Comments
 (0)