Skip to content

Commit 1383e2a

Browse files
committed
Improve FSM management for BRIN indexes.
BRIN indexes like to propagate additions of free space into the upper pages of their free space maps as soon as the new space is known, even when it's just on one individual index page. Previously this required calling FreeSpaceMapVacuum, which is quite an expensive thing if the map is large. Use the FreeSpaceMapVacuumRange function recently added by commit c79f6df to reduce the amount of work done for this purpose. Fix a couple of places that neglected to do the upper-page vacuuming at all after recording new free space. If the policy is to be that BRIN should do that, it should do it everywhere. Do RecordPageWithFreeSpace unconditionally in brin_page_cleanup, and do FreeSpaceMapVacuum unconditionally in brin_vacuum_scan. Because of the FSM's imprecise storage of free space, the old complications here seldom bought anything, they just slowed things down. This approach also provides a predictable path for FSM corruption to be repaired. Remove premature RecordPageWithFreeSpace call in brin_getinsertbuffer where it's about to return an extended page to the caller. The caller should do that, instead, after it's inserted its new tuple. Fix the one caller that forgot to do so. Simplify logic in brin_doupdate's same-page-update case by postponing brin_initialize_empty_new_buffer to after the critical section; I see little point in doing it before. Avoid repeat calls of RelationGetNumberOfBlocks in brin_vacuum_scan. Avoid duplicate BufferGetBlockNumber and BufferGetPage calls in a couple of places where we already had the right values. Move a BRIN_elog debug logging call out of a critical section; that's pretty unsafe and I don't think it buys us anything to not wait till after the critical section. Move the "*extended = false" step in brin_getinsertbuffer into the routine's main loop. There's no actual bug there, since the loop can't iterate with *extended still true, but it doesn't seem very future-proof as coded; and it's certainly not documented as a loop invariant. This is all from follow-on investigation inspired by commit c79f6df. Discussion: https://postgr.es/m/5801.1522429460@sss.pgh.pa.us
1 parent 3de241d commit 1383e2a

File tree

3 files changed

+103
-83
lines changed

3 files changed

+103
-83
lines changed

src/backend/access/brin/brin.c

Lines changed: 18 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1121,16 +1121,22 @@ initialize_brin_buildstate(Relation idxRel, BrinRevmap *revmap,
11211121
static void
11221122
terminate_brin_buildstate(BrinBuildState *state)
11231123
{
1124-
/* release the last index buffer used */
1124+
/*
1125+
* Release the last index buffer used. We might as well ensure that
1126+
* whatever free space remains in that page is available in FSM, too.
1127+
*/
11251128
if (!BufferIsInvalid(state->bs_currentInsertBuf))
11261129
{
11271130
Page page;
1131+
Size freespace;
1132+
BlockNumber blk;
11281133

11291134
page = BufferGetPage(state->bs_currentInsertBuf);
1130-
RecordPageWithFreeSpace(state->bs_irel,
1131-
BufferGetBlockNumber(state->bs_currentInsertBuf),
1132-
PageGetFreeSpace(page));
1135+
freespace = PageGetFreeSpace(page);
1136+
blk = BufferGetBlockNumber(state->bs_currentInsertBuf);
11331137
ReleaseBuffer(state->bs_currentInsertBuf);
1138+
RecordPageWithFreeSpace(state->bs_irel, blk, freespace);
1139+
FreeSpaceMapVacuumRange(state->bs_irel, blk, blk + 1);
11341140
}
11351141

11361142
brin_free_desc(state->bs_bdesc);
@@ -1445,14 +1451,15 @@ union_tuples(BrinDesc *bdesc, BrinMemTuple *a, BrinTuple *b)
14451451
static void
14461452
brin_vacuum_scan(Relation idxrel, BufferAccessStrategy strategy)
14471453
{
1448-
bool vacuum_fsm = false;
1454+
BlockNumber nblocks;
14491455
BlockNumber blkno;
14501456

14511457
/*
14521458
* Scan the index in physical order, and clean up any possible mess in
14531459
* each page.
14541460
*/
1455-
for (blkno = 0; blkno < RelationGetNumberOfBlocks(idxrel); blkno++)
1461+
nblocks = RelationGetNumberOfBlocks(idxrel);
1462+
for (blkno = 0; blkno < nblocks; blkno++)
14561463
{
14571464
Buffer buf;
14581465

@@ -1461,15 +1468,15 @@ brin_vacuum_scan(Relation idxrel, BufferAccessStrategy strategy)
14611468
buf = ReadBufferExtended(idxrel, MAIN_FORKNUM, blkno,
14621469
RBM_NORMAL, strategy);
14631470

1464-
vacuum_fsm |= brin_page_cleanup(idxrel, buf);
1471+
brin_page_cleanup(idxrel, buf);
14651472

14661473
ReleaseBuffer(buf);
14671474
}
14681475

14691476
/*
1470-
* If we made any change to the FSM, make sure the new info is visible all
1471-
* the way to the top.
1477+
* Update all upper pages in the index's FSM, as well. This ensures not
1478+
* only that we propagate leaf-page FSM updates made by brin_page_cleanup,
1479+
* but also that any pre-existing damage or out-of-dateness is repaired.
14721480
*/
1473-
if (vacuum_fsm)
1474-
FreeSpaceMapVacuum(idxrel);
1481+
FreeSpaceMapVacuum(idxrel);
14751482
}

0 commit comments

Comments
 (0)