Skip to content

Commit 202b127

Browse files
committed
bufmgr: Improve stats when a buffer is read in concurrently
Previously we would have the following inaccuracies when a backend tried to read in a buffer, but that buffer was read in concurrently by another backend: - the read IO was double-counted in the global buffer access stats (pgBufferUsage) - the buffer hit was not accounted for in: - global buffer access statistics - pg_stat_io - relation level IO stats - vacuum cost balancing While trying to read in a buffer that is concurrently read in by another backend is not a common occurrence, it's also not that rare, e.g. due to concurrent sequential scans on the same relation. This scenario has become more likely in PG 17, due to the introducing of read streams, which can pin multiple buffers before calling StartBufferIO() for all the buffers. This behaviour has historically grown, but there doesn't seem to be any reason to continue with the wrong accounting. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_Zk-B08AzPsO-6680LUHLOCGaNJYofaxTFseLa=OepV1g@mail.gmail.com
1 parent 1260459 commit 202b127

File tree

1 file changed

+22
-15
lines changed

1 file changed

+22
-15
lines changed

src/backend/storage/buffer/bufmgr.c

Lines changed: 22 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1436,19 +1436,6 @@ WaitReadBuffers(ReadBuffersOperation *operation)
14361436
io_object = IOOBJECT_RELATION;
14371437
}
14381438

1439-
/*
1440-
* We count all these blocks as read by this backend. This is traditional
1441-
* behavior, but might turn out to be not true if we find that someone
1442-
* else has beaten us and completed the read of some of these blocks. In
1443-
* that case the system globally double-counts, but we traditionally don't
1444-
* count this as a "hit", and we don't have a separate counter for "miss,
1445-
* but another backend completed the read".
1446-
*/
1447-
if (persistence == RELPERSISTENCE_TEMP)
1448-
pgBufferUsage.local_blks_read += nblocks;
1449-
else
1450-
pgBufferUsage.shared_blks_read += nblocks;
1451-
14521439
for (int i = 0; i < nblocks; ++i)
14531440
{
14541441
int io_buffers_len;
@@ -1466,15 +1453,30 @@ WaitReadBuffers(ReadBuffersOperation *operation)
14661453
if (!WaitReadBuffersCanStartIO(buffers[i], false))
14671454
{
14681455
/*
1469-
* Report this as a 'hit' for this backend, even though it must
1470-
* have started out as a miss in PinBufferForBlock().
1456+
* Report and track this as a 'hit' for this backend, even though
1457+
* it must have started out as a miss in PinBufferForBlock(). The
1458+
* other backend will track this as a 'read'.
14711459
*/
14721460
TRACE_POSTGRESQL_BUFFER_READ_DONE(forknum, blocknum + i,
14731461
operation->smgr->smgr_rlocator.locator.spcOid,
14741462
operation->smgr->smgr_rlocator.locator.dbOid,
14751463
operation->smgr->smgr_rlocator.locator.relNumber,
14761464
operation->smgr->smgr_rlocator.backend,
14771465
true);
1466+
1467+
if (persistence == RELPERSISTENCE_TEMP)
1468+
pgBufferUsage.local_blks_hit += 1;
1469+
else
1470+
pgBufferUsage.shared_blks_hit += 1;
1471+
1472+
if (operation->rel)
1473+
pgstat_count_buffer_hit(operation->rel);
1474+
1475+
pgstat_count_io_op(io_object, io_context, IOOP_HIT, 1, 0);
1476+
1477+
if (VacuumCostActive)
1478+
VacuumCostBalance += VacuumCostPageHit;
1479+
14781480
continue;
14791481
}
14801482

@@ -1560,6 +1562,11 @@ WaitReadBuffers(ReadBuffersOperation *operation)
15601562
false);
15611563
}
15621564

1565+
if (persistence == RELPERSISTENCE_TEMP)
1566+
pgBufferUsage.local_blks_read += io_buffers_len;
1567+
else
1568+
pgBufferUsage.shared_blks_read += io_buffers_len;
1569+
15631570
if (VacuumCostActive)
15641571
VacuumCostBalance += VacuumCostPageMiss * io_buffers_len;
15651572
}

0 commit comments

Comments
 (0)