Skip to content

Commit 8191e0c

Browse files
committed
Fix corruption of pgstats shared hashtable due to OOM failures
A new pgstats entry is created as a two-step process: - The entry is looked at in the shared hashtable of pgstats, and is inserted if not found. - When not found and inserted, its fields are then initialized. This part include a DSA chunk allocation for the stats data of the new entry. As currently coded, if the DSA chunk allocation fails due to an out-of-memory failure, an ERROR is generated, leaving in the pgstats shared hashtable an inconsistent entry due to the first step, as the entry has already been inserted in the hashtable. These broken entries can then be found by other backends, crashing them. There are only two callers of pgstat_init_entry(), when loading the pgstats file at startup and when creating a new pgstats entry. This commit changes pgstat_init_entry() so as we use dsa_allocate_extended() with DSA_ALLOC_NO_OOM, making it return NULL on allocation failure instead of failing. This way, a backend failing an entry creation can take appropriate cleanup actions in the shared hashtable before throwing an error. Currently, this means removing the entry from the shared hashtable before throwing the error for the allocation failure. Out-of-memory errors unlikely happen in the wild, and we do not bother with back-patches when these are fixed, usually. However, the problem dealt with here is a degree worse as it breaks the shared memory state of pgstats, impacting other processes that may look at an inconsistent entry that a different process has failed to create. Author: Mikhail Kot <mikhail.kot@databricks.com> Discussion: https://postgr.es/m/CAAi9E7jELo5_-sBENftnc2E8XhW2PKZJWfTC3i2y-GMQd2bcqQ@mail.gmail.com Backpatch-through: 15
1 parent 1f7e9ba commit 8191e0c

File tree

2 files changed

+38
-1
lines changed

2 files changed

+38
-1
lines changed

src/backend/utils/activity/pgstat.c

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1975,6 +1975,17 @@ pgstat_read_statsfile(void)
19751975

19761976
header = pgstat_init_entry(key.kind, p);
19771977
dshash_release_lock(pgStatLocal.shared_hash, p);
1978+
if (header == NULL)
1979+
{
1980+
/*
1981+
* It would be tempting to switch this ERROR to a
1982+
* WARNING, but it would mean that all the statistics
1983+
* are discarded when the environment fails on OOM.
1984+
*/
1985+
elog(ERROR, "could not allocate entry %u/%u/%" PRIu64 " of type %c",
1986+
key.kind, key.dboid,
1987+
key.objid, t);
1988+
}
19781989

19791990
if (!read_chunk(fpin,
19801991
pgstat_get_entry_data(key.kind, header),

src/backend/utils/activity/pgstat_shmem.c

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -289,6 +289,13 @@ pgstat_detach_shmem(void)
289289
* ------------------------------------------------------------
290290
*/
291291

292+
/*
293+
* Initialize entry newly-created.
294+
*
295+
* Returns NULL in the event of an allocation failure, so as callers can
296+
* take cleanup actions as the entry initialized is already inserted in the
297+
* shared hashtable.
298+
*/
292299
PgStatShared_Common *
293300
pgstat_init_entry(PgStat_Kind kind,
294301
PgStatShared_HashEntry *shhashent)
@@ -311,7 +318,12 @@ pgstat_init_entry(PgStat_Kind kind,
311318
pg_atomic_init_u32(&shhashent->generation, 0);
312319
shhashent->dropped = false;
313320

314-
chunk = dsa_allocate0(pgStatLocal.dsa, pgstat_get_kind_info(kind)->shared_size);
321+
chunk = dsa_allocate_extended(pgStatLocal.dsa,
322+
pgstat_get_kind_info(kind)->shared_size,
323+
DSA_ALLOC_ZERO | DSA_ALLOC_NO_OOM);
324+
if (chunk == InvalidDsaPointer)
325+
return NULL;
326+
315327
shheader = dsa_get_address(pgStatLocal.dsa, chunk);
316328
shheader->magic = 0xdeadbeef;
317329

@@ -509,6 +521,20 @@ pgstat_get_entry_ref(PgStat_Kind kind, Oid dboid, uint64 objid, bool create,
509521
if (!shfound)
510522
{
511523
shheader = pgstat_init_entry(kind, shhashent);
524+
if (shheader == NULL)
525+
{
526+
/*
527+
* Failed the allocation of a new entry, so clean up the
528+
* shared hashtable before giving up.
529+
*/
530+
dshash_delete_entry(pgStatLocal.shared_hash, shhashent);
531+
532+
ereport(ERROR,
533+
(errcode(ERRCODE_OUT_OF_MEMORY),
534+
errmsg("out of memory"),
535+
errdetail("Failed while allocating entry %u/%u/%" PRIu64 ".",
536+
key.kind, key.dboid, key.objid)));
537+
}
512538
pgstat_acquire_entry_ref(entry_ref, shhashent, shheader);
513539

514540
if (created_entry != NULL)

0 commit comments

Comments
 (0)