Skip to content

Commit 305db95

Browse files
committed
Fix creation of partition descriptor during concurrent detach+drop
If a partition undergoes DETACH CONCURRENTLY immediately followed by DROP, this could cause a problem for a concurrent transaction recomputing the partition descriptor when running a prepared statement, because it tries to dereference a pointer to a tuple that's not found in a catalog scan. The existing retry logic added in commit dbca346 is sufficient to cope with the overall problem, provided we don't try to dereference a non-existant heap tuple. Arguably, the code in RelationBuildPartitionDesc() has been wrong all along, since no check was added in commit 898e5e3 against receiving a NULL tuple from the catalog scan; that bug has only become user-visible with DETACH CONCURRENTLY which was added in branch 14. Therefore, even though there's no known mechanism to cause a crash because of this, backpatch the addition of such a check to all supported branches. In branches prior to 14, this would cause the code to fail with a "missing relpartbound for relation XYZ" error instead of crashing; that's okay, because there are no reports of such behavior anyway. Author: Kuntal Ghosh <kuntalghosh.2007@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/18559-b48286d2eacd9a4e@postgresql.org
1 parent 16e67bc commit 305db95

File tree

1 file changed

+22
-8
lines changed

1 file changed

+22
-8
lines changed

src/backend/partitioning/partdesc.c

Lines changed: 22 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -210,6 +210,10 @@ RelationBuildPartitionDesc(Relation rel, bool omit_detached)
210210
* shared queue. We solve this problem by reading pg_class directly
211211
* for the desired tuple.
212212
*
213+
* If the partition recently detached is also dropped, we get no tuple
214+
* from the scan. In that case, we also retry, and next time through
215+
* here, we don't see that partition anymore.
216+
*
213217
* The other problem is that DETACH CONCURRENTLY is in the process of
214218
* removing a partition, which happens in two steps: first it marks it
215219
* as "detach pending", commits, then unsets relpartbound. If
@@ -224,8 +228,6 @@ RelationBuildPartitionDesc(Relation rel, bool omit_detached)
224228
Relation pg_class;
225229
SysScanDesc scan;
226230
ScanKeyData key[1];
227-
Datum datum;
228-
bool isnull;
229231

230232
pg_class = table_open(RelationRelationId, AccessShareLock);
231233
ScanKeyInit(&key[0],
@@ -234,17 +236,29 @@ RelationBuildPartitionDesc(Relation rel, bool omit_detached)
234236
ObjectIdGetDatum(inhrelid));
235237
scan = systable_beginscan(pg_class, ClassOidIndexId, true,
236238
NULL, 1, key);
239+
240+
/*
241+
* We could get one tuple from the scan (the normal case), or zero
242+
* tuples if the table has been dropped meanwhile.
243+
*/
237244
tuple = systable_getnext(scan);
238-
datum = heap_getattr(tuple, Anum_pg_class_relpartbound,
239-
RelationGetDescr(pg_class), &isnull);
240-
if (!isnull)
241-
boundspec = stringToNode(TextDatumGetCString(datum));
245+
if (HeapTupleIsValid(tuple))
246+
{
247+
Datum datum;
248+
bool isnull;
249+
250+
datum = heap_getattr(tuple, Anum_pg_class_relpartbound,
251+
RelationGetDescr(pg_class), &isnull);
252+
if (!isnull)
253+
boundspec = stringToNode(TextDatumGetCString(datum));
254+
}
242255
systable_endscan(scan);
243256
table_close(pg_class, AccessShareLock);
244257

245258
/*
246-
* If we still don't get a relpartbound value, then it must be
247-
* because of DETACH CONCURRENTLY. Restart from the top, as
259+
* If we still don't get a relpartbound value (either because
260+
* boundspec is null or because there was no tuple), then it must
261+
* be because of DETACH CONCURRENTLY. Restart from the top, as
248262
* explained above. We only do this once, for two reasons: first,
249263
* only one DETACH CONCURRENTLY session could affect us at a time,
250264
* since each of them would have to wait for the snapshot under

0 commit comments

Comments
 (0)