Skip to content

Commit c899c68

Browse files
committed
Fix creation of partition descriptor during concurrent detach+drop
If a partition undergoes DETACH CONCURRENTLY immediately followed by DROP, this could cause a problem for a concurrent transaction recomputing the partition descriptor when running a prepared statement, because it tries to dereference a pointer to a tuple that's not found in a catalog scan. The existing retry logic added in commit dbca346 is sufficient to cope with the overall problem, provided we don't try to dereference a non-existant heap tuple. Arguably, the code in RelationBuildPartitionDesc() has been wrong all along, since no check was added in commit 898e5e3 against receiving a NULL tuple from the catalog scan; that bug has only become user-visible with DETACH CONCURRENTLY which was added in branch 14. Therefore, even though there's no known mechanism to cause a crash because of this, backpatch the addition of such a check to all supported branches. In branches prior to 14, this would cause the code to fail with a "missing relpartbound for relation XYZ" error instead of crashing; that's okay, because there are no reports of such behavior anyway. Author: Kuntal Ghosh <kuntalghosh.2007@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/18559-b48286d2eacd9a4e@postgresql.org
1 parent a459ac5 commit c899c68

File tree

1 file changed

+22
-8
lines changed

1 file changed

+22
-8
lines changed

src/backend/partitioning/partdesc.c

Lines changed: 22 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -209,6 +209,10 @@ RelationBuildPartitionDesc(Relation rel, bool omit_detached)
209209
* shared queue. We solve this problem by reading pg_class directly
210210
* for the desired tuple.
211211
*
212+
* If the partition recently detached is also dropped, we get no tuple
213+
* from the scan. In that case, we also retry, and next time through
214+
* here, we don't see that partition anymore.
215+
*
212216
* The other problem is that DETACH CONCURRENTLY is in the process of
213217
* removing a partition, which happens in two steps: first it marks it
214218
* as "detach pending", commits, then unsets relpartbound. If
@@ -223,8 +227,6 @@ RelationBuildPartitionDesc(Relation rel, bool omit_detached)
223227
Relation pg_class;
224228
SysScanDesc scan;
225229
ScanKeyData key[1];
226-
Datum datum;
227-
bool isnull;
228230

229231
pg_class = table_open(RelationRelationId, AccessShareLock);
230232
ScanKeyInit(&key[0],
@@ -233,17 +235,29 @@ RelationBuildPartitionDesc(Relation rel, bool omit_detached)
233235
ObjectIdGetDatum(inhrelid));
234236
scan = systable_beginscan(pg_class, ClassOidIndexId, true,
235237
NULL, 1, key);
238+
239+
/*
240+
* We could get one tuple from the scan (the normal case), or zero
241+
* tuples if the table has been dropped meanwhile.
242+
*/
236243
tuple = systable_getnext(scan);
237-
datum = heap_getattr(tuple, Anum_pg_class_relpartbound,
238-
RelationGetDescr(pg_class), &isnull);
239-
if (!isnull)
240-
boundspec = stringToNode(TextDatumGetCString(datum));
244+
if (HeapTupleIsValid(tuple))
245+
{
246+
Datum datum;
247+
bool isnull;
248+
249+
datum = heap_getattr(tuple, Anum_pg_class_relpartbound,
250+
RelationGetDescr(pg_class), &isnull);
251+
if (!isnull)
252+
boundspec = stringToNode(TextDatumGetCString(datum));
253+
}
241254
systable_endscan(scan);
242255
table_close(pg_class, AccessShareLock);
243256

244257
/*
245-
* If we still don't get a relpartbound value, then it must be
246-
* because of DETACH CONCURRENTLY. Restart from the top, as
258+
* If we still don't get a relpartbound value (either because
259+
* boundspec is null or because there was no tuple), then it must
260+
* be because of DETACH CONCURRENTLY. Restart from the top, as
247261
* explained above. We only do this once, for two reasons: first,
248262
* only one DETACH CONCURRENTLY session could affect us at a time,
249263
* since each of them would have to wait for the snapshot under

0 commit comments

Comments
 (0)