Skip to content

Commit ef94805

Browse files
committed
Restore the portal-level snapshot after procedure COMMIT/ROLLBACK.
COMMIT/ROLLBACK necessarily destroys all snapshots within the session. The original implementation of intra-procedure transactions just cavalierly did that, ignoring the fact that this left us executing in a rather different environment than normal. In particular, it turns out that handling of toasted datums depends rather critically on there being an outer ActiveSnapshot: otherwise, when SPI or the core executor pop whatever snapshot they used and return, it's unsafe to dereference any toasted datums that may appear in the query result. It's possible to demonstrate "no known snapshots" and "missing chunk number N for toast value" errors as a result of this oversight. Historically this outer snapshot has been held by the Portal code, and that seems like a good plan to preserve. So add infrastructure to pquery.c to allow re-establishing the Portal-owned snapshot if it's not there anymore, and add enough bookkeeping support that we can tell whether it is or not. We can't, however, just re-establish the Portal snapshot as part of COMMIT/ROLLBACK. As in normal transaction start, acquiring the first snapshot should wait until after SET and LOCK commands. Hence, teach spi.c about doing this at the right time. (Note that this patch doesn't fix the problem for any PLs that try to run intra-procedure transactions without using SPI to execute SQL commands.) This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD, rename that to allow_nonatomic. replication/logical/worker.c also needs some fixes, because it wasn't careful to hold a snapshot open around AFTER trigger execution. That code doesn't use a Portal, which I suspect someday we're gonna have to fix. But for now, just rearrange the order of operations. This includes back-patching the recent addition of finish_estate() to centralize the cleanup logic there. This also back-patches commit 2ecfeda into v13, to improve the test coverage for worker.c (it was that test that exposed that worker.c's snapshot management is wrong). Per bug #15990 from Andreas Wicht. Back-patch to v11 where intra-procedure COMMIT was added. Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
1 parent 71787b2 commit ef94805

File tree

11 files changed

+321
-110
lines changed

11 files changed

+321
-110
lines changed

src/backend/commands/functioncmds.c

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@
6161
#include "parser/parse_func.h"
6262
#include "parser/parse_type.h"
6363
#include "pgstat.h"
64+
#include "tcop/pquery.h"
6465
#include "utils/acl.h"
6566
#include "utils/builtins.h"
6667
#include "utils/fmgroids.h"
@@ -2342,6 +2343,20 @@ ExecuteCallStmt(CallStmt *stmt, ParamListInfo params, bool atomic, DestReceiver
23422343
if (fcinfo.isnull)
23432344
elog(ERROR, "procedure returned null record");
23442345

2346+
/*
2347+
* Ensure there's an active snapshot whilst we execute whatever's
2348+
* involved here. Note that this is *not* sufficient to make the
2349+
* world safe for TOAST pointers to be included in the returned data:
2350+
* the referenced data could have gone away while we didn't hold a
2351+
* snapshot. Hence, it's incumbent on PLs that can do COMMIT/ROLLBACK
2352+
* to not return TOAST pointers, unless those pointers were fetched
2353+
* after the last COMMIT/ROLLBACK in the procedure.
2354+
*
2355+
* XXX that is a really nasty, hard-to-test requirement. Is there a
2356+
* way to remove it?
2357+
*/
2358+
EnsurePortalSnapshotExists();
2359+
23452360
td = DatumGetHeapTupleHeader(retval);
23462361
tupType = HeapTupleHeaderGetTypeId(td);
23472362
tupTypmod = HeapTupleHeaderGetTypMod(td);

src/backend/executor/spi.c

Lines changed: 49 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -256,12 +256,8 @@ SPI_commit(void)
256256
/* Start the actual commit */
257257
_SPI_current->internal_xact = true;
258258

259-
/*
260-
* Before committing, pop all active snapshots to avoid error about
261-
* "snapshot %p still active".
262-
*/
263-
while (ActiveSnapshotSet())
264-
PopActiveSnapshot();
259+
/* Release snapshots associated with portals */
260+
ForgetPortalSnapshots();
265261

266262
CommitTransactionCommand();
267263
MemoryContextSwitchTo(oldcontext);
@@ -296,6 +292,9 @@ SPI_rollback(void)
296292
/* Start the actual rollback */
297293
_SPI_current->internal_xact = true;
298294

295+
/* Release snapshots associated with portals */
296+
ForgetPortalSnapshots();
297+
299298
AbortCurrentTransaction();
300299
MemoryContextSwitchTo(oldcontext);
301300

@@ -2071,6 +2070,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
20712070
Oid my_lastoid = InvalidOid;
20722071
SPITupleTable *my_tuptable = NULL;
20732072
int res = 0;
2073+
bool allow_nonatomic = plan->no_snapshots; /* legacy API name */
20742074
bool pushed_active_snap = false;
20752075
ErrorContextCallback spierrcontext;
20762076
CachedPlan *cplan = NULL;
@@ -2103,11 +2103,12 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
21032103
* In the first two cases, we can just push the snap onto the stack once
21042104
* for the whole plan list.
21052105
*
2106-
* But if the plan has no_snapshots set to true, then don't manage
2107-
* snapshots at all. The caller should then take care of that.
2106+
* Note that snapshot != InvalidSnapshot implies an atomic execution
2107+
* context.
21082108
*/
2109-
if (snapshot != InvalidSnapshot && !plan->no_snapshots)
2109+
if (snapshot != InvalidSnapshot)
21102110
{
2111+
Assert(!allow_nonatomic);
21112112
if (read_only)
21122113
{
21132114
PushActiveSnapshot(snapshot);
@@ -2182,15 +2183,39 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
21822183
stmt_list = cplan->stmt_list;
21832184

21842185
/*
2185-
* In the default non-read-only case, get a new snapshot, replacing
2186-
* any that we pushed in a previous cycle.
2186+
* If we weren't given a specific snapshot to use, and the statement
2187+
* list requires a snapshot, set that up.
21872188
*/
2188-
if (snapshot == InvalidSnapshot && !read_only && !plan->no_snapshots)
2189+
if (snapshot == InvalidSnapshot &&
2190+
(list_length(stmt_list) > 1 ||
2191+
(list_length(stmt_list) == 1 &&
2192+
PlannedStmtRequiresSnapshot(linitial_node(PlannedStmt,
2193+
stmt_list)))))
21892194
{
2190-
if (pushed_active_snap)
2191-
PopActiveSnapshot();
2192-
PushActiveSnapshot(GetTransactionSnapshot());
2193-
pushed_active_snap = true;
2195+
/*
2196+
* First, ensure there's a Portal-level snapshot. This back-fills
2197+
* the snapshot stack in case the previous operation was a COMMIT
2198+
* or ROLLBACK inside a procedure or DO block. (We can't put back
2199+
* the Portal snapshot any sooner, or we'd break cases like doing
2200+
* SET or LOCK just after COMMIT.) It's enough to check once per
2201+
* statement list, since COMMIT/ROLLBACK/CALL/DO can't appear
2202+
* within a multi-statement list.
2203+
*/
2204+
EnsurePortalSnapshotExists();
2205+
2206+
/*
2207+
* In the default non-read-only case, get a new per-statement-list
2208+
* snapshot, replacing any that we pushed in a previous cycle.
2209+
* Skip it when doing non-atomic execution, though (we rely
2210+
* entirely on the Portal snapshot in that case).
2211+
*/
2212+
if (!read_only && !allow_nonatomic)
2213+
{
2214+
if (pushed_active_snap)
2215+
PopActiveSnapshot();
2216+
PushActiveSnapshot(GetTransactionSnapshot());
2217+
pushed_active_snap = true;
2218+
}
21942219
}
21952220

21962221
foreach(lc2, stmt_list)
@@ -2203,6 +2228,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
22032228
_SPI_current->lastoid = InvalidOid;
22042229
_SPI_current->tuptable = NULL;
22052230

2231+
/* Check for unsupported cases. */
22062232
if (stmt->utilityStmt)
22072233
{
22082234
if (IsA(stmt->utilityStmt, CopyStmt))
@@ -2234,9 +2260,10 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
22342260

22352261
/*
22362262
* If not read-only mode, advance the command counter before each
2237-
* command and update the snapshot.
2263+
* command and update the snapshot. (But skip it if the snapshot
2264+
* isn't under our control.)
22382265
*/
2239-
if (!read_only && !plan->no_snapshots)
2266+
if (!read_only && pushed_active_snap)
22402267
{
22412268
CommandCounterIncrement();
22422269
UpdateActiveSnapshotCommandId();
@@ -2270,13 +2297,11 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
22702297
ProcessUtilityContext context;
22712298

22722299
/*
2273-
* If the SPI context is atomic, or we are asked to manage
2274-
* snapshots, then we are in an atomic execution context.
2275-
* Conversely, to propagate a nonatomic execution context, the
2276-
* caller must be in a nonatomic SPI context and manage
2277-
* snapshots itself.
2300+
* If the SPI context is atomic, or we were not told to allow
2301+
* nonatomic operations, tell ProcessUtility this is an atomic
2302+
* execution context.
22782303
*/
2279-
if (_SPI_current->atomic || !plan->no_snapshots)
2304+
if (_SPI_current->atomic || !allow_nonatomic)
22802305
context = PROCESS_UTILITY_QUERY;
22812306
else
22822307
context = PROCESS_UTILITY_QUERY_NONATOMIC;

src/backend/replication/logical/worker.c

Lines changed: 29 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -194,6 +194,13 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
194194
ResultRelInfo *resultRelInfo;
195195
RangeTblEntry *rte;
196196

197+
/*
198+
* Input functions may need an active snapshot, as may AFTER triggers
199+
* invoked during finish_estate. For safety, ensure an active snapshot
200+
* exists throughout all our usage of the executor.
201+
*/
202+
PushActiveSnapshot(GetTransactionSnapshot());
203+
197204
estate = CreateExecutorState();
198205

199206
rte = makeNode(RangeTblEntry);
@@ -221,6 +228,22 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
221228
return estate;
222229
}
223230

231+
/*
232+
* Finish any operations related to the executor state created by
233+
* create_estate_for_relation().
234+
*/
235+
static void
236+
finish_estate(EState *estate)
237+
{
238+
/* Handle any queued AFTER triggers. */
239+
AfterTriggerEndQuery(estate);
240+
241+
/* Cleanup. */
242+
ExecResetTupleTable(estate->es_tupleTable, false);
243+
FreeExecutorState(estate);
244+
PopActiveSnapshot();
245+
}
246+
224247
/*
225248
* Executes default values for columns for which we can't map to remote
226249
* relation columns.
@@ -627,9 +650,6 @@ apply_handle_insert(StringInfo s)
627650
remoteslot = ExecInitExtraTupleSlot(estate,
628651
RelationGetDescr(rel->localrel));
629652

630-
/* Input functions may need an active snapshot, so get one */
631-
PushActiveSnapshot(GetTransactionSnapshot());
632-
633653
/* Process and store remote tuple in the slot */
634654
oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
635655
slot_store_cstrings(remoteslot, rel, newtup.values);
@@ -643,13 +663,8 @@ apply_handle_insert(StringInfo s)
643663

644664
/* Cleanup. */
645665
ExecCloseIndices(estate->es_result_relation_info);
646-
PopActiveSnapshot();
647666

648-
/* Handle queued AFTER triggers. */
649-
AfterTriggerEndQuery(estate);
650-
651-
ExecResetTupleTable(estate->es_tupleTable, false);
652-
FreeExecutorState(estate);
667+
finish_estate(estate);
653668

654669
logicalrep_rel_close(rel, NoLock);
655670

@@ -760,7 +775,6 @@ apply_handle_update(StringInfo s)
760775
}
761776
}
762777

763-
PushActiveSnapshot(GetTransactionSnapshot());
764778
ExecOpenIndices(estate->es_result_relation_info, false);
765779

766780
/* Build the search tuple. */
@@ -819,15 +833,10 @@ apply_handle_update(StringInfo s)
819833
}
820834

821835
/* Cleanup. */
836+
EvalPlanQualEnd(&epqstate);
822837
ExecCloseIndices(estate->es_result_relation_info);
823-
PopActiveSnapshot();
824838

825-
/* Handle queued AFTER triggers. */
826-
AfterTriggerEndQuery(estate);
827-
828-
EvalPlanQualEnd(&epqstate);
829-
ExecResetTupleTable(estate->es_tupleTable, false);
830-
FreeExecutorState(estate);
839+
finish_estate(estate);
831840

832841
logicalrep_rel_close(rel, NoLock);
833842

@@ -878,7 +887,6 @@ apply_handle_delete(StringInfo s)
878887
RelationGetDescr(rel->localrel));
879888
EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
880889

881-
PushActiveSnapshot(GetTransactionSnapshot());
882890
ExecOpenIndices(estate->es_result_relation_info, false);
883891

884892
/* Find the tuple using the replica identity index. */
@@ -919,15 +927,10 @@ apply_handle_delete(StringInfo s)
919927
}
920928

921929
/* Cleanup. */
930+
EvalPlanQualEnd(&epqstate);
922931
ExecCloseIndices(estate->es_result_relation_info);
923-
PopActiveSnapshot();
924932

925-
/* Handle queued AFTER triggers. */
926-
AfterTriggerEndQuery(estate);
927-
928-
EvalPlanQualEnd(&epqstate);
929-
ExecResetTupleTable(estate->es_tupleTable, false);
930-
FreeExecutorState(estate);
933+
finish_estate(estate);
931934

932935
logicalrep_rel_close(rel, NoLock);
933936

@@ -950,7 +953,7 @@ apply_handle_truncate(StringInfo s)
950953
List *relids = NIL;
951954
List *relids_logged = NIL;
952955
ListCell *lc;
953-
LOCKMODE lockmode = AccessExclusiveLock;
956+
LOCKMODE lockmode = AccessExclusiveLock;
954957

955958
ensure_transaction();
956959

0 commit comments

Comments
 (0)