Skip to content

Commit 05c8482

Browse files
author
Amit Kapila
committed
Enable parallel SELECT for "INSERT INTO ... SELECT ...".
Parallel SELECT can't be utilized for INSERT in the following cases: - INSERT statement uses the ON CONFLICT DO UPDATE clause - Target table has a parallel-unsafe: trigger, index expression or predicate, column default expression or check constraint - Target table has a parallel-unsafe domain constraint on any column - Target table is a partitioned table with a parallel-unsafe partition key expression or support function The planner is updated to perform additional parallel-safety checks for the cases listed above, for determining whether it is safe to run INSERT in parallel-mode with an underlying parallel SELECT. The planner will consider using parallel SELECT for "INSERT INTO ... SELECT ...", provided nothing unsafe is found from the additional parallel-safety checks, or from the existing parallel-safety checks for SELECT. While checking parallel-safety, we need to check it for all the partitions on the table which can be costly especially when we decide not to use a parallel plan. So, in a separate patch, we will introduce a GUC and or a reloption to enable/disable parallelism for Insert statements. Prior to entering parallel-mode for the execution of INSERT with parallel SELECT, a TransactionId is acquired and assigned to the current transaction state. This is necessary to prevent the INSERT from attempting to assign the TransactionId whilst in parallel-mode, which is not allowed. This approach has a disadvantage in that if the underlying SELECT does not return any rows, then the TransactionId is not used, however that shouldn't happen in practice in many cases. Author: Greg Nancarrow, Amit Langote, Amit Kapila Reviewed-by: Amit Langote, Hou Zhijie, Takayuki Tsunakawa, Antonin Houska, Bharath Rupireddy, Dilip Kumar, Vignesh C, Zhihong Yu, Amit Kapila Tested-by: Tang, Haiying Discussion: https://postgr.es/m/CAJcOf-cXnB5cnMKqWEp2E2z7Mvcd04iLVmV=qpFJrR3AcrTS3g@mail.gmail.com Discussion: https://postgr.es/m/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com
1 parent 0ba7110 commit 05c8482

File tree

17 files changed

+1531
-21
lines changed

17 files changed

+1531
-21
lines changed

doc/src/sgml/parallel.sgml

+3-1
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,9 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%';
146146
a CTE, no parallel plans for that query will be generated. As an
147147
exception, the commands <literal>CREATE TABLE ... AS</literal>, <literal>SELECT
148148
INTO</literal>, and <literal>CREATE MATERIALIZED VIEW</literal> which create a new
149-
table and populate it can use a parallel plan.
149+
table and populate it can use a parallel plan. Another exception is the command
150+
<literal>INSERT INTO ... SELECT ...</literal> which can use a parallel plan for
151+
the underlying <literal>SELECT</literal> part of the query.
150152
</para>
151153
</listitem>
152154

src/backend/access/transam/xact.c

+26
Original file line numberDiff line numberDiff line change
@@ -1014,6 +1014,32 @@ IsInParallelMode(void)
10141014
return CurrentTransactionState->parallelModeLevel != 0;
10151015
}
10161016

1017+
/*
1018+
* PrepareParallelModePlanExec
1019+
*
1020+
* Prepare for entering parallel mode plan execution, based on command-type.
1021+
*/
1022+
void
1023+
PrepareParallelModePlanExec(CmdType commandType)
1024+
{
1025+
if (IsModifySupportedInParallelMode(commandType))
1026+
{
1027+
Assert(!IsInParallelMode());
1028+
1029+
/*
1030+
* Prepare for entering parallel mode by assigning a TransactionId.
1031+
* Failure to do this now would result in heap_insert() subsequently
1032+
* attempting to assign a TransactionId whilst in parallel-mode, which
1033+
* is not allowed.
1034+
*
1035+
* This approach has a disadvantage in that if the underlying SELECT
1036+
* does not return any rows, then the TransactionId is not used,
1037+
* however that shouldn't happen in practice in many cases.
1038+
*/
1039+
(void) GetCurrentTransactionId();
1040+
}
1041+
}
1042+
10171043
/*
10181044
* CommandCounterIncrement
10191045
*/

src/backend/executor/execMain.c

+3
Original file line numberDiff line numberDiff line change
@@ -1512,7 +1512,10 @@ ExecutePlan(EState *estate,
15121512

15131513
estate->es_use_parallel_mode = use_parallel_mode;
15141514
if (use_parallel_mode)
1515+
{
1516+
PrepareParallelModePlanExec(estate->es_plannedstmt->commandType);
15151517
EnterParallelMode();
1518+
}
15161519

15171520
/*
15181521
* Loop until we've processed the proper number of tuples from the plan.

src/backend/nodes/copyfuncs.c

+1
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,7 @@ _copyPlannedStmt(const PlannedStmt *from)
9696
COPY_BITMAPSET_FIELD(rewindPlanIDs);
9797
COPY_NODE_FIELD(rowMarks);
9898
COPY_NODE_FIELD(relationOids);
99+
COPY_NODE_FIELD(partitionOids);
99100
COPY_NODE_FIELD(invalItems);
100101
COPY_NODE_FIELD(paramExecTypes);
101102
COPY_NODE_FIELD(utilityStmt);

src/backend/nodes/outfuncs.c

+2
Original file line numberDiff line numberDiff line change
@@ -314,6 +314,7 @@ _outPlannedStmt(StringInfo str, const PlannedStmt *node)
314314
WRITE_BITMAPSET_FIELD(rewindPlanIDs);
315315
WRITE_NODE_FIELD(rowMarks);
316316
WRITE_NODE_FIELD(relationOids);
317+
WRITE_NODE_FIELD(partitionOids);
317318
WRITE_NODE_FIELD(invalItems);
318319
WRITE_NODE_FIELD(paramExecTypes);
319320
WRITE_NODE_FIELD(utilityStmt);
@@ -2221,6 +2222,7 @@ _outPlannerGlobal(StringInfo str, const PlannerGlobal *node)
22212222
WRITE_NODE_FIELD(resultRelations);
22222223
WRITE_NODE_FIELD(appendRelations);
22232224
WRITE_NODE_FIELD(relationOids);
2225+
WRITE_NODE_FIELD(partitionOids);
22242226
WRITE_NODE_FIELD(invalItems);
22252227
WRITE_NODE_FIELD(paramExecTypes);
22262228
WRITE_UINT_FIELD(lastPHId);

src/backend/nodes/readfuncs.c

+1
Original file line numberDiff line numberDiff line change
@@ -1590,6 +1590,7 @@ _readPlannedStmt(void)
15901590
READ_BITMAPSET_FIELD(rewindPlanIDs);
15911591
READ_NODE_FIELD(rowMarks);
15921592
READ_NODE_FIELD(relationOids);
1593+
READ_NODE_FIELD(partitionOids);
15931594
READ_NODE_FIELD(invalItems);
15941595
READ_NODE_FIELD(paramExecTypes);
15951596
READ_NODE_FIELD(utilityStmt);

src/backend/optimizer/plan/planner.c

+26-11
Original file line numberDiff line numberDiff line change
@@ -305,6 +305,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
305305
glob->resultRelations = NIL;
306306
glob->appendRelations = NIL;
307307
glob->relationOids = NIL;
308+
glob->partitionOids = NIL;
308309
glob->invalItems = NIL;
309310
glob->paramExecTypes = NIL;
310311
glob->lastPHId = 0;
@@ -316,16 +317,16 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
316317
/*
317318
* Assess whether it's feasible to use parallel mode for this query. We
318319
* can't do this in a standalone backend, or if the command will try to
319-
* modify any data, or if this is a cursor operation, or if GUCs are set
320-
* to values that don't permit parallelism, or if parallel-unsafe
321-
* functions are present in the query tree.
320+
* modify any data (except for Insert), or if this is a cursor operation,
321+
* or if GUCs are set to values that don't permit parallelism, or if
322+
* parallel-unsafe functions are present in the query tree.
322323
*
323-
* (Note that we do allow CREATE TABLE AS, SELECT INTO, and CREATE
324-
* MATERIALIZED VIEW to use parallel plans, but as of now, only the leader
325-
* backend writes into a completely new table. In the future, we can
326-
* extend it to allow workers to write into the table. However, to allow
327-
* parallel updates and deletes, we have to solve other problems,
328-
* especially around combo CIDs.)
324+
* (Note that we do allow CREATE TABLE AS, INSERT INTO...SELECT, SELECT
325+
* INTO, and CREATE MATERIALIZED VIEW to use parallel plans. However, as
326+
* of now, only the leader backend writes into a completely new table. In
327+
* the future, we can extend it to allow workers to write into the table.
328+
* However, to allow parallel updates and deletes, we have to solve other
329+
* problems, especially around combo CIDs.)
329330
*
330331
* For now, we don't try to use parallel mode if we're running inside a
331332
* parallel worker. We might eventually be able to relax this
@@ -334,13 +335,14 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
334335
*/
335336
if ((cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 &&
336337
IsUnderPostmaster &&
337-
parse->commandType == CMD_SELECT &&
338+
(parse->commandType == CMD_SELECT ||
339+
is_parallel_allowed_for_modify(parse)) &&
338340
!parse->hasModifyingCTE &&
339341
max_parallel_workers_per_gather > 0 &&
340342
!IsParallelWorker())
341343
{
342344
/* all the cheap tests pass, so scan the query tree */
343-
glob->maxParallelHazard = max_parallel_hazard(parse);
345+
glob->maxParallelHazard = max_parallel_hazard(parse, glob);
344346
glob->parallelModeOK = (glob->maxParallelHazard != PROPARALLEL_UNSAFE);
345347
}
346348
else
@@ -521,6 +523,19 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
521523
result->rewindPlanIDs = glob->rewindPlanIDs;
522524
result->rowMarks = glob->finalrowmarks;
523525
result->relationOids = glob->relationOids;
526+
527+
/*
528+
* Register the Oids of parallel-safety-checked partitions as plan
529+
* dependencies. This is only really needed in the case of a parallel plan
530+
* so that if parallel-unsafe properties are subsequently defined on the
531+
* partitions, the cached parallel plan will be invalidated, and a
532+
* non-parallel plan will be generated.
533+
*
534+
* We also use this list to acquire locks on partitions before executing
535+
* cached plan. See AcquireExecutorLocks().
536+
*/
537+
if (glob->partitionOids != NIL && glob->parallelModeNeeded)
538+
result->partitionOids = glob->partitionOids;
524539
result->invalItems = glob->invalItems;
525540
result->paramExecTypes = glob->paramExecTypes;
526541
/* utilityStmt should be null, but we might as well copy it */

0 commit comments

Comments
 (0)