Skip to content

Commit 86dc900

Browse files
committed
Rework planning and execution of UPDATE and DELETE.
This patch makes two closely related sets of changes: 1. For UPDATE, the subplan of the ModifyTable node now only delivers the new values of the changed columns (i.e., the expressions computed in the query's SET clause) plus row identity information such as CTID. ModifyTable must re-fetch the original tuple to merge in the old values of any unchanged columns. The core advantage of this is that the changed columns are uniform across all tables of an inherited or partitioned target relation, whereas the other columns might not be. A secondary advantage, when the UPDATE involves joins, is that less data needs to pass through the plan tree. The disadvantage of course is an extra fetch of each tuple to be updated. However, that seems to be very nearly free in context; even worst-case tests don't show it to add more than a couple percent to the total query cost. At some point it might be interesting to combine the re-fetch with the tuple access that ModifyTable must do anyway to mark the old tuple dead; but that would require a good deal of refactoring and it seems it wouldn't buy all that much, so this patch doesn't attempt it. 2. For inherited UPDATE/DELETE, instead of generating a separate subplan for each target relation, we now generate a single subplan that is just exactly like a SELECT's plan, then stick ModifyTable on top of that. To let ModifyTable know which target relation a given incoming row refers to, a tableoid junk column is added to the row identity information. This gets rid of the horrid hack that was inheritance_planner(), eliminating O(N^2) planning cost and memory consumption in cases where there were many unprunable target relations. Point 2 of course requires point 1, so that there is a uniform definition of the non-junk columns to be returned by the subplan. We can't insist on uniform definition of the row identity junk columns however, if we want to keep the ability to have both plain and foreign tables in a partitioning hierarchy. Since it wouldn't scale very far to have every child table have its own row identity column, this patch includes provisions to merge similar row identity columns into one column of the subplan result. In particular, we can merge the whole-row Vars typically used as row identity by FDWs into one column by pretending they are type RECORD. (It's still okay for the actual composite Datums to be labeled with the table's rowtype OID, though.) There is more that can be done to file down residual inefficiencies in this patch, but it seems to be committable now. FDW authors should note several API changes: * The argument list for AddForeignUpdateTargets() has changed, and so has the method it must use for adding junk columns to the query. Call add_row_identity_var() instead of manipulating the parse tree directly. You might want to reconsider exactly what you're adding, too. * PlanDirectModify() must now work a little harder to find the ForeignScan plan node; if the foreign table is part of a partitioning hierarchy then the ForeignScan might not be the direct child of ModifyTable. See postgres_fdw for sample code. * To check whether a relation is a target relation, it's no longer sufficient to compare its relid to root->parse->resultRelation. Instead, check it against all_result_relids or leaf_result_relids, as appropriate. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
1 parent 055fee7 commit 86dc900

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+2352
-2198
lines changed

contrib/postgres_fdw/deparse.c

+9-8
Original file line numberDiff line numberDiff line change
@@ -1275,7 +1275,7 @@ deparseLockingClause(deparse_expr_cxt *context)
12751275
* that DECLARE CURSOR ... FOR UPDATE is supported, which it isn't
12761276
* before 8.3.
12771277
*/
1278-
if (relid == root->parse->resultRelation &&
1278+
if (bms_is_member(relid, root->all_result_relids) &&
12791279
(root->parse->commandType == CMD_UPDATE ||
12801280
root->parse->commandType == CMD_DELETE))
12811281
{
@@ -1867,6 +1867,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
18671867
* 'foreignrel' is the RelOptInfo for the target relation or the join relation
18681868
* containing all base relations in the query
18691869
* 'targetlist' is the tlist of the underlying foreign-scan plan node
1870+
* (note that this only contains new-value expressions and junk attrs)
18701871
* 'targetAttrs' is the target columns of the UPDATE
18711872
* 'remote_conds' is the qual clauses that must be evaluated remotely
18721873
* '*params_list' is an output list of exprs that will become remote Params
@@ -1888,8 +1889,9 @@ deparseDirectUpdateSql(StringInfo buf, PlannerInfo *root,
18881889
deparse_expr_cxt context;
18891890
int nestlevel;
18901891
bool first;
1891-
ListCell *lc;
18921892
RangeTblEntry *rte = planner_rt_fetch(rtindex, root);
1893+
ListCell *lc,
1894+
*lc2;
18931895

18941896
/* Set up context struct for recursion */
18951897
context.root = root;
@@ -1908,14 +1910,13 @@ deparseDirectUpdateSql(StringInfo buf, PlannerInfo *root,
19081910
nestlevel = set_transmission_modes();
19091911

19101912
first = true;
1911-
foreach(lc, targetAttrs)
1913+
forboth(lc, targetlist, lc2, targetAttrs)
19121914
{
1913-
int attnum = lfirst_int(lc);
1914-
TargetEntry *tle = get_tle_by_resno(targetlist, attnum);
1915+
TargetEntry *tle = lfirst_node(TargetEntry, lc);
1916+
int attnum = lfirst_int(lc2);
19151917

1916-
if (!tle)
1917-
elog(ERROR, "attribute number %d not found in UPDATE targetlist",
1918-
attnum);
1918+
/* update's new-value expressions shouldn't be resjunk */
1919+
Assert(!tle->resjunk);
19191920

19201921
if (!first)
19211922
appendStringInfoString(buf, ", ");

contrib/postgres_fdw/expected/postgres_fdw.out

+193-230
Large diffs are not rendered by default.

contrib/postgres_fdw/postgres_fdw.c

+87-42
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@
2727
#include "miscadmin.h"
2828
#include "nodes/makefuncs.h"
2929
#include "nodes/nodeFuncs.h"
30+
#include "optimizer/appendinfo.h"
3031
#include "optimizer/clauses.h"
3132
#include "optimizer/cost.h"
3233
#include "optimizer/optimizer.h"
@@ -345,7 +346,8 @@ static void postgresBeginForeignScan(ForeignScanState *node, int eflags);
345346
static TupleTableSlot *postgresIterateForeignScan(ForeignScanState *node);
346347
static void postgresReScanForeignScan(ForeignScanState *node);
347348
static void postgresEndForeignScan(ForeignScanState *node);
348-
static void postgresAddForeignUpdateTargets(Query *parsetree,
349+
static void postgresAddForeignUpdateTargets(PlannerInfo *root,
350+
Index rtindex,
349351
RangeTblEntry *target_rte,
350352
Relation target_relation);
351353
static List *postgresPlanForeignModify(PlannerInfo *root,
@@ -1669,36 +1671,27 @@ postgresEndForeignScan(ForeignScanState *node)
16691671
* Add resjunk column(s) needed for update/delete on a foreign table
16701672
*/
16711673
static void
1672-
postgresAddForeignUpdateTargets(Query *parsetree,
1674+
postgresAddForeignUpdateTargets(PlannerInfo *root,
1675+
Index rtindex,
16731676
RangeTblEntry *target_rte,
16741677
Relation target_relation)
16751678
{
16761679
Var *var;
1677-
const char *attrname;
1678-
TargetEntry *tle;
16791680

16801681
/*
16811682
* In postgres_fdw, what we need is the ctid, same as for a regular table.
16821683
*/
16831684

16841685
/* Make a Var representing the desired value */
1685-
var = makeVar(parsetree->resultRelation,
1686+
var = makeVar(rtindex,
16861687
SelfItemPointerAttributeNumber,
16871688
TIDOID,
16881689
-1,
16891690
InvalidOid,
16901691
0);
16911692

1692-
/* Wrap it in a resjunk TLE with the right name ... */
1693-
attrname = "ctid";
1694-
1695-
tle = makeTargetEntry((Expr *) var,
1696-
list_length(parsetree->targetList) + 1,
1697-
pstrdup(attrname),
1698-
true);
1699-
1700-
/* ... and add it to the query's targetlist */
1701-
parsetree->targetList = lappend(parsetree->targetList, tle);
1693+
/* Register it as a row-identity column needed by this target rel */
1694+
add_row_identity_var(root, var, rtindex, "ctid");
17021695
}
17031696

17041697
/*
@@ -1886,7 +1879,7 @@ postgresBeginForeignModify(ModifyTableState *mtstate,
18861879
rte,
18871880
resultRelInfo,
18881881
mtstate->operation,
1889-
mtstate->mt_plans[subplan_index]->plan,
1882+
outerPlanState(mtstate)->plan,
18901883
query,
18911884
target_attrs,
18921885
values_end_len,
@@ -2086,8 +2079,7 @@ postgresBeginForeignInsert(ModifyTableState *mtstate,
20862079
*/
20872080
if (plan && plan->operation == CMD_UPDATE &&
20882081
(resultRelInfo->ri_usesFdwDirectModify ||
2089-
resultRelInfo->ri_FdwState) &&
2090-
resultRelInfo > mtstate->resultRelInfo + mtstate->mt_whichplan)
2082+
resultRelInfo->ri_FdwState))
20912083
ereport(ERROR,
20922084
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
20932085
errmsg("cannot route tuples into foreign table to be updated \"%s\"",
@@ -2283,6 +2275,65 @@ postgresRecheckForeignScan(ForeignScanState *node, TupleTableSlot *slot)
22832275
return true;
22842276
}
22852277

2278+
/*
2279+
* find_modifytable_subplan
2280+
* Helper routine for postgresPlanDirectModify to find the
2281+
* ModifyTable subplan node that scans the specified RTI.
2282+
*
2283+
* Returns NULL if the subplan couldn't be identified. That's not a fatal
2284+
* error condition, we just abandon trying to do the update directly.
2285+
*/
2286+
static ForeignScan *
2287+
find_modifytable_subplan(PlannerInfo *root,
2288+
ModifyTable *plan,
2289+
Index rtindex,
2290+
int subplan_index)
2291+
{
2292+
Plan *subplan = outerPlan(plan);
2293+
2294+
/*
2295+
* The cases we support are (1) the desired ForeignScan is the immediate
2296+
* child of ModifyTable, or (2) it is the subplan_index'th child of an
2297+
* Append node that is the immediate child of ModifyTable. There is no
2298+
* point in looking further down, as that would mean that local joins are
2299+
* involved, so we can't do the update directly.
2300+
*
2301+
* There could be a Result atop the Append too, acting to compute the
2302+
* UPDATE targetlist values. We ignore that here; the tlist will be
2303+
* checked by our caller.
2304+
*
2305+
* In principle we could examine all the children of the Append, but it's
2306+
* currently unlikely that the core planner would generate such a plan
2307+
* with the children out-of-order. Moreover, such a search risks costing
2308+
* O(N^2) time when there are a lot of children.
2309+
*/
2310+
if (IsA(subplan, Append))
2311+
{
2312+
Append *appendplan = (Append *) subplan;
2313+
2314+
if (subplan_index < list_length(appendplan->appendplans))
2315+
subplan = (Plan *) list_nth(appendplan->appendplans, subplan_index);
2316+
}
2317+
else if (IsA(subplan, Result) && IsA(outerPlan(subplan), Append))
2318+
{
2319+
Append *appendplan = (Append *) outerPlan(subplan);
2320+
2321+
if (subplan_index < list_length(appendplan->appendplans))
2322+
subplan = (Plan *) list_nth(appendplan->appendplans, subplan_index);
2323+
}
2324+
2325+
/* Now, have we got a ForeignScan on the desired rel? */
2326+
if (IsA(subplan, ForeignScan))
2327+
{
2328+
ForeignScan *fscan = (ForeignScan *) subplan;
2329+
2330+
if (bms_is_member(rtindex, fscan->fs_relids))
2331+
return fscan;
2332+
}
2333+
2334+
return NULL;
2335+
}
2336+
22862337
/*
22872338
* postgresPlanDirectModify
22882339
* Consider a direct foreign table modification
@@ -2297,13 +2348,13 @@ postgresPlanDirectModify(PlannerInfo *root,
22972348
int subplan_index)
22982349
{
22992350
CmdType operation = plan->operation;
2300-
Plan *subplan;
23012351
RelOptInfo *foreignrel;
23022352
RangeTblEntry *rte;
23032353
PgFdwRelationInfo *fpinfo;
23042354
Relation rel;
23052355
StringInfoData sql;
23062356
ForeignScan *fscan;
2357+
List *processed_tlist = NIL;
23072358
List *targetAttrs = NIL;
23082359
List *remote_exprs;
23092360
List *params_list = NIL;
@@ -2321,19 +2372,17 @@ postgresPlanDirectModify(PlannerInfo *root,
23212372
return false;
23222373

23232374
/*
2324-
* It's unsafe to modify a foreign table directly if there are any local
2325-
* joins needed.
2375+
* Try to locate the ForeignScan subplan that's scanning resultRelation.
23262376
*/
2327-
subplan = (Plan *) list_nth(plan->plans, subplan_index);
2328-
if (!IsA(subplan, ForeignScan))
2377+
fscan = find_modifytable_subplan(root, plan, resultRelation, subplan_index);
2378+
if (!fscan)
23292379
return false;
2330-
fscan = (ForeignScan *) subplan;
23312380

23322381
/*
23332382
* It's unsafe to modify a foreign table directly if there are any quals
23342383
* that should be evaluated locally.
23352384
*/
2336-
if (subplan->qual != NIL)
2385+
if (fscan->scan.plan.qual != NIL)
23372386
return false;
23382387

23392388
/* Safe to fetch data about the target foreign rel */
@@ -2354,32 +2403,28 @@ postgresPlanDirectModify(PlannerInfo *root,
23542403
*/
23552404
if (operation == CMD_UPDATE)
23562405
{
2357-
int col;
2406+
ListCell *lc,
2407+
*lc2;
23582408

23592409
/*
2360-
* We transmit only columns that were explicitly targets of the
2361-
* UPDATE, so as to avoid unnecessary data transmission.
2410+
* The expressions of concern are the first N columns of the processed
2411+
* targetlist, where N is the length of the rel's update_colnos.
23622412
*/
2363-
col = -1;
2364-
while ((col = bms_next_member(rte->updatedCols, col)) >= 0)
2413+
get_translated_update_targetlist(root, resultRelation,
2414+
&processed_tlist, &targetAttrs);
2415+
forboth(lc, processed_tlist, lc2, targetAttrs)
23652416
{
2366-
/* bit numbers are offset by FirstLowInvalidHeapAttributeNumber */
2367-
AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
2368-
TargetEntry *tle;
2417+
TargetEntry *tle = lfirst_node(TargetEntry, lc);
2418+
AttrNumber attno = lfirst_int(lc2);
2419+
2420+
/* update's new-value expressions shouldn't be resjunk */
2421+
Assert(!tle->resjunk);
23692422

23702423
if (attno <= InvalidAttrNumber) /* shouldn't happen */
23712424
elog(ERROR, "system-column update is not supported");
23722425

2373-
tle = get_tle_by_resno(subplan->targetlist, attno);
2374-
2375-
if (!tle)
2376-
elog(ERROR, "attribute number %d not found in subplan targetlist",
2377-
attno);
2378-
23792426
if (!is_foreign_expr(root, foreignrel, (Expr *) tle->expr))
23802427
return false;
2381-
2382-
targetAttrs = lappend_int(targetAttrs, attno);
23832428
}
23842429
}
23852430

@@ -2430,7 +2475,7 @@ postgresPlanDirectModify(PlannerInfo *root,
24302475
case CMD_UPDATE:
24312476
deparseDirectUpdateSql(&sql, root, resultRelation, rel,
24322477
foreignrel,
2433-
((Plan *) fscan)->targetlist,
2478+
processed_tlist,
24342479
targetAttrs,
24352480
remote_exprs, &params_list,
24362481
returningList, &retrieved_attrs);

doc/src/sgml/ddl.sgml

+1-2
Original file line numberDiff line numberDiff line change
@@ -4823,8 +4823,7 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2008-01-01';
48234823
well, provided that typical queries allow the query planner to prune all
48244824
but a small number of partitions. Planning times become longer and memory
48254825
consumption becomes higher when more partitions remain after the planner
4826-
performs partition pruning. This is particularly true for the
4827-
<command>UPDATE</command> and <command>DELETE</command> commands. Another
4826+
performs partition pruning. Another
48284827
reason to be concerned about having a large number of partitions is that
48294828
the server's memory consumption may grow significantly over
48304829
time, especially if many sessions touch large numbers of partitions.

doc/src/sgml/fdwhandler.sgml

+37-28
Original file line numberDiff line numberDiff line change
@@ -424,7 +424,8 @@ GetForeignUpperPaths(PlannerInfo *root,
424424
<para>
425425
<programlisting>
426426
void
427-
AddForeignUpdateTargets(Query *parsetree,
427+
AddForeignUpdateTargets(PlannerInfo *root,
428+
Index rtindex,
428429
RangeTblEntry *target_rte,
429430
Relation target_relation);
430431
</programlisting>
@@ -440,27 +441,31 @@ AddForeignUpdateTargets(Query *parsetree,
440441
</para>
441442

442443
<para>
443-
To do that, add <structname>TargetEntry</structname> items to
444-
<literal>parsetree-&gt;targetList</literal>, containing expressions for the
445-
extra values to be fetched. Each such entry must be marked
446-
<structfield>resjunk</structfield> = <literal>true</literal>, and must have a distinct
447-
<structfield>resname</structfield> that will identify it at execution time.
448-
Avoid using names matching <literal>ctid<replaceable>N</replaceable></literal>,
449-
<literal>wholerow</literal>, or
450-
<literal>wholerow<replaceable>N</replaceable></literal>, as the core system can
451-
generate junk columns of these names.
452-
If the extra expressions are more complex than simple Vars, they
453-
must be run through <function>eval_const_expressions</function>
454-
before adding them to the target list.
455-
</para>
456-
457-
<para>
458-
Although this function is called during planning, the
459-
information provided is a bit different from that available to other
460-
planning routines.
461-
<literal>parsetree</literal> is the parse tree for the <command>UPDATE</command> or
462-
<command>DELETE</command> command, while <literal>target_rte</literal> and
463-
<literal>target_relation</literal> describe the target foreign table.
444+
To do that, construct a <structname>Var</structname> representing
445+
an extra value you need, and pass it
446+
to <function>add_row_identity_var</function>, along with a name for
447+
the junk column. (You can do this more than once if several columns
448+
are needed.) You must choose a distinct junk column name for each
449+
different <structname>Var</structname> you need, except
450+
that <structname>Var</structname>s that are identical except for
451+
the <structfield>varno</structfield> field can and should share a
452+
column name.
453+
The core system uses the junk column names
454+
<literal>tableoid</literal> for a
455+
table's <structfield>tableoid</structfield> column,
456+
<literal>ctid</literal>
457+
or <literal>ctid<replaceable>N</replaceable></literal>
458+
for <structfield>ctid</structfield>,
459+
<literal>wholerow</literal>
460+
for a whole-row <structname>Var</structname> marked with
461+
<structfield>vartype</structfield> = <type>RECORD</type>,
462+
and <literal>wholerow<replaceable>N</replaceable></literal>
463+
for a whole-row <structname>Var</structname> with
464+
<structfield>vartype</structfield> equal to the table's declared rowtype.
465+
Re-use these names when you can (the planner will combine duplicate
466+
requests for identical junk columns). If you need another kind of
467+
junk column besides these, it might be wise to choose a name prefixed
468+
with your extension name, to avoid conflicts against other FDWs.
464469
</para>
465470

466471
<para>
@@ -495,8 +500,8 @@ PlanForeignModify(PlannerInfo *root,
495500
<literal>resultRelation</literal> identifies the target foreign table by its
496501
range table index. <literal>subplan_index</literal> identifies which target of
497502
the <structname>ModifyTable</structname> plan node this is, counting from zero;
498-
use this if you want to index into <literal>plan-&gt;plans</literal> or other
499-
substructure of the <literal>plan</literal> node.
503+
use this if you want to index into per-target-relation substructures of the
504+
<literal>plan</literal> node.
500505
</para>
501506

502507
<para>
@@ -703,10 +708,14 @@ ExecForeignUpdate(EState *estate,
703708
<literal>slot</literal> contains the new data for the tuple; it will match the
704709
row-type definition of the foreign table.
705710
<literal>planSlot</literal> contains the tuple that was generated by the
706-
<structname>ModifyTable</structname> plan node's subplan; it differs from
707-
<literal>slot</literal> in possibly containing additional <quote>junk</quote>
708-
columns. In particular, any junk columns that were requested by
709-
<function>AddForeignUpdateTargets</function> will be available from this slot.
711+
<structname>ModifyTable</structname> plan node's subplan. Unlike
712+
<literal>slot</literal>, this tuple contains only the new values for
713+
columns changed by the query, so do not rely on attribute numbers of the
714+
foreign table to index into <literal>planSlot</literal>.
715+
Also, <literal>planSlot</literal> typically contains
716+
additional <quote>junk</quote> columns. In particular, any junk columns
717+
that were requested by <function>AddForeignUpdateTargets</function> will
718+
be available from this slot.
710719
</para>
711720

712721
<para>

0 commit comments

Comments
 (0)