Skip to content

Commit 9ce77d7

Browse files
committed
Reconsider the representation of join alias Vars.
The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
1 parent ed10f32 commit 9ce77d7

19 files changed

+421
-390
lines changed

src/backend/catalog/dependency.c

Lines changed: 39 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1718,12 +1718,6 @@ recordDependencyOnSingleRelExpr(const ObjectAddress *depender,
17181718
/*
17191719
* Recursively search an expression tree for object references.
17201720
*
1721-
* Note: we avoid creating references to columns of tables that participate
1722-
* in an SQL JOIN construct, but are not actually used anywhere in the query.
1723-
* To do so, we do not scan the joinaliasvars list of a join RTE while
1724-
* scanning the query rangetable, but instead scan each individual entry
1725-
* of the alias list when we find a reference to it.
1726-
*
17271721
* Note: in many cases we do not need to create dependencies on the datatypes
17281722
* involved in an expression, because we'll have an indirect dependency via
17291723
* some other object. For instance Var nodes depend on a column which depends
@@ -1773,24 +1767,15 @@ find_expr_references_walker(Node *node,
17731767
add_object_address(OCLASS_CLASS, rte->relid, var->varattno,
17741768
context->addrs);
17751769
}
1776-
else if (rte->rtekind == RTE_JOIN)
1777-
{
1778-
/* Scan join output column to add references to join inputs */
1779-
List *save_rtables;
1780-
1781-
/* We must make the context appropriate for join's level */
1782-
save_rtables = context->rtables;
1783-
context->rtables = list_copy_tail(context->rtables,
1784-
var->varlevelsup);
1785-
if (var->varattno <= 0 ||
1786-
var->varattno > list_length(rte->joinaliasvars))
1787-
elog(ERROR, "invalid varattno %d", var->varattno);
1788-
find_expr_references_walker((Node *) list_nth(rte->joinaliasvars,
1789-
var->varattno - 1),
1790-
context);
1791-
list_free(context->rtables);
1792-
context->rtables = save_rtables;
1793-
}
1770+
1771+
/*
1772+
* Vars referencing other RTE types require no additional work. In
1773+
* particular, a join alias Var can be ignored, because it must
1774+
* reference a merged USING column. The relevant join input columns
1775+
* will also be referenced in the join qual, and any type coercion
1776+
* functions involved in the alias expression will be dealt with when
1777+
* we scan the RTE itself.
1778+
*/
17941779
return false;
17951780
}
17961781
else if (IsA(node, Const))
@@ -2147,11 +2132,13 @@ find_expr_references_walker(Node *node,
21472132

21482133
/*
21492134
* Add whole-relation refs for each plain relation mentioned in the
2150-
* subquery's rtable.
2135+
* subquery's rtable, and ensure we add refs for any type-coercion
2136+
* functions used in join alias lists.
21512137
*
21522138
* Note: query_tree_walker takes care of recursing into RTE_FUNCTION
2153-
* RTEs, subqueries, etc, so no need to do that here. But keep it
2154-
* from looking at join alias lists.
2139+
* RTEs, subqueries, etc, so no need to do that here. But we must
2140+
* tell it not to visit join alias lists, or we'll add refs for join
2141+
* input columns whether or not they are actually used in our query.
21552142
*
21562143
* Note: we don't need to worry about collations mentioned in
21572144
* RTE_VALUES or RTE_CTE RTEs, because those must just duplicate
@@ -2169,6 +2156,31 @@ find_expr_references_walker(Node *node,
21692156
add_object_address(OCLASS_CLASS, rte->relid, 0,
21702157
context->addrs);
21712158
break;
2159+
case RTE_JOIN:
2160+
2161+
/*
2162+
* Examine joinaliasvars entries only for merged JOIN
2163+
* USING columns. Only those entries could contain
2164+
* type-coercion functions. Also, their join input
2165+
* columns must be referenced in the join quals, so this
2166+
* won't accidentally add refs to otherwise-unused join
2167+
* input columns. (We want to ref the type coercion
2168+
* functions even if the merged column isn't explicitly
2169+
* used anywhere, to protect possible expansion of the
2170+
* join RTE as a whole-row var, and because it seems like
2171+
* a bad idea to allow dropping a function that's present
2172+
* in our query tree, whether or not it could get called.)
2173+
*/
2174+
context->rtables = lcons(query->rtable, context->rtables);
2175+
for (int i = 0; i < rte->joinmergedcols; i++)
2176+
{
2177+
Node *aliasvar = list_nth(rte->joinaliasvars, i);
2178+
2179+
if (!IsA(aliasvar, Var))
2180+
find_expr_references_walker(aliasvar, context);
2181+
}
2182+
context->rtables = list_delete_first(context->rtables);
2183+
break;
21722184
default:
21732185
break;
21742186
}

src/backend/nodes/copyfuncs.c

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1367,8 +1367,8 @@ _copyVar(const Var *from)
13671367
COPY_SCALAR_FIELD(vartypmod);
13681368
COPY_SCALAR_FIELD(varcollid);
13691369
COPY_SCALAR_FIELD(varlevelsup);
1370-
COPY_SCALAR_FIELD(varnoold);
1371-
COPY_SCALAR_FIELD(varoattno);
1370+
COPY_SCALAR_FIELD(varnosyn);
1371+
COPY_SCALAR_FIELD(varattnosyn);
13721372
COPY_LOCATION_FIELD(location);
13731373

13741374
return newnode;
@@ -2373,7 +2373,10 @@ _copyRangeTblEntry(const RangeTblEntry *from)
23732373
COPY_NODE_FIELD(subquery);
23742374
COPY_SCALAR_FIELD(security_barrier);
23752375
COPY_SCALAR_FIELD(jointype);
2376+
COPY_SCALAR_FIELD(joinmergedcols);
23762377
COPY_NODE_FIELD(joinaliasvars);
2378+
COPY_NODE_FIELD(joinleftcols);
2379+
COPY_NODE_FIELD(joinrightcols);
23772380
COPY_NODE_FIELD(functions);
23782381
COPY_SCALAR_FIELD(funcordinality);
23792382
COPY_NODE_FIELD(tablefunc);

src/backend/nodes/equalfuncs.c

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -168,8 +168,12 @@ _equalVar(const Var *a, const Var *b)
168168
COMPARE_SCALAR_FIELD(vartypmod);
169169
COMPARE_SCALAR_FIELD(varcollid);
170170
COMPARE_SCALAR_FIELD(varlevelsup);
171-
COMPARE_SCALAR_FIELD(varnoold);
172-
COMPARE_SCALAR_FIELD(varoattno);
171+
172+
/*
173+
* varnosyn/varattnosyn are intentionally ignored here, because Vars with
174+
* different syntactic identifiers are semantically the same as long as
175+
* their varno/varattno match.
176+
*/
173177
COMPARE_LOCATION_FIELD(location);
174178

175179
return true;
@@ -2657,7 +2661,10 @@ _equalRangeTblEntry(const RangeTblEntry *a, const RangeTblEntry *b)
26572661
COMPARE_NODE_FIELD(subquery);
26582662
COMPARE_SCALAR_FIELD(security_barrier);
26592663
COMPARE_SCALAR_FIELD(jointype);
2664+
COMPARE_SCALAR_FIELD(joinmergedcols);
26602665
COMPARE_NODE_FIELD(joinaliasvars);
2666+
COMPARE_NODE_FIELD(joinleftcols);
2667+
COMPARE_NODE_FIELD(joinrightcols);
26612668
COMPARE_NODE_FIELD(functions);
26622669
COMPARE_SCALAR_FIELD(funcordinality);
26632670
COMPARE_NODE_FIELD(tablefunc);

src/backend/nodes/makefuncs.c

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -80,13 +80,13 @@ makeVar(Index varno,
8080
var->varlevelsup = varlevelsup;
8181

8282
/*
83-
* Since few if any routines ever create Var nodes with varnoold/varoattno
84-
* different from varno/varattno, we don't provide separate arguments for
85-
* them, but just initialize them to the given varno/varattno. This
83+
* Only a few callers need to make Var nodes with varnosyn/varattnosyn
84+
* different from varno/varattno. We don't provide separate arguments for
85+
* them, but just initialize them to the given varno/varattno. This
8686
* reduces code clutter and chance of error for most callers.
8787
*/
88-
var->varnoold = varno;
89-
var->varoattno = varattno;
88+
var->varnosyn = varno;
89+
var->varattnosyn = varattno;
9090

9191
/* Likewise, we just set location to "unknown" here */
9292
var->location = -1;

src/backend/nodes/outfuncs.c

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1074,8 +1074,8 @@ _outVar(StringInfo str, const Var *node)
10741074
WRITE_INT_FIELD(vartypmod);
10751075
WRITE_OID_FIELD(varcollid);
10761076
WRITE_UINT_FIELD(varlevelsup);
1077-
WRITE_UINT_FIELD(varnoold);
1078-
WRITE_INT_FIELD(varoattno);
1077+
WRITE_UINT_FIELD(varnosyn);
1078+
WRITE_INT_FIELD(varattnosyn);
10791079
WRITE_LOCATION_FIELD(location);
10801080
}
10811081

@@ -3071,7 +3071,10 @@ _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
30713071
break;
30723072
case RTE_JOIN:
30733073
WRITE_ENUM_FIELD(jointype, JoinType);
3074+
WRITE_INT_FIELD(joinmergedcols);
30743075
WRITE_NODE_FIELD(joinaliasvars);
3076+
WRITE_NODE_FIELD(joinleftcols);
3077+
WRITE_NODE_FIELD(joinrightcols);
30753078
break;
30763079
case RTE_FUNCTION:
30773080
WRITE_NODE_FIELD(functions);

src/backend/nodes/readfuncs.c

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -540,8 +540,8 @@ _readVar(void)
540540
READ_INT_FIELD(vartypmod);
541541
READ_OID_FIELD(varcollid);
542542
READ_UINT_FIELD(varlevelsup);
543-
READ_UINT_FIELD(varnoold);
544-
READ_INT_FIELD(varoattno);
543+
READ_UINT_FIELD(varnosyn);
544+
READ_INT_FIELD(varattnosyn);
545545
READ_LOCATION_FIELD(location);
546546

547547
READ_DONE();
@@ -1400,7 +1400,10 @@ _readRangeTblEntry(void)
14001400
break;
14011401
case RTE_JOIN:
14021402
READ_ENUM_FIELD(jointype, JoinType);
1403+
READ_INT_FIELD(joinmergedcols);
14031404
READ_NODE_FIELD(joinaliasvars);
1405+
READ_NODE_FIELD(joinleftcols);
1406+
READ_NODE_FIELD(joinrightcols);
14041407
break;
14051408
case RTE_FUNCTION:
14061409
READ_NODE_FIELD(functions);

src/backend/optimizer/plan/setrefs.c

Lines changed: 21 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -428,6 +428,8 @@ add_rte_to_flat_rtable(PlannerGlobal *glob, RangeTblEntry *rte)
428428
newrte->tablesample = NULL;
429429
newrte->subquery = NULL;
430430
newrte->joinaliasvars = NIL;
431+
newrte->joinleftcols = NIL;
432+
newrte->joinrightcols = NIL;
431433
newrte->functions = NIL;
432434
newrte->tablefunc = NULL;
433435
newrte->values_lists = NIL;
@@ -1681,8 +1683,8 @@ fix_scan_expr_mutator(Node *node, fix_scan_expr_context *context)
16811683
Assert(var->varno != OUTER_VAR);
16821684
if (!IS_SPECIAL_VARNO(var->varno))
16831685
var->varno += context->rtoffset;
1684-
if (var->varnoold > 0)
1685-
var->varnoold += context->rtoffset;
1686+
if (var->varnosyn > 0)
1687+
var->varnosyn += context->rtoffset;
16861688
return (Node *) var;
16871689
}
16881690
if (IsA(node, Param))
@@ -2110,15 +2112,16 @@ set_dummy_tlist_references(Plan *plan, int rtoffset)
21102112
exprTypmod((Node *) oldvar),
21112113
exprCollation((Node *) oldvar),
21122114
0);
2113-
if (IsA(oldvar, Var))
2115+
if (IsA(oldvar, Var) &&
2116+
oldvar->varnosyn > 0)
21142117
{
2115-
newvar->varnoold = oldvar->varno + rtoffset;
2116-
newvar->varoattno = oldvar->varattno;
2118+
newvar->varnosyn = oldvar->varnosyn + rtoffset;
2119+
newvar->varattnosyn = oldvar->varattnosyn;
21172120
}
21182121
else
21192122
{
2120-
newvar->varnoold = 0; /* wasn't ever a plain Var */
2121-
newvar->varoattno = 0;
2123+
newvar->varnosyn = 0; /* wasn't ever a plain Var */
2124+
newvar->varattnosyn = 0;
21222125
}
21232126

21242127
tle = flatCopyTargetEntry(tle);
@@ -2242,7 +2245,7 @@ build_tlist_index_other_vars(List *tlist, Index ignore_rel)
22422245
*
22432246
* If a match is found, return a copy of the given Var with suitably
22442247
* modified varno/varattno (to wit, newvarno and the resno of the TLE entry).
2245-
* Also ensure that varnoold is incremented by rtoffset.
2248+
* Also ensure that varnosyn is incremented by rtoffset.
22462249
* If no match, return NULL.
22472250
*/
22482251
static Var *
@@ -2265,8 +2268,8 @@ search_indexed_tlist_for_var(Var *var, indexed_tlist *itlist,
22652268

22662269
newvar->varno = newvarno;
22672270
newvar->varattno = vinfo->resno;
2268-
if (newvar->varnoold > 0)
2269-
newvar->varnoold += rtoffset;
2271+
if (newvar->varnosyn > 0)
2272+
newvar->varnosyn += rtoffset;
22702273
return newvar;
22712274
}
22722275
vinfo++;
@@ -2308,8 +2311,8 @@ search_indexed_tlist_for_non_var(Expr *node,
23082311
Var *newvar;
23092312

23102313
newvar = makeVarFromTargetEntry(newvarno, tle);
2311-
newvar->varnoold = 0; /* wasn't ever a plain Var */
2312-
newvar->varoattno = 0;
2314+
newvar->varnosyn = 0; /* wasn't ever a plain Var */
2315+
newvar->varattnosyn = 0;
23132316
return newvar;
23142317
}
23152318
return NULL; /* no match */
@@ -2345,8 +2348,8 @@ search_indexed_tlist_for_sortgroupref(Expr *node,
23452348
Var *newvar;
23462349

23472350
newvar = makeVarFromTargetEntry(newvarno, tle);
2348-
newvar->varnoold = 0; /* wasn't ever a plain Var */
2349-
newvar->varoattno = 0;
2351+
newvar->varnosyn = 0; /* wasn't ever a plain Var */
2352+
newvar->varattnosyn = 0;
23502353
return newvar;
23512354
}
23522355
}
@@ -2384,7 +2387,7 @@ search_indexed_tlist_for_sortgroupref(Expr *node,
23842387
* or NULL
23852388
* 'acceptable_rel' is either zero or the rangetable index of a relation
23862389
* whose Vars may appear in the clause without provoking an error
2387-
* 'rtoffset': how much to increment varnoold by
2390+
* 'rtoffset': how much to increment varnos by
23882391
*
23892392
* Returns the new expression tree. The original clause structure is
23902393
* not modified.
@@ -2445,8 +2448,8 @@ fix_join_expr_mutator(Node *node, fix_join_expr_context *context)
24452448
{
24462449
var = copyVar(var);
24472450
var->varno += context->rtoffset;
2448-
if (var->varnoold > 0)
2449-
var->varnoold += context->rtoffset;
2451+
if (var->varnosyn > 0)
2452+
var->varnosyn += context->rtoffset;
24502453
return (Node *) var;
24512454
}
24522455

@@ -2528,7 +2531,7 @@ fix_join_expr_mutator(Node *node, fix_join_expr_context *context)
25282531
* 'node': the tree to be fixed (a target item or qual)
25292532
* 'subplan_itlist': indexed target list for subplan (or index)
25302533
* 'newvarno': varno to use for Vars referencing tlist elements
2531-
* 'rtoffset': how much to increment varnoold by
2534+
* 'rtoffset': how much to increment varnos by
25322535
*
25332536
* The resulting tree is a copy of the original in which all Var nodes have
25342537
* varno = newvarno, varattno = resno of corresponding targetlist element.

src/backend/optimizer/util/appendinfo.c

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -255,6 +255,9 @@ adjust_appendrel_attrs_mutator(Node *node,
255255
Var *var = (Var *) copyObject(node);
256256
AppendRelInfo *appinfo = NULL;
257257

258+
if (var->varlevelsup != 0)
259+
return (Node *) var; /* no changes needed */
260+
258261
for (cnt = 0; cnt < nappinfos; cnt++)
259262
{
260263
if (var->varno == appinfos[cnt]->parent_relid)
@@ -264,10 +267,12 @@ adjust_appendrel_attrs_mutator(Node *node,
264267
}
265268
}
266269

267-
if (var->varlevelsup == 0 && appinfo)
270+
if (appinfo)
268271
{
269272
var->varno = appinfo->child_relid;
270-
var->varnoold = appinfo->child_relid;
273+
/* it's now a generated Var, so drop any syntactic labeling */
274+
var->varnosyn = 0;
275+
var->varattnosyn = 0;
271276
if (var->varattno > 0)
272277
{
273278
Node *newnode;

src/backend/optimizer/util/paramassign.c

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -83,15 +83,14 @@ assign_param_for_var(PlannerInfo *root, Var *var)
8383

8484
/*
8585
* This comparison must match _equalVar(), except for ignoring
86-
* varlevelsup. Note that _equalVar() ignores the location.
86+
* varlevelsup. Note that _equalVar() ignores varnosyn,
87+
* varattnosyn, and location, so this does too.
8788
*/
8889
if (pvar->varno == var->varno &&
8990
pvar->varattno == var->varattno &&
9091
pvar->vartype == var->vartype &&
9192
pvar->vartypmod == var->vartypmod &&
92-
pvar->varcollid == var->varcollid &&
93-
pvar->varnoold == var->varnoold &&
94-
pvar->varoattno == var->varoattno)
93+
pvar->varcollid == var->varcollid)
9594
return pitem->paramId;
9695
}
9796
}

src/backend/parser/analyze.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1734,7 +1734,10 @@ transformSetOperationStmt(ParseState *pstate, SelectStmt *stmt)
17341734
targetnames,
17351735
sortnscolumns,
17361736
JOIN_INNER,
1737+
0,
17371738
targetvars,
1739+
NIL,
1740+
NIL,
17381741
NULL,
17391742
false);
17401743

0 commit comments

Comments
 (0)