Skip to content

Commit 739f1d6

Browse files
committed
Fix mis-handling of outer join quals generated by EquivalenceClasses.
It's possible, in admittedly-rather-contrived cases, for an eclass to generate a derived "join" qual that constrains the post-outer-join value(s) of some RHS variable(s) without mentioning the LHS at all. While the mechanisms were set up to work for this, we fell foul of the "get_common_eclass_indexes" filter installed by commit 3373c71: it could decide that such an eclass wasn't relevant to the join, so that the required qual clause wouldn't get emitted there or anywhere else. To fix, apply get_common_eclass_indexes only at inner joins, where its rule is still valid. At an outer join, fall back to examining all eclasses that mention either input (or the OJ relid, though it should be impossible for an eclass to mention that without mentioning either input). Perhaps we can improve on that later, but the cost/benefit of adding more complexity to skip some irrelevant eclasses is dubious. To allow cheaply distinguishing outer from inner joins, pass the ojrelid to generate_join_implied_equalities as a separate argument. This also allows cleaning up some sloppiness that had crept into the definition of its join_relids argument, and it allows accurate calculation of nominal_join_relids for a child outer join. (The latter oversight seems not to have been a live bug, but it certainly could have caused problems in future.) Also fix what might be a live bug in check_index_predicates: it was being sloppy about what it passed to generate_join_implied_equalities. Per report from Richard Guo. Discussion: https://postgr.es/m/CAMbWs4-DsTBfOvXuw64GdFss2=M5cwtEhY=0DCS7t2gT7P6hSA@mail.gmail.com
1 parent 03d02f5 commit 739f1d6

File tree

7 files changed

+107
-18
lines changed

7 files changed

+107
-18
lines changed

src/backend/optimizer/path/equivclass.c

Lines changed: 42 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1366,15 +1366,19 @@ generate_base_implied_equalities_broken(PlannerInfo *root,
13661366
* commutative duplicates, i.e. if the algorithm selects "a.x = b.y" but
13671367
* we already have "b.y = a.x", we return the existing clause.
13681368
*
1369-
* join_relids should always equal bms_union(outer_relids, inner_rel->relids).
1370-
* We could simplify this function's API by computing it internally, but in
1371-
* most current uses, the caller has the value at hand anyway.
1369+
* If we are considering an outer join, ojrelid is the associated OJ relid,
1370+
* otherwise it's zero.
1371+
*
1372+
* join_relids should always equal bms_union(outer_relids, inner_rel->relids)
1373+
* plus ojrelid if that's not zero. We could simplify this function's API by
1374+
* computing it internally, but most callers have the value at hand anyway.
13721375
*/
13731376
List *
13741377
generate_join_implied_equalities(PlannerInfo *root,
13751378
Relids join_relids,
13761379
Relids outer_relids,
1377-
RelOptInfo *inner_rel)
1380+
RelOptInfo *inner_rel,
1381+
Index ojrelid)
13781382
{
13791383
List *result = NIL;
13801384
Relids inner_relids = inner_rel->relids;
@@ -1392,6 +1396,8 @@ generate_join_implied_equalities(PlannerInfo *root,
13921396
nominal_inner_relids = inner_rel->top_parent_relids;
13931397
/* ECs will be marked with the parent's relid, not the child's */
13941398
nominal_join_relids = bms_union(outer_relids, nominal_inner_relids);
1399+
if (ojrelid != 0)
1400+
nominal_join_relids = bms_add_member(nominal_join_relids, ojrelid);
13951401
}
13961402
else
13971403
{
@@ -1400,10 +1406,23 @@ generate_join_implied_equalities(PlannerInfo *root,
14001406
}
14011407

14021408
/*
1403-
* Get all eclasses that mention both inner and outer sides of the join
1409+
* Examine all potentially-relevant eclasses.
1410+
*
1411+
* If we are considering an outer join, we must include "join" clauses
1412+
* that mention either input rel plus the outer join's relid; these
1413+
* represent post-join filter clauses that have to be applied at this
1414+
* join. We don't have infrastructure that would let us identify such
1415+
* eclasses cheaply, so just fall back to considering all eclasses
1416+
* mentioning anything in nominal_join_relids.
1417+
*
1418+
* At inner joins, we can be smarter: only consider eclasses mentioning
1419+
* both input rels.
14041420
*/
1405-
matching_ecs = get_common_eclass_indexes(root, nominal_inner_relids,
1406-
outer_relids);
1421+
if (ojrelid != 0)
1422+
matching_ecs = get_eclass_indexes_for_relids(root, nominal_join_relids);
1423+
else
1424+
matching_ecs = get_common_eclass_indexes(root, nominal_inner_relids,
1425+
outer_relids);
14071426

14081427
i = -1;
14091428
while ((i = bms_next_member(matching_ecs, i)) >= 0)
@@ -1447,6 +1466,10 @@ generate_join_implied_equalities(PlannerInfo *root,
14471466
/*
14481467
* generate_join_implied_equalities_for_ecs
14491468
* As above, but consider only the listed ECs.
1469+
*
1470+
* For the sole current caller, we can assume ojrelid == 0, that is we are
1471+
* not interested in outer-join filter clauses. This might need to change
1472+
* in future.
14501473
*/
14511474
List *
14521475
generate_join_implied_equalities_for_ecs(PlannerInfo *root,
@@ -2970,6 +2993,8 @@ generate_implied_equalities_for_column(PlannerInfo *root,
29702993
* generate_join_implied_equalities(). Note it's OK to occasionally say "yes"
29712994
* incorrectly. Hence we don't bother with details like whether the lack of a
29722995
* cross-type operator might prevent the clause from actually being generated.
2996+
* False negatives are not always fatal either: they will discourage, but not
2997+
* completely prevent, investigation of particular join pathways.
29732998
*/
29742999
bool
29753000
have_relevant_eclass_joinclause(PlannerInfo *root,
@@ -2978,7 +3003,16 @@ have_relevant_eclass_joinclause(PlannerInfo *root,
29783003
Bitmapset *matching_ecs;
29793004
int i;
29803005

2981-
/* Examine only eclasses mentioning both rel1 and rel2 */
3006+
/*
3007+
* Examine only eclasses mentioning both rel1 and rel2.
3008+
*
3009+
* Note that we do not consider the possibility of an eclass generating
3010+
* "join" clauses that mention just one of the rels plus an outer join
3011+
* that could be formed from them. Although such clauses must be
3012+
* correctly enforced when we form the outer join, they don't seem like
3013+
* sufficient reason to prioritize this join over other ones. The join
3014+
* ordering rules will force the join to be made when necessary.
3015+
*/
29823016
matching_ecs = get_common_eclass_indexes(root, rel1->relids,
29833017
rel2->relids);
29843018

src/backend/optimizer/path/indxpath.c

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3349,12 +3349,15 @@ check_index_predicates(PlannerInfo *root, RelOptInfo *rel)
33493349
* relid sets for generate_join_implied_equalities is slightly tricky
33503350
* because the rel could be a child rel rather than a true baserel, and in
33513351
* that case we must subtract its parents' relid(s) from all_query_rels.
3352+
* Additionally, we mustn't consider clauses that are only computable
3353+
* after outer joins that can null the rel.
33523354
*/
33533355
if (rel->reloptkind == RELOPT_OTHER_MEMBER_REL)
33543356
otherrels = bms_difference(root->all_query_rels,
33553357
find_childrel_parents(root, rel));
33563358
else
33573359
otherrels = bms_difference(root->all_query_rels, rel->relids);
3360+
otherrels = bms_del_members(otherrels, rel->nulling_relids);
33583361

33593362
if (!bms_is_empty(otherrels))
33603363
clauselist =
@@ -3363,7 +3366,8 @@ check_index_predicates(PlannerInfo *root, RelOptInfo *rel)
33633366
bms_union(rel->relids,
33643367
otherrels),
33653368
otherrels,
3366-
rel));
3369+
rel,
3370+
0));
33673371

33683372
/*
33693373
* Normally we remove quals that are implied by a partial index's

src/backend/optimizer/plan/analyzejoins.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -669,7 +669,8 @@ reduce_unique_semijoins(PlannerInfo *root)
669669
list_concat(generate_join_implied_equalities(root,
670670
joinrelids,
671671
sjinfo->min_lefthand,
672-
innerrel),
672+
innerrel,
673+
0),
673674
innerrel->joininfo);
674675

675676
/* Test whether the innerrel is unique for those clauses. */

src/backend/optimizer/util/relnode.c

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,8 @@ static void build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
4747
static List *build_joinrel_restrictlist(PlannerInfo *root,
4848
RelOptInfo *joinrel,
4949
RelOptInfo *outer_rel,
50-
RelOptInfo *inner_rel);
50+
RelOptInfo *inner_rel,
51+
SpecialJoinInfo *sjinfo);
5152
static void build_joinrel_joinlist(RelOptInfo *joinrel,
5253
RelOptInfo *outer_rel,
5354
RelOptInfo *inner_rel);
@@ -667,7 +668,8 @@ build_join_rel(PlannerInfo *root,
667668
*restrictlist_ptr = build_joinrel_restrictlist(root,
668669
joinrel,
669670
outer_rel,
670-
inner_rel);
671+
inner_rel,
672+
sjinfo);
671673
return joinrel;
672674
}
673675

@@ -779,7 +781,8 @@ build_join_rel(PlannerInfo *root,
779781
* for set_joinrel_size_estimates().)
780782
*/
781783
restrictlist = build_joinrel_restrictlist(root, joinrel,
782-
outer_rel, inner_rel);
784+
outer_rel, inner_rel,
785+
sjinfo);
783786
if (restrictlist_ptr)
784787
*restrictlist_ptr = restrictlist;
785788
build_joinrel_joinlist(joinrel, outer_rel, inner_rel);
@@ -1220,6 +1223,7 @@ build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
12201223
* 'joinrel' is a join relation node
12211224
* 'outer_rel' and 'inner_rel' are a pair of relations that can be joined
12221225
* to form joinrel.
1226+
* 'sjinfo': join context info
12231227
*
12241228
* build_joinrel_restrictlist() returns a list of relevant restrictinfos,
12251229
* whereas build_joinrel_joinlist() stores its results in the joinrel's
@@ -1234,7 +1238,8 @@ static List *
12341238
build_joinrel_restrictlist(PlannerInfo *root,
12351239
RelOptInfo *joinrel,
12361240
RelOptInfo *outer_rel,
1237-
RelOptInfo *inner_rel)
1241+
RelOptInfo *inner_rel,
1242+
SpecialJoinInfo *sjinfo)
12381243
{
12391244
List *result;
12401245
Relids both_input_relids;
@@ -1260,7 +1265,8 @@ build_joinrel_restrictlist(PlannerInfo *root,
12601265
generate_join_implied_equalities(root,
12611266
joinrel->relids,
12621267
outer_rel->relids,
1263-
inner_rel));
1268+
inner_rel,
1269+
sjinfo->ojrelid));
12641270

12651271
return result;
12661272
}
@@ -1543,7 +1549,8 @@ get_baserel_parampathinfo(PlannerInfo *root, RelOptInfo *baserel,
15431549
generate_join_implied_equalities(root,
15441550
joinrelids,
15451551
required_outer,
1546-
baserel));
1552+
baserel,
1553+
0));
15471554

15481555
/* Compute set of serial numbers of the enforced clauses */
15491556
pserials = NULL;
@@ -1665,7 +1672,8 @@ get_joinrel_parampathinfo(PlannerInfo *root, RelOptInfo *joinrel,
16651672
eclauses = generate_join_implied_equalities(root,
16661673
join_and_req,
16671674
required_outer,
1668-
joinrel);
1675+
joinrel,
1676+
0);
16691677
/* We only want ones that aren't movable to lower levels */
16701678
dropped_ecs = NIL;
16711679
foreach(lc, eclauses)

src/include/optimizer/paths.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,8 @@ extern void generate_base_implied_equalities(PlannerInfo *root);
149149
extern List *generate_join_implied_equalities(PlannerInfo *root,
150150
Relids join_relids,
151151
Relids outer_relids,
152-
RelOptInfo *inner_rel);
152+
RelOptInfo *inner_rel,
153+
Index ojrelid);
153154
extern List *generate_join_implied_equalities_for_ecs(PlannerInfo *root,
154155
List *eclasses,
155156
Relids join_relids,

src/test/regress/expected/join.out

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4100,6 +4100,38 @@ select a.unique1, b.unique1, c.unique1, coalesce(b.twothousand, a.twothousand)
41004100
---------+---------+---------+----------
41014101
(0 rows)
41024102

4103+
-- related case
4104+
explain (costs off)
4105+
select * from int8_tbl t1 left join int8_tbl t2 on t1.q2 = t2.q1,
4106+
lateral (select * from int8_tbl t3 where t2.q1 = t2.q2) ss;
4107+
QUERY PLAN
4108+
-------------------------------------------
4109+
Nested Loop
4110+
-> Hash Left Join
4111+
Hash Cond: (t1.q2 = t2.q1)
4112+
Filter: (t2.q1 = t2.q2)
4113+
-> Seq Scan on int8_tbl t1
4114+
-> Hash
4115+
-> Seq Scan on int8_tbl t2
4116+
-> Seq Scan on int8_tbl t3
4117+
(8 rows)
4118+
4119+
select * from int8_tbl t1 left join int8_tbl t2 on t1.q2 = t2.q1,
4120+
lateral (select * from int8_tbl t3 where t2.q1 = t2.q2) ss;
4121+
q1 | q2 | q1 | q2 | q1 | q2
4122+
------------------+------------------+------------------+------------------+------------------+-------------------
4123+
123 | 4567890123456789 | 4567890123456789 | 4567890123456789 | 123 | 456
4124+
123 | 4567890123456789 | 4567890123456789 | 4567890123456789 | 123 | 4567890123456789
4125+
123 | 4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789 | 123
4126+
123 | 4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789
4127+
123 | 4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789 | -4567890123456789
4128+
4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789 | 123 | 456
4129+
4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789 | 123 | 4567890123456789
4130+
4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789 | 123
4131+
4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789
4132+
4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789 | -4567890123456789
4133+
(10 rows)
4134+
41034135
--
41044136
-- check handling of join aliases when flattening multiple levels of subquery
41054137
--

src/test/regress/sql/join.sql

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1386,6 +1386,15 @@ select a.unique1, b.unique1, c.unique1, coalesce(b.twothousand, a.twothousand)
13861386
from tenk1 a left join tenk1 b on b.thousand = a.unique1 left join tenk1 c on c.unique2 = coalesce(b.twothousand, a.twothousand)
13871387
where a.unique2 < 10 and coalesce(b.twothousand, a.twothousand) = 44;
13881388

1389+
-- related case
1390+
1391+
explain (costs off)
1392+
select * from int8_tbl t1 left join int8_tbl t2 on t1.q2 = t2.q1,
1393+
lateral (select * from int8_tbl t3 where t2.q1 = t2.q2) ss;
1394+
1395+
select * from int8_tbl t1 left join int8_tbl t2 on t1.q2 = t2.q1,
1396+
lateral (select * from int8_tbl t3 where t2.q1 = t2.q2) ss;
1397+
13891398
--
13901399
-- check handling of join aliases when flattening multiple levels of subquery
13911400
--

0 commit comments

Comments
 (0)