Skip to content

Commit 6a65225

Browse files
committed
Fix some planner issues found while investigating Kevin Grittner's report
of poorer planning in 8.3 than 8.2: 1. After pushing a constant across an outer join --- ie, given "a LEFT JOIN b ON (a.x = b.y) WHERE a.x = 42", we can deduce that b.y is sort of equal to 42, in the sense that we needn't fetch any b rows where it isn't 42 --- loop to see if any additional deductions can be made. Previous releases did that by recursing, but I had mistakenly thought that this was no longer necessary given the EquivalenceClass machinery. 2. Allow pushing constants across outer join conditions even if the condition is outerjoin_delayed due to a lower outer join. This is safe as long as the condition is strict and we re-test it at the upper join. 3. Keep the outer-join clause even if we successfully push a constant across it. This is *necessary* in the outerjoin_delayed case, but even in the simple case, it seems better to do this to ensure that the join search order heuristics will consider the join as reasonable to make. Mark such a clause as having selectivity 1.0, though, since it's not going to eliminate very many rows after application of the constant condition. 4. Tweak have_relevant_eclass_joinclause to report that two relations are joinable when they have vars that are equated to the same constant. We won't actually generate any joinclause from such an EquivalenceClass, but again it seems that in such a case it's a good idea to consider the join as worth costing out. 5. Fix a bug in select_mergejoin_clauses that was exposed by these changes: we have to reject candidate mergejoin clauses if either side was equated to a constant, because we can't construct a canonical pathkey list for such a clause. This is an implementation restriction that might be worth fixing someday, but it doesn't seem critical to get it done for 8.3.
1 parent 8d546c7 commit 6a65225

File tree

9 files changed

+336
-93
lines changed

9 files changed

+336
-93
lines changed

src/backend/optimizer/path/equivclass.c

Lines changed: 171 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
* Portions Copyright (c) 1994, Regents of the University of California
1111
*
1212
* IDENTIFICATION
13-
* $PostgreSQL: pgsql/src/backend/optimizer/path/equivclass.c,v 1.8 2008/01/01 19:45:50 momjian Exp $
13+
* $PostgreSQL: pgsql/src/backend/optimizer/path/equivclass.c,v 1.9 2008/01/09 20:42:27 tgl Exp $
1414
*
1515
*-------------------------------------------------------------------------
1616
*/
@@ -52,10 +52,10 @@ static RestrictInfo *create_join_clause(PlannerInfo *root,
5252
EquivalenceMember *leftem,
5353
EquivalenceMember *rightem,
5454
EquivalenceClass *parent_ec);
55-
static void reconsider_outer_join_clause(PlannerInfo *root,
55+
static bool reconsider_outer_join_clause(PlannerInfo *root,
5656
RestrictInfo *rinfo,
5757
bool outer_on_left);
58-
static void reconsider_full_join_clause(PlannerInfo *root,
58+
static bool reconsider_full_join_clause(PlannerInfo *root,
5959
RestrictInfo *rinfo);
6060

6161

@@ -1094,8 +1094,9 @@ create_join_clause(PlannerInfo *root,
10941094
/*
10951095
* reconsider_outer_join_clauses
10961096
* Re-examine any outer-join clauses that were set aside by
1097-
* distribute_qual_to_rels(), and either create EquivalenceClasses
1098-
* to replace them or push them out into the regular join-clause lists.
1097+
* distribute_qual_to_rels(), and see if we can derive any
1098+
* EquivalenceClasses from them. Then, if they were not made
1099+
* redundant, push them out into the regular join-clause lists.
10991100
*
11001101
* When we have mergejoinable clauses A = B that are outer-join clauses,
11011102
* we can't blindly combine them with other clauses A = C to deduce B = C,
@@ -1112,8 +1113,8 @@ create_join_clause(PlannerInfo *root,
11121113
* equivalence clause.)
11131114
*
11141115
* Note that the above rule does not work for full outer joins; nor is it
1115-
* very interesting to consider cases where the equivalence clause involves
1116-
* relations entirely outside the outer join, since such clauses couldn't
1116+
* very interesting to consider cases where the generated equivalence clause
1117+
* would involve relations outside the outer join, since such clauses couldn't
11171118
* be pushed into the inner side's scan anyway. So the restriction to
11181119
* outervar = pseudoconstant is not really giving up anything.
11191120
*
@@ -1141,57 +1142,161 @@ create_join_clause(PlannerInfo *root,
11411142
* that could result in propagating constant restrictions from
11421143
* INNERVAR to OUTERVAR, which would be very wrong.
11431144
*
1145+
* It's possible that the INNERVAR is also an OUTERVAR for some other
1146+
* outer-join clause, in which case the process can be repeated. So we repeat
1147+
* looping over the lists of clauses until no further deductions can be made.
1148+
* Whenever we do make a deduction, we remove the generating clause from the
1149+
* lists, since we don't want to make the same deduction twice.
1150+
*
11441151
* If we don't find any match for a set-aside outer join clause, we must
11451152
* throw it back into the regular joinclause processing by passing it to
1146-
* distribute_restrictinfo_to_rels().
1153+
* distribute_restrictinfo_to_rels(). If we do generate a derived clause,
1154+
* however, the outer-join clause is redundant. We still throw it back,
1155+
* because otherwise the join will be seen as a clauseless join and avoided
1156+
* during join order searching; but we mark it as redundant to keep from
1157+
* messing up the joinrel's size estimate. (This behavior means that the
1158+
* API for this routine is uselessly complex: we could have just put all
1159+
* the clauses into the regular processing initially. We keep it because
1160+
* someday we might want to do something else, such as inserting "dummy"
1161+
* joinclauses instead of real ones.)
1162+
*
1163+
* Outer join clauses that are marked outerjoin_delayed are special: this
1164+
* condition means that one or both VARs might go to null due to a lower
1165+
* outer join. We can still push a constant through the clause, but only
1166+
* if its operator is strict; and we *have to* throw the clause back into
1167+
* regular joinclause processing. By keeping the strict join clause,
1168+
* we ensure that any null-extended rows that are mistakenly generated due
1169+
* to suppressing rows not matching the constant will be rejected at the
1170+
* upper outer join. (This doesn't work for full-join clauses.)
11471171
*/
11481172
void
11491173
reconsider_outer_join_clauses(PlannerInfo *root)
11501174
{
1151-
ListCell *lc;
1175+
bool found;
1176+
ListCell *cell;
1177+
ListCell *prev;
1178+
ListCell *next;
1179+
1180+
/* Outer loop repeats until we find no more deductions */
1181+
do
1182+
{
1183+
found = false;
1184+
1185+
/* Process the LEFT JOIN clauses */
1186+
prev = NULL;
1187+
for (cell = list_head(root->left_join_clauses); cell; cell = next)
1188+
{
1189+
RestrictInfo *rinfo = (RestrictInfo *) lfirst(cell);
1190+
1191+
next = lnext(cell);
1192+
if (reconsider_outer_join_clause(root, rinfo, true))
1193+
{
1194+
found = true;
1195+
/* remove it from the list */
1196+
root->left_join_clauses =
1197+
list_delete_cell(root->left_join_clauses, cell, prev);
1198+
/* we throw it back anyway (see notes above) */
1199+
/* but the thrown-back clause has no extra selectivity */
1200+
rinfo->this_selec = 1.0;
1201+
distribute_restrictinfo_to_rels(root, rinfo);
1202+
}
1203+
else
1204+
prev = cell;
1205+
}
1206+
1207+
/* Process the RIGHT JOIN clauses */
1208+
prev = NULL;
1209+
for (cell = list_head(root->right_join_clauses); cell; cell = next)
1210+
{
1211+
RestrictInfo *rinfo = (RestrictInfo *) lfirst(cell);
1212+
1213+
next = lnext(cell);
1214+
if (reconsider_outer_join_clause(root, rinfo, false))
1215+
{
1216+
found = true;
1217+
/* remove it from the list */
1218+
root->right_join_clauses =
1219+
list_delete_cell(root->right_join_clauses, cell, prev);
1220+
/* we throw it back anyway (see notes above) */
1221+
/* but the thrown-back clause has no extra selectivity */
1222+
rinfo->this_selec = 1.0;
1223+
distribute_restrictinfo_to_rels(root, rinfo);
1224+
}
1225+
else
1226+
prev = cell;
1227+
}
1228+
1229+
/* Process the FULL JOIN clauses */
1230+
prev = NULL;
1231+
for (cell = list_head(root->full_join_clauses); cell; cell = next)
1232+
{
1233+
RestrictInfo *rinfo = (RestrictInfo *) lfirst(cell);
1234+
1235+
next = lnext(cell);
1236+
if (reconsider_full_join_clause(root, rinfo))
1237+
{
1238+
found = true;
1239+
/* remove it from the list */
1240+
root->full_join_clauses =
1241+
list_delete_cell(root->full_join_clauses, cell, prev);
1242+
/* we throw it back anyway (see notes above) */
1243+
/* but the thrown-back clause has no extra selectivity */
1244+
rinfo->this_selec = 1.0;
1245+
distribute_restrictinfo_to_rels(root, rinfo);
1246+
}
1247+
else
1248+
prev = cell;
1249+
}
1250+
} while (found);
11521251

1153-
/* Process the LEFT JOIN clauses */
1154-
foreach(lc, root->left_join_clauses)
1252+
/* Now, any remaining clauses have to be thrown back */
1253+
foreach(cell, root->left_join_clauses)
11551254
{
1156-
RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
1255+
RestrictInfo *rinfo = (RestrictInfo *) lfirst(cell);
11571256

1158-
reconsider_outer_join_clause(root, rinfo, true);
1257+
distribute_restrictinfo_to_rels(root, rinfo);
11591258
}
1160-
/* And the RIGHT JOIN clauses */
1161-
foreach(lc, root->right_join_clauses)
1259+
foreach(cell, root->right_join_clauses)
11621260
{
1163-
RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
1261+
RestrictInfo *rinfo = (RestrictInfo *) lfirst(cell);
11641262

1165-
reconsider_outer_join_clause(root, rinfo, false);
1263+
distribute_restrictinfo_to_rels(root, rinfo);
11661264
}
1167-
/* And the FULL JOIN clauses */
1168-
foreach(lc, root->full_join_clauses)
1265+
foreach(cell, root->full_join_clauses)
11691266
{
1170-
RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
1267+
RestrictInfo *rinfo = (RestrictInfo *) lfirst(cell);
11711268

1172-
reconsider_full_join_clause(root, rinfo);
1269+
distribute_restrictinfo_to_rels(root, rinfo);
11731270
}
11741271
}
11751272

11761273
/*
11771274
* reconsider_outer_join_clauses for a single LEFT/RIGHT JOIN clause
1275+
*
1276+
* Returns TRUE if we were able to propagate a constant through the clause.
11781277
*/
1179-
static void
1278+
static bool
11801279
reconsider_outer_join_clause(PlannerInfo *root, RestrictInfo *rinfo,
11811280
bool outer_on_left)
11821281
{
11831282
Expr *outervar,
11841283
*innervar;
1185-
Oid left_type,
1284+
Oid opno,
1285+
left_type,
11861286
right_type,
11871287
inner_datatype;
11881288
Relids inner_relids;
11891289
ListCell *lc1;
11901290

1191-
/* Extract needed info from the clause */
11921291
Assert(is_opclause(rinfo->clause));
1193-
op_input_types(((OpExpr *) rinfo->clause)->opno,
1194-
&left_type, &right_type);
1292+
opno = ((OpExpr *) rinfo->clause)->opno;
1293+
1294+
/* If clause is outerjoin_delayed, operator must be strict */
1295+
if (rinfo->outerjoin_delayed && !op_strict(opno))
1296+
return false;
1297+
1298+
/* Extract needed info from the clause */
1299+
op_input_types(opno, &left_type, &right_type);
11951300
if (outer_on_left)
11961301
{
11971302
outervar = (Expr *) get_leftop(rinfo->clause);
@@ -1266,41 +1371,46 @@ reconsider_outer_join_clause(PlannerInfo *root, RestrictInfo *rinfo,
12661371
}
12671372

12681373
/*
1269-
* If we were able to equate INNERVAR to any constant, we're done, and
1270-
* we can throw away the outer-join clause as redundant. Otherwise,
1271-
* fall out of the search loop, since we know the OUTERVAR appears in
1272-
* at most one EC.
1374+
* If we were able to equate INNERVAR to any constant, report success.
1375+
* Otherwise, fall out of the search loop, since we know the OUTERVAR
1376+
* appears in at most one EC.
12731377
*/
12741378
if (match)
1275-
return;
1379+
return true;
12761380
else
12771381
break;
12781382
}
12791383

1280-
/* We did not find a match, so throw it back into regular processing */
1281-
distribute_restrictinfo_to_rels(root, rinfo);
1384+
return false; /* failed to make any deduction */
12821385
}
12831386

12841387
/*
12851388
* reconsider_outer_join_clauses for a single FULL JOIN clause
1389+
*
1390+
* Returns TRUE if we were able to propagate a constant through the clause.
12861391
*/
1287-
static void
1392+
static bool
12881393
reconsider_full_join_clause(PlannerInfo *root, RestrictInfo *rinfo)
12891394
{
12901395
Expr *leftvar;
12911396
Expr *rightvar;
1292-
Oid left_type,
1397+
Oid opno,
1398+
left_type,
12931399
right_type;
12941400
Relids left_relids,
12951401
right_relids;
12961402
ListCell *lc1;
12971403

1404+
/* Can't use an outerjoin_delayed clause here */
1405+
if (rinfo->outerjoin_delayed)
1406+
return false;
1407+
12981408
/* Extract needed info from the clause */
12991409
Assert(is_opclause(rinfo->clause));
1410+
opno = ((OpExpr *) rinfo->clause)->opno;
1411+
op_input_types(opno, &left_type, &right_type);
13001412
leftvar = (Expr *) get_leftop(rinfo->clause);
13011413
rightvar = (Expr *) get_rightop(rinfo->clause);
1302-
op_input_types(((OpExpr *) rinfo->clause)->opno,
1303-
&left_type, &right_type);
13041414
left_relids = rinfo->left_relids;
13051415
right_relids = rinfo->right_relids;
13061416

@@ -1412,7 +1522,7 @@ reconsider_full_join_clause(PlannerInfo *root, RestrictInfo *rinfo)
14121522
if (matchleft && matchright)
14131523
{
14141524
cur_ec->ec_members = list_delete_ptr(cur_ec->ec_members, coal_em);
1415-
return;
1525+
return true;
14161526
}
14171527

14181528
/*
@@ -1423,8 +1533,7 @@ reconsider_full_join_clause(PlannerInfo *root, RestrictInfo *rinfo)
14231533
break;
14241534
}
14251535

1426-
/* We did not find a match, so throw it back into regular processing */
1427-
distribute_restrictinfo_to_rels(root, rinfo);
1536+
return false; /* failed to make any deduction */
14281537
}
14291538

14301539

@@ -1651,16 +1760,22 @@ have_relevant_eclass_joinclause(PlannerInfo *root,
16511760
ListCell *lc2;
16521761

16531762
/*
1654-
* Won't generate joinclauses if const or single-member (the latter
1655-
* test covers the volatile case too)
1763+
* Won't generate joinclauses if single-member (this test covers the
1764+
* volatile case too)
16561765
*/
1657-
if (ec->ec_has_const || list_length(ec->ec_members) <= 1)
1766+
if (list_length(ec->ec_members) <= 1)
16581767
continue;
16591768

16601769
/*
16611770
* Note we don't test ec_broken; if we did, we'd need a separate code
16621771
* path to look through ec_sources. Checking the members anyway is OK
16631772
* as a possibly-overoptimistic heuristic.
1773+
*
1774+
* We don't test ec_has_const either, even though a const eclass
1775+
* won't generate real join clauses. This is because if we had
1776+
* "WHERE a.x = b.y and a.x = 42", it is worth considering a join
1777+
* between a and b, since the join result is likely to be small even
1778+
* though it'll end up being an unqualified nestloop.
16641779
*/
16651780

16661781
/* Needn't scan if it couldn't contain members from each rel */
@@ -1674,8 +1789,8 @@ have_relevant_eclass_joinclause(PlannerInfo *root,
16741789
{
16751790
EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
16761791

1677-
if (cur_em->em_is_child)
1678-
continue; /* ignore children here */
1792+
if (cur_em->em_is_const || cur_em->em_is_child)
1793+
continue; /* ignore consts and children here */
16791794
if (bms_is_subset(cur_em->em_relids, rel1->relids))
16801795
{
16811796
has_rel1 = true;
@@ -1719,16 +1834,22 @@ has_relevant_eclass_joinclause(PlannerInfo *root, RelOptInfo *rel1)
17191834
ListCell *lc2;
17201835

17211836
/*
1722-
* Won't generate joinclauses if const or single-member (the latter
1723-
* test covers the volatile case too)
1837+
* Won't generate joinclauses if single-member (this test covers the
1838+
* volatile case too)
17241839
*/
1725-
if (ec->ec_has_const || list_length(ec->ec_members) <= 1)
1840+
if (list_length(ec->ec_members) <= 1)
17261841
continue;
17271842

17281843
/*
17291844
* Note we don't test ec_broken; if we did, we'd need a separate code
17301845
* path to look through ec_sources. Checking the members anyway is OK
17311846
* as a possibly-overoptimistic heuristic.
1847+
*
1848+
* We don't test ec_has_const either, even though a const eclass
1849+
* won't generate real join clauses. This is because if we had
1850+
* "WHERE a.x = b.y and a.x = 42", it is worth considering a join
1851+
* between a and b, since the join result is likely to be small even
1852+
* though it'll end up being an unqualified nestloop.
17321853
*/
17331854

17341855
/* Needn't scan if it couldn't contain members from each rel */
@@ -1742,8 +1863,8 @@ has_relevant_eclass_joinclause(PlannerInfo *root, RelOptInfo *rel1)
17421863
{
17431864
EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
17441865

1745-
if (cur_em->em_is_child)
1746-
continue; /* ignore children here */
1866+
if (cur_em->em_is_const || cur_em->em_is_child)
1867+
continue; /* ignore consts and children here */
17471868
if (bms_is_subset(cur_em->em_relids, rel1->relids))
17481869
{
17491870
has_rel1 = true;

0 commit comments

Comments
 (0)