Skip to content

Commit 841c9b6

Browse files
committed
Postpone creation of pathkeys lists to fix bug #8049.
This patch gets rid of the concept of, and infrastructure for, non-canonical PathKeys; we now only ever create canonical pathkey lists. The need for non-canonical pathkeys came from the desire to have grouping_planner initialize query_pathkeys and related pathkey lists before calling query_planner. However, since query_planner didn't actually *do* anything with those lists before they'd been made canonical, we can get rid of the whole mess by just not creating the lists at all until the point where we formerly canonicalized them. There are several ways in which we could implement that without making query_planner itself deal with grouping/sorting features (which are supposed to be the province of grouping_planner). I chose to add a callback function to query_planner's API; other alternatives would have required adding more fields to PlannerInfo, which while not bad in itself would create an ABI break for planner-related plugins in the 9.2 release series. This still breaks ABI for anything that calls query_planner directly, but it seems somewhat unlikely that there are any such plugins. I had originally conceived of this change as merely a step on the way to fixing bug #8049 from Teun Hoogendoorn; but it turns out that this fixes that bug all by itself, as per the added regression test. The reason is that now get_eclass_for_sort_expr is adding the ORDER BY expression at the end of EquivalenceClass creation not the start, and so anything that is in a multi-member EquivalenceClass has already been created with correct em_nullable_relids. I am suspicious that there are related scenarios in which we still need to teach get_eclass_for_sort_expr to compute correct nullable_relids, but am not eager to risk destabilizing either 9.2 or 9.3 to fix bugs that are only hypothetical. So for the moment, do this and stop here. Back-patch to 9.2 but not to earlier branches, since they don't exhibit this bug for lack of join-clause-movement logic that depends on em_nullable_relids being correct. (We might have to revisit that choice if any related bugs turn up.) In 9.2, don't change the signature of make_pathkeys_for_sortclauses nor remove canonicalize_pathkeys, so as not to risk more plugin breakage than we have to.
1 parent 95909f3 commit 841c9b6

File tree

11 files changed

+250
-281
lines changed

11 files changed

+250
-281
lines changed

src/backend/nodes/equalfuncs.c

Lines changed: 2 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -726,22 +726,8 @@ _equalFromExpr(const FromExpr *a, const FromExpr *b)
726726
static bool
727727
_equalPathKey(const PathKey *a, const PathKey *b)
728728
{
729-
/*
730-
* This is normally used on non-canonicalized PathKeys, so must chase up
731-
* to the topmost merged EquivalenceClass and see if those are the same
732-
* (by pointer equality).
733-
*/
734-
EquivalenceClass *a_eclass;
735-
EquivalenceClass *b_eclass;
736-
737-
a_eclass = a->pk_eclass;
738-
while (a_eclass->ec_merged)
739-
a_eclass = a_eclass->ec_merged;
740-
b_eclass = b->pk_eclass;
741-
while (b_eclass->ec_merged)
742-
b_eclass = b_eclass->ec_merged;
743-
if (a_eclass != b_eclass)
744-
return false;
729+
/* We assume pointer equality is sufficient to compare the eclasses */
730+
COMPARE_SCALAR_FIELD(pk_eclass);
745731
COMPARE_SCALAR_FIELD(pk_opfamily);
746732
COMPARE_SCALAR_FIELD(pk_strategy);
747733
COMPARE_SCALAR_FIELD(pk_nulls_first);

src/backend/optimizer/README

Lines changed: 5 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -589,16 +589,6 @@ since they are easily compared to the pathkeys of a potential candidate
589589
path. So, SortGroupClause lists are turned into pathkeys lists for use
590590
inside the optimizer.
591591

592-
Because we have to generate pathkeys lists from the sort clauses before
593-
we've finished EquivalenceClass merging, we cannot use the pointer-equality
594-
method of comparing PathKeys in the earliest stages of the planning
595-
process. Instead, we generate "non canonical" PathKeys that reference
596-
single-element EquivalenceClasses that might get merged later. After we
597-
complete EquivalenceClass merging, we replace these with "canonical"
598-
PathKeys that reference only fully-merged classes, and after that we make
599-
sure we don't generate more than one copy of each "canonical" PathKey.
600-
Then it is safe to use pointer comparison on canonical PathKeys.
601-
602592
An additional refinement we can make is to insist that canonical pathkey
603593
lists (sort orderings) do not mention the same EquivalenceClass more than
604594
once. For example, in all these cases the second sort column is redundant,
@@ -651,12 +641,11 @@ mergejoinable clauses found in the quals. At the end of this process,
651641
we know all we can know about equivalence of different variables, so
652642
subsequently there will be no further merging of EquivalenceClasses.
653643
At that point it is possible to consider the EquivalenceClasses as
654-
"canonical" and build canonical PathKeys that reference them. Before
655-
we reach that point (actually, before entering query_planner at all)
656-
we also ensure that we have constructed EquivalenceClasses for all the
657-
expressions used in the query's ORDER BY and related clauses. These
658-
classes might or might not get merged together, depending on what we
659-
find in the quals.
644+
"canonical" and build canonical PathKeys that reference them. At this
645+
time we construct PathKeys for the query's ORDER BY and related clauses.
646+
(Any ordering expressions that do not appear elsewhere will result in
647+
the creation of new EquivalenceClasses, but this cannot result in merging
648+
existing classes, so canonical-ness is not lost.)
660649

661650
Because all the EquivalenceClasses are known before we begin path
662651
generation, we can use them as a guide to which indexes are of interest:

src/backend/optimizer/path/equivclass.c

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -285,11 +285,19 @@ process_equivalence(PlannerInfo *root, RestrictInfo *restrictinfo,
285285
}
286286

287287
/*
288-
* Case 2: need to merge ec1 and ec2. We add ec2's items to ec1, then
289-
* set ec2's ec_merged link to point to ec1 and remove ec2 from the
290-
* eq_classes list. We cannot simply delete ec2 because that could
291-
* leave dangling pointers in existing PathKeys. We leave it behind
292-
* with a link so that the merged EC can be found.
288+
* Case 2: need to merge ec1 and ec2. This should never happen after
289+
* we've built any canonical pathkeys; if it did, those pathkeys might
290+
* be rendered non-canonical by the merge.
291+
*/
292+
if (root->canon_pathkeys != NIL)
293+
elog(ERROR, "too late to merge equivalence classes");
294+
295+
/*
296+
* We add ec2's items to ec1, then set ec2's ec_merged link to point
297+
* to ec1 and remove ec2 from the eq_classes list. We cannot simply
298+
* delete ec2 because that could leave dangling pointers in existing
299+
* PathKeys. We leave it behind with a link so that the merged EC can
300+
* be found.
293301
*/
294302
ec1->ec_members = list_concat(ec1->ec_members, ec2->ec_members);
295303
ec1->ec_sources = list_concat(ec1->ec_sources, ec2->ec_sources);

src/backend/optimizer/path/pathkeys.c

Lines changed: 37 additions & 106 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,6 @@
2828
#include "utils/lsyscache.h"
2929

3030

31-
static PathKey *makePathKey(EquivalenceClass *eclass, Oid opfamily,
32-
int strategy, bool nulls_first);
3331
static PathKey *make_canonical_pathkey(PlannerInfo *root,
3432
EquivalenceClass *eclass, Oid opfamily,
3533
int strategy, bool nulls_first);
@@ -41,35 +39,16 @@ static bool right_merge_direction(PlannerInfo *root, PathKey *pathkey);
4139
* PATHKEY CONSTRUCTION AND REDUNDANCY TESTING
4240
****************************************************************************/
4341

44-
/*
45-
* makePathKey
46-
* create a PathKey node
47-
*
48-
* This does not promise to create a canonical PathKey, it's merely a
49-
* convenience routine to build the specified node.
50-
*/
51-
static PathKey *
52-
makePathKey(EquivalenceClass *eclass, Oid opfamily,
53-
int strategy, bool nulls_first)
54-
{
55-
PathKey *pk = makeNode(PathKey);
56-
57-
pk->pk_eclass = eclass;
58-
pk->pk_opfamily = opfamily;
59-
pk->pk_strategy = strategy;
60-
pk->pk_nulls_first = nulls_first;
61-
62-
return pk;
63-
}
64-
6542
/*
6643
* make_canonical_pathkey
6744
* Given the parameters for a PathKey, find any pre-existing matching
6845
* pathkey in the query's list of "canonical" pathkeys. Make a new
6946
* entry if there's not one already.
7047
*
7148
* Note that this function must not be used until after we have completed
72-
* merging EquivalenceClasses.
49+
* merging EquivalenceClasses. (We don't try to enforce that here; instead,
50+
* equivclass.c will complain if a merge occurs after root->canon_pathkeys
51+
* has become nonempty.)
7352
*/
7453
static PathKey *
7554
make_canonical_pathkey(PlannerInfo *root,
@@ -100,7 +79,12 @@ make_canonical_pathkey(PlannerInfo *root,
10079
*/
10180
oldcontext = MemoryContextSwitchTo(root->planner_cxt);
10281

103-
pk = makePathKey(eclass, opfamily, strategy, nulls_first);
82+
pk = makeNode(PathKey);
83+
pk->pk_eclass = eclass;
84+
pk->pk_opfamily = opfamily;
85+
pk->pk_strategy = strategy;
86+
pk->pk_nulls_first = nulls_first;
87+
10488
root->canon_pathkeys = lappend(root->canon_pathkeys, pk);
10589

10690
MemoryContextSwitchTo(oldcontext);
@@ -112,8 +96,7 @@ make_canonical_pathkey(PlannerInfo *root,
11296
* pathkey_is_redundant
11397
* Is a pathkey redundant with one already in the given list?
11498
*
115-
* Both the given pathkey and the list members must be canonical for this
116-
* to work properly. We detect two cases:
99+
* We detect two cases:
117100
*
118101
* 1. If the new pathkey's equivalence class contains a constant, and isn't
119102
* below an outer join, then we can disregard it as a sort key. An example:
@@ -135,6 +118,12 @@ make_canonical_pathkey(PlannerInfo *root,
135118
* Note in particular that we need not compare opfamily (all the opfamilies
136119
* of the EC have the same notion of equality) nor sort direction.
137120
*
121+
* Both the given pathkey and the list members must be canonical for this
122+
* to work properly, but that's okay since we no longer ever construct any
123+
* non-canonical pathkeys. (Note: the notion of a pathkey *list* being
124+
* canonical includes the additional requirement of no redundant entries,
125+
* which is exactly what we are checking for here.)
126+
*
138127
* Because the equivclass.c machinery forms only one copy of any EC per query,
139128
* pointer comparison is enough to decide whether canonical ECs are the same.
140129
*/
@@ -144,9 +133,6 @@ pathkey_is_redundant(PathKey *new_pathkey, List *pathkeys)
144133
EquivalenceClass *new_ec = new_pathkey->pk_eclass;
145134
ListCell *lc;
146135

147-
/* Assert we've been given canonical pathkeys */
148-
Assert(!new_ec->ec_merged);
149-
150136
/* Check for EC containing a constant --- unconditionally redundant */
151137
if (EC_MUST_BE_REDUNDANT(new_ec))
152138
return true;
@@ -156,9 +142,6 @@ pathkey_is_redundant(PathKey *new_pathkey, List *pathkeys)
156142
{
157143
PathKey *old_pathkey = (PathKey *) lfirst(lc);
158144

159-
/* Assert we've been given canonical pathkeys */
160-
Assert(!old_pathkey->pk_eclass->ec_merged);
161-
162145
if (new_ec == old_pathkey->pk_eclass)
163146
return true;
164147
}
@@ -170,53 +153,22 @@ pathkey_is_redundant(PathKey *new_pathkey, List *pathkeys)
170153
* canonicalize_pathkeys
171154
* Convert a not-necessarily-canonical pathkeys list to canonical form.
172155
*
173-
* Note that this function must not be used until after we have completed
174-
* merging EquivalenceClasses.
156+
* This is a no-op now that all pathkeys are canonical, but we keep it
157+
* to avoid possibly breaking plug-ins in the 9.2 release cycle.
158+
*
159+
* Since the previous implementation made a fresh List, we use list_copy
160+
* (NOT list_copy_deep) to preserve that behavior.
175161
*/
176162
List *
177163
canonicalize_pathkeys(PlannerInfo *root, List *pathkeys)
178164
{
179-
List *new_pathkeys = NIL;
180-
ListCell *l;
181-
182-
foreach(l, pathkeys)
183-
{
184-
PathKey *pathkey = (PathKey *) lfirst(l);
185-
EquivalenceClass *eclass;
186-
PathKey *cpathkey;
187-
188-
/* Find the canonical (merged) EquivalenceClass */
189-
eclass = pathkey->pk_eclass;
190-
while (eclass->ec_merged)
191-
eclass = eclass->ec_merged;
192-
193-
/*
194-
* If we can tell it's redundant just from the EC, skip.
195-
* pathkey_is_redundant would notice that, but we needn't even bother
196-
* constructing the node...
197-
*/
198-
if (EC_MUST_BE_REDUNDANT(eclass))
199-
continue;
200-
201-
/* OK, build a canonicalized PathKey struct */
202-
cpathkey = make_canonical_pathkey(root,
203-
eclass,
204-
pathkey->pk_opfamily,
205-
pathkey->pk_strategy,
206-
pathkey->pk_nulls_first);
207-
208-
/* Add to list unless redundant */
209-
if (!pathkey_is_redundant(cpathkey, new_pathkeys))
210-
new_pathkeys = lappend(new_pathkeys, cpathkey);
211-
}
212-
return new_pathkeys;
165+
return list_copy(pathkeys);
213166
}
214167

215168
/*
216169
* make_pathkey_from_sortinfo
217170
* Given an expression and sort-order information, create a PathKey.
218-
* If canonicalize = true, the result is a "canonical" PathKey,
219-
* otherwise not. (But note it might be redundant anyway.)
171+
* The result is always a "canonical" PathKey, but it might be redundant.
220172
*
221173
* If the PathKey is being generated from a SortGroupClause, sortref should be
222174
* the SortGroupClause's SortGroupRef; otherwise zero.
@@ -229,9 +181,6 @@ canonicalize_pathkeys(PlannerInfo *root, List *pathkeys)
229181
* create_it is TRUE if we should create any missing EquivalenceClass
230182
* needed to represent the sort key. If it's FALSE, we return NULL if the
231183
* sort key isn't already present in any EquivalenceClass.
232-
*
233-
* canonicalize should always be TRUE after EquivalenceClass merging has
234-
* been performed, but FALSE if we haven't done EquivalenceClass merging yet.
235184
*/
236185
static PathKey *
237186
make_pathkey_from_sortinfo(PlannerInfo *root,
@@ -251,6 +200,8 @@ make_pathkey_from_sortinfo(PlannerInfo *root,
251200
List *opfamilies;
252201
EquivalenceClass *eclass;
253202

203+
Assert(canonicalize);
204+
254205
strategy = reverse_sort ? BTGreaterStrategyNumber : BTLessStrategyNumber;
255206

256207
/*
@@ -281,11 +232,8 @@ make_pathkey_from_sortinfo(PlannerInfo *root,
281232
return NULL;
282233

283234
/* And finally we can find or create a PathKey node */
284-
if (canonicalize)
285-
return make_canonical_pathkey(root, eclass, opfamily,
286-
strategy, nulls_first);
287-
else
288-
return makePathKey(eclass, opfamily, strategy, nulls_first);
235+
return make_canonical_pathkey(root, eclass, opfamily,
236+
strategy, nulls_first);
289237
}
290238

291239
/*
@@ -341,9 +289,8 @@ make_pathkey_from_sortop(PlannerInfo *root,
341289
* Compare two pathkeys to see if they are equivalent, and if not whether
342290
* one is "better" than the other.
343291
*
344-
* This function may only be applied to canonicalized pathkey lists.
345-
* In the canonical representation, pathkeys can be checked for equality
346-
* by simple pointer comparison.
292+
* We assume the pathkeys are canonical, and so they can be checked for
293+
* equality by simple pointer comparison.
347294
*/
348295
PathKeysComparison
349296
compare_pathkeys(List *keys1, List *keys2)
@@ -364,15 +311,6 @@ compare_pathkeys(List *keys1, List *keys2)
364311
PathKey *pathkey1 = (PathKey *) lfirst(key1);
365312
PathKey *pathkey2 = (PathKey *) lfirst(key2);
366313

367-
/*
368-
* XXX would like to check that we've been given canonicalized input,
369-
* but PlannerInfo not accessible here...
370-
*/
371-
#ifdef NOT_USED
372-
Assert(list_member_ptr(root->canon_pathkeys, pathkey1));
373-
Assert(list_member_ptr(root->canon_pathkeys, pathkey2));
374-
#endif
375-
376314
if (pathkey1 != pathkey2)
377315
return PATHKEYS_DIFFERENT; /* no need to keep looking */
378316
}
@@ -414,7 +352,7 @@ pathkeys_contained_in(List *keys1, List *keys2)
414352
* Return NULL if no such path.
415353
*
416354
* 'paths' is a list of possible paths that all generate the same relation
417-
* 'pathkeys' represents a required ordering (already canonicalized!)
355+
* 'pathkeys' represents a required ordering (in canonical form!)
418356
* 'required_outer' denotes allowable outer relations for parameterized paths
419357
* 'cost_criterion' is STARTUP_COST or TOTAL_COST
420358
*/
@@ -455,7 +393,7 @@ get_cheapest_path_for_pathkeys(List *paths, List *pathkeys,
455393
* parameter.
456394
*
457395
* 'paths' is a list of possible paths that all generate the same relation
458-
* 'pathkeys' represents a required ordering (already canonicalized!)
396+
* 'pathkeys' represents a required ordering (in canonical form!)
459397
* 'required_outer' denotes allowable outer relations for parameterized paths
460398
* 'fraction' is the fraction of the total tuples expected to be retrieved
461399
*/
@@ -829,12 +767,8 @@ build_join_pathkeys(PlannerInfo *root,
829767
* Generate a pathkeys list that represents the sort order specified
830768
* by a list of SortGroupClauses
831769
*
832-
* If canonicalize is TRUE, the resulting PathKeys are all in canonical form;
833-
* otherwise not. canonicalize should always be TRUE after EquivalenceClass
834-
* merging has been performed, but FALSE if we haven't done EquivalenceClass
835-
* merging yet. (We provide this option because grouping_planner() needs to
836-
* be able to represent requested pathkeys before the equivalence classes have
837-
* been created for the query.)
770+
* The resulting PathKeys are always in canonical form. (Actually, there
771+
* is no longer any code anywhere that creates non-canonical PathKeys.)
838772
*
839773
* 'sortclauses' is a list of SortGroupClause nodes
840774
* 'tlist' is the targetlist to find the referenced tlist entries in
@@ -848,6 +782,8 @@ make_pathkeys_for_sortclauses(PlannerInfo *root,
848782
List *pathkeys = NIL;
849783
ListCell *l;
850784

785+
Assert(canonicalize);
786+
851787
foreach(l, sortclauses)
852788
{
853789
SortGroupClause *sortcl = (SortGroupClause *) lfirst(l);
@@ -862,15 +798,10 @@ make_pathkeys_for_sortclauses(PlannerInfo *root,
862798
sortcl->nulls_first,
863799
sortcl->tleSortGroupRef,
864800
true,
865-
canonicalize);
801+
true);
866802

867803
/* Canonical form eliminates redundant ordering keys */
868-
if (canonicalize)
869-
{
870-
if (!pathkey_is_redundant(pathkey, pathkeys))
871-
pathkeys = lappend(pathkeys, pathkey);
872-
}
873-
else
804+
if (!pathkey_is_redundant(pathkey, pathkeys))
874805
pathkeys = lappend(pathkeys, pathkey);
875806
}
876807
return pathkeys;

0 commit comments

Comments
 (0)