Skip to content

Commit 9e21537

Browse files
committed
Fix planner's use of Result Cache with unique joins
When the planner considered using a Result Cache node to cache results from the inner side of a Nested Loop Join, it failed to consider that the inner path's parameterization may not be the entire join condition. If the join was marked as inner_unique then we may accidentally put the cache in singlerow mode. This meant that entries would be marked as complete after caching the first row. That was wrong as if only part of the join condition was parameterized then the uniqueness of the unique join was not guaranteed at the Result Cache's level. The uniqueness is only guaranteed after Nested Loop applies the join filter. If subsequent rows were found, this would lead to: ERROR: cache entry already complete This could have been fixed by only putting the cache in singlerow mode if the entire join condition was parameterized. However, Nested Loop will only read its inner side so far as the first matching row when the join is unique, so that might mean we never get an opportunity to mark cache entries as complete. Since non-complete cache entries are useless for subsequent lookups, we just don't bother considering a Result Cache path in this case. In passing, remove the XXX comment that claimed the above ERROR might be better suited to be an Assert. After there being an actual case which triggered it, it seems better to keep it an ERROR. Reported-by: David Christensen Discussion: https://postgr.es/m/CAOxo6X+dy-V58iEPFgst8ahPKEU+38NZzUuc+a7wDBZd4TrHMQ@mail.gmail.com
1 parent 0cdaa05 commit 9e21537

File tree

2 files changed

+32
-1
lines changed

2 files changed

+32
-1
lines changed

src/backend/executor/nodeResultCache.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -760,7 +760,7 @@ ExecResultCache(PlanState *pstate)
760760
/*
761761
* Validate if the planner properly set the singlerow flag. It
762762
* should only set that if each cache entry can, at most,
763-
* return 1 row. XXX maybe this should be an Assert?
763+
* return 1 row.
764764
*/
765765
if (unlikely(entry->complete))
766766
elog(ERROR, "cache entry already complete");

src/backend/optimizer/path/joinpath.c

+31
Original file line numberDiff line numberDiff line change
@@ -503,6 +503,37 @@ get_resultcache_path(PlannerInfo *root, RelOptInfo *innerrel,
503503
jointype == JOIN_ANTI))
504504
return NULL;
505505

506+
/*
507+
* Result Cache normally marks cache entries as complete when it runs out
508+
* of tuples to read from its subplan. However, with unique joins, Nested
509+
* Loop will skip to the next outer tuple after finding the first matching
510+
* inner tuple. This means that we may not read the inner side of the
511+
* join to completion which leaves no opportunity to mark the cache entry
512+
* as complete. To work around that, when the join is unique we
513+
* automatically mark cache entries as complete after fetching the first
514+
* tuple. This works when the entire join condition is parameterized.
515+
* Otherwise, when the parameterization is only a subset of the join
516+
* condition, we can't be sure which part of it causes the join to be
517+
* unique. This means there are no guarantees that only 1 tuple will be
518+
* read. We cannot mark the cache entry as complete after reading the
519+
* first tuple without that guarantee. This means the scope of Result
520+
* Cache's usefulness is limited to only outer rows that have no join
521+
* partner as this is the only case where Nested Loop would exhaust the
522+
* inner scan of a unique join. Since the scope is limited to that, we
523+
* just don't bother making a result cache path in this case.
524+
*
525+
* Lateral vars needn't be considered here as they're not considered when
526+
* determining if the join is unique.
527+
*
528+
* XXX this could be enabled if the remaining join quals were made part of
529+
* the inner scan's filter instead of the join filter. Maybe it's worth
530+
* considering doing that?
531+
*/
532+
if (extra->inner_unique &&
533+
list_length(inner_path->param_info->ppi_clauses) <
534+
list_length(extra->restrictlist))
535+
return NULL;
536+
506537
/*
507538
* We can't use a result cache if there are volatile functions in the
508539
* inner rel's target list or restrict list. A cache hit could reduce the

0 commit comments

Comments
 (0)