Skip to content

Commit 72bd38c

Browse files
committed
Transform OR clauses to ANY expression
Replace (expr op C1) OR (expr op C2) ... with expr op ANY(ARRAY[C1, C2, ...]) on the preliminary stage of optimization when we are still working with the expression tree. Here Cn is a n-th constant expression, 'expr' is non-constant expression, 'op' is an operator which returns boolean result and has a commuter (for the case of reverse order of constant and non-constant parts of the expression, like 'Cn op expr'). Sometimes it can lead to not optimal plan. This is why there is a or_to_any_transform_limit GUC. It specifies a threshold value of length of arguments in an OR expression that triggers the OR-to-ANY transformation. Generally, more groupable OR arguments mean that transformation will be more likely to win than to lose. Discussion: https://postgr.es/m/567ED6CA.2040504%40sigaev.ru Author: Alena Rybakina <lena.ribackina@yandex.ru> Author: Andrey Lepikhov <a.lepikhov@postgrespro.ru> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Reviewed-by: Ranier Vilela <ranier.vf@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com>
1 parent 75a47b6 commit 72bd38c

File tree

14 files changed

+785
-20
lines changed

14 files changed

+785
-20
lines changed

doc/src/sgml/config.sgml

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6304,6 +6304,63 @@ SELECT * FROM parent WHERE key = 2400;
63046304
</listitem>
63056305
</varlistentry>
63066306

6307+
<varlistentry id="guc-or-to-any-transform-limit" xreflabel="or_to_any_transform_limit">
6308+
<term><varname>or_to_any_transform_limit</varname> (<type>boolean</type>)
6309+
<indexterm>
6310+
<primary><varname>or_to_any_transform_limit</varname> configuration parameter</primary>
6311+
</indexterm>
6312+
</term>
6313+
<listitem>
6314+
<para>
6315+
Sets the minimum length of arguments in an <literal>OR</literal>
6316+
expression exceeding which planner will try to lookup and group
6317+
multiple similar <literal>OR</literal> expressions to
6318+
<literal>ANY</literal> (<xref linkend="functions-comparisons-any-some"/>)
6319+
expressions. The grouping technique of this transformation is based
6320+
on the equivalence of variable sides. One side of such an expression
6321+
must be a constant clause, and the other must contain a variable
6322+
clause. The default value is <literal>5</literal>. The value of
6323+
<literal>-1</literal> completely disables the transformation.
6324+
</para>
6325+
<para>
6326+
The advantage of this <literal>OR-to-ANY</literal> transformation is
6327+
faster query planning and execution. In certain cases, this
6328+
transformation also leads to more effective plans containing
6329+
a single index scan instead of multiple bitmap scans. However, it
6330+
may also cause a planning regression when distinct
6331+
<literal>OR</literal> arguments are better to match to distinct indexes.
6332+
This may happen when they have different matching partial indexes or
6333+
have different distributions of other columns used in the query.
6334+
Generally, more groupable <literal>OR</literal> arguments mean that
6335+
transformation will be more likely to win than to lose.
6336+
</para>
6337+
<para>
6338+
For example, this query has its set of five <literal>OR</literal>
6339+
expressions transformed to <literal>ANY</literal> with the default
6340+
value of <varname>or_to_any_transform_limit</varname>. But not with
6341+
the increased value.
6342+
<programlisting>
6343+
# EXPLAIN SELECT * FROM tbl WHERE key = 1 OR key = 2 OR key = 3 OR key = 4 OR key = 5;
6344+
QUERY PLAN
6345+
-----------------------------------------------------
6346+
Seq Scan on tbl (cost=0.00..51.44 rows=64 width=4)
6347+
Filter: (key = ANY ('{1,2,3,4,5}'::integer[]))
6348+
(2 rows)
6349+
6350+
# SET or_to_any_transform_limit = 6;
6351+
SET
6352+
6353+
# EXPLAIN SELECT * FROM tbl WHERE key = 1 OR key = 2 OR key = 3 OR key = 4 OR key = 5;
6354+
QUERY PLAN
6355+
---------------------------------------------------------------------------
6356+
Seq Scan on tbl (cost=0.00..67.38 rows=63 width=4)
6357+
Filter: ((key = 1) OR (key = 2) OR (key = 3) OR (key = 4) OR (key = 5))
6358+
(2 rows)
6359+
</programlisting>
6360+
</para>
6361+
</listitem>
6362+
</varlistentry>
6363+
63076364
<varlistentry id="guc-plan-cache-mode" xreflabel="plan_cache_mode">
63086365
<term><varname>plan_cache_mode</varname> (<type>enum</type>)
63096366
<indexterm>

src/backend/nodes/queryjumblefuncs.c

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -141,6 +141,33 @@ JumbleQuery(Query *query)
141141
return jstate;
142142
}
143143

144+
JumbleState *
145+
JumbleExpr(Expr *expr, uint64 *exprId)
146+
{
147+
JumbleState *jstate = NULL;
148+
149+
Assert(exprId != NULL);
150+
151+
jstate = (JumbleState *) palloc(sizeof(JumbleState));
152+
153+
/* Set up workspace for query jumbling */
154+
jstate->jumble = (unsigned char *) palloc(JUMBLE_SIZE);
155+
jstate->jumble_len = 0;
156+
jstate->clocations_buf_size = 32;
157+
jstate->clocations = (LocationLen *)
158+
palloc(jstate->clocations_buf_size * sizeof(LocationLen));
159+
jstate->clocations_count = 0;
160+
jstate->highest_extern_param_id = 0;
161+
162+
/* Compute query ID */
163+
_jumbleNode(jstate, (Node *) expr);
164+
*exprId = DatumGetUInt64(hash_any_extended(jstate->jumble,
165+
jstate->jumble_len,
166+
0));
167+
168+
return jstate;
169+
}
170+
144171
/*
145172
* Enables query identifier computation.
146173
*

0 commit comments

Comments
 (0)