Skip to content

Commit f0fedfe

Browse files
committed
Allow polymorphic aggregates to have non-polymorphic state data types.
Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.
1 parent 125ba29 commit f0fedfe

File tree

17 files changed

+509
-298
lines changed

17 files changed

+509
-298
lines changed

doc/src/sgml/catalogs.sgml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -404,6 +404,18 @@
404404
<entry><literal><link linkend="catalog-pg-proc"><structname>pg_proc</structname></link>.oid</literal></entry>
405405
<entry>Final function for moving-aggregate mode (zero if none)</entry>
406406
</row>
407+
<row>
408+
<entry><structfield>aggfinalextra</structfield></entry>
409+
<entry><type>bool</type></entry>
410+
<entry></entry>
411+
<entry>True to pass extra dummy arguments to aggfinalfn</entry>
412+
</row>
413+
<row>
414+
<entry><structfield>aggmfinalextra</structfield></entry>
415+
<entry><type>bool</type></entry>
416+
<entry></entry>
417+
<entry>True to pass extra dummy arguments to aggmfinalfn</entry>
418+
</row>
407419
<row>
408420
<entry><structfield>aggsortop</structfield></entry>
409421
<entry><type>oid</type></entry>

doc/src/sgml/ref/create_aggregate.sgml

Lines changed: 40 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -26,12 +26,14 @@ CREATE AGGREGATE <replaceable class="parameter">name</replaceable> ( [ <replacea
2626
STYPE = <replaceable class="PARAMETER">state_data_type</replaceable>
2727
[ , SSPACE = <replaceable class="PARAMETER">state_data_size</replaceable> ]
2828
[ , FINALFUNC = <replaceable class="PARAMETER">ffunc</replaceable> ]
29+
[ , FINALFUNC_EXTRA ]
2930
[ , INITCOND = <replaceable class="PARAMETER">initial_condition</replaceable> ]
3031
[ , MSFUNC = <replaceable class="PARAMETER">msfunc</replaceable> ]
3132
[ , MINVFUNC = <replaceable class="PARAMETER">minvfunc</replaceable> ]
3233
[ , MSTYPE = <replaceable class="PARAMETER">mstate_data_type</replaceable> ]
3334
[ , MSSPACE = <replaceable class="PARAMETER">mstate_data_size</replaceable> ]
3435
[ , MFINALFUNC = <replaceable class="PARAMETER">mffunc</replaceable> ]
36+
[ , MFINALFUNC_EXTRA ]
3537
[ , MINITCOND = <replaceable class="PARAMETER">minitial_condition</replaceable> ]
3638
[ , SORTOP = <replaceable class="PARAMETER">sort_operator</replaceable> ]
3739
)
@@ -42,6 +44,7 @@ CREATE AGGREGATE <replaceable class="parameter">name</replaceable> ( [ [ <replac
4244
STYPE = <replaceable class="PARAMETER">state_data_type</replaceable>
4345
[ , SSPACE = <replaceable class="PARAMETER">state_data_size</replaceable> ]
4446
[ , FINALFUNC = <replaceable class="PARAMETER">ffunc</replaceable> ]
47+
[ , FINALFUNC_EXTRA ]
4548
[ , INITCOND = <replaceable class="PARAMETER">initial_condition</replaceable> ]
4649
[ , HYPOTHETICAL ]
4750
)
@@ -54,12 +57,14 @@ CREATE AGGREGATE <replaceable class="PARAMETER">name</replaceable> (
5457
STYPE = <replaceable class="PARAMETER">state_data_type</replaceable>
5558
[ , SSPACE = <replaceable class="PARAMETER">state_data_size</replaceable> ]
5659
[ , FINALFUNC = <replaceable class="PARAMETER">ffunc</replaceable> ]
60+
[ , FINALFUNC_EXTRA ]
5761
[ , INITCOND = <replaceable class="PARAMETER">initial_condition</replaceable> ]
5862
[ , MSFUNC = <replaceable class="PARAMETER">sfunc</replaceable> ]
5963
[ , MINVFUNC = <replaceable class="PARAMETER">invfunc</replaceable> ]
6064
[ , MSTYPE = <replaceable class="PARAMETER">state_data_type</replaceable> ]
6165
[ , MSSPACE = <replaceable class="PARAMETER">state_data_size</replaceable> ]
6266
[ , MFINALFUNC = <replaceable class="PARAMETER">ffunc</replaceable> ]
67+
[ , MFINALFUNC_EXTRA ]
6368
[ , MINITCOND = <replaceable class="PARAMETER">initial_condition</replaceable> ]
6469
[ , SORTOP = <replaceable class="PARAMETER">sort_operator</replaceable> ]
6570
)
@@ -166,12 +171,25 @@ CREATE AGGREGATE <replaceable class="PARAMETER">name</replaceable> (
166171
input rows.
167172
</para>
168173

174+
<para>
175+
Sometimes it is useful to declare the final function as taking not just
176+
the state value, but extra parameters corresponding to the aggregate's
177+
input values. The main reason for doing this is if the final function
178+
is polymorphic and the state value's data type would be inadequate to
179+
pin down the result type. These extra parameters are always passed as
180+
NULL (and so the final function must not be strict when
181+
the <literal>FINALFUNC_EXTRA</> option is used), but nonetheless they
182+
are valid parameters. The final function could for example make use
183+
of <function>get_fn_expr_argtype</> to identify the actual argument type
184+
in the current call.
185+
</para>
186+
169187
<para>
170188
An aggregate can optionally support <firstterm>moving-aggregate mode</>,
171189
as described in <xref linkend="xaggr-moving-aggregates">. This requires
172190
specifying the <literal>MSFUNC</>, <literal>MINVFUNC</>,
173191
and <literal>MSTYPE</> parameters, and optionally
174-
the <literal>MSPACE</>, <literal>MFINALFUNC</>,
192+
the <literal>MSPACE</>, <literal>MFINALFUNC</>, <literal>MFINALFUNC_EXTRA</>,
175193
and <literal>MINITCOND</> parameters. Except for <literal>MINVFUNC</>,
176194
these parameters work like the corresponding simple-aggregate parameters
177195
without <literal>M</>; they define a separate implementation of the
@@ -361,12 +379,16 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
361379
<para>
362380
For ordered-set (including hypothetical-set) aggregates, the
363381
final function receives not only the final state value,
364-
but also the values of all the direct arguments, followed by
365-
null values corresponding to each aggregated argument.
366-
(The reason for including the aggregated arguments in the function
367-
signature is that this may be necessary to allow correct resolution
368-
of the aggregate result type, when a polymorphic aggregate is
369-
being defined.)
382+
but also the values of all the direct arguments.
383+
</para>
384+
385+
<para>
386+
If <literal>FINALFUNC_EXTRA</> is specified, then in addition to the
387+
final state value and any direct arguments, the final function
388+
receives extra NULL values corresponding to the aggregate's regular
389+
(aggregated) arguments. This is mainly useful to allow correct
390+
resolution of the aggregate result type when a polymorphic aggregate
391+
is being defined.
370392
</para>
371393
</listitem>
372394
</varlistentry>
@@ -438,9 +460,11 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
438460
The name of the final function called to compute the aggregate's
439461
result after all input rows have been traversed, when using
440462
moving-aggregate mode. This works the same as <replaceable>ffunc</>,
441-
except that its input type is <replaceable>mstate_data_type</>.
463+
except that its first argument's type
464+
is <replaceable>mstate_data_type</> and extra dummy arguments are
465+
specified by writing <literal>MFINALFUNC_EXTRA</>.
442466
The aggregate result type determined by <replaceable>mffunc</>
443-
and <replaceable>mstate_data_type</> must match that determined by the
467+
or <replaceable>mstate_data_type</> must match that determined by the
444468
aggregate's regular implementation.
445469
</para>
446470
</listitem>
@@ -494,6 +518,13 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
494518
<refsect1>
495519
<title>Notes</title>
496520

521+
<para>
522+
In parameters that specify support function names, you can write
523+
a schema name if needed, for example <literal>SFUNC = public.sum</>.
524+
Do not write argument types there, however &mdash; the argument types
525+
of the support functions are determined from other parameters.
526+
</para>
527+
497528
<para>
498529
If an aggregate supports moving-aggregate mode, it will improve
499530
calculation efficiency when the aggregate is used as a window function

doc/src/sgml/xaggr.sgml

Lines changed: 59 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ CREATE AGGREGATE avg (float8)
127127
<function>float8_accum</> requires a three-element array, not just
128128
two elements, because it accumulates the sum of squares as well as
129129
the sum and count of the inputs. This is so that it can be used for
130-
some other aggregates besides <function>avg</>.
130+
some other aggregates as well as <function>avg</>.
131131
</para>
132132
</note>
133133

@@ -182,7 +182,7 @@ CREATE AGGREGATE avg (float8)
182182
The inverse transition function is passed the current state value and the
183183
aggregate input value(s) for the earliest row included in the current
184184
state. It must reconstruct what the state value would have been if the
185-
given input value had never been aggregated, but only the rows following
185+
given input row had never been aggregated, but only the rows following
186186
it. This sometimes requires that the forward transition function keep
187187
more state than is needed for plain aggregation mode. Therefore, the
188188
moving-aggregate mode uses a completely separate implementation from the
@@ -339,6 +339,47 @@ SELECT attrelid::regclass, array_accum(atttypid::regtype)
339339
</programlisting>
340340
</para>
341341

342+
<para>
343+
Ordinarily, an aggregate function with a polymorphic result type has a
344+
polymorphic state type, as in the above example. This is necessary
345+
because otherwise the final function cannot be declared sensibly: it
346+
would need to have a polymorphic result type but no polymorphic argument
347+
type, which <command>CREATE FUNCTION</> will reject on the grounds that
348+
the result type cannot be deduced from a call. But sometimes it is
349+
inconvenient to use a polymorphic state type. The most common case is
350+
where the aggregate support functions are to be written in C and the
351+
state type should be declared as <type>internal</> because there is
352+
no SQL-level equivalent for it. To address this case, it is possible to
353+
declare the final function as taking extra <quote>dummy</> arguments
354+
that match the input arguments of the aggregate. Such dummy arguments
355+
are always passed as NULLs since no specific value is available when the
356+
final function is called. Their only use is to allow a polymorphic
357+
final function's result type to be connected to the aggregate's input
358+
type(s). For example, the definition of the built-in
359+
aggregate <function>array_agg</> is equivalent to
360+
361+
<programlisting>
362+
CREATE FUNCTION array_agg_transfn(internal, anyelement)
363+
RETURNS internal ...;
364+
CREATE FUNCTION array_agg_finalfn(internal, anyelement)
365+
RETURNS anyarray ...;
366+
367+
CREATE AGGREGATE array_agg (anyelement)
368+
(
369+
sfunc = array_agg_transfn,
370+
stype = internal,
371+
finalfunc = array_agg_finalfn,
372+
finalfunc_extra
373+
);
374+
</programlisting>
375+
376+
Here, the <literal>finalfunc_extra</> option specifies that the final
377+
function receives, in addition to the state value, extra dummy
378+
argument(s) corresponding to the aggregate's input argument(s).
379+
The extra <type>anyelement</> argument allows the declaration
380+
of <function>array_agg_finalfn</> to be valid.
381+
</para>
382+
342383
<para>
343384
An aggregate function can be made to accept a varying number of arguments
344385
by declaring its last argument as a <literal>VARIADIC</> array, in much
@@ -401,15 +442,23 @@ SELECT myaggregate(a, b, c ORDER BY a) FROM ...
401442
definition of <function>percentile_disc</> is equivalent to:
402443

403444
<programlisting>
445+
CREATE FUNCTION ordered_set_transition(internal, anyelement)
446+
RETURNS internal ...;
447+
CREATE FUNCTION percentile_disc_final(internal, float8, anyelement)
448+
RETURNS anyelement ...;
449+
404450
CREATE AGGREGATE percentile_disc (float8 ORDER BY anyelement)
405451
(
406452
sfunc = ordered_set_transition,
407453
stype = internal,
408-
finalfunc = percentile_disc_final
454+
finalfunc = percentile_disc_final,
455+
finalfunc_extra
409456
);
410457
</programlisting>
411458

412-
which could be used to obtain a median household income like this:
459+
This aggregate takes a <type>float8</> direct argument (the percentile
460+
fraction) and an aggregated input that can be of any sortable data type.
461+
It could be used to obtain a median household income like this:
413462

414463
<programlisting>
415464
SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY income) FROM households;
@@ -447,25 +496,12 @@ SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY income) FROM households;
447496
same definition as for normal aggregates, but note that the direct
448497
arguments (if any) are not provided. The final function receives
449498
the last state value, the values of the direct arguments if any,
450-
and null values corresponding to the aggregated input(s). While the
451-
null values seem useless at first sight, they are important because
452-
they make it possible to include the data types of the aggregated
453-
input(s) in the final function's signature, which may be necessary
454-
to resolve the output type of a polymorphic aggregate. For example,
455-
the built-in <function>mode()</> ordered-set aggregate takes a
456-
single aggregated column of any sortable data type and returns a
457-
value of that same type. This is possible because the final function
458-
is declared as <literal>mode_final(internal, anyelement) returns
459-
anyelement</>, with the <type>anyelement</> parameter corresponding
460-
to the dummy null argument that represents the aggregated column.
461-
The actual data is conveyed in the <type>internal</>-type state
462-
value, but type resolution needs a parse-time indication of what the
463-
result data type will be, and the dummy argument provides that.
464-
In the example of <function>percentile_disc</>, the support functions
465-
are respectively declared as
466-
<literal>ordered_set_transition(internal, "any") returns internal</>
467-
and <literal>percentile_disc_final(internal, float8, anyelement)
468-
returns anyelement</>.
499+
and (if <literal>finalfunc_extra</> is specified) NULL values
500+
corresponding to the aggregated input(s). As with normal
501+
aggregates, <literal>finalfunc_extra</> is only really useful if the
502+
aggregate is polymorphic; then the extra dummy argument(s) are needed
503+
to connect the final function's result type to the aggregate's input
504+
type(s).
469505
</para>
470506

471507
<para>

src/backend/catalog/pg_aggregate.c

Lines changed: 54 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,8 @@ AggregateCreate(const char *aggName,
6060
List *aggmtransfnName,
6161
List *aggminvtransfnName,
6262
List *aggmfinalfnName,
63+
bool finalfnExtraArgs,
64+
bool mfinalfnExtraArgs,
6365
List *aggsortopName,
6466
Oid aggTransType,
6567
int32 aggTransSpace,
@@ -344,48 +346,46 @@ AggregateCreate(const char *aggName,
344346
ReleaseSysCache(tup);
345347
}
346348

347-
/*
348-
* Set up fnArgs for looking up finalfn(s)
349-
*
350-
* For ordinary aggs, the finalfn just takes the transtype. For
351-
* ordered-set aggs, it takes the transtype plus all args. (The
352-
* aggregated args are useless at runtime, and are actually passed as
353-
* NULLs, but we may need them in the function signature to allow
354-
* resolution of a polymorphic agg's result type.)
355-
*/
356-
fnArgs[0] = aggTransType;
357-
if (AGGKIND_IS_ORDERED_SET(aggKind))
358-
{
359-
nargs_finalfn = numArgs + 1;
360-
memcpy(fnArgs + 1, aggArgTypes, numArgs * sizeof(Oid));
361-
}
362-
else
363-
{
364-
nargs_finalfn = 1;
365-
/* variadic-ness of the aggregate doesn't affect finalfn */
366-
variadicArgType = InvalidOid;
367-
}
368-
369349
/* handle finalfn, if supplied */
370350
if (aggfinalfnName)
371351
{
352+
/*
353+
* If finalfnExtraArgs is specified, the transfn takes the transtype
354+
* plus all args; otherwise, it just takes the transtype plus any
355+
* direct args. (Non-direct args are useless at runtime, and are
356+
* actually passed as NULLs, but we may need them in the function
357+
* signature to allow resolution of a polymorphic agg's result type.)
358+
*/
359+
Oid ffnVariadicArgType = variadicArgType;
360+
361+
fnArgs[0] = aggTransType;
362+
memcpy(fnArgs + 1, aggArgTypes, numArgs * sizeof(Oid));
363+
if (finalfnExtraArgs)
364+
nargs_finalfn = numArgs + 1;
365+
else
366+
{
367+
nargs_finalfn = numDirectArgs + 1;
368+
if (numDirectArgs < numArgs)
369+
{
370+
/* variadic argument doesn't affect finalfn */
371+
ffnVariadicArgType = InvalidOid;
372+
}
373+
}
374+
372375
finalfn = lookup_agg_function(aggfinalfnName, nargs_finalfn,
373-
fnArgs, variadicArgType,
376+
fnArgs, ffnVariadicArgType,
374377
&finaltype);
375378

376379
/*
377-
* The finalfn of an ordered-set agg will certainly be passed at least
378-
* one null argument, so complain if it's strict. Nothing bad would
379-
* happen at runtime (you'd just get a null result), but it's surely
380-
* not what the user wants, so let's complain now.
381-
*
382-
* Note: it's likely that a strict transfn would also be a mistake,
383-
* but the case isn't quite so airtight, so we let that pass.
380+
* When finalfnExtraArgs is specified, the finalfn will certainly be
381+
* passed at least one null argument, so complain if it's strict.
382+
* Nothing bad would happen at runtime (you'd just get a null result),
383+
* but it's surely not what the user wants, so let's complain now.
384384
*/
385-
if (AGGKIND_IS_ORDERED_SET(aggKind) && func_strict(finalfn))
385+
if (finalfnExtraArgs && func_strict(finalfn))
386386
ereport(ERROR,
387387
(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
388-
errmsg("final function of an ordered-set aggregate must not be declared STRICT")));
388+
errmsg("final function with extra arguments must not be declared STRICT")));
389389
}
390390
else
391391
{
@@ -434,21 +434,34 @@ AggregateCreate(const char *aggName,
434434
if (aggmfinalfnName)
435435
{
436436
/*
437-
* The arguments are the same as for the regular finalfn, except
438-
* that the transition data type might be different. So re-use
439-
* the fnArgs values set up above, except for that one.
437+
* The arguments are figured the same way as for the regular
438+
* finalfn, but using aggmTransType and mfinalfnExtraArgs.
440439
*/
440+
Oid ffnVariadicArgType = variadicArgType;
441+
441442
fnArgs[0] = aggmTransType;
443+
memcpy(fnArgs + 1, aggArgTypes, numArgs * sizeof(Oid));
444+
if (mfinalfnExtraArgs)
445+
nargs_finalfn = numArgs + 1;
446+
else
447+
{
448+
nargs_finalfn = numDirectArgs + 1;
449+
if (numDirectArgs < numArgs)
450+
{
451+
/* variadic argument doesn't affect finalfn */
452+
ffnVariadicArgType = InvalidOid;
453+
}
454+
}
442455

443456
mfinalfn = lookup_agg_function(aggmfinalfnName, nargs_finalfn,
444-
fnArgs, variadicArgType,
457+
fnArgs, ffnVariadicArgType,
445458
&rettype);
446459

447-
/* As above, check strictness if it's an ordered-set agg */
448-
if (AGGKIND_IS_ORDERED_SET(aggKind) && func_strict(mfinalfn))
460+
/* As above, check strictness if mfinalfnExtraArgs is given */
461+
if (mfinalfnExtraArgs && func_strict(mfinalfn))
449462
ereport(ERROR,
450463
(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
451-
errmsg("final function of an ordered-set aggregate must not be declared STRICT")));
464+
errmsg("final function with extra arguments must not be declared STRICT")));
452465
}
453466
else
454467
{
@@ -554,6 +567,8 @@ AggregateCreate(const char *aggName,
554567
values[Anum_pg_aggregate_aggmtransfn - 1] = ObjectIdGetDatum(mtransfn);
555568
values[Anum_pg_aggregate_aggminvtransfn - 1] = ObjectIdGetDatum(minvtransfn);
556569
values[Anum_pg_aggregate_aggmfinalfn - 1] = ObjectIdGetDatum(mfinalfn);
570+
values[Anum_pg_aggregate_aggfinalextra - 1] = BoolGetDatum(finalfnExtraArgs);
571+
values[Anum_pg_aggregate_aggmfinalextra - 1] = BoolGetDatum(mfinalfnExtraArgs);
557572
values[Anum_pg_aggregate_aggsortop - 1] = ObjectIdGetDatum(sortop);
558573
values[Anum_pg_aggregate_aggtranstype - 1] = ObjectIdGetDatum(aggTransType);
559574
values[Anum_pg_aggregate_aggtransspace - 1] = Int32GetDatum(aggTransSpace);

0 commit comments

Comments
 (0)