Skip to content

Commit 7151e72

Browse files
committed
Improve speed of aggregates that use array_append as transition function.
In the previous coding, if an aggregate's transition function returned an expanded array, nodeAgg.c and nodeWindowAgg.c would always copy it and thus force it into the flat representation. This led to ping-ponging between flat and expanded formats, which costs a lot. For an aggregate using array_append as transition function, I measured about a 15X slowdown compared to the pre-9.5 code, when working on simple int[] arrays. Of course, the old code was already O(N^2) in this usage due to copying flat arrays all the time, but it wasn't quite this inefficient. To fix, teach nodeAgg.c and nodeWindowAgg.c to allow expanded transition values without copying, so long as the transition function takes care to return the transition value already properly parented under the aggcontext. That puts a bit of extra responsibility on the transition function, but doing it this way allows us to not need any extra logic in the fast path of advance_transition_function (ie, with a pass-by-value transition value, or with a modified-in-place pass-by-reference value). We already know that that's a hot spot so I'm loath to add any cycles at all there. Also, while only array_append currently knows how to follow this convention, this solution allows other transition functions to opt-in without needing to have a whitelist in the core aggregation code. (The reason we would need a whitelist is that currently, if you pass a R/W expanded-object pointer to an arbitrary function, it's allowed to do anything with it including deleting it; that breaks the core agg code's assumption that it should free discarded values. Returning a value under aggcontext is the transition function's signal that it knows it is an aggregate transition function and will play nice. Possibly the API rules for expanded objects should be refined, but that would not be a back-patchable change.) With this fix, an aggregate using array_append is no longer O(N^2), so it's much faster than pre-9.5 code rather than much slower. It's still a bit slower than the bespoke infrastructure for array_agg, but the differential seems to be only about 10%-20% rather than orders of magnitude. Discussion: <6315.1477677885@sss.pgh.pa.us>
1 parent 0cbd199 commit 7151e72

File tree

5 files changed

+116
-26
lines changed

5 files changed

+116
-26
lines changed

doc/src/sgml/xaggr.sgml

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -530,14 +530,28 @@ if (AggCheckCallContext(fcinfo, NULL))
530530
function, the first input
531531
must be a temporary state value and can therefore safely be modified
532532
in-place rather than allocating a new copy.
533-
See <literal>int8inc()</> for an example.
533+
See <function>int8inc()</> for an example.
534534
(This is the <emphasis>only</>
535535
case where it is safe for a function to modify a pass-by-reference input.
536536
In particular, final functions for normal aggregates must not
537537
modify their inputs in any case, because in some cases they will be
538538
re-executed on the same final state value.)
539539
</para>
540540

541+
<para>
542+
The second argument of <function>AggCheckCallContext</> can be used to
543+
retrieve the memory context in which aggregate state values are being kept.
544+
This is useful for transition functions that wish to use <quote>expanded</>
545+
objects (see <xref linkend="xtypes-toast">) as their state values.
546+
On first call, the transition function should return an expanded object
547+
whose memory context is a child of the aggregate state context, and then
548+
keep returning the same expanded object on subsequent calls. See
549+
<function>array_append()</> for an example. (<function>array_append()</>
550+
is not the transition function of any built-in aggregate, but it is written
551+
to behave efficiently when used as transition function of a custom
552+
aggregate.)
553+
</para>
554+
541555
<para>
542556
Another support routine available to aggregate functions written in C
543557
is <function>AggGetAggref</>, which returns the <literal>Aggref</>

src/backend/executor/nodeAgg.c

Lines changed: 32 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -72,10 +72,13 @@
7272
* transition value or a previous function result, and in either case its
7373
* value need not be preserved. See int8inc() for an example. Notice that
7474
* advance_transition_function() is coded to avoid a data copy step when
75-
* the previous transition value pointer is returned. Also, some
76-
* transition functions want to store working state in addition to the
77-
* nominal transition value; they can use the memory context returned by
78-
* AggCheckCallContext() to do that.
75+
* the previous transition value pointer is returned. It is also possible
76+
* to avoid repeated data copying when the transition value is an expanded
77+
* object: to do that, the transition function must take care to return
78+
* an expanded object that is in a child context of the memory context
79+
* returned by AggCheckCallContext(). Also, some transition functions want
80+
* to store working state in addition to the nominal transition value; they
81+
* can use the memory context returned by AggCheckCallContext() to do that.
7982
*
8083
* Note: AggCheckCallContext() is available as of PostgreSQL 9.0. The
8184
* AggState is available as context in earlier releases (back to 8.1),
@@ -694,21 +697,36 @@ advance_transition_function(AggState *aggstate,
694697

695698
/*
696699
* If pass-by-ref datatype, must copy the new value into aggcontext and
697-
* pfree the prior transValue. But if transfn returned a pointer to its
698-
* first input, we don't need to do anything.
700+
* free the prior transValue. But if transfn returned a pointer to its
701+
* first input, we don't need to do anything. Also, if transfn returned a
702+
* pointer to a R/W expanded object that is already a child of the
703+
* aggcontext, assume we can adopt that value without copying it.
699704
*/
700705
if (!peraggstate->transtypeByVal &&
701706
DatumGetPointer(newVal) != DatumGetPointer(pergroupstate->transValue))
702707
{
703708
if (!fcinfo->isnull)
704709
{
705710
MemoryContextSwitchTo(aggstate->aggcontexts[aggstate->current_set]->ecxt_per_tuple_memory);
706-
newVal = datumCopy(newVal,
707-
peraggstate->transtypeByVal,
708-
peraggstate->transtypeLen);
711+
if (DatumIsReadWriteExpandedObject(newVal,
712+
false,
713+
peraggstate->transtypeLen) &&
714+
MemoryContextGetParent(DatumGetEOHP(newVal)->eoh_context) == CurrentMemoryContext)
715+
/* do nothing */ ;
716+
else
717+
newVal = datumCopy(newVal,
718+
peraggstate->transtypeByVal,
719+
peraggstate->transtypeLen);
709720
}
710721
if (!pergroupstate->transValueIsNull)
711-
pfree(DatumGetPointer(pergroupstate->transValue));
722+
{
723+
if (DatumIsReadWriteExpandedObject(pergroupstate->transValue,
724+
false,
725+
peraggstate->transtypeLen))
726+
DeleteExpandedObject(pergroupstate->transValue);
727+
else
728+
pfree(DatumGetPointer(pergroupstate->transValue));
729+
}
712730
}
713731

714732
pergroupstate->transValue = newVal;
@@ -1059,7 +1077,9 @@ finalize_aggregate(AggState *aggstate,
10591077
(void *) aggstate, NULL);
10601078

10611079
/* Fill in the transition state value */
1062-
fcinfo.arg[0] = pergroupstate->transValue;
1080+
fcinfo.arg[0] = MakeExpandedObjectReadOnly(pergroupstate->transValue,
1081+
pergroupstate->transValueIsNull,
1082+
peraggstate->transtypeLen);
10631083
fcinfo.argnull[0] = pergroupstate->transValueIsNull;
10641084
anynull |= pergroupstate->transValueIsNull;
10651085

@@ -1086,6 +1106,7 @@ finalize_aggregate(AggState *aggstate,
10861106
}
10871107
else
10881108
{
1109+
/* Don't need MakeExpandedObjectReadOnly; datumCopy will copy it */
10891110
*resultVal = pergroupstate->transValue;
10901111
*resultIsNull = pergroupstate->transValueIsNull;
10911112
}

src/backend/executor/nodeWindowAgg.c

Lines changed: 46 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -362,21 +362,36 @@ advance_windowaggregate(WindowAggState *winstate,
362362

363363
/*
364364
* If pass-by-ref datatype, must copy the new value into aggcontext and
365-
* pfree the prior transValue. But if transfn returned a pointer to its
366-
* first input, we don't need to do anything.
365+
* free the prior transValue. But if transfn returned a pointer to its
366+
* first input, we don't need to do anything. Also, if transfn returned a
367+
* pointer to a R/W expanded object that is already a child of the
368+
* aggcontext, assume we can adopt that value without copying it.
367369
*/
368370
if (!peraggstate->transtypeByVal &&
369371
DatumGetPointer(newVal) != DatumGetPointer(peraggstate->transValue))
370372
{
371373
if (!fcinfo->isnull)
372374
{
373375
MemoryContextSwitchTo(peraggstate->aggcontext);
374-
newVal = datumCopy(newVal,
375-
peraggstate->transtypeByVal,
376-
peraggstate->transtypeLen);
376+
if (DatumIsReadWriteExpandedObject(newVal,
377+
false,
378+
peraggstate->transtypeLen) &&
379+
MemoryContextGetParent(DatumGetEOHP(newVal)->eoh_context) == CurrentMemoryContext)
380+
/* do nothing */ ;
381+
else
382+
newVal = datumCopy(newVal,
383+
peraggstate->transtypeByVal,
384+
peraggstate->transtypeLen);
377385
}
378386
if (!peraggstate->transValueIsNull)
379-
pfree(DatumGetPointer(peraggstate->transValue));
387+
{
388+
if (DatumIsReadWriteExpandedObject(peraggstate->transValue,
389+
false,
390+
peraggstate->transtypeLen))
391+
DeleteExpandedObject(peraggstate->transValue);
392+
else
393+
pfree(DatumGetPointer(peraggstate->transValue));
394+
}
380395
}
381396

382397
MemoryContextSwitchTo(oldContext);
@@ -513,8 +528,10 @@ advance_windowaggregate_base(WindowAggState *winstate,
513528

514529
/*
515530
* If pass-by-ref datatype, must copy the new value into aggcontext and
516-
* pfree the prior transValue. But if invtransfn returned a pointer to
517-
* its first input, we don't need to do anything.
531+
* free the prior transValue. But if invtransfn returned a pointer to its
532+
* first input, we don't need to do anything. Also, if invtransfn
533+
* returned a pointer to a R/W expanded object that is already a child of
534+
* the aggcontext, assume we can adopt that value without copying it.
518535
*
519536
* Note: the checks for null values here will never fire, but it seems
520537
* best to have this stanza look just like advance_windowaggregate.
@@ -525,12 +542,25 @@ advance_windowaggregate_base(WindowAggState *winstate,
525542
if (!fcinfo->isnull)
526543
{
527544
MemoryContextSwitchTo(peraggstate->aggcontext);
528-
newVal = datumCopy(newVal,
529-
peraggstate->transtypeByVal,
530-
peraggstate->transtypeLen);
545+
if (DatumIsReadWriteExpandedObject(newVal,
546+
false,
547+
peraggstate->transtypeLen) &&
548+
MemoryContextGetParent(DatumGetEOHP(newVal)->eoh_context) == CurrentMemoryContext)
549+
/* do nothing */ ;
550+
else
551+
newVal = datumCopy(newVal,
552+
peraggstate->transtypeByVal,
553+
peraggstate->transtypeLen);
531554
}
532555
if (!peraggstate->transValueIsNull)
533-
pfree(DatumGetPointer(peraggstate->transValue));
556+
{
557+
if (DatumIsReadWriteExpandedObject(peraggstate->transValue,
558+
false,
559+
peraggstate->transtypeLen))
560+
DeleteExpandedObject(peraggstate->transValue);
561+
else
562+
pfree(DatumGetPointer(peraggstate->transValue));
563+
}
534564
}
535565

536566
MemoryContextSwitchTo(oldContext);
@@ -568,7 +598,9 @@ finalize_windowaggregate(WindowAggState *winstate,
568598
numFinalArgs,
569599
perfuncstate->winCollation,
570600
(void *) winstate, NULL);
571-
fcinfo.arg[0] = peraggstate->transValue;
601+
fcinfo.arg[0] = MakeExpandedObjectReadOnly(peraggstate->transValue,
602+
peraggstate->transValueIsNull,
603+
peraggstate->transtypeLen);
572604
fcinfo.argnull[0] = peraggstate->transValueIsNull;
573605
anynull = peraggstate->transValueIsNull;
574606

@@ -596,6 +628,7 @@ finalize_windowaggregate(WindowAggState *winstate,
596628
}
597629
else
598630
{
631+
/* Don't need MakeExpandedObjectReadOnly; datumCopy will copy it */
599632
*result = peraggstate->transValue;
600633
*isnull = peraggstate->transValueIsNull;
601634
}

src/backend/optimizer/util/clauses.c

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -558,6 +558,16 @@ count_agg_clauses_walker(Node *node, count_agg_clauses_context *context)
558558
/* Use average width if aggregate definition gave one */
559559
if (aggtransspace > 0)
560560
avgwidth = aggtransspace;
561+
else if (aggtransfn == F_ARRAY_APPEND)
562+
{
563+
/*
564+
* If the transition function is array_append(), it'll use an
565+
* expanded array as transvalue, which will occupy at least
566+
* ALLOCSET_SMALL_INITSIZE and possibly more. Use that as the
567+
* estimate for lack of a better idea.
568+
*/
569+
avgwidth = ALLOCSET_SMALL_INITSIZE;
570+
}
561571
else
562572
{
563573
/*

src/backend/utils/adt/array_userfuncs.c

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,13 +32,18 @@ static Datum array_position_common(FunctionCallInfo fcinfo);
3232
* Caution: if the input is a read/write pointer, this returns the input
3333
* argument; so callers must be sure that their changes are "safe", that is
3434
* they cannot leave the array in a corrupt state.
35+
*
36+
* If we're being called as an aggregate function, make sure any newly-made
37+
* expanded array is allocated in the aggregate state context, so as to save
38+
* copying operations.
3539
*/
3640
static ExpandedArrayHeader *
3741
fetch_array_arg_replace_nulls(FunctionCallInfo fcinfo, int argno)
3842
{
3943
ExpandedArrayHeader *eah;
4044
Oid element_type;
4145
ArrayMetaState *my_extra;
46+
MemoryContext resultcxt;
4247

4348
/* If first time through, create datatype cache struct */
4449
my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra;
@@ -51,10 +56,17 @@ fetch_array_arg_replace_nulls(FunctionCallInfo fcinfo, int argno)
5156
fcinfo->flinfo->fn_extra = my_extra;
5257
}
5358

59+
/* Figure out which context we want the result in */
60+
if (!AggCheckCallContext(fcinfo, &resultcxt))
61+
resultcxt = CurrentMemoryContext;
62+
5463
/* Now collect the array value */
5564
if (!PG_ARGISNULL(argno))
5665
{
66+
MemoryContext oldcxt = MemoryContextSwitchTo(resultcxt);
67+
5768
eah = PG_GETARG_EXPANDED_ARRAYX(argno, my_extra);
69+
MemoryContextSwitchTo(oldcxt);
5870
}
5971
else
6072
{
@@ -72,7 +84,7 @@ fetch_array_arg_replace_nulls(FunctionCallInfo fcinfo, int argno)
7284
errmsg("input data type is not an array")));
7385

7486
eah = construct_empty_expanded_array(element_type,
75-
CurrentMemoryContext,
87+
resultcxt,
7688
my_extra);
7789
}
7890

0 commit comments

Comments
 (0)