Skip to content

Commit c7aba7c

Browse files
committed
Support subscripting of arbitrary types, not only arrays.
This patch generalizes the subscripting infrastructure so that any data type can be subscripted, if it provides a handler function to define what that means. Traditional variable-length (varlena) arrays all use array_subscript_handler(), while the existing fixed-length types that support subscripting use raw_array_subscript_handler(). It's expected that other types that want to use subscripting notation will define their own handlers. (This patch provides no such new features, though; it only lays the foundation for them.) To do this, move the parser's semantic processing of subscripts (including coercion to whatever data type is required) into a method callback supplied by the handler. On the execution side, replace the ExecEvalSubscriptingRef* layer of functions with direct calls to callback-supplied execution routines. (Thus, essentially no new run-time overhead should be caused by this patch. Indeed, there is room to remove some overhead by supplying specialized execution routines. This patch does a little bit in that line, but more could be done.) Additional work is required here and there to remove formerly hard-wired assumptions about the result type, collation, etc of a SubscriptingRef expression node; and to remove assumptions that the subscript values must be integers. One useful side-effect of this is that we now have a less squishy mechanism for identifying whether a data type is a "true" array: instead of wiring in weird rules about typlen, we can look to see if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER. For this to be bulletproof, we have to forbid user-defined types from using that handler directly; but there seems no good reason for them to do so. This patch also removes assumptions that the number of subscripts is limited to MAXDIM (6), or indeed has any hard-wired limit. That limit still applies to types handled by array_subscript_handler or raw_array_subscript_handler, but to discourage other dependencies on this constant, I've moved it from c.h to utils/array.h. Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov, Peter Eisentraut, Pavel Stehule Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com
1 parent 8b069ef commit c7aba7c

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

52 files changed

+1552
-711
lines changed

contrib/postgres_fdw/deparse.c

+10-5
Original file line numberDiff line numberDiff line change
@@ -426,23 +426,28 @@ foreign_expr_walker(Node *node,
426426
return false;
427427

428428
/*
429-
* Recurse to remaining subexpressions. Since the container
430-
* subscripts must yield (noncollatable) integers, they won't
431-
* affect the inner_cxt state.
429+
* Recurse into the remaining subexpressions. The container
430+
* subscripts will not affect collation of the SubscriptingRef
431+
* result, so do those first and reset inner_cxt afterwards.
432432
*/
433433
if (!foreign_expr_walker((Node *) sr->refupperindexpr,
434434
glob_cxt, &inner_cxt))
435435
return false;
436+
inner_cxt.collation = InvalidOid;
437+
inner_cxt.state = FDW_COLLATE_NONE;
436438
if (!foreign_expr_walker((Node *) sr->reflowerindexpr,
437439
glob_cxt, &inner_cxt))
438440
return false;
441+
inner_cxt.collation = InvalidOid;
442+
inner_cxt.state = FDW_COLLATE_NONE;
439443
if (!foreign_expr_walker((Node *) sr->refexpr,
440444
glob_cxt, &inner_cxt))
441445
return false;
442446

443447
/*
444-
* Container subscripting should yield same collation as
445-
* input, but for safety use same logic as for function nodes.
448+
* Container subscripting typically yields same collation as
449+
* refexpr's, but in case it doesn't, use same logic as for
450+
* function nodes.
446451
*/
447452
collation = sr->refcollid;
448453
if (collation == InvalidOid)

doc/src/sgml/catalogs.sgml

+25-13
Original file line numberDiff line numberDiff line change
@@ -8740,26 +8740,38 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
87408740
</para></entry>
87418741
</row>
87428742

8743+
<row>
8744+
<entry role="catalog_table_entry"><para role="column_definition">
8745+
<structfield>typsubscript</structfield> <type>regproc</type>
8746+
(references <link linkend="catalog-pg-proc"><structname>pg_proc</structname></link>.<structfield>oid</structfield>)
8747+
</para>
8748+
<para>
8749+
Subscripting handler function's OID, or zero if this type doesn't
8750+
support subscripting. Types that are <quote>true</quote> array
8751+
types have <structfield>typsubscript</structfield>
8752+
= <function>array_subscript_handler</function>, but other types may
8753+
have other handler functions to implement specialized subscripting
8754+
behavior.
8755+
</para></entry>
8756+
</row>
8757+
87438758
<row>
87448759
<entry role="catalog_table_entry"><para role="column_definition">
87458760
<structfield>typelem</structfield> <type>oid</type>
87468761
(references <link linkend="catalog-pg-type"><structname>pg_type</structname></link>.<structfield>oid</structfield>)
87478762
</para>
87488763
<para>
87498764
If <structfield>typelem</structfield> is not 0 then it
8750-
identifies another row in <structname>pg_type</structname>.
8751-
The current type can then be subscripted like an array yielding
8752-
values of type <structfield>typelem</structfield>. A
8753-
<quote>true</quote> array type is variable length
8754-
(<structfield>typlen</structfield> = -1),
8755-
but some fixed-length (<structfield>typlen</structfield> &gt; 0) types
8756-
also have nonzero <structfield>typelem</structfield>, for example
8757-
<type>name</type> and <type>point</type>.
8758-
If a fixed-length type has a <structfield>typelem</structfield> then
8759-
its internal representation must be some number of values of the
8760-
<structfield>typelem</structfield> data type with no other data.
8761-
Variable-length array types have a header defined by the array
8762-
subroutines.
8765+
identifies another row in <structname>pg_type</structname>,
8766+
defining the type yielded by subscripting. This should be 0
8767+
if <structfield>typsubscript</structfield> is 0. However, it can
8768+
be 0 when <structfield>typsubscript</structfield> isn't 0, if the
8769+
handler doesn't need <structfield>typelem</structfield> to
8770+
determine the subscripting result type.
8771+
Note that a <structfield>typelem</structfield> dependency is
8772+
considered to imply physical containment of the element type in
8773+
this type; so DDL changes on the element type might be restricted
8774+
by the presence of this type.
87638775
</para></entry>
87648776
</row>
87658777

doc/src/sgml/ref/create_type.sgml

+65-11
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ CREATE TYPE <replaceable class="parameter">name</replaceable> (
4343
[ , TYPMOD_IN = <replaceable class="parameter">type_modifier_input_function</replaceable> ]
4444
[ , TYPMOD_OUT = <replaceable class="parameter">type_modifier_output_function</replaceable> ]
4545
[ , ANALYZE = <replaceable class="parameter">analyze_function</replaceable> ]
46+
[ , SUBSCRIPT = <replaceable class="parameter">subscript_function</replaceable> ]
4647
[ , INTERNALLENGTH = { <replaceable class="parameter">internallength</replaceable> | VARIABLE } ]
4748
[ , PASSEDBYVALUE ]
4849
[ , ALIGNMENT = <replaceable class="parameter">alignment</replaceable> ]
@@ -196,8 +197,9 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
196197
<replaceable class="parameter">receive_function</replaceable>,
197198
<replaceable class="parameter">send_function</replaceable>,
198199
<replaceable class="parameter">type_modifier_input_function</replaceable>,
199-
<replaceable class="parameter">type_modifier_output_function</replaceable> and
200-
<replaceable class="parameter">analyze_function</replaceable>
200+
<replaceable class="parameter">type_modifier_output_function</replaceable>,
201+
<replaceable class="parameter">analyze_function</replaceable>, and
202+
<replaceable class="parameter">subscript_function</replaceable>
201203
are optional. Generally these functions have to be coded in C
202204
or another low-level language.
203205
</para>
@@ -318,6 +320,26 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
318320
in <filename>src/include/commands/vacuum.h</filename>.
319321
</para>
320322

323+
<para>
324+
The optional <replaceable class="parameter">subscript_function</replaceable>
325+
allows the data type to be subscripted in SQL commands. Specifying this
326+
function does not cause the type to be considered a <quote>true</quote>
327+
array type; for example, it will not be a candidate for the result type
328+
of <literal>ARRAY[]</literal> constructs. But if subscripting a value
329+
of the type is a natural notation for extracting data from it, then
330+
a <replaceable class="parameter">subscript_function</replaceable> can
331+
be written to define what that means. The subscript function must be
332+
declared to take a single argument of type <type>internal</type>, and
333+
return an <type>internal</type> result, which is a pointer to a struct
334+
of methods (functions) that implement subscripting.
335+
The detailed API for subscript functions appears
336+
in <filename>src/include/nodes/subscripting.h</filename>;
337+
it may also be useful to read the array implementation
338+
in <filename>src/backend/utils/adt/arraysubs.c</filename>.
339+
Additional information appears in
340+
<xref linkend="sql-createtype-array"/> below.
341+
</para>
342+
321343
<para>
322344
While the details of the new type's internal representation are only
323345
known to the I/O functions and other functions you create to work with
@@ -428,11 +450,12 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
428450
</para>
429451

430452
<para>
431-
To indicate that a type is an array, specify the type of the array
453+
To indicate that a type is a fixed-length array type,
454+
specify the type of the array
432455
elements using the <literal>ELEMENT</literal> key word. For example, to
433456
define an array of 4-byte integers (<type>int4</type>), specify
434-
<literal>ELEMENT = int4</literal>. More details about array types
435-
appear below.
457+
<literal>ELEMENT = int4</literal>. For more details,
458+
see <xref linkend="sql-createtype-array"/> below.
436459
</para>
437460

438461
<para>
@@ -456,7 +479,7 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
456479
</para>
457480
</refsect2>
458481

459-
<refsect2>
482+
<refsect2 id="sql-createtype-array" xreflabel="Array Types">
460483
<title>Array Types</title>
461484

462485
<para>
@@ -469,14 +492,16 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
469492
repeated until a non-colliding name is found.)
470493
This implicitly-created array type is variable length and uses the
471494
built-in input and output functions <literal>array_in</literal> and
472-
<literal>array_out</literal>. The array type tracks any changes in its
495+
<literal>array_out</literal>. Furthermore, this type is what the system
496+
uses for constructs such as <literal>ARRAY[]</literal> over the
497+
user-defined type. The array type tracks any changes in its
473498
element type's owner or schema, and is dropped if the element type is.
474499
</para>
475500

476501
<para>
477502
You might reasonably ask why there is an <option>ELEMENT</option>
478503
option, if the system makes the correct array type automatically.
479-
The only case where it's useful to use <option>ELEMENT</option> is when you are
504+
The main case where it's useful to use <option>ELEMENT</option> is when you are
480505
making a fixed-length type that happens to be internally an array of a number of
481506
identical things, and you want to allow these things to be accessed
482507
directly by subscripting, in addition to whatever operations you plan
@@ -485,13 +510,32 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
485510
using <literal>point[0]</literal> and <literal>point[1]</literal>.
486511
Note that
487512
this facility only works for fixed-length types whose internal form
488-
is exactly a sequence of identical fixed-length fields. A subscriptable
489-
variable-length type must have the generalized internal representation
490-
used by <literal>array_in</literal> and <literal>array_out</literal>.
513+
is exactly a sequence of identical fixed-length fields.
491514
For historical reasons (i.e., this is clearly wrong but it's far too
492515
late to change it), subscripting of fixed-length array types starts from
493516
zero, rather than from one as for variable-length arrays.
494517
</para>
518+
519+
<para>
520+
Specifying the <option>SUBSCRIPT</option> option allows a data type to
521+
be subscripted, even though the system does not otherwise regard it as
522+
an array type. The behavior just described for fixed-length arrays is
523+
actually implemented by the <option>SUBSCRIPT</option> handler
524+
function <function>raw_array_subscript_handler</function>, which is
525+
used automatically if you specify <option>ELEMENT</option> for a
526+
fixed-length type without also writing <option>SUBSCRIPT</option>.
527+
</para>
528+
529+
<para>
530+
When specifying a custom <option>SUBSCRIPT</option> function, it is
531+
not necessary to specify <option>ELEMENT</option> unless
532+
the <option>SUBSCRIPT</option> handler function needs to
533+
consult <structfield>typelem</structfield> to find out what to return.
534+
Be aware that specifying <option>ELEMENT</option> causes the system to
535+
assume that the new type contains, or is somehow physically dependent on,
536+
the element type; thus for example changing properties of the element
537+
type won't be allowed if there are any columns of the dependent type.
538+
</para>
495539
</refsect2>
496540
</refsect1>
497541

@@ -654,6 +698,16 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
654698
</listitem>
655699
</varlistentry>
656700

701+
<varlistentry>
702+
<term><replaceable class="parameter">subscript_function</replaceable></term>
703+
<listitem>
704+
<para>
705+
The name of a function that defines what subscripting a value of the
706+
data type does.
707+
</para>
708+
</listitem>
709+
</varlistentry>
710+
657711
<varlistentry>
658712
<term><replaceable class="parameter">internallength</replaceable></term>
659713
<listitem>

src/backend/catalog/aclchk.c

+2-2
Original file line numberDiff line numberDiff line change
@@ -3114,7 +3114,7 @@ ExecGrant_Type(InternalGrant *istmt)
31143114

31153115
pg_type_tuple = (Form_pg_type) GETSTRUCT(tuple);
31163116

3117-
if (pg_type_tuple->typelem != 0 && pg_type_tuple->typlen == -1)
3117+
if (IsTrueArrayType(pg_type_tuple))
31183118
ereport(ERROR,
31193119
(errcode(ERRCODE_INVALID_GRANT_OPERATION),
31203120
errmsg("cannot set privileges of array types"),
@@ -4392,7 +4392,7 @@ pg_type_aclmask(Oid type_oid, Oid roleid, AclMode mask, AclMaskHow how)
43924392
* "True" array types don't manage permissions of their own; consult the
43934393
* element type instead.
43944394
*/
4395-
if (OidIsValid(typeForm->typelem) && typeForm->typlen == -1)
4395+
if (IsTrueArrayType(typeForm))
43964396
{
43974397
Oid elttype_oid = typeForm->typelem;
43984398

src/backend/catalog/dependency.c

+16
Original file line numberDiff line numberDiff line change
@@ -2074,6 +2074,22 @@ find_expr_references_walker(Node *node,
20742074
context->addrs);
20752075
/* fall through to examine arguments */
20762076
}
2077+
else if (IsA(node, SubscriptingRef))
2078+
{
2079+
SubscriptingRef *sbsref = (SubscriptingRef *) node;
2080+
2081+
/*
2082+
* The refexpr should provide adequate dependency on refcontainertype,
2083+
* and that type in turn depends on refelemtype. However, a custom
2084+
* subscripting handler might set refrestype to something different
2085+
* from either of those, in which case we'd better record it.
2086+
*/
2087+
if (sbsref->refrestype != sbsref->refcontainertype &&
2088+
sbsref->refrestype != sbsref->refelemtype)
2089+
add_object_address(OCLASS_TYPE, sbsref->refrestype, 0,
2090+
context->addrs);
2091+
/* fall through to examine arguments */
2092+
}
20772093
else if (IsA(node, SubPlan))
20782094
{
20792095
/* Extra work needed here if we ever need this case */

src/backend/catalog/heap.c

+2
Original file line numberDiff line numberDiff line change
@@ -1079,6 +1079,7 @@ AddNewRelationType(const char *typeName,
10791079
InvalidOid, /* typmodin procedure - none */
10801080
InvalidOid, /* typmodout procedure - none */
10811081
InvalidOid, /* analyze procedure - default */
1082+
InvalidOid, /* subscript procedure - none */
10821083
InvalidOid, /* array element type - irrelevant */
10831084
false, /* this is not an array type */
10841085
new_array_type, /* array type if any */
@@ -1358,6 +1359,7 @@ heap_create_with_catalog(const char *relname,
13581359
InvalidOid, /* typmodin procedure - none */
13591360
InvalidOid, /* typmodout procedure - none */
13601361
F_ARRAY_TYPANALYZE, /* array analyze procedure */
1362+
F_ARRAY_SUBSCRIPT_HANDLER, /* array subscript procedure */
13611363
new_type_oid, /* array element type - the rowtype */
13621364
true, /* yes, this is an array type */
13631365
InvalidOid, /* this has no array type */

src/backend/catalog/pg_type.c

+10-1
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,7 @@ TypeShellMake(const char *typeName, Oid typeNamespace, Oid ownerId)
103103
values[Anum_pg_type_typisdefined - 1] = BoolGetDatum(false);
104104
values[Anum_pg_type_typdelim - 1] = CharGetDatum(DEFAULT_TYPDELIM);
105105
values[Anum_pg_type_typrelid - 1] = ObjectIdGetDatum(InvalidOid);
106+
values[Anum_pg_type_typsubscript - 1] = ObjectIdGetDatum(InvalidOid);
106107
values[Anum_pg_type_typelem - 1] = ObjectIdGetDatum(InvalidOid);
107108
values[Anum_pg_type_typarray - 1] = ObjectIdGetDatum(InvalidOid);
108109
values[Anum_pg_type_typinput - 1] = ObjectIdGetDatum(F_SHELL_IN);
@@ -208,6 +209,7 @@ TypeCreate(Oid newTypeOid,
208209
Oid typmodinProcedure,
209210
Oid typmodoutProcedure,
210211
Oid analyzeProcedure,
212+
Oid subscriptProcedure,
211213
Oid elementType,
212214
bool isImplicitArray,
213215
Oid arrayType,
@@ -357,6 +359,7 @@ TypeCreate(Oid newTypeOid,
357359
values[Anum_pg_type_typisdefined - 1] = BoolGetDatum(true);
358360
values[Anum_pg_type_typdelim - 1] = CharGetDatum(typDelim);
359361
values[Anum_pg_type_typrelid - 1] = ObjectIdGetDatum(relationOid);
362+
values[Anum_pg_type_typsubscript - 1] = ObjectIdGetDatum(subscriptProcedure);
360363
values[Anum_pg_type_typelem - 1] = ObjectIdGetDatum(elementType);
361364
values[Anum_pg_type_typarray - 1] = ObjectIdGetDatum(arrayType);
362365
values[Anum_pg_type_typinput - 1] = ObjectIdGetDatum(inputProcedure);
@@ -667,7 +670,7 @@ GenerateTypeDependencies(HeapTuple typeTuple,
667670
recordDependencyOnCurrentExtension(&myself, rebuild);
668671
}
669672

670-
/* Normal dependencies on the I/O functions */
673+
/* Normal dependencies on the I/O and support functions */
671674
if (OidIsValid(typeForm->typinput))
672675
{
673676
ObjectAddressSet(referenced, ProcedureRelationId, typeForm->typinput);
@@ -710,6 +713,12 @@ GenerateTypeDependencies(HeapTuple typeTuple,
710713
add_exact_object_address(&referenced, addrs_normal);
711714
}
712715

716+
if (OidIsValid(typeForm->typsubscript))
717+
{
718+
ObjectAddressSet(referenced, ProcedureRelationId, typeForm->typsubscript);
719+
add_exact_object_address(&referenced, addrs_normal);
720+
}
721+
713722
/* Normal dependency from a domain to its base type. */
714723
if (OidIsValid(typeForm->typbasetype))
715724
{

0 commit comments

Comments
 (0)