|
78 | 78 |
|
79 | 79 | <para>
|
80 | 80 | All it takes to get a <acronym>GiST</acronym> access method up and running
|
81 |
| - is to implement seven user-defined methods, which define the behavior of |
| 81 | + is to implement several user-defined methods, which define the behavior of |
82 | 82 | keys in the tree. Of course these methods have to be pretty fancy to
|
83 | 83 | support fancy queries, but for all the standard queries (B-trees,
|
84 | 84 | R-trees, etc.) they're relatively straightforward. In short,
|
|
93 | 93 |
|
94 | 94 | <para>
|
95 | 95 | There are seven methods that an index operator class for
|
96 |
| - <acronym>GiST</acronym> must provide. Correctness of the index is ensured |
| 96 | + <acronym>GiST</acronym> must provide, and an eighth that is optional. |
| 97 | + Correctness of the index is ensured |
97 | 98 | by proper implementation of the <function>same</>, <function>consistent</>
|
98 | 99 | and <function>union</> methods, while efficiency (size and speed) of the
|
99 | 100 | index will depend on the <function>penalty</> and <function>picksplit</>
|
100 | 101 | methods.
|
101 |
| - The remaining two methods are <function>compress</> and |
| 102 | + The remaining two basic methods are <function>compress</> and |
102 | 103 | <function>decompress</>, which allow an index to have internal tree data of
|
103 | 104 | a different type than the data it indexes. The leaves are to be of the
|
104 | 105 | indexed data type, while the other tree nodes can be of any C struct (but
|
105 | 106 | you still have to follow <productname>PostgreSQL</> data type rules here,
|
106 | 107 | see about <literal>varlena</> for variable sized data). If the tree's
|
107 | 108 | internal data type exists at the SQL level, the <literal>STORAGE</> option
|
108 | 109 | of the <command>CREATE OPERATOR CLASS</> command can be used.
|
| 110 | + The optional eighth method is <function>distance</>, which is needed |
| 111 | + if the operator class wishes to support ordered scans (nearest-neighbor |
| 112 | + searches). |
109 | 113 | </para>
|
110 | 114 |
|
111 | 115 | <variablelist>
|
@@ -567,6 +571,73 @@ my_same(PG_FUNCTION_ARGS)
|
567 | 571 | </listitem>
|
568 | 572 | </varlistentry>
|
569 | 573 |
|
| 574 | + <varlistentry> |
| 575 | + <term><function>distance</></term> |
| 576 | + <listitem> |
| 577 | + <para> |
| 578 | + Given an index entry <literal>p</> and a query value <literal>q</>, |
| 579 | + this function determines the index entry's |
| 580 | + <quote>distance</> from the query value. This function must be |
| 581 | + supplied if the operator class contains any ordering operators. |
| 582 | + A query using the ordering operator will be implemented by returning |
| 583 | + index entries with the smallest <quote>distance</> values first, |
| 584 | + so the results must be consistent with the operator's semantics. |
| 585 | + For a leaf index entry the result just represents the distance to |
| 586 | + the index entry; for an internal tree node, the result must be the |
| 587 | + smallest distance that any child entry could have. |
| 588 | + </para> |
| 589 | + |
| 590 | + <para> |
| 591 | + The <acronym>SQL</> declaration of the function must look like this: |
| 592 | + |
| 593 | +<programlisting> |
| 594 | +CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid) |
| 595 | +RETURNS float8 |
| 596 | +AS 'MODULE_PATHNAME' |
| 597 | +LANGUAGE C STRICT; |
| 598 | +</programlisting> |
| 599 | + |
| 600 | + And the matching code in the C module could then follow this skeleton: |
| 601 | + |
| 602 | +<programlisting> |
| 603 | +Datum my_distance(PG_FUNCTION_ARGS); |
| 604 | +PG_FUNCTION_INFO_V1(my_distance); |
| 605 | + |
| 606 | +Datum |
| 607 | +my_distance(PG_FUNCTION_ARGS) |
| 608 | +{ |
| 609 | + GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0); |
| 610 | + data_type *query = PG_GETARG_DATA_TYPE_P(1); |
| 611 | + StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2); |
| 612 | + /* Oid subtype = PG_GETARG_OID(3); */ |
| 613 | + data_type *key = DatumGetDataType(entry->key); |
| 614 | + double retval; |
| 615 | + |
| 616 | + /* |
| 617 | + * determine return value as a function of strategy, key and query. |
| 618 | + */ |
| 619 | + |
| 620 | + PG_RETURN_FLOAT8(retval); |
| 621 | +} |
| 622 | +</programlisting> |
| 623 | + |
| 624 | + The arguments to the <function>distance</> function are identical to |
| 625 | + the arguments of the <function>consistent</> function, except that no |
| 626 | + recheck flag is used. The distance to a leaf index entry must always |
| 627 | + be determined exactly, since there is no way to re-order the tuples |
| 628 | + once they are returned. Some approximation is allowed when determining |
| 629 | + the distance to an internal tree node, so long as the result is never |
| 630 | + greater than any child's actual distance. Thus, for example, distance |
| 631 | + to a bounding box is usually sufficient in geometric applications. The |
| 632 | + result value can be any finite <type>float8</> value. (Infinity and |
| 633 | + minus infinity are used internally to handle cases such as nulls, so it |
| 634 | + is not recommended that <function>distance</> functions return these |
| 635 | + values.) |
| 636 | + </para> |
| 637 | + |
| 638 | + </listitem> |
| 639 | + </varlistentry> |
| 640 | + |
570 | 641 | </variablelist>
|
571 | 642 |
|
572 | 643 | </sect1>
|
|
0 commit comments