Skip to content

Commit 27cb66f

Browse files
committed
Multi-column GIN indexes. Teodor Sigaev
1 parent 2d6599f commit 27cb66f

File tree

15 files changed

+468
-185
lines changed

15 files changed

+468
-185
lines changed

doc/src/sgml/indices.sgml

Lines changed: 22 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/indices.sgml,v 1.73 2008/05/27 00:13:08 tgl Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/indices.sgml,v 1.74 2008/07/11 21:06:28 tgl Exp $ -->
22

33
<chapter id="indexes">
44
<title id="indexes-title">Indexes</title>
@@ -198,7 +198,7 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
198198
after a database crash.
199199
For these reasons, hash index use is presently discouraged.
200200
</para>
201-
</note>
201+
</note>
202202

203203
<para>
204204
<indexterm>
@@ -250,9 +250,9 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
250250
</indexterm>
251251
GIN indexes are inverted indexes which can handle values that contain more
252252
than one key, arrays for example. Like GiST, GIN can support
253-
many different user-defined indexing strategies and the particular
254-
operators with which a GIN index can be used vary depending on the
255-
indexing strategy.
253+
many different user-defined indexing strategies and the particular
254+
operators with which a GIN index can be used vary depending on the
255+
indexing strategy.
256256
As an example, the standard distribution of
257257
<productname>PostgreSQL</productname> includes GIN operator classes
258258
for one-dimensional arrays, which support indexed
@@ -306,7 +306,7 @@ CREATE INDEX test2_mm_idx ON test2 (major, minor);
306306
</para>
307307

308308
<para>
309-
Currently, only the B-tree and GiST index types support multicolumn
309+
Currently, only the B-tree, GiST and GIN index types support multicolumn
310310
indexes. Up to 32 columns can be specified. (This limit can be
311311
altered when building <productname>PostgreSQL</productname>; see the
312312
file <filename>pg_config_manual.h</filename>.)
@@ -336,14 +336,21 @@ CREATE INDEX test2_mm_idx ON test2 (major, minor);
336336

337337
<para>
338338
A multicolumn GiST index can be used with query conditions that
339-
involve any subset of the index's columns. Conditions on additional
340-
columns restrict the entries returned by the index, but the condition on
341-
the first column is the most important one for determining how much of
342-
the index needs to be scanned. A GiST index will be relatively
343-
ineffective if its first column has only a few distinct values, even if
339+
involve any subset of the index's columns. Conditions on additional
340+
columns restrict the entries returned by the index, but the condition on
341+
the first column is the most important one for determining how much of
342+
the index needs to be scanned. A GiST index will be relatively
343+
ineffective if its first column has only a few distinct values, even if
344344
there are many distinct values in additional columns.
345345
</para>
346346

347+
<para>
348+
A multicolumn GIN index can be used with query conditions that
349+
involve any subset of the index's columns. Unlike B-tree or GiST,
350+
index search effectiveness is the same regardless of which index column(s)
351+
the query conditions use.
352+
</para>
353+
347354
<para>
348355
Of course, each column must be used with operators appropriate to the index
349356
type; clauses that involve other operators will not be considered.
@@ -551,7 +558,7 @@ CREATE UNIQUE INDEX <replaceable>name</replaceable> ON <replaceable>table</repla
551558
<para>
552559
<productname>PostgreSQL</productname> automatically creates a unique
553560
index when a unique constraint or a primary key is defined for a table.
554-
The index covers the columns that make up the primary key or unique
561+
The index covers the columns that make up the primary key or unique
555562
columns (a multicolumn index, if appropriate), and is the mechanism
556563
that enforces the constraint.
557564
</para>
@@ -798,9 +805,9 @@ SELECT * FROM orders WHERE order_nr = 3501;
798805
or the index will not be recognized to be usable. Matching takes
799806
place at query planning time, not at run time. As a result,
800807
parameterized query clauses will not work with a partial index. For
801-
example a prepared query with a parameter might specify
802-
<quote>x &lt; ?</quote> which will never imply
803-
<quote>x &lt; 2</quote> for all possible values of the parameter.
808+
example a prepared query with a parameter might specify
809+
<quote>x &lt; ?</quote> which will never imply
810+
<quote>x &lt; 2</quote> for all possible values of the parameter.
804811
</para>
805812

806813
<para>

doc/src/sgml/ref/create_index.sgml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
$PostgreSQL: pgsql/doc/src/sgml/ref/create_index.sgml,v 1.67 2008/03/16 23:57:51 tgl Exp $
2+
$PostgreSQL: pgsql/doc/src/sgml/ref/create_index.sgml,v 1.68 2008/07/11 21:06:29 tgl Exp $
33
PostgreSQL documentation
44
-->
55

@@ -394,7 +394,7 @@ Indexes:
394394
</para>
395395

396396
<para>
397-
Currently, only the B-tree and GiST index methods support
397+
Currently, only the B-tree, GiST and GIN index methods support
398398
multicolumn indexes. Up to 32 fields can be specified by default.
399399
(This limit can be altered when building
400400
<productname>PostgreSQL</productname>.) Only B-tree currently
@@ -423,7 +423,7 @@ Indexes:
423423
the optional clauses <literal>ASC</>, <literal>DESC</>, <literal>NULLS
424424
FIRST</>, and/or <literal>NULLS LAST</> can be specified to reverse
425425
the normal sort direction of the index. Since an ordered index can be
426-
scanned either forward or backward, it is not normally useful to create a
426+
scanned either forward or backward, it is not normally useful to create a
427427
single-column <literal>DESC</> index &mdash; that sort ordering is already
428428
available with a regular index. The value of these options is that
429429
multicolumn indexes can be created that match the sort ordering requested

src/backend/access/gin/ginbulk.c

Lines changed: 21 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
* Portions Copyright (c) 1994, Regents of the University of California
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/access/gin/ginbulk.c,v 1.12 2008/06/29 21:04:01 tgl Exp $
11+
* $PostgreSQL: pgsql/src/backend/access/gin/ginbulk.c,v 1.13 2008/07/11 21:06:29 tgl Exp $
1212
*-------------------------------------------------------------------------
1313
*/
1414

@@ -82,16 +82,16 @@ ginInsertData(BuildAccumulator *accum, EntryAccumulator *entry, ItemPointer heap
8282
* palloc'd space in accum.
8383
*/
8484
static Datum
85-
getDatumCopy(BuildAccumulator *accum, Datum value)
85+
getDatumCopy(BuildAccumulator *accum, OffsetNumber attnum, Datum value)
8686
{
87-
Form_pg_attribute *att = accum->ginstate->tupdesc->attrs;
87+
Form_pg_attribute att = accum->ginstate->origTupdesc->attrs[ attnum - 1 ];
8888
Datum res;
8989

90-
if (att[0]->attbyval)
90+
if (att->attbyval)
9191
res = value;
9292
else
9393
{
94-
res = datumCopy(value, false, att[0]->attlen);
94+
res = datumCopy(value, false, att->attlen);
9595
accum->allocatedMemory += GetMemoryChunkSpace(DatumGetPointer(res));
9696
}
9797
return res;
@@ -101,7 +101,7 @@ getDatumCopy(BuildAccumulator *accum, Datum value)
101101
* Find/store one entry from indexed value.
102102
*/
103103
static void
104-
ginInsertEntry(BuildAccumulator *accum, ItemPointer heapptr, Datum entry)
104+
ginInsertEntry(BuildAccumulator *accum, ItemPointer heapptr, OffsetNumber attnum, Datum entry)
105105
{
106106
EntryAccumulator *ea = accum->entries,
107107
*pea = NULL;
@@ -110,7 +110,7 @@ ginInsertEntry(BuildAccumulator *accum, ItemPointer heapptr, Datum entry)
110110

111111
while (ea)
112112
{
113-
res = compareEntries(accum->ginstate, entry, ea->value);
113+
res = compareAttEntries(accum->ginstate, attnum, entry, ea->attnum, ea->value);
114114
if (res == 0)
115115
break; /* found */
116116
else
@@ -132,7 +132,8 @@ ginInsertEntry(BuildAccumulator *accum, ItemPointer heapptr, Datum entry)
132132
ea = EAAllocate(accum);
133133

134134
ea->left = ea->right = NULL;
135-
ea->value = getDatumCopy(accum, entry);
135+
ea->attnum = attnum;
136+
ea->value = getDatumCopy(accum, attnum, entry);
136137
ea->length = DEF_NPTR;
137138
ea->number = 1;
138139
ea->shouldSort = FALSE;
@@ -160,23 +161,24 @@ ginInsertEntry(BuildAccumulator *accum, ItemPointer heapptr, Datum entry)
160161
* then calls itself for each parts
161162
*/
162163
static void
163-
ginChooseElem(BuildAccumulator *accum, ItemPointer heapptr, Datum *entries, uint32 nentry,
164+
ginChooseElem(BuildAccumulator *accum, ItemPointer heapptr, OffsetNumber attnum,
165+
Datum *entries, uint32 nentry,
164166
uint32 low, uint32 high, uint32 offset)
165167
{
166168
uint32 pos;
167169
uint32 middle = (low + high) >> 1;
168170

169171
pos = (low + middle) >> 1;
170172
if (low != middle && pos >= offset && pos - offset < nentry)
171-
ginInsertEntry(accum, heapptr, entries[pos - offset]);
173+
ginInsertEntry(accum, heapptr, attnum, entries[pos - offset]);
172174
pos = (high + middle + 1) >> 1;
173175
if (middle + 1 != high && pos >= offset && pos - offset < nentry)
174-
ginInsertEntry(accum, heapptr, entries[pos - offset]);
176+
ginInsertEntry(accum, heapptr, attnum, entries[pos - offset]);
175177

176178
if (low != middle)
177-
ginChooseElem(accum, heapptr, entries, nentry, low, middle, offset);
179+
ginChooseElem(accum, heapptr, attnum, entries, nentry, low, middle, offset);
178180
if (high != middle + 1)
179-
ginChooseElem(accum, heapptr, entries, nentry, middle + 1, high, offset);
181+
ginChooseElem(accum, heapptr, attnum, entries, nentry, middle + 1, high, offset);
180182
}
181183

182184
/*
@@ -185,7 +187,8 @@ ginChooseElem(BuildAccumulator *accum, ItemPointer heapptr, Datum *entries, uint
185187
* next middle on left part and middle of right part.
186188
*/
187189
void
188-
ginInsertRecordBA(BuildAccumulator *accum, ItemPointer heapptr, Datum *entries, int32 nentry)
190+
ginInsertRecordBA(BuildAccumulator *accum, ItemPointer heapptr, OffsetNumber attnum,
191+
Datum *entries, int32 nentry)
189192
{
190193
uint32 i,
191194
nbit = 0,
@@ -201,8 +204,8 @@ ginInsertRecordBA(BuildAccumulator *accum, ItemPointer heapptr, Datum *entries,
201204
nbit = 1 << nbit;
202205
offset = (nbit - nentry) / 2;
203206

204-
ginInsertEntry(accum, heapptr, entries[(nbit >> 1) - offset]);
205-
ginChooseElem(accum, heapptr, entries, nentry, 0, nbit, offset);
207+
ginInsertEntry(accum, heapptr, attnum, entries[(nbit >> 1) - offset]);
208+
ginChooseElem(accum, heapptr, attnum, entries, nentry, 0, nbit, offset);
206209
}
207210

208211
static int
@@ -259,7 +262,7 @@ walkTree(BuildAccumulator *accum)
259262
}
260263

261264
ItemPointerData *
262-
ginGetEntry(BuildAccumulator *accum, Datum *value, uint32 *n)
265+
ginGetEntry(BuildAccumulator *accum, OffsetNumber *attnum, Datum *value, uint32 *n)
263266
{
264267
EntryAccumulator *entry;
265268
ItemPointerData *list;
@@ -299,6 +302,7 @@ ginGetEntry(BuildAccumulator *accum, Datum *value, uint32 *n)
299302
return NULL;
300303

301304
*n = entry->number;
305+
*attnum = entry->attnum;
302306
*value = entry->value;
303307
list = entry->list;
304308

src/backend/access/gin/ginentrypage.c

Lines changed: 33 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
* Portions Copyright (c) 1994, Regents of the University of California
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/access/gin/ginentrypage.c,v 1.16 2008/06/19 00:46:03 alvherre Exp $
11+
* $PostgreSQL: pgsql/src/backend/access/gin/ginentrypage.c,v 1.17 2008/07/11 21:06:29 tgl Exp $
1212
*-------------------------------------------------------------------------
1313
*/
1414

@@ -38,14 +38,27 @@
3838
* - ItemPointerGetBlockNumber(&itup->t_tid) contains block number of
3939
* root of posting tree
4040
* - ItemPointerGetOffsetNumber(&itup->t_tid) contains magic number GIN_TREE_POSTING
41+
*
42+
* Storage of attributes of tuple are different for single and multicolumn index.
43+
* For single-column index tuple stores only value to be indexed and for
44+
* multicolumn variant it stores two attributes: column number of value and value.
4145
*/
4246
IndexTuple
43-
GinFormTuple(GinState *ginstate, Datum key, ItemPointerData *ipd, uint32 nipd)
47+
GinFormTuple(GinState *ginstate, OffsetNumber attnum, Datum key, ItemPointerData *ipd, uint32 nipd)
4448
{
45-
bool isnull = FALSE;
49+
bool isnull[2] = {FALSE,FALSE};
4650
IndexTuple itup;
4751

48-
itup = index_form_tuple(ginstate->tupdesc, &key, &isnull);
52+
if ( ginstate->oneCol )
53+
itup = index_form_tuple(ginstate->origTupdesc, &key, isnull);
54+
else
55+
{
56+
Datum datums[2];
57+
58+
datums[0] = UInt16GetDatum(attnum);
59+
datums[1] = key;
60+
itup = index_form_tuple(ginstate->tupdesc[attnum-1], datums, isnull);
61+
}
4962

5063
GinSetOrigSizePosting(itup, IndexTupleSize(itup));
5164

@@ -88,28 +101,20 @@ getRightMostTuple(Page page)
88101
return (IndexTuple) PageGetItem(page, PageGetItemId(page, maxoff));
89102
}
90103

91-
Datum
92-
ginGetHighKey(GinState *ginstate, Page page)
93-
{
94-
IndexTuple itup;
95-
bool isnull;
96-
97-
itup = getRightMostTuple(page);
98-
99-
return index_getattr(itup, FirstOffsetNumber, ginstate->tupdesc, &isnull);
100-
}
101-
102104
static bool
103105
entryIsMoveRight(GinBtree btree, Page page)
104106
{
105-
Datum highkey;
107+
IndexTuple itup;
106108

107109
if (GinPageRightMost(page))
108110
return FALSE;
109111

110-
highkey = ginGetHighKey(btree->ginstate, page);
112+
itup = getRightMostTuple(page);
111113

112-
if (compareEntries(btree->ginstate, btree->entryValue, highkey) > 0)
114+
if (compareAttEntries(btree->ginstate,
115+
btree->entryAttnum, btree->entryValue,
116+
gintuple_get_attrnum(btree->ginstate, itup),
117+
gin_index_getattr(btree->ginstate, itup)) > 0)
113118
return TRUE;
114119

115120
return FALSE;
@@ -154,11 +159,11 @@ entryLocateEntry(GinBtree btree, GinBtreeStack *stack)
154159
result = -1;
155160
else
156161
{
157-
bool isnull;
158-
159162
itup = (IndexTuple) PageGetItem(page, PageGetItemId(page, mid));
160-
result = compareEntries(btree->ginstate, btree->entryValue,
161-
index_getattr(itup, FirstOffsetNumber, btree->ginstate->tupdesc, &isnull));
163+
result = compareAttEntries(btree->ginstate,
164+
btree->entryAttnum, btree->entryValue,
165+
gintuple_get_attrnum(btree->ginstate, itup),
166+
gin_index_getattr(btree->ginstate, itup));
162167
}
163168

164169
if (result == 0)
@@ -217,13 +222,13 @@ entryLocateLeafEntry(GinBtree btree, GinBtreeStack *stack)
217222
while (high > low)
218223
{
219224
OffsetNumber mid = low + ((high - low) / 2);
220-
bool isnull;
221225
int result;
222226

223227
itup = (IndexTuple) PageGetItem(page, PageGetItemId(page, mid));
224-
result = compareEntries(btree->ginstate, btree->entryValue,
225-
index_getattr(itup, FirstOffsetNumber, btree->ginstate->tupdesc, &isnull));
226-
228+
result = compareAttEntries(btree->ginstate,
229+
btree->entryAttnum, btree->entryValue,
230+
gintuple_get_attrnum(btree->ginstate, itup),
231+
gin_index_getattr(btree->ginstate, itup));
227232
if (result == 0)
228233
{
229234
stack->off = mid;
@@ -587,7 +592,7 @@ entryFillRoot(GinBtree btree, Buffer root, Buffer lbuf, Buffer rbuf)
587592
}
588593

589594
void
590-
prepareEntryScan(GinBtree btree, Relation index, Datum value, GinState *ginstate)
595+
prepareEntryScan(GinBtree btree, Relation index, OffsetNumber attnum, Datum value, GinState *ginstate)
591596
{
592597
memset(btree, 0, sizeof(GinBtreeData));
593598

@@ -603,6 +608,7 @@ prepareEntryScan(GinBtree btree, Relation index, Datum value, GinState *ginstate
603608

604609
btree->index = index;
605610
btree->ginstate = ginstate;
611+
btree->entryAttnum = attnum;
606612
btree->entryValue = value;
607613

608614
btree->isDelete = FALSE;

0 commit comments

Comments
 (0)