1
1
<!--
2
- $PostgreSQL: pgsql/doc/src/sgml/indexam.sgml,v 2.2 2005/03/21 01: 23:55 tgl Exp $
2
+ $PostgreSQL: pgsql/doc/src/sgml/indexam.sgml,v 2.3 2005/03/27 23:52:51 tgl Exp $
3
3
-->
4
4
5
5
<chapter id="indexam">
@@ -252,6 +252,28 @@ amgettuple (IndexScanDesc scan,
252
252
253
253
<para>
254
254
<programlisting>
255
+ boolean
256
+ amgetmulti (IndexScanDesc scan,
257
+ ItemPointer tids,
258
+ int32 max_tids,
259
+ int32 *returned_tids);
260
+ </programlisting>
261
+ Fetch multiple tuples in the given scan. Returns TRUE if the scan should
262
+ continue, FALSE if no matching tuples remain. <literal>tids</> points to
263
+ a caller-supplied array of <literal>max_tids</>
264
+ <structname>ItemPointerData</> records, which the call fills with TIDs of
265
+ matching tuples. <literal>*returned_tids</> is set to the number of TIDs
266
+ actually returned. This can be less than <literal>max_tids</>, or even
267
+ zero, even when the return value is TRUE. (This provision allows the
268
+ access method to choose the most efficient stopping points in its scan,
269
+ for example index page boundaries.) <function>amgetmulti</> and
270
+ <function>amgettuple</> cannot be used in the same index scan; there
271
+ are other restrictions too when using <function>amgetmulti</>, as explained
272
+ in <xref linkend="index-scanning">.
273
+ </para>
274
+
275
+ <para>
276
+ <programlisting>
255
277
void
256
278
amrescan (IndexScanDesc scan,
257
279
ScanKey key);
@@ -297,7 +319,6 @@ amrestrpos (IndexScanDesc scan);
297
319
<programlisting>
298
320
void
299
321
amcostestimate (Query *root,
300
- RelOptInfo *rel,
301
322
IndexOptInfo *index,
302
323
List *indexQuals,
303
324
Cost *indexStartupCost,
@@ -407,6 +428,25 @@ amcostestimate (Query *root,
407
428
true, insertions or deletions from other backends must be handled as well.)
408
429
</para>
409
430
431
+ <para>
432
+ Instead of using <function>amgettuple</>, an index scan can be done with
433
+ <function>amgetmulti</> to fetch multiple tuples per call. This can be
434
+ noticeably more efficient than <function>amgettuple</> because it allows
435
+ avoiding lock/unlock cycles within the access method. In principle
436
+ <function>amgetmulti</> should have the same effects as repeated
437
+ <function>amgettuple</> calls, but we impose several restrictions to
438
+ simplify matters. In the first place, <function>amgetmulti</> does not
439
+ take a <literal>direction</> argument, and therefore it does not support
440
+ backwards scan nor intrascan reversal of direction. The access method
441
+ need not support marking or restoring scan positions during an
442
+ <function>amgetmulti</> scan, either. (These restrictions cost little
443
+ since it would be difficult to use these features in an
444
+ <function>amgetmulti</> scan anyway: adjusting the caller's buffered
445
+ list of TIDs would be complex.) Finally, <function>amgetmulti</> does
446
+ not guarantee any locking of the returned tuples, with implications
447
+ spelled out in <xref linkend="index-locking">.
448
+ </para>
449
+
410
450
</sect1>
411
451
412
452
<sect1 id="index-locking">
@@ -515,10 +555,15 @@ amcostestimate (Query *root,
515
555
and only visit the heap tuples sometime later, requires much less index
516
556
locking overhead and may allow a more efficient heap access pattern.
517
557
Per the above analysis, we must use the synchronous approach for
518
- non-MVCC-compliant snapshots, but an asynchronous scan would be safe
519
- for a query using an MVCC snapshot. This possibility is not exploited
520
- as of <productname>PostgreSQL</productname> 8.0, but it is likely to be
521
- investigated soon.
558
+ non-MVCC-compliant snapshots, but an asynchronous scan is workable
559
+ for a query using an MVCC snapshot.
560
+ </para>
561
+
562
+ <para>
563
+ In an <function>amgetmulti</> index scan, the access method need not
564
+ guarantee to keep an index pin on any of the returned tuples. (It would be
565
+ impractical to pin more than the last one anyway.) Therefore
566
+ it is only safe to use such scans with MVCC-compliant snapshots.
522
567
</para>
523
568
524
569
</sect1>
@@ -611,7 +656,6 @@ amcostestimate (Query *root,
611
656
<programlisting>
612
657
void
613
658
amcostestimate (Query *root,
614
- RelOptInfo *rel,
615
659
IndexOptInfo *index,
616
660
List *indexQuals,
617
661
Cost *indexStartupCost,
@@ -632,20 +676,11 @@ amcostestimate (Query *root,
632
676
</listitem>
633
677
</varlistentry>
634
678
635
- <varlistentry>
636
- <term>rel</term>
637
- <listitem>
638
- <para>
639
- The relation the index is on.
640
- </para>
641
- </listitem>
642
- </varlistentry>
643
-
644
679
<varlistentry>
645
680
<term>index</term>
646
681
<listitem>
647
682
<para>
648
- The index itself .
683
+ The index being considered .
649
684
</para>
650
685
</listitem>
651
686
</varlistentry>
@@ -714,19 +749,19 @@ amcostestimate (Query *root,
714
749
715
750
<para>
716
751
The index access costs should be computed in the units used by
717
- <filename>src/backend/optimizer/path/costsize.c</filename>: a sequential disk block fetch
718
- has cost 1.0, a nonsequential fetch has cost random_page_cost, and
719
- the cost of processing one index row should usually be taken as
720
- cpu_index_tuple_cost (which is a user-adjustable optimizer parameter).
721
- In addition, an appropriate multiple of cpu_operator_cost should be charged
752
+ <filename>src/backend/optimizer/path/costsize.c</filename>: a sequential
753
+ disk block fetch has cost 1.0, a nonsequential fetch has cost
754
+ <varname>random_page_cost</>, and the cost of processing one index row
755
+ should usually be taken as <varname>cpu_index_tuple_cost</>. In addition,
756
+ an appropriate multiple of <varname> cpu_operator_cost</> should be charged
722
757
for any comparison operators invoked during index processing (especially
723
758
evaluation of the indexQuals themselves).
724
759
</para>
725
760
726
761
<para>
727
762
The access costs should include all disk and CPU costs associated with
728
- scanning the index itself, but NOT the costs of retrieving or processing
729
- the parent-table rows that are identified by the index.
763
+ scanning the index itself, but <emphasis>not</> the costs of retrieving or
764
+ processing the parent-table rows that are identified by the index.
730
765
</para>
731
766
732
767
<para>
@@ -764,7 +799,7 @@ amcostestimate (Query *root,
764
799
765
800
<programlisting>
766
801
*indexSelectivity = clauselist_selectivity(root, indexQuals,
767
- rel->relid, JOIN_INNER);
802
+ index-> rel->relid, JOIN_INNER);
768
803
</programlisting>
769
804
</para>
770
805
</step>
0 commit comments