|
1 | 1 | <!--
|
2 |
| -$Header: /cvsroot/pgsql/doc/src/sgml/perform.sgml,v 1.23 2003/01/12 18:42:59 tgl Exp $ |
| 2 | +$Header: /cvsroot/pgsql/doc/src/sgml/perform.sgml,v 1.24 2003/01/25 23:10:27 tgl Exp $ |
3 | 3 | -->
|
4 | 4 |
|
5 | 5 | <chapter id="performance-tips">
|
@@ -591,53 +591,93 @@ SELECT * FROM a LEFT JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
|
591 | 591 | </para>
|
592 | 592 |
|
593 | 593 | <para>
|
594 |
| - The <productname>PostgreSQL</productname> query planner treats all |
595 |
| - explicit <literal>JOIN</> syntaxes as constraining the join order, even though |
596 |
| - it is not logically necessary to make such a constraint for inner |
597 |
| - joins. Therefore, although all of these queries give the same result: |
| 594 | + Explicit inner join syntax (<literal>INNER JOIN</>, <literal>CROSS |
| 595 | + JOIN</>, or unadorned <literal>JOIN</>) is semantically the same as |
| 596 | + listing the input relations in <literal>FROM</>, so it does not need to |
| 597 | + constrain the join order. But it is possible to instruct the |
| 598 | + <productname>PostgreSQL</productname> query planner to treat |
| 599 | + explicit inner <literal>JOIN</>s as constraining the join order anyway. |
| 600 | + For example, these three queries are logically equivalent: |
598 | 601 | <programlisting>
|
599 | 602 | SELECT * FROM a, b, c WHERE a.id = b.id AND b.ref = c.id;
|
600 | 603 | SELECT * FROM a CROSS JOIN b CROSS JOIN c WHERE a.id = b.id AND b.ref = c.id;
|
601 | 604 | SELECT * FROM a JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
|
602 | 605 | </programlisting>
|
| 606 | + But if we tell the planner to honor the <literal>JOIN</> order, |
603 | 607 | the second and third take less time to plan than the first. This effect
|
604 | 608 | is not worth worrying about for only three tables, but it can be a
|
605 | 609 | lifesaver with many tables.
|
606 | 610 | </para>
|
607 | 611 |
|
| 612 | + <para> |
| 613 | + To force the planner to follow the <literal>JOIN</> order for inner joins, |
| 614 | + set the <varname>JOIN_COLLAPSE_LIMIT</> run-time parameter to 1. |
| 615 | + (Other possible values are discussed below.) |
| 616 | + </para> |
| 617 | + |
608 | 618 | <para>
|
609 | 619 | You do not need to constrain the join order completely in order to
|
610 |
| - cut search time, because it's OK to use <literal>JOIN</> operators in a plain |
611 |
| - <literal>FROM</> list. For example, |
| 620 | + cut search time, because it's OK to use <literal>JOIN</> operators |
| 621 | + within items of a plain <literal>FROM</> list. For example, consider |
612 | 622 | <programlisting>
|
613 | 623 | SELECT * FROM a CROSS JOIN b, c, d, e WHERE ...;
|
614 | 624 | </programlisting>
|
| 625 | + With <varname>JOIN_COLLAPSE_LIMIT</> = 1, this |
615 | 626 | forces the planner to join A to B before joining them to other tables,
|
616 | 627 | but doesn't constrain its choices otherwise. In this example, the
|
617 | 628 | number of possible join orders is reduced by a factor of 5.
|
618 | 629 | </para>
|
619 | 630 |
|
620 | 631 | <para>
|
621 |
| - If you have a mix of outer and inner joins in a complex query, you |
622 |
| - might not want to constrain the planner's search for a good ordering |
623 |
| - of inner joins inside an outer join. You can't do that directly in the |
624 |
| - <literal>JOIN</> syntax, but you can get around the syntactic limitation by using |
625 |
| - subselects. For example, |
| 632 | + Constraining the planner's search in this way is a useful technique |
| 633 | + both for reducing planning time and for directing the planner to a |
| 634 | + good query plan. If the planner chooses a bad join order by default, |
| 635 | + you can force it to choose a better order via <literal>JOIN</> syntax |
| 636 | + --- assuming that you know of a better order, that is. Experimentation |
| 637 | + is recommended. |
| 638 | + </para> |
| 639 | + |
| 640 | + <para> |
| 641 | + A closely related issue that affects planning time is collapsing of |
| 642 | + sub-SELECTs into their parent query. For example, consider |
| 643 | +<programlisting> |
| 644 | +SELECT * |
| 645 | +FROM x, y, |
| 646 | + (SELECT * FROM a, b, c WHERE something) AS ss |
| 647 | +WHERE somethingelse |
| 648 | +</programlisting> |
| 649 | + This situation might arise from use of a view that contains a join; |
| 650 | + the view's SELECT rule will be inserted in place of the view reference, |
| 651 | + yielding a query much like the above. Normally, the planner will try |
| 652 | + to collapse the sub-query into the parent, yielding |
626 | 653 | <programlisting>
|
627 |
| -SELECT * FROM d LEFT JOIN |
628 |
| - (SELECT * FROM a, b, c WHERE ...) AS ss |
629 |
| - ON (...); |
| 654 | +SELECT * FROM x, y, a, b, c WHERE something AND somethingelse |
630 | 655 | </programlisting>
|
631 |
| - Here, joining to D must be the last step in the query plan, but the |
632 |
| - planner is free to consider various join orders for A, B, and C. |
| 656 | + This usually results in a better plan than planning the sub-query |
| 657 | + separately. (For example, the outer WHERE conditions might be such that |
| 658 | + joining X to A first eliminates many rows of A, thus avoiding the need to |
| 659 | + form the full logical output of the sub-select.) But at the same time, |
| 660 | + we have increased the planning time; here, we have a five-way join |
| 661 | + problem replacing two separate three-way join problems. Because of the |
| 662 | + exponential growth of the number of possibilities, this makes a big |
| 663 | + difference. The planner tries to avoid getting stuck in huge join search |
| 664 | + problems by not collapsing a sub-query if more than |
| 665 | + <varname>FROM_COLLAPSE_LIMIT</> FROM-items would result in the parent |
| 666 | + query. You can trade off planning time against quality of plan by |
| 667 | + adjusting this run-time parameter up or down. |
633 | 668 | </para>
|
634 | 669 |
|
635 | 670 | <para>
|
636 |
| - Constraining the planner's search in this way is a useful technique |
637 |
| - both for reducing planning time and for directing the planner to a |
638 |
| - good query plan. If the planner chooses a bad join order by default, |
639 |
| - you can force it to choose a better order via <literal>JOIN</> syntax --- assuming |
640 |
| - that you know of a better order, that is. Experimentation is recommended. |
| 671 | + <varname>FROM_COLLAPSE_LIMIT</> and <varname>JOIN_COLLAPSE_LIMIT</> |
| 672 | + are similarly named because they do almost the same thing: one controls |
| 673 | + when the planner will <quote>flatten out</> sub-SELECTs, and the |
| 674 | + other controls when it will flatten out explicit inner JOINs. Typically |
| 675 | + you would either set <varname>JOIN_COLLAPSE_LIMIT</> equal to |
| 676 | + <varname>FROM_COLLAPSE_LIMIT</> (so that explicit JOINs and sub-SELECTs |
| 677 | + act similarly) or set <varname>JOIN_COLLAPSE_LIMIT</> to 1 (if you want |
| 678 | + to control join order with explicit JOINs). But you might set them |
| 679 | + differently if you are trying to fine-tune the tradeoff between planning |
| 680 | + time and run time. |
641 | 681 | </para>
|
642 | 682 | </sect1>
|
643 | 683 |
|
|
0 commit comments