@@ -2011,6 +2011,10 @@ GROUP BY region, product;
2011
2011
but we'd have needed two levels of nested sub-<command>SELECT</command>s. It's a bit
2012
2012
easier to follow this way.
2013
2013
</para>
2014
+ </sect2>
2015
+
2016
+ <sect2 id="queries-with-recursive">
2017
+ <title>Recursive Queries</title>
2014
2018
2015
2019
<para>
2016
2020
<indexterm>
@@ -2114,6 +2118,120 @@ GROUP BY sub_part
2114
2118
</programlisting>
2115
2119
</para>
2116
2120
2121
+ <sect3 id="queries-with-search">
2122
+ <title>Search Order</title>
2123
+
2124
+ <para>
2125
+ When computing a tree traversal using a recursive query, you might want to
2126
+ order the results in either depth-first or breadth-first order. This can
2127
+ be done by computing an ordering column alongside the other data columns
2128
+ and using that to sort the results at the end. Note that this does not
2129
+ actually control in which order the query evaluation visits the rows; that
2130
+ is as always in SQL implementation-dependent. This approach merely
2131
+ provides a convenient way to order the results afterwards.
2132
+ </para>
2133
+
2134
+ <para>
2135
+ To create a depth-first order, we compute for each result row an array of
2136
+ rows that we have visited so far. For example, consider the following
2137
+ query that searches a table <structname>tree</structname> using a
2138
+ <structfield>link</structfield> field:
2139
+
2140
+ <programlisting>
2141
+ WITH RECURSIVE search_tree(id, link, data) AS (
2142
+ SELECT t.id, t.link, t.data
2143
+ FROM tree t
2144
+ UNION ALL
2145
+ SELECT t.id, t.link, t.data
2146
+ FROM tree t, search_tree st
2147
+ WHERE t.id = st.link
2148
+ )
2149
+ SELECT * FROM search_tree;
2150
+ </programlisting>
2151
+
2152
+ To add depth-first ordering information, you can write this:
2153
+
2154
+ <programlisting>
2155
+ WITH RECURSIVE search_tree(id, link, data, <emphasis>path</emphasis>) AS (
2156
+ SELECT t.id, t.link, t.data, <emphasis>ARRAY[t.id]</emphasis>
2157
+ FROM tree t
2158
+ UNION ALL
2159
+ SELECT t.id, t.link, t.data, <emphasis>path || t.id</emphasis>
2160
+ FROM tree t, search_tree st
2161
+ WHERE t.id = st.link
2162
+ )
2163
+ SELECT * FROM search_tree <emphasis>ORDER BY path</emphasis>;
2164
+ </programlisting>
2165
+ </para>
2166
+
2167
+ <para>
2168
+ In the general case where more than one field needs to be used to identify
2169
+ a row, use an array of rows. For example, if we needed to track fields
2170
+ <structfield>f1</structfield> and <structfield>f2</structfield>:
2171
+
2172
+ <programlisting>
2173
+ WITH RECURSIVE search_tree(id, link, data, <emphasis>path</emphasis>) AS (
2174
+ SELECT t.id, t.link, t.data, <emphasis>ARRAY[ROW(t.f1, t.f2)]</emphasis>
2175
+ FROM tree t
2176
+ UNION ALL
2177
+ SELECT t.id, t.link, t.data, <emphasis>path || ROW(t.f1, t.f2)</emphasis>
2178
+ FROM tree t, search_tree st
2179
+ WHERE t.id = st.link
2180
+ )
2181
+ SELECT * FROM search_tree <emphasis>ORDER BY path</emphasis>;
2182
+ </programlisting>
2183
+ </para>
2184
+
2185
+ <note>
2186
+ <para>
2187
+ The queries shown in this and the following section involving
2188
+ <literal>ROW</literal> constructors in the target list only support
2189
+ <literal>UNION ALL</literal> (not plain <literal>UNION</literal>) in the
2190
+ current implementation.
2191
+ </para>
2192
+ </note>
2193
+
2194
+ <tip>
2195
+ <para>
2196
+ Omit the <literal>ROW()</literal> syntax in the common case where only one
2197
+ field needs to be tracked. This allows a simple array rather than a
2198
+ composite-type array to be used, gaining efficiency.
2199
+ </para>
2200
+ </tip>
2201
+
2202
+ <para>
2203
+ To create a breadth-first order, you can add a column that tracks the depth
2204
+ of the search, for example:
2205
+
2206
+ <programlisting>
2207
+ WITH RECURSIVE search_tree(id, link, data, <emphasis>depth</emphasis>) AS (
2208
+ SELECT t.id, t.link, t.data, <emphasis>0</emphasis>
2209
+ FROM tree t
2210
+ UNION ALL
2211
+ SELECT t.id, t.link, t.data, <emphasis>depth + 1</emphasis>
2212
+ FROM tree t, search_tree st
2213
+ WHERE t.id = st.link
2214
+ )
2215
+ SELECT * FROM search_tree <emphasis>ORDER BY depth</emphasis>;
2216
+ </programlisting>
2217
+
2218
+ To get a stable sort, add data columns as secondary sorting columns.
2219
+ </para>
2220
+
2221
+ <tip>
2222
+ <para>
2223
+ The recursive query evaluation algorithm produces its output in
2224
+ breadth-first search order. However, this is an implementation detail and
2225
+ it is perhaps unsound to rely on it. The order of the rows within each
2226
+ level is certainly undefined, so some explicit ordering might be desired
2227
+ in any case.
2228
+ </para>
2229
+ </tip>
2230
+ </sect3>
2231
+
2232
+ <sect3 id="queries-with-cycle">
2233
+ <title>Cycle Detection</title>
2234
+
2117
2235
<para>
2118
2236
When working with recursive queries it is important to be sure that
2119
2237
the recursive part of the query will eventually return no tuples,
@@ -2123,13 +2241,13 @@ GROUP BY sub_part
2123
2241
cycle does not involve output rows that are completely duplicate: it may be
2124
2242
necessary to check just one or a few fields to see if the same point has
2125
2243
been reached before. The standard method for handling such situations is
2126
- to compute an array of the already-visited values. For example, consider
2244
+ to compute an array of the already-visited values. For example, consider again
2127
2245
the following query that searches a table <structname>graph</structname> using a
2128
2246
<structfield>link</structfield> field:
2129
2247
2130
2248
<programlisting>
2131
2249
WITH RECURSIVE search_graph(id, link, data, depth) AS (
2132
- SELECT g.id, g.link, g.data, 1
2250
+ SELECT g.id, g.link, g.data, 0
2133
2251
FROM graph g
2134
2252
UNION ALL
2135
2253
SELECT g.id, g.link, g.data, sg.depth + 1
@@ -2147,17 +2265,17 @@ SELECT * FROM search_graph;
2147
2265
<structfield>is_cycle</structfield> and <structfield>path</structfield> to the loop-prone query:
2148
2266
2149
2267
<programlisting>
2150
- WITH RECURSIVE search_graph(id, link, data, depth, is_cycle, path) AS (
2151
- SELECT g.id, g.link, g.data, 1 ,
2152
- false,
2153
- ARRAY[g.id]
2268
+ WITH RECURSIVE search_graph(id, link, data, depth, <emphasis> is_cycle, path</emphasis> ) AS (
2269
+ SELECT g.id, g.link, g.data, 0 ,
2270
+ <emphasis> false,
2271
+ ARRAY[g.id]</emphasis>
2154
2272
FROM graph g
2155
2273
UNION ALL
2156
2274
SELECT g.id, g.link, g.data, sg.depth + 1,
2157
- g.id = ANY(path),
2158
- path || g.id
2275
+ <emphasis> g.id = ANY(path),
2276
+ path || g.id</emphasis>
2159
2277
FROM graph g, search_graph sg
2160
- WHERE g.id = sg.link AND NOT is_cycle
2278
+ WHERE g.id = sg.link <emphasis> AND NOT is_cycle</emphasis>
2161
2279
)
2162
2280
SELECT * FROM search_graph;
2163
2281
</programlisting>
@@ -2172,17 +2290,17 @@ SELECT * FROM search_graph;
2172
2290
compare fields <structfield>f1</structfield> and <structfield>f2</structfield>:
2173
2291
2174
2292
<programlisting>
2175
- WITH RECURSIVE search_graph(id, link, data, depth, is_cycle, path) AS (
2176
- SELECT g.id, g.link, g.data, 1 ,
2177
- false,
2178
- ARRAY[ROW(g.f1, g.f2)]
2293
+ WITH RECURSIVE search_graph(id, link, data, depth, <emphasis> is_cycle, path</emphasis> ) AS (
2294
+ SELECT g.id, g.link, g.data, 0 ,
2295
+ <emphasis> false,
2296
+ ARRAY[ROW(g.f1, g.f2)]</emphasis>
2179
2297
FROM graph g
2180
2298
UNION ALL
2181
2299
SELECT g.id, g.link, g.data, sg.depth + 1,
2182
- ROW(g.f1, g.f2) = ANY(path),
2183
- path || ROW(g.f1, g.f2)
2300
+ <emphasis> ROW(g.f1, g.f2) = ANY(path),
2301
+ path || ROW(g.f1, g.f2)</emphasis>
2184
2302
FROM graph g, search_graph sg
2185
- WHERE g.id = sg.link AND NOT is_cycle
2303
+ WHERE g.id = sg.link <emphasis> AND NOT is_cycle</emphasis>
2186
2304
)
2187
2305
SELECT * FROM search_graph;
2188
2306
</programlisting>
@@ -2198,10 +2316,8 @@ SELECT * FROM search_graph;
2198
2316
2199
2317
<tip>
2200
2318
<para>
2201
- The recursive query evaluation algorithm produces its output in
2202
- breadth-first search order. You can display the results in depth-first
2203
- search order by making the outer query <literal>ORDER BY</literal> a
2204
- <quote>path</quote> column constructed in this way.
2319
+ The cycle path column is computed in the same way as the depth-first
2320
+ ordering column show in the previous section.
2205
2321
</para>
2206
2322
</tip>
2207
2323
@@ -2217,7 +2333,7 @@ WITH RECURSIVE t(n) AS (
2217
2333
UNION ALL
2218
2334
SELECT n+1 FROM t
2219
2335
)
2220
- SELECT n FROM t LIMIT 100;
2336
+ SELECT n FROM t <emphasis> LIMIT 100</emphasis> ;
2221
2337
</programlisting>
2222
2338
2223
2339
This works because <productname>PostgreSQL</productname>'s implementation
@@ -2229,6 +2345,11 @@ SELECT n FROM t LIMIT 100;
2229
2345
outer query will usually try to fetch all of the <literal>WITH</literal> query's
2230
2346
output anyway.
2231
2347
</para>
2348
+ </sect3>
2349
+ </sect2>
2350
+
2351
+ <sect2>
2352
+ <title>Common Table Expression Materialization</title>
2232
2353
2233
2354
<para>
2234
2355
A useful property of <literal>WITH</literal> queries is that they are
0 commit comments