Skip to content

Commit 3963574

Browse files
committed
XPath fixes:
- Function renamed to "xpath". - Function is now strict, per discussion. - Return empty array in case when XPath expression detects nothing (previously, NULL was returned in such case), per discussion. - (bugfix) Work with fragments with prologue: select xpath('/a', '<?xml version="1.0"?><a /><b />'); // now XML datum is always wrapped with dummy <x>...</x>, XML prologue simply goes away (if any). - Some cleanup. Nikolay Samokhvalov Some code cleanup and documentation work by myself.
1 parent 0c644d2 commit 3963574

File tree

9 files changed

+238
-205
lines changed

9 files changed

+238
-205
lines changed

doc/src/sgml/datatype.sgml

+43-3
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/datatype.sgml,v 1.200 2007/05/08 17:02:59 tgl Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/datatype.sgml,v 1.201 2007/05/21 17:10:28 petere Exp $ -->
22

33
<chapter id="datatype">
44
<title id="datatype-title">Data Types</title>
@@ -3213,7 +3213,7 @@ SELECT * FROM test;
32133213
<sect1 id="datatype-uuid">
32143214
<title><acronym>UUID</acronym> Type</title>
32153215

3216-
<indexterm zone="datatype-xml">
3216+
<indexterm zone="datatype-uuid">
32173217
<primary>UUID</primary>
32183218
</indexterm>
32193219

@@ -3289,6 +3289,8 @@ a0eebc999c0b4ef8bb6d6bb9bd380a11
32893289
value is a full document or only a content fragment.
32903290
</para>
32913291

3292+
<sect2>
3293+
<title>Creating XML Values</title>
32923294
<para>
32933295
To produce a value of type <type>xml</type> from character data,
32943296
use the function
@@ -3299,7 +3301,7 @@ XMLPARSE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable>)
32993301
Examples:
33003302
<programlisting><![CDATA[
33013303
XMLPARSE (DOCUMENT '<?xml version="1.0"?><book><title>Manual</title><chapter>...</chapter><book>')
3302-
XMLPARSE (CONTENT 'abc<foo>bar</bar><bar>foo</foo>')
3304+
XMLPARSE (CONTENT 'abc<foo>bar</foo><bar>foo</bar>')
33033305
]]></programlisting>
33043306
While this is the only way to convert character strings into XML
33053307
values according to the SQL standard, the PostgreSQL-specific
@@ -3351,7 +3353,10 @@ SET xmloption TO { DOCUMENT | CONTENT };
33513353
The default is <literal>CONTENT</literal>, so all forms of XML
33523354
data are allowed.
33533355
</para>
3356+
</sect2>
33543357

3358+
<sect2>
3359+
<title>Encoding Handling</title>
33553360
<para>
33563361
Care must be taken when dealing with multiple character encodings
33573362
on the client, server, and in the XML data passed through them.
@@ -3398,6 +3403,41 @@ SET xmloption TO { DOCUMENT | CONTENT };
33983403
processed in UTF-8, computations will be most efficient if the
33993404
server encoding is also UTF-8.
34003405
</para>
3406+
</sect2>
3407+
3408+
<sect2>
3409+
<title>Accessing XML Values</title>
3410+
3411+
<para>
3412+
The <type>xml</type> data type is unusual in that it does not
3413+
provide any comparison operators. This is because there is no
3414+
well-defined and universally useful comparison algorithm for XML
3415+
data. One consequence of this is that you cannot retrieve rows by
3416+
comparing an <type>xml</type> column against a search value. XML
3417+
values should therefore typically be accompanied by a separate key
3418+
field such as an ID. An alternative solution for comparing XML
3419+
values is to convert them to character strings first, but note
3420+
that character string comparison has little to do with a useful
3421+
XML comparison method.
3422+
</para>
3423+
3424+
<para>
3425+
Since there are no comparison operators for the <type>xml</type>
3426+
data type, it is not possible to create an index directly on a
3427+
column of this type. If speedy searches in XML data are desired,
3428+
possible workarounds would be casting the expression to a
3429+
character string type and indexing that, or indexing an XPath
3430+
expression. The actual query would of course have to be adjusted
3431+
to search by the indexed expression.
3432+
</para>
3433+
3434+
<para>
3435+
The full-text search module Tsearch2 could also be used to speed
3436+
up full-document searches in XML data. The necessary
3437+
preprocessing support is, however, not available in the PostgreSQL
3438+
distribution in this release.
3439+
</para>
3440+
</sect2>
34013441
</sect1>
34023442

34033443
&array;

doc/src/sgml/func.sgml

+47-71
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.379 2007/05/07 07:53:26 petere Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.380 2007/05/21 17:10:28 petere Exp $ -->
22

33
<chapter id="functions">
44
<title>Functions and Operators</title>
@@ -7512,7 +7512,7 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple
75127512
type. The function-like expressions <function>xmlparse</function>
75137513
and <function>xmlserialize</function> for converting to and from
75147514
type <type>xml</type> are not repeated here. Use of many of these
7515-
<type>xml</type> functions requires the installation to have been built
7515+
functions requires the installation to have been built
75167516
with <command>configure --with-libxml</>.
75177517
</para>
75187518

@@ -7848,6 +7848,51 @@ SELECT xmlroot(xmlparse(document '<?xml version="1.1"?><content>abc</content>'),
78487848
</sect3>
78497849
</sect2>
78507850

7851+
<sect2 id="functions-xml-processing">
7852+
<title>Processing XML</title>
7853+
7854+
<indexterm>
7855+
<primary>XPath</primary>
7856+
</indexterm>
7857+
7858+
<para>
7859+
To process values of data type <type>xml</type>, PostgreSQL offers
7860+
the function <function>xpath</function>, which evaluates XPath 1.0
7861+
expressions.
7862+
</para>
7863+
7864+
<synopsis>
7865+
<function>xpath</function>(<replaceable>xpath</replaceable>, <replaceable>xml</replaceable><optional>, <replaceable>nsarray</replaceable></optional>)
7866+
</synopsis>
7867+
7868+
<para>
7869+
The function <function>xpath</function> evaluates the XPath
7870+
expression <replaceable>xpath</replaceable> against the XML value
7871+
<replaceable>xml</replaceable>. It returns an array of XML values
7872+
corresponding to the node set produced by the XPath expression.
7873+
</para>
7874+
7875+
<para>
7876+
The third argument of the function is an array of namespace
7877+
mappings. This array should be a two-dimensional array with the
7878+
length of the second axis being equal to 2 (i.e., it should be an
7879+
array of arrays, each of which consists of exactly 2 elements).
7880+
The first element of each array entry is the namespace name, the
7881+
second the namespace URI.
7882+
</para>
7883+
7884+
<para>
7885+
Example:
7886+
<screen><![CDATA[
7887+
SELECT xpath('/my:a/text()', '<my:a xmlns:my="http://example.com">test</my:a>', ARRAY[ARRAY['my', 'http://example.com']]);
7888+
xpath
7889+
--------
7890+
{test}
7891+
(1 row)
7892+
]]></screen>
7893+
</para>
7894+
</sect2>
7895+
78517896
<sect2 id="functions-xml-mapping">
78527897
<title>Mapping Tables to XML</title>
78537898

@@ -8097,75 +8142,6 @@ table2-mapping
80978142
]]></programlisting>
80988143
</figure>
80998144
</sect2>
8100-
8101-
<sect2>
8102-
<title>Processing XML</title>
8103-
8104-
<para>
8105-
<acronym>XML</> support is not just the existence of an
8106-
<type>xml</type> data type, but a variety of features supported by
8107-
a database system. These capabilities include import/export,
8108-
indexing, searching, transforming, and <acronym>XML</> to
8109-
<acronym>SQL</> mapping. <productname>PostgreSQL</> supports some
8110-
but not all of these <acronym>XML</> capabilities. For an
8111-
overview of <acronym>XML</> use in databases, see <ulink
8112-
url="http://www.rpbourret.com/xml/XMLAndDatabases.htm"></>.
8113-
</para>
8114-
8115-
<variablelist>
8116-
<varlistentry>
8117-
<term>Indexing</term>
8118-
<listitem>
8119-
8120-
<para>
8121-
<filename>contrib/xml2/</> functions can be used in expression
8122-
indexes to index specific <acronym>XML</> fields. To index the
8123-
full contents of <acronym>XML</> documents, the full-text
8124-
indexing tool <filename>contrib/tsearch2/</> can be used. Of
8125-
course, Tsearch2 indexes have no <acronym>XML</> awareness so
8126-
additional <filename>contrib/xml2/</> checks should be added to
8127-
queries.
8128-
</para>
8129-
</listitem>
8130-
</varlistentry>
8131-
8132-
<varlistentry>
8133-
<term>Searching</term>
8134-
<listitem>
8135-
8136-
<para>
8137-
XPath searches are implemented using <filename>contrib/xml2/</>.
8138-
It processes <acronym>XML</> text documents and returns results
8139-
based on the requested query.
8140-
</para>
8141-
</listitem>
8142-
</varlistentry>
8143-
8144-
<varlistentry>
8145-
<term>Transforming</term>
8146-
<listitem>
8147-
8148-
<para>
8149-
<filename>contrib/xml2/</> supports <acronym>XSLT</> (Extensible
8150-
Stylesheet Language Transformation).
8151-
</para>
8152-
</listitem>
8153-
</varlistentry>
8154-
8155-
<varlistentry>
8156-
<term>XML to SQL Mapping</term>
8157-
<listitem>
8158-
8159-
<para>
8160-
This involves converting <acronym>XML</> data to and from
8161-
relational structures. <productname>PostgreSQL</> has no
8162-
internal support for such mapping, and relies on external tools
8163-
to do such conversions.
8164-
</para>
8165-
</listitem>
8166-
</varlistentry>
8167-
</variablelist>
8168-
</sect2>
81698145
</sect1>
81708146

81718147

0 commit comments

Comments
 (0)