Skip to content

Commit a44af6d

Browse files
committed
Document that the regexp split functions ignore zero-length matches in
certain corner cases. Per discussion, the code does what we want, but it really needs to be documented that these functions act differently from regexp_matches.
1 parent b70d4a6 commit a44af6d

File tree

1 file changed

+14
-5
lines changed

1 file changed

+14
-5
lines changed

doc/src/sgml/func.sgml

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.384 2007/08/11 03:56:24 tgl Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.385 2007/08/13 01:18:47 tgl Exp $ -->
22

33
<chapter id="functions">
44
<title>Functions and Operators</title>
@@ -3383,17 +3383,26 @@ SELECT foo FROM regexp_split_to_table('the quick brown fox', E'\\s*') AS foo;
33833383
</para>
33843384

33853385
<para>
3386-
<productname>PostgreSQL</productname>'s regular expressions are implemented
3387-
using a package written by Henry Spencer. Much of
3388-
the description of regular expressions below is copied verbatim from his
3389-
manual entry.
3386+
As the last example demonstrates, the regexp split functions ignore
3387+
zero-length matches that occur at the start or end of the string
3388+
or immediately after a previous match. This is contrary to the strict
3389+
definition of regexp matching that is implemented by
3390+
<function>regexp_matches</>, but is usually the most convenient behavior
3391+
in practice. Other software systems such as Perl use similar definitions.
33903392
</para>
33913393

33923394
<!-- derived from the re_syntax.n man page -->
33933395

33943396
<sect3 id="posix-syntax-details">
33953397
<title>Regular Expression Details</title>
33963398

3399+
<para>
3400+
<productname>PostgreSQL</productname>'s regular expressions are implemented
3401+
using a package written by Henry Spencer. Much of
3402+
the description of regular expressions below is copied verbatim from his
3403+
manual entry.
3404+
</para>
3405+
33973406
<para>
33983407
Regular expressions (<acronym>RE</acronym>s), as defined in
33993408
<acronym>POSIX</acronym> 1003.2, come in two forms:

0 commit comments

Comments
 (0)