Skip to content

Commit 77ede89

Browse files
committed
Create a GUC variable REGEX_FLAVOR to control the type of regular
expression accepted by the regex operators, per discussion yesterday. Along the way, reduce deadlock_timeout from PGC_POSTMASTER to PGC_SIGHUP category. It is probably best to insist that all backends share the same setting, but that doesn't mean it has to be frozen at startup.
1 parent 465ed56 commit 77ede89

File tree

7 files changed

+114
-53
lines changed

7 files changed

+114
-53
lines changed

doc/src/sgml/func.sgml

Lines changed: 33 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
$Header: /cvsroot/pgsql/doc/src/sgml/func.sgml,v 1.137 2003/02/05 17:41:32 tgl Exp $
2+
$Header: /cvsroot/pgsql/doc/src/sgml/func.sgml,v 1.138 2003/02/06 20:25:31 tgl Exp $
33
PostgreSQL documentation
44
-->
55

@@ -2665,10 +2665,24 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
26652665
due to their availability in programming languages such as Perl and Tcl.
26662666
<acronym>RE</acronym>s using these non-POSIX extensions are called
26672667
<firstterm>advanced</> <acronym>RE</acronym>s or <acronym>ARE</>s
2668-
in this documentation. We first describe the ERE/ARE flavor and then
2669-
mention the restrictions of the BRE form.
2668+
in this documentation. AREs are almost an exact superset of EREs,
2669+
but BREs have several notational incompatibilities (as well as being
2670+
much more limited).
2671+
We first describe the ARE and ERE forms, noting features that apply
2672+
only to AREs, and then describe how BREs differ.
26702673
</para>
26712674

2675+
<note>
2676+
<para>
2677+
The form of regular expressions accepted by <productname>PostgreSQL</>
2678+
can be chosen by setting the <varname>REGEX_FLAVOR</> run-time parameter
2679+
(described in the &cite-admin;). The usual setting is
2680+
<literal>advanced</>, but one might choose <literal>extended</> for
2681+
maximum backwards compatibility with pre-7.4 releases of
2682+
<productname>PostgreSQL</>.
2683+
</para>
2684+
</note>
2685+
26722686
<para>
26732687
A regular expression is defined as one or more
26742688
<firstterm>branches</firstterm>, separated by
@@ -2784,7 +2798,7 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
27842798
meaning in <productname>PostgreSQL</> string literals.
27852799
To write a pattern constant that contains a backslash,
27862800
you must write two backslashes in the query.
2787-
</para>
2801+
</para>
27882802
</note>
27892803

27902804
<table id="posix-quantifiers-table">
@@ -3392,11 +3406,11 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
33923406
</para>
33933407

33943408
<para>
3395-
Normally the flavor of RE being used is specified by
3396-
application-dependent means.
3397-
However, this can be overridden by a <firstterm>director</>.
3409+
Normally the flavor of RE being used is determined by
3410+
<varname>REGEX_FLAVOR</>.
3411+
However, this can be overridden by a <firstterm>director</> prefix.
33983412
If an RE of any flavor begins with <literal>***:</>,
3399-
the rest of the RE is an ARE.
3413+
the rest of the RE is taken as an ARE.
34003414
If an RE of any flavor begins with <literal>***=</>,
34013415
the rest of the RE is taken to be a literal string,
34023416
with all characters considered ordinary characters.
@@ -3407,8 +3421,8 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
34073421
a sequence <literal>(?</><replaceable>xyz</><literal>)</>
34083422
(where <replaceable>xyz</> is one or more alphabetic characters)
34093423
specifies options affecting the rest of the RE.
3410-
These supplement, and can override,
3411-
any options specified externally.
3424+
These options override any previously determined options (including
3425+
both the RE flavor and case sensitivity).
34123426
The available option letters are
34133427
shown in <xref linkend="posix-embedded-options-table">.
34143428
</para>
@@ -3432,7 +3446,7 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
34323446

34333447
<row>
34343448
<entry> <literal>c</> </entry>
3435-
<entry> case-sensitive matching (usual default) </entry>
3449+
<entry> case-sensitive matching (overrides operator type) </entry>
34363450
</row>
34373451

34383452
<row>
@@ -3443,7 +3457,7 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
34433457
<row>
34443458
<entry> <literal>i</> </entry>
34453459
<entry> case-insensitive matching (see
3446-
<xref linkend="posix-matching-rules">) </entry>
3460+
<xref linkend="posix-matching-rules">) (overrides operator type) </entry>
34473461
</row>
34483462

34493463
<row>
@@ -3471,12 +3485,12 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
34713485

34723486
<row>
34733487
<entry> <literal>s</> </entry>
3474-
<entry> non-newline-sensitive matching (usual default) </entry>
3488+
<entry> non-newline-sensitive matching (default) </entry>
34753489
</row>
34763490

34773491
<row>
34783492
<entry> <literal>t</> </entry>
3479-
<entry> tight syntax (usual default; see below) </entry>
3493+
<entry> tight syntax (default; see below) </entry>
34803494
</row>
34813495

34823496
<row>
@@ -3696,7 +3710,7 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
36963710
</para>
36973711

36983712
<para>
3699-
Two significant incompatibilites exist between AREs and the ERE syntax
3713+
Two significant incompatibilities exist between AREs and the ERE syntax
37003714
recognized by pre-7.4 releases of <productname>PostgreSQL</>:
37013715

37023716
<itemizedlist>
@@ -3717,6 +3731,10 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
37173731
</para>
37183732
</listitem>
37193733
</itemizedlist>
3734+
3735+
While these differences are unlikely to create a problem for most
3736+
applications, you can avoid them if necessary by
3737+
setting <varname>REGEX_FLAVOR</> to <literal>extended</>.
37203738
</para>
37213739
</sect3>
37223740

doc/src/sgml/runtime.sgml

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
$Header: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v 1.167 2003/01/25 23:10:27 tgl Exp $
2+
$Header: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v 1.168 2003/02/06 20:25:31 tgl Exp $
33
-->
44

55
<Chapter Id="runtime">
@@ -1447,8 +1447,7 @@ env PGOPTIONS='-c geqo=off' psql
14471447
practice. On a heavily loaded server you might want to raise it.
14481448
Ideally the setting should exceed your typical transaction time,
14491449
so as to improve the odds that the lock will be released before
1450-
the waiter decides to check for deadlock. This option can only
1451-
be set at server start.
1450+
the waiter decides to check for deadlock.
14521451
</para>
14531452
</listitem>
14541453
</varlistentry>
@@ -1781,6 +1780,20 @@ dynamic_library_path = '/usr/local/lib/postgresql:/home/my_project/lib:$libdir'
17811780
</listitem>
17821781
</varlistentry>
17831782

1783+
<varlistentry>
1784+
<term><varname>REGEX_FLAVOR</varname> (<type>string</type>)</term>
1785+
<indexterm><primary>regular expressions</></>
1786+
<listitem>
1787+
<para>
1788+
The regular expression <quote>flavor</> can be set to
1789+
<literal>advanced</>, <literal>extended</>, or <literal>basic</>.
1790+
The usual default is <literal>advanced</>. The <literal>extended</>
1791+
setting may be useful for exact backwards compatibility with
1792+
pre-7.4 releases of <productname>PostgreSQL</>.
1793+
</para>
1794+
</listitem>
1795+
</varlistentry>
1796+
17841797
<varlistentry>
17851798
<term><varname>SEARCH_PATH</varname> (<type>string</type>)</term>
17861799
<indexterm><primary>search_path</></>

src/backend/utils/adt/regexp.c

Lines changed: 42 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $Header: /cvsroot/pgsql/src/backend/utils/adt/regexp.c,v 1.44 2003/02/05 17:41:32 tgl Exp $
11+
* $Header: /cvsroot/pgsql/src/backend/utils/adt/regexp.c,v 1.45 2003/02/06 20:25:33 tgl Exp $
1212
*
1313
* Alistair Crooks added the code for the regex caching
1414
* agc - cached the regular expressions used - there's a good chance
@@ -34,6 +34,10 @@
3434
#include "utils/builtins.h"
3535

3636

37+
/* GUC-settable flavor parameter */
38+
static int regex_flavor = REG_ADVANCED;
39+
40+
3741
/*
3842
* We cache precompiled regular expressions using a "self organizing list"
3943
* structure, in which recently-used items tend to be near the front.
@@ -216,6 +220,34 @@ RE_compile_and_execute(text *text_re, unsigned char *dat, int dat_len,
216220
}
217221

218222

223+
/*
224+
* assign_regex_flavor - GUC hook to validate and set REGEX_FLAVOR
225+
*/
226+
const char *
227+
assign_regex_flavor(const char *value,
228+
bool doit, bool interactive)
229+
{
230+
if (strcasecmp(value, "advanced") == 0)
231+
{
232+
if (doit)
233+
regex_flavor = REG_ADVANCED;
234+
}
235+
else if (strcasecmp(value, "extended") == 0)
236+
{
237+
if (doit)
238+
regex_flavor = REG_EXTENDED;
239+
}
240+
else if (strcasecmp(value, "basic") == 0)
241+
{
242+
if (doit)
243+
regex_flavor = REG_BASIC;
244+
}
245+
else
246+
return NULL; /* fail */
247+
return value; /* OK */
248+
}
249+
250+
219251
/*
220252
* interface routines called by the function manager
221253
*/
@@ -229,7 +261,7 @@ nameregexeq(PG_FUNCTION_ARGS)
229261
PG_RETURN_BOOL(RE_compile_and_execute(p,
230262
(unsigned char *) NameStr(*n),
231263
strlen(NameStr(*n)),
232-
REG_ADVANCED,
264+
regex_flavor,
233265
0, NULL));
234266
}
235267

@@ -242,7 +274,7 @@ nameregexne(PG_FUNCTION_ARGS)
242274
PG_RETURN_BOOL(!RE_compile_and_execute(p,
243275
(unsigned char *) NameStr(*n),
244276
strlen(NameStr(*n)),
245-
REG_ADVANCED,
277+
regex_flavor,
246278
0, NULL));
247279
}
248280

@@ -255,7 +287,7 @@ textregexeq(PG_FUNCTION_ARGS)
255287
PG_RETURN_BOOL(RE_compile_and_execute(p,
256288
(unsigned char *) VARDATA(s),
257289
VARSIZE(s) - VARHDRSZ,
258-
REG_ADVANCED,
290+
regex_flavor,
259291
0, NULL));
260292
}
261293

@@ -268,7 +300,7 @@ textregexne(PG_FUNCTION_ARGS)
268300
PG_RETURN_BOOL(!RE_compile_and_execute(p,
269301
(unsigned char *) VARDATA(s),
270302
VARSIZE(s) - VARHDRSZ,
271-
REG_ADVANCED,
303+
regex_flavor,
272304
0, NULL));
273305
}
274306

@@ -288,7 +320,7 @@ nameicregexeq(PG_FUNCTION_ARGS)
288320
PG_RETURN_BOOL(RE_compile_and_execute(p,
289321
(unsigned char *) NameStr(*n),
290322
strlen(NameStr(*n)),
291-
REG_ICASE | REG_ADVANCED,
323+
regex_flavor | REG_ICASE,
292324
0, NULL));
293325
}
294326

@@ -301,7 +333,7 @@ nameicregexne(PG_FUNCTION_ARGS)
301333
PG_RETURN_BOOL(!RE_compile_and_execute(p,
302334
(unsigned char *) NameStr(*n),
303335
strlen(NameStr(*n)),
304-
REG_ICASE | REG_ADVANCED,
336+
regex_flavor | REG_ICASE,
305337
0, NULL));
306338
}
307339

@@ -314,7 +346,7 @@ texticregexeq(PG_FUNCTION_ARGS)
314346
PG_RETURN_BOOL(RE_compile_and_execute(p,
315347
(unsigned char *) VARDATA(s),
316348
VARSIZE(s) - VARHDRSZ,
317-
REG_ICASE | REG_ADVANCED,
349+
regex_flavor | REG_ICASE,
318350
0, NULL));
319351
}
320352

@@ -327,7 +359,7 @@ texticregexne(PG_FUNCTION_ARGS)
327359
PG_RETURN_BOOL(!RE_compile_and_execute(p,
328360
(unsigned char *) VARDATA(s),
329361
VARSIZE(s) - VARHDRSZ,
330-
REG_ICASE | REG_ADVANCED,
362+
regex_flavor | REG_ICASE,
331363
0, NULL));
332364
}
333365

@@ -353,7 +385,7 @@ textregexsubstr(PG_FUNCTION_ARGS)
353385
match = RE_compile_and_execute(p,
354386
(unsigned char *) VARDATA(s),
355387
VARSIZE(s) - VARHDRSZ,
356-
REG_ADVANCED,
388+
regex_flavor,
357389
2, pmatch);
358390

359391
/* match? then return the substring matching the pattern */

src/backend/utils/misc/guc.c

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
* command, configuration file, and command line options.
66
* See src/backend/utils/misc/README for more information.
77
*
8-
* $Header: /cvsroot/pgsql/src/backend/utils/misc/guc.c,v 1.113 2003/01/28 18:04:02 tgl Exp $
8+
* $Header: /cvsroot/pgsql/src/backend/utils/misc/guc.c,v 1.114 2003/02/06 20:25:33 tgl Exp $
99
*
1010
* Copyright 2000 by PostgreSQL Global Development Group
1111
* Written by Peter Eisentraut <peter_e@gmx.net>.
@@ -127,6 +127,7 @@ static double phony_random_seed;
127127
static char *client_encoding_string;
128128
static char *datestyle_string;
129129
static char *default_iso_level_string;
130+
static char *regex_flavor_string;
130131
static char *server_encoding_string;
131132
static char *session_authorization_string;
132133
static char *timezone_string;
@@ -568,7 +569,7 @@ static struct config_int
568569
},
569570

570571
{
571-
{"deadlock_timeout", PGC_POSTMASTER}, &DeadlockTimeout,
572+
{"deadlock_timeout", PGC_SIGHUP}, &DeadlockTimeout,
572573
1000, 0, INT_MAX, NULL, NULL
573574
},
574575

@@ -818,6 +819,11 @@ static struct config_string
818819
"C", locale_time_assign, NULL
819820
},
820821

822+
{
823+
{"regex_flavor", PGC_USERSET}, &regex_flavor_string,
824+
"advanced", assign_regex_flavor, NULL
825+
},
826+
821827
{
822828
{"search_path", PGC_USERSET, GUC_LIST_INPUT | GUC_LIST_QUOTE},
823829
&namespace_search_path,

src/backend/utils/misc/postgresql.conf.sample

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -208,6 +208,7 @@
208208
#max_expr_depth = 10000 # min 10
209209
#max_files_per_process = 1000 # min 25
210210
#password_encryption = true
211+
#regex_flavor = advanced # advanced, extended, or basic
211212
#sql_inheritance = true
212213
#transform_null_equals = false
213214
#statement_timeout = 0 # 0 is disabled, in milliseconds

0 commit comments

Comments
 (0)