@@ -4682,10 +4682,37 @@ SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})');
4682
4682
The quantifiers <literal>{1,1}</> and <literal>{1,1}?</>
4683
4683
can be used to force greediness or non-greediness, respectively,
4684
4684
on a subexpression or a whole RE.
4685
+ This is useful when you need the whole RE to have a greediness attribute
4686
+ different from what's deduced from its elements. As an example,
4687
+ suppose that we are trying to separate a string containing some digits
4688
+ into the digits and the parts before and after them. We might try to
4689
+ do that like this:
4690
+ <screen>
4691
+ SELECT regexp_matches('abc01234xyz', '(.*)(\d+)(.*)');
4692
+ <lineannotation>Result: </lineannotation><computeroutput>{abc0123,4,xyz}</computeroutput>
4693
+ </screen>
4694
+ That didn't work: the first <literal>.*</> is greedy so
4695
+ it <quote>eats</> as much as it can, leaving the <literal>\d+</> to
4696
+ match at the last possible place, the last digit. We might try to fix
4697
+ that by making it non-greedy:
4698
+ <screen>
4699
+ SELECT regexp_matches('abc01234xyz', '(.*?)(\d+)(.*)');
4700
+ <lineannotation>Result: </lineannotation><computeroutput>{abc,0,""}</computeroutput>
4701
+ </screen>
4702
+ That didn't work either, because now the RE as a whole is non-greedy
4703
+ and so it ends the overall match as soon as possible. We can get what
4704
+ we want by forcing the RE as a whole to be greedy:
4705
+ <screen>
4706
+ SELECT regexp_matches('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
4707
+ <lineannotation>Result: </lineannotation><computeroutput>{abc,01234,xyz}</computeroutput>
4708
+ </screen>
4709
+ Controlling the RE's overall greediness separately from its components'
4710
+ greediness allows great flexibility in handling variable-length patterns.
4685
4711
</para>
4686
4712
4687
4713
<para>
4688
- Match lengths are measured in characters, not collating elements.
4714
+ When deciding what is a longer or shorter match,
4715
+ match lengths are measured in characters, not collating elements.
4689
4716
An empty string is considered longer than no match at all.
4690
4717
For example:
4691
4718
<literal>bb*</>
0 commit comments