Skip to content

Commit f37ac61

Browse files
committed
Prevent mis-encoding of "trailing junk after numeric literal" errors.
Since commit 2549f06, we reject an identifier immediately following a numeric literal (without separating whitespace), because that risks ambiguity with hex/octal/binary integers. However, that patch used token patterns like "{integer}{ident_start}", which is problematic because {ident_start} matches only a single byte. If the first character after the integer is a multibyte character, this ends up with flex reporting an error message that includes a partial multibyte character. That can cause assorted bad-encoding problems downstream, both in the report to the client and in the postmaster log file. To fix, use {identifier} not {ident_start} in the "junk" token patterns, so that they will match complete multibyte characters. This seems generally better user experience quite aside from the encoding problem: for "123abc" the error message will now say that the error appeared at or near "123abc" instead of "123a". While at it, add some commentary about why these patterns exist and how they work. Report and patch by Karina Litskevich; review by Pavel Borisov. Back-patch to v15 where the problem came in. Discussion: https://postgr.es/m/CACiT8iZ_diop=0zJ7zuY3BXegJpkKK1Av-PU7xh0EDYHsa5+=g@mail.gmail.com
1 parent 777f50b commit f37ac61

File tree

4 files changed

+15
-15
lines changed

4 files changed

+15
-15
lines changed

src/backend/parser/scan.l

+4-4
Original file line numberDiff line numberDiff line change
@@ -398,12 +398,12 @@ decimalfail {digit}+\.\.
398398
real ({integer}|{decimal})[Ee][-+]?{digit}+
399399
realfail ({integer}|{decimal})[Ee][-+]
400400

401-
integer_junk {integer}{ident_start}
402-
decimal_junk {decimal}{ident_start}
403-
real_junk {real}{ident_start}
401+
integer_junk {integer}{identifier}
402+
decimal_junk {decimal}{identifier}
403+
real_junk {real}{identifier}
404404

405405
param \${integer}
406-
param_junk \${integer}{ident_start}
406+
param_junk \${integer}{identifier}
407407

408408
other .
409409

src/fe_utils/psqlscan.l

+4-4
Original file line numberDiff line numberDiff line change
@@ -336,12 +336,12 @@ decimalfail {digit}+\.\.
336336
real ({integer}|{decimal})[Ee][-+]?{digit}+
337337
realfail ({integer}|{decimal})[Ee][-+]
338338

339-
integer_junk {integer}{ident_start}
340-
decimal_junk {decimal}{ident_start}
341-
real_junk {real}{ident_start}
339+
integer_junk {integer}{identifier}
340+
decimal_junk {decimal}{identifier}
341+
real_junk {real}{identifier}
342342

343343
param \${integer}
344-
param_junk \${integer}{ident_start}
344+
param_junk \${integer}{identifier}
345345

346346
/* psql-specific: characters allowed in variable names */
347347
variable_char [A-Za-z\200-\377_0-9]

src/interfaces/ecpg/preproc/pgc.l

+4-4
Original file line numberDiff line numberDiff line change
@@ -369,12 +369,12 @@ decimalfail {digit}+\.\.
369369
real ({integer}|{decimal})[Ee][-+]?{digit}+
370370
realfail ({integer}|{decimal})[Ee][-+]
371371

372-
integer_junk {integer}{ident_start}
373-
decimal_junk {decimal}{ident_start}
374-
real_junk {real}{ident_start}
372+
integer_junk {integer}{identifier}
373+
decimal_junk {decimal}{identifier}
374+
real_junk {real}{identifier}
375375

376376
param \${integer}
377-
param_junk \${integer}{ident_start}
377+
param_junk \${integer}{identifier}
378378

379379
/* special characters for other dbms */
380380
/* we have to react differently in compat mode */

src/test/regress/expected/numerology.out

+3-3
Original file line numberDiff line numberDiff line change
@@ -6,15 +6,15 @@
66
-- Trailing junk in numeric literals
77
--
88
SELECT 123abc;
9-
ERROR: trailing junk after numeric literal at or near "123a"
9+
ERROR: trailing junk after numeric literal at or near "123abc"
1010
LINE 1: SELECT 123abc;
1111
^
1212
SELECT 0x0o;
13-
ERROR: trailing junk after numeric literal at or near "0x"
13+
ERROR: trailing junk after numeric literal at or near "0x0o"
1414
LINE 1: SELECT 0x0o;
1515
^
1616
SELECT 1_2_3;
17-
ERROR: trailing junk after numeric literal at or near "1_"
17+
ERROR: trailing junk after numeric literal at or near "1_2_3"
1818
LINE 1: SELECT 1_2_3;
1919
^
2020
SELECT 0.a;

0 commit comments

Comments
 (0)