Skip to content

Commit 756a4ed

Browse files
committed
Add simple script to check for right recursion in Bison grammars.
We should generally use left-recursion not right-recursion to parse lists. Bison hasn't got any built-in way to check for this type of inefficiency, and I didn't find anything on the net in a quick search, so I wrote a little Perl script to do it. Add to src/tools/ so we don't have to re-invent this wheel next time we wonder if we're doing anything stupid. Currently, the only place that seems to need fixing is plpgsql's stmt_else production, so the problem doesn't appear to be common enough to warrant trying to include such a test in our standard build process. If we did want to do that, we'd need a way to ignore some false positives, such as a_expr := '-' a_expr
1 parent bf82013 commit 756a4ed

File tree

1 file changed

+74
-0
lines changed

1 file changed

+74
-0
lines changed

src/tools/check_bison_recursion.pl

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
#! /usr/bin/perl
2+
3+
#################################################################
4+
#
5+
# check_bison_recursion.pl -- check for right recursion in Bison grammars
6+
#
7+
# The standard way to parse list constructs in Bison grammars is via left
8+
# recursion, wherein a nonterminal symbol has itself as the first symbol
9+
# in one of its expansion rules. It is also possible to parse a list via
10+
# right recursion, wherein a nonterminal symbol has itself as the last
11+
# symbol of an expansion; but that's a bad way to write it because a long
12+
# enough list will result in parser stack overflow. Since Bison doesn't
13+
# have any built-in way to warn about use of right recursion, we use this
14+
# script when we want to check for the problem.
15+
#
16+
# To use: run bison with the -v switch, then feed the produced y.output
17+
# file to this script.
18+
#
19+
# Copyright (c) 2011, PostgreSQL Global Development Group
20+
#
21+
# src/tools/check_bison_recursion.pl
22+
#################################################################
23+
24+
use strict;
25+
use warnings;
26+
27+
my $debug = 0;
28+
29+
# must retain this across input lines
30+
my $cur_nonterminal;
31+
32+
# We parse the input and emit warnings on the fly.
33+
my $in_grammar = 0;
34+
35+
while (<>) {
36+
my $rule_number;
37+
my $rhs;
38+
39+
# We only care about the "Grammar" part of the input.
40+
if (m/^Grammar$/) {
41+
$in_grammar = 1;
42+
} elsif (m/^Terminal/) {
43+
$in_grammar = 0;
44+
} elsif ($in_grammar) {
45+
if (m/^\s*(\d+)\s+(\S+):\s+(.*)$/) {
46+
# first rule for nonterminal
47+
$rule_number = $1;
48+
$cur_nonterminal = $2;
49+
$rhs = $3;
50+
} elsif (m/^\s*(\d+)\s+\|\s+(.*)$/) {
51+
# additional rule for nonterminal
52+
$rule_number = $1;
53+
$rhs = $2;
54+
}
55+
}
56+
57+
# Process rule if we found one
58+
if (defined $rule_number) {
59+
# deconstruct the RHS
60+
$rhs =~ s|^/\* empty \*/$||;
61+
my @rhs = split '\s', $rhs;
62+
print "Rule $rule_number: $cur_nonterminal := @rhs\n" if $debug;
63+
# We complain if the nonterminal appears as the last RHS element
64+
# but not elsewhere, since "expr := expr + expr" is reasonable
65+
my $lastrhs = pop @rhs;
66+
if (defined $lastrhs &&
67+
$cur_nonterminal eq $lastrhs &&
68+
!grep { $cur_nonterminal eq $_ } @rhs) {
69+
print "Right recursion in rule $rule_number: $cur_nonterminal := $rhs\n";
70+
}
71+
}
72+
}
73+
74+
exit 0;

0 commit comments

Comments
 (0)