Skip to content

Commit 9a72299

Browse files
committed
Remove race condition in 022_crash_temp_files.pl test.
It's possible for the query that "waits for restart" to complete a successful iteration before the postmaster has noticed its SIGKILL'd child and begun the restart cycle. (This is a bit hard to believe perhaps, but it's been seen at least twice in the buildfarm, mainly on ancient platforms that likely have quirky schedulers.) To provide a more secure interlock, wait for the other session we're using to report that it's been forcibly shut down. Patch by me, based on a suggestion from Andres Freund. Back-patch to v14 where this test case came in. Discussion: https://postgr.es/m/1801850.1649047827@sss.pgh.pa.us
1 parent 8803df4 commit 9a72299

File tree

1 file changed

+29
-5
lines changed

1 file changed

+29
-5
lines changed

src/test/recovery/t/022_crash_temp_files.pl

Lines changed: 29 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
}
1717
else
1818
{
19-
plan tests => 9;
19+
plan tests => 11;
2020
}
2121

2222

@@ -130,11 +130,23 @@ BEGIN
130130
my $ret = TestLib::system_log('pg_ctl', 'kill', 'KILL', $pid);
131131
is($ret, 0, 'killed process with KILL');
132132

133-
# Close psql session
133+
# Close that psql session
134134
$killme->finish;
135+
136+
# Wait till the other session reports failure, ensuring that the postmaster
137+
# has noticed its dead child and begun a restart cycle.
138+
$killme_stdin2 .= qq[
139+
SELECT pg_sleep($TestLib::timeout_default);
140+
];
141+
ok( pump_until(
142+
$killme2,
143+
\$killme_stderr2,
144+
qr/WARNING: terminating connection because of crash of another server process|server closed the connection unexpectedly|connection to server was lost|could not send data to server/m
145+
),
146+
"second psql session died successfully after SIGKILL");
135147
$killme2->finish;
136148

137-
# Wait till server restarts
149+
# Wait till server finishes restarting
138150
$node->poll_query_until('postgres', undef, '');
139151

140152
# Check for temporary files
@@ -219,11 +231,23 @@ BEGIN
219231
$ret = TestLib::system_log('pg_ctl', 'kill', 'KILL', $pid);
220232
is($ret, 0, 'killed process with KILL');
221233

222-
# Close psql session
234+
# Close that psql session
223235
$killme->finish;
236+
237+
# Wait till the other session reports failure, ensuring that the postmaster
238+
# has noticed its dead child and begun a restart cycle.
239+
$killme_stdin2 .= qq[
240+
SELECT pg_sleep($TestLib::timeout_default);
241+
];
242+
ok( pump_until(
243+
$killme2,
244+
\$killme_stderr2,
245+
qr/WARNING: terminating connection because of crash of another server process|server closed the connection unexpectedly|connection to server was lost|could not send data to server/m
246+
),
247+
"second psql session died successfully after SIGKILL");
224248
$killme2->finish;
225249

226-
# Wait till server restarts
250+
# Wait till server finishes restarting
227251
$node->poll_query_until('postgres', undef, '');
228252

229253
# Check for temporary files -- should be there

0 commit comments

Comments
 (0)