Skip to content

Commit e37ad5f

Browse files
committed
Remove race condition in 022_crash_temp_files.pl test.
It's possible for the query that "waits for restart" to complete a successful iteration before the postmaster has noticed its SIGKILL'd child and begun the restart cycle. (This is a bit hard to believe perhaps, but it's been seen at least twice in the buildfarm, mainly on ancient platforms that likely have quirky schedulers.) To provide a more secure interlock, wait for the other session we're using to report that it's been forcibly shut down. Patch by me, based on a suggestion from Andres Freund. Back-patch to v14 where this test case came in. Discussion: https://postgr.es/m/1801850.1649047827@sss.pgh.pa.us
1 parent 75edb91 commit e37ad5f

File tree

1 file changed

+30
-4
lines changed

1 file changed

+30
-4
lines changed

src/test/recovery/t/022_crash_temp_files.pl

Lines changed: 30 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -125,11 +125,24 @@ BEGIN
125125
my $ret = PostgreSQL::Test::Utils::system_log('pg_ctl', 'kill', 'KILL', $pid);
126126
is($ret, 0, 'killed process with KILL');
127127

128-
# Close psql session
128+
# Close that psql session
129129
$killme->finish;
130+
131+
# Wait till the other session reports failure, ensuring that the postmaster
132+
# has noticed its dead child and begun a restart cycle.
133+
$killme_stdin2 .= qq[
134+
SELECT pg_sleep($PostgreSQL::Test::Utils::timeout_default);
135+
];
136+
ok( pump_until(
137+
$killme2,
138+
$psql_timeout,
139+
\$killme_stderr2,
140+
qr/WARNING: terminating connection because of crash of another server process|server closed the connection unexpectedly|connection to server was lost|could not send data to server/m
141+
),
142+
"second psql session died successfully after SIGKILL");
130143
$killme2->finish;
131144

132-
# Wait till server restarts
145+
# Wait till server finishes restarting
133146
$node->poll_query_until('postgres', undef, '');
134147

135148
# Check for temporary files
@@ -214,11 +227,24 @@ BEGIN
214227
$ret = PostgreSQL::Test::Utils::system_log('pg_ctl', 'kill', 'KILL', $pid);
215228
is($ret, 0, 'killed process with KILL');
216229

217-
# Close psql session
230+
# Close that psql session
218231
$killme->finish;
232+
233+
# Wait till the other session reports failure, ensuring that the postmaster
234+
# has noticed its dead child and begun a restart cycle.
235+
$killme_stdin2 .= qq[
236+
SELECT pg_sleep($PostgreSQL::Test::Utils::timeout_default);
237+
];
238+
ok( pump_until(
239+
$killme2,
240+
$psql_timeout,
241+
\$killme_stderr2,
242+
qr/WARNING: terminating connection because of crash of another server process|server closed the connection unexpectedly|connection to server was lost|could not send data to server/m
243+
),
244+
"second psql session died successfully after SIGKILL");
219245
$killme2->finish;
220246

221-
# Wait till server restarts
247+
# Wait till server finishes restarting
222248
$node->poll_query_until('postgres', undef, '');
223249

224250
# Check for temporary files -- should be there

0 commit comments

Comments
 (0)