Skip to content

Commit 58feb1a

Browse files
committed
In the pg_rewind test suite, receive WAL fully before promoting.
If a transaction never reaches the standby, later tests find unexpected cluster state. A "tail-copy: query result matches" test failure has been the usual symptom. Among the buildfarm members having run this test suite, most have exhibited that symptom at least once. Back-patch to 9.5, where pg_rewind was introduced. Michael Paquier, reported by Christoph Berg.
1 parent 73d2d2e commit 58feb1a

File tree

1 file changed

+8
-6
lines changed

1 file changed

+8
-6
lines changed

src/bin/pg_rewind/RewindTest.pm

+8-6
Original file line numberDiff line numberDiff line change
@@ -229,19 +229,21 @@ recovery_target_timeline='latest'
229229
'-o', "-k $tempdir_short --listen-addresses='' -p $port_standby",
230230
'start');
231231

232-
# Wait until the standby has caught up with the primary, by polling
233-
# pg_stat_replication.
234-
my $caughtup_query =
235-
"SELECT pg_current_xlog_location() = replay_location FROM pg_stat_replication WHERE application_name = 'rewind_standby';";
236-
poll_query_until($caughtup_query, $connstr_master)
237-
or die "Timed out while waiting for standby to catch up";
232+
# The standby may have WAL to apply before it matches the primary. That
233+
# is fine, because no test examines the standby before promotion.
238234
}
239235

240236
sub promote_standby
241237
{
242238
#### Now run the test-specific parts to run after standby has been started
243239
# up standby
244240

241+
# Wait for the standby to receive and write all WAL.
242+
my $wal_received_query =
243+
"SELECT pg_current_xlog_location() = write_location FROM pg_stat_replication WHERE application_name = 'rewind_standby';";
244+
poll_query_until($wal_received_query, $connstr_master)
245+
or die "Timed out while waiting for standby to receive and write WAL";
246+
245247
# Now promote slave and insert some new data on master, this will put
246248
# the master out-of-sync with the standby. Wait until the standby is
247249
# out of recovery mode, and is ready to accept read-write connections.

0 commit comments

Comments
 (0)