Skip to content

Commit ba019b4

Browse files
committed
Harden postgres_fdw tests against unexpected cache flushes.
postgres_fdw will close its remote session if an sinval cache reset occurs, since it's possible that that means some FDW parameters changed. We had two tests that were trying to ensure that the session remains alive by setting debug_discard_caches = 0; but that's not sufficient. Even though the tests seem stable enough in the buildfarm, they flap a lot under CI. In the first test, which is checking the ability to recover from a lost connection, we can stabilize the results by just not caring whether pg_terminate_backend() finds a victim backend. If a reset did happen, there won't be a session to terminate anymore, but the test can proceed anyway. (Arguably, we are then not testing the unintentional-disconnect case, but as long as that scenario is exercised in most runs I think it's fine; testing the reset-driven case is of value too.) In the second test, which is trying to verify the application_name displayed in pg_stat_activity by a remote session, we had a race condition in that the remote session might go away before we can fetch its pg_stat_activity entry. We can close that race and make the test more certainly test what it intends to by arranging things so that the remote session itself fetches its pg_stat_activity entry (based on PID rather than a somewhat-circular assumption about the application name). Both tests now demonstrably pass under debug_discard_caches = 1, so we can remove that hack. Back-patch into relevant back branches. Discussion: https://postgr.es/m/20230226194340.u44bkfgyz64c67i6@awork3.anarazel.de
1 parent 4d68338 commit ba019b4

File tree

2 files changed

+16
-28
lines changed

2 files changed

+16
-28
lines changed

contrib/postgres_fdw/expected/postgres_fdw.out

Lines changed: 8 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -9614,11 +9614,6 @@ WARNING: there is no transaction in progress
96149614
-- Change application_name of remote connection to special one
96159615
-- so that we can easily terminate the connection later.
96169616
ALTER SERVER loopback OPTIONS (application_name 'fdw_retry_check');
9617-
-- If debug_discard_caches is active, it results in
9618-
-- dropping remote connections after every transaction, making it
9619-
-- impossible to test termination meaningfully. So turn that off
9620-
-- for this test.
9621-
SET debug_discard_caches = 0;
96229617
-- Make sure we have a remote connection.
96239618
SELECT 1 FROM ft1 LIMIT 1;
96249619
?column?
@@ -9627,13 +9622,12 @@ SELECT 1 FROM ft1 LIMIT 1;
96279622
(1 row)
96289623

96299624
-- Terminate the remote connection and wait for the termination to complete.
9630-
SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
9625+
-- (If a cache flush happens, the remote connection might have already been
9626+
-- dropped; so code this step in a way that doesn't fail if no connection.)
9627+
DO $$ BEGIN
9628+
PERFORM pg_terminate_backend(pid, 180000) FROM pg_stat_activity
96319629
WHERE application_name = 'fdw_retry_check';
9632-
pg_terminate_backend
9633-
----------------------
9634-
t
9635-
(1 row)
9636-
9630+
END $$;
96379631
-- This query should detect the broken connection when starting new remote
96389632
-- transaction, reestablish new connection, and then succeed.
96399633
BEGIN;
@@ -9646,21 +9640,17 @@ SELECT 1 FROM ft1 LIMIT 1;
96469640
-- If we detect the broken connection when starting a new remote
96479641
-- subtransaction, we should fail instead of establishing a new connection.
96489642
-- Terminate the remote connection and wait for the termination to complete.
9649-
SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
9643+
DO $$ BEGIN
9644+
PERFORM pg_terminate_backend(pid, 180000) FROM pg_stat_activity
96509645
WHERE application_name = 'fdw_retry_check';
9651-
pg_terminate_backend
9652-
----------------------
9653-
t
9654-
(1 row)
9655-
9646+
END $$;
96569647
SAVEPOINT s;
96579648
-- The text of the error might vary across platforms, so only show SQLSTATE.
96589649
\set VERBOSITY sqlstate
96599650
SELECT 1 FROM ft1 LIMIT 1; -- should fail
96609651
ERROR: 08006
96619652
\set VERBOSITY default
96629653
COMMIT;
9663-
RESET debug_discard_caches;
96649654
-- =============================================================================
96659655
-- test connection invalidation cases and postgres_fdw_get_connections function
96669656
-- =============================================================================

contrib/postgres_fdw/sql/postgres_fdw.sql

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2908,18 +2908,16 @@ ROLLBACK;
29082908
-- so that we can easily terminate the connection later.
29092909
ALTER SERVER loopback OPTIONS (application_name 'fdw_retry_check');
29102910

2911-
-- If debug_discard_caches is active, it results in
2912-
-- dropping remote connections after every transaction, making it
2913-
-- impossible to test termination meaningfully. So turn that off
2914-
-- for this test.
2915-
SET debug_discard_caches = 0;
2916-
29172911
-- Make sure we have a remote connection.
29182912
SELECT 1 FROM ft1 LIMIT 1;
29192913

29202914
-- Terminate the remote connection and wait for the termination to complete.
2921-
SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
2915+
-- (If a cache flush happens, the remote connection might have already been
2916+
-- dropped; so code this step in a way that doesn't fail if no connection.)
2917+
DO $$ BEGIN
2918+
PERFORM pg_terminate_backend(pid, 180000) FROM pg_stat_activity
29222919
WHERE application_name = 'fdw_retry_check';
2920+
END $$;
29232921

29242922
-- This query should detect the broken connection when starting new remote
29252923
-- transaction, reestablish new connection, and then succeed.
@@ -2929,17 +2927,17 @@ SELECT 1 FROM ft1 LIMIT 1;
29292927
-- If we detect the broken connection when starting a new remote
29302928
-- subtransaction, we should fail instead of establishing a new connection.
29312929
-- Terminate the remote connection and wait for the termination to complete.
2932-
SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
2930+
DO $$ BEGIN
2931+
PERFORM pg_terminate_backend(pid, 180000) FROM pg_stat_activity
29332932
WHERE application_name = 'fdw_retry_check';
2933+
END $$;
29342934
SAVEPOINT s;
29352935
-- The text of the error might vary across platforms, so only show SQLSTATE.
29362936
\set VERBOSITY sqlstate
29372937
SELECT 1 FROM ft1 LIMIT 1; -- should fail
29382938
\set VERBOSITY default
29392939
COMMIT;
29402940

2941-
RESET debug_discard_caches;
2942-
29432941
-- =============================================================================
29442942
-- test connection invalidation cases and postgres_fdw_get_connections function
29452943
-- =============================================================================

0 commit comments

Comments
 (0)