Skip to content

Commit ea0382e

Browse files
committed
Code review for recent patch to terminate online backup during shutdown:
do CancelBackup at a sane place, fix some oversights in the state transitions, allow only superusers to connect while we are waiting for backup mode to end.
1 parent b6e2fab commit ea0382e

File tree

4 files changed

+58
-38
lines changed

4 files changed

+58
-38
lines changed

doc/src/sgml/runtime.sgml

Lines changed: 14 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/runtime.sgml,v 1.415 2008/04/23 13:44:58 mha Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/runtime.sgml,v 1.416 2008/04/26 22:47:40 tgl Exp $ -->
22

33
<chapter Id="runtime">
44
<title>Operating System Environment</title>
@@ -1306,12 +1306,15 @@ sysctl -w vm.overcommit_memory=2
13061306
<term><systemitem>SIGTERM</systemitem><indexterm><primary>SIGTERM</></></term>
13071307
<listitem>
13081308
<para>
1309+
This is the <firstterm>Smart Shutdown</firstterm> mode.
13091310
After receiving <systemitem>SIGTERM</systemitem>, the server
1310-
waits until online backup mode is no longer active. It then
13111311
disallows new connections, but lets existing sessions end their
1312-
work normally. It shuts down only after all of the sessions
1313-
terminate normally. This is the <firstterm>Smart
1314-
Shutdown</firstterm>.
1312+
work normally. It shuts down only after all of the sessions terminate.
1313+
If the server is in online backup mode, it additionally waits
1314+
until online backup mode is no longer active. While backup mode is
1315+
active, new connections will still be allowed, but only to superusers
1316+
(this exception allows a superuser to connect to terminate
1317+
online backup mode).
13151318
</para>
13161319
</listitem>
13171320
</varlistentry>
@@ -1320,13 +1323,13 @@ sysctl -w vm.overcommit_memory=2
13201323
<term><systemitem>SIGINT</systemitem><indexterm><primary>SIGINT</></></term>
13211324
<listitem>
13221325
<para>
1326+
This is the <firstterm>Fast Shutdown</firstterm> mode.
13231327
The server disallows new connections and sends all existing
13241328
server processes <systemitem>SIGTERM</systemitem>, which will cause them
13251329
to abort their current transactions and exit promptly. It then
13261330
waits for the server processes to exit and finally shuts down.
13271331
If the server is in online backup mode, backup mode will be
1328-
terminated, rendering the backup useless. This is the
1329-
<firstterm>Fast Shutdown</firstterm>.
1332+
terminated, rendering the backup useless.
13301333
</para>
13311334
</listitem>
13321335
</varlistentry>
@@ -1335,8 +1338,8 @@ sysctl -w vm.overcommit_memory=2
13351338
<term><systemitem>SIGQUIT</systemitem><indexterm><primary>SIGQUIT</></></term>
13361339
<listitem>
13371340
<para>
1338-
This is the <firstterm>Immediate Shutdown</firstterm>, which
1339-
will cause the master <command>postgres</command> process to send a
1341+
This is the <firstterm>Immediate Shutdown</firstterm> mode.
1342+
The master <command>postgres</command> process will send a
13401343
<systemitem>SIGQUIT</systemitem> to all child processes and exit
13411344
immediately, without properly shutting itself down. The child processes
13421345
likewise exit immediately upon receiving
@@ -1377,8 +1380,8 @@ $ <userinput>kill -INT `head -1 /usr/local/pgsql/data/postmaster.pid`</userinput
13771380
</important>
13781381

13791382
<para>
1380-
To terminate a session while allowing other sessions to continue, use
1381-
<function>pg_terminate_backend()</> (<xref
1383+
To terminate an individual session while allowing other sessions to
1384+
continue, use <function>pg_terminate_backend()</> (see <xref
13821385
linkend="functions-admin-signal-table">) or send a
13831386
<systemitem>SIGTERM</> signal to the child process associated with
13841387
the session.

src/backend/postmaster/postmaster.c

Lines changed: 29 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@
3737
*
3838
*
3939
* IDENTIFICATION
40-
* $PostgreSQL: pgsql/src/backend/postmaster/postmaster.c,v 1.555 2008/04/23 13:44:59 mha Exp $
40+
* $PostgreSQL: pgsql/src/backend/postmaster/postmaster.c,v 1.556 2008/04/26 22:47:40 tgl Exp $
4141
*
4242
* NOTES
4343
*
@@ -230,6 +230,7 @@ static bool FatalError = false; /* T if recovering from backend crash */
230230
* crash recovery (which is rather like shutdown followed by startup).
231231
*
232232
* Normal child backends can only be launched when we are in PM_RUN state.
233+
* (We also allow it in PM_WAIT_BACKUP state, but only for superusers.)
233234
* In other states we handle connection requests by launching "dead_end"
234235
* child processes, which will simply send the client an error message and
235236
* quit. (We track these in the BackendList so that we can know when they
@@ -242,9 +243,9 @@ static bool FatalError = false; /* T if recovering from backend crash */
242243
* will not be very long).
243244
*
244245
* Notice that this state variable does not distinguish *why* we entered
245-
* PM_WAIT_BACKENDS or later states --- Shutdown and FatalError must be
246-
* consulted to find that out. FatalError is never true in PM_RUN state, nor
247-
* in PM_SHUTDOWN states (because we don't enter those states when trying to
246+
* states later than PM_RUN --- Shutdown and FatalError must be consulted
247+
* to find that out. FatalError is never true in PM_RUN state, nor in
248+
* PM_SHUTDOWN states (because we don't enter those states when trying to
248249
* recover from a crash). It can be true in PM_STARTUP state, because we
249250
* don't clear it until we've successfully recovered.
250251
*/
@@ -1650,6 +1651,9 @@ ProcessStartupPacket(Port *port, bool SSLdone)
16501651
(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
16511652
errmsg("sorry, too many clients already")));
16521653
break;
1654+
case CAC_WAITBACKUP:
1655+
/* OK for now, will check in InitPostgres */
1656+
break;
16531657
case CAC_OK:
16541658
break;
16551659
}
@@ -1727,11 +1731,15 @@ canAcceptConnections(void)
17271731
{
17281732
/*
17291733
* Can't start backends when in startup/shutdown/recovery state.
1730-
* In state PM_WAIT_BACKUP we must allow connections so that
1731-
* a superuser can end online backup mode.
1734+
*
1735+
* In state PM_WAIT_BACKUP only superusers can connect (this must be
1736+
* allowed so that a superuser can end online backup mode); we return
1737+
* CAC_WAITBACKUP code to indicate that this must be checked later.
17321738
*/
1733-
if ((pmState != PM_RUN) && (pmState != PM_WAIT_BACKUP))
1739+
if (pmState != PM_RUN)
17341740
{
1741+
if (pmState == PM_WAIT_BACKUP)
1742+
return CAC_WAITBACKUP; /* allow superusers only */
17351743
if (Shutdown > NoShutdown)
17361744
return CAC_SHUTDOWN; /* shutdown is pending */
17371745
if (pmState == PM_STARTUP && !FatalError)
@@ -1997,7 +2005,7 @@ pmdie(SIGNAL_ARGS)
19972005

19982006
if (StartupPID != 0)
19992007
signal_child(StartupPID, SIGTERM);
2000-
if (pmState == PM_RUN)
2008+
if (pmState == PM_RUN || pmState == PM_WAIT_BACKUP)
20012009
{
20022010
ereport(LOG,
20032011
(errmsg("aborting any active transactions")));
@@ -2017,13 +2025,6 @@ pmdie(SIGNAL_ARGS)
20172025
* PostmasterStateMachine will take the next step.
20182026
*/
20192027
PostmasterStateMachine();
2020-
2021-
/*
2022-
* Terminate backup mode to avoid recovery after a
2023-
* clean fast shutdown.
2024-
*/
2025-
CancelBackup();
2026-
20272028
break;
20282029

20292030
case SIGQUIT:
@@ -2499,7 +2500,9 @@ HandleChildCrash(int pid, int exitstatus, const char *procname)
24992500

25002501
FatalError = true;
25012502
/* We now transit into a state of waiting for children to die */
2502-
if (pmState == PM_RUN || pmState == PM_SHUTDOWN)
2503+
if (pmState == PM_RUN ||
2504+
pmState == PM_WAIT_BACKUP ||
2505+
pmState == PM_SHUTDOWN)
25032506
pmState = PM_WAIT_BACKENDS;
25042507
}
25052508

@@ -2568,15 +2571,10 @@ PostmasterStateMachine(void)
25682571
if (pmState == PM_WAIT_BACKUP)
25692572
{
25702573
/*
2571-
* PM_WAIT_BACKUP state ends when online backup mode is no longer
2572-
* active. In this state canAcceptConnections() will still allow
2573-
* client connections, which is necessary because a superuser
2574-
* has to call pg_stop_backup() to end online backup mode.
2574+
* PM_WAIT_BACKUP state ends when online backup mode is not active.
25752575
*/
25762576
if (!BackupInProgress())
2577-
{
25782577
pmState = PM_WAIT_BACKENDS;
2579-
}
25802578
}
25812579

25822580
/*
@@ -2699,6 +2697,12 @@ PostmasterStateMachine(void)
26992697
}
27002698
else
27012699
{
2700+
/*
2701+
* Terminate backup mode to avoid recovery after a
2702+
* clean fast shutdown.
2703+
*/
2704+
CancelBackup();
2705+
27022706
/* Normal exit from the postmaster is here */
27032707
ExitPostmaster(0);
27042708
}
@@ -2819,7 +2823,7 @@ BackendStartup(Port *port)
28192823
return STATUS_ERROR;
28202824
}
28212825

2822-
/* Pass down canAcceptConnections state (kluge for EXEC_BACKEND case) */
2826+
/* Pass down canAcceptConnections state */
28232827
port->canAcceptConnections = canAcceptConnections();
28242828

28252829
#ifdef EXEC_BACKEND
@@ -2880,7 +2884,8 @@ BackendStartup(Port *port)
28802884
bn->pid = pid;
28812885
bn->cancel_key = MyCancelKey;
28822886
bn->is_autovacuum = false;
2883-
bn->dead_end = (port->canAcceptConnections != CAC_OK);
2887+
bn->dead_end = (port->canAcceptConnections != CAC_OK &&
2888+
port->canAcceptConnections != CAC_WAITBACKUP);
28842889
DLAddHead(BackendList, DLNewElem(bn));
28852890
#ifdef EXEC_BACKEND
28862891
if (!bn->dead_end)

src/backend/utils/init/postinit.c

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/utils/init/postinit.c,v 1.182 2008/03/26 21:10:39 alvherre Exp $
11+
* $PostgreSQL: pgsql/src/backend/utils/init/postinit.c,v 1.183 2008/04/26 22:47:40 tgl Exp $
1212
*
1313
*
1414
*-------------------------------------------------------------------------
@@ -26,6 +26,7 @@
2626
#include "catalog/pg_database.h"
2727
#include "catalog/pg_tablespace.h"
2828
#include "libpq/hba.h"
29+
#include "libpq/libpq-be.h"
2930
#include "mb/pg_wchar.h"
3031
#include "miscadmin.h"
3132
#include "pgstat.h"
@@ -577,6 +578,16 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
577578
if (!bootstrap)
578579
CheckMyDatabase(dbname, am_superuser);
579580

581+
/*
582+
* If we're trying to shut down, only superusers can connect.
583+
*/
584+
if (!am_superuser &&
585+
MyProcPort != NULL &&
586+
MyProcPort->canAcceptConnections == CAC_WAITBACKUP)
587+
ereport(FATAL,
588+
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
589+
errmsg("must be superuser to connect during database shutdown")));
590+
580591
/*
581592
* Check a normal user hasn't connected to a superuser reserved slot.
582593
*/

src/include/libpq/libpq-be.h

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
* Portions Copyright (c) 1996-2008, PostgreSQL Global Development Group
1212
* Portions Copyright (c) 1994, Regents of the University of California
1313
*
14-
* $PostgreSQL: pgsql/src/include/libpq/libpq-be.h,v 1.65 2008/01/01 19:45:58 momjian Exp $
14+
* $PostgreSQL: pgsql/src/include/libpq/libpq-be.h,v 1.66 2008/04/26 22:47:40 tgl Exp $
1515
*
1616
*-------------------------------------------------------------------------
1717
*/
@@ -69,7 +69,8 @@ typedef struct
6969

7070
typedef enum CAC_state
7171
{
72-
CAC_OK, CAC_STARTUP, CAC_SHUTDOWN, CAC_RECOVERY, CAC_TOOMANY
72+
CAC_OK, CAC_STARTUP, CAC_SHUTDOWN, CAC_RECOVERY, CAC_TOOMANY,
73+
CAC_WAITBACKUP
7374
} CAC_state;
7475

7576

0 commit comments

Comments
 (0)