Skip to content

Commit efc16ea

Browse files
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record. New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far. This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required. Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit. Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
1 parent 78a0914 commit efc16ea

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

87 files changed

+6160
-423
lines changed

doc/src/sgml/backup.sgml

Lines changed: 773 additions & 3 deletions
Large diffs are not rendered by default.

doc/src/sgml/config.sgml

Lines changed: 113 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.238 2009/12/17 14:36:16 rhaas Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.239 2009/12/19 01:32:31 sriggs Exp $ -->
22

33
<chapter Id="runtime-config">
44
<title>Server Configuration</title>
@@ -376,6 +376,12 @@ SET ENABLE_SEQSCAN TO OFF;
376376
allows. See <xref linkend="sysvipc"> for information on how to
377377
adjust those parameters, if necessary.
378378
</para>
379+
380+
<para>
381+
When running a standby server, you must set this parameter to the
382+
same or higher value than on the master server. Otherwise, queries
383+
will not be allowed in the standby server.
384+
</para>
379385
</listitem>
380386
</varlistentry>
381387

@@ -826,6 +832,12 @@ SET ENABLE_SEQSCAN TO OFF;
826832
allows. See <xref linkend="sysvipc"> for information on how to
827833
adjust those parameters, if necessary.
828834
</para>
835+
836+
<para>
837+
When running a standby server, you must set this parameter to the
838+
same or higher value than on the master server. Otherwise, queries
839+
will not be allowed in the standby server.
840+
</para>
829841
</listitem>
830842
</varlistentry>
831843

@@ -1733,6 +1745,51 @@ archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"' # Windows
17331745

17341746
</variablelist>
17351747
</sect2>
1748+
1749+
<sect2 id="runtime-config-standby">
1750+
<title>Standby Servers</title>
1751+
1752+
<variablelist>
1753+
1754+
<varlistentry id="recovery-connections" xreflabel="recovery_connections">
1755+
<term><varname>recovery_connections</varname> (<type>boolean</type>)</term>
1756+
<listitem>
1757+
<para>
1758+
Parameter has two roles. During recovery, specifies whether or not
1759+
you can connect and run queries to enable <xref linkend="hot-standby">.
1760+
During normal running, specifies whether additional information is written
1761+
to WAL to allow recovery connections on a standby server that reads
1762+
WAL data generated by this server. The default value is
1763+
<literal>on</literal>. It is thought that there is little
1764+
measurable difference in performance from using this feature, so
1765+
feedback is welcome if any production impacts are noticeable.
1766+
It is likely that this parameter will be removed in later releases.
1767+
This parameter can only be set at server start.
1768+
</para>
1769+
</listitem>
1770+
</varlistentry>
1771+
1772+
<varlistentry id="max-standby-delay" xreflabel="max_standby_delay">
1773+
<term><varname>max_standby_delay</varname> (<type>string</type>)</term>
1774+
<listitem>
1775+
<para>
1776+
When server acts as a standby, this parameter specifies a wait policy
1777+
for queries that conflict with incoming data changes. Valid settings
1778+
are -1, meaning wait forever, or a wait time of 0 or more seconds.
1779+
If a conflict should occur the server will delay up to this
1780+
amount before it begins trying to resolve things less amicably, as
1781+
described in <xref linkend="hot-standby-conflict">. Typically,
1782+
this parameter makes sense only during replication, so when
1783+
performing an archive recovery to recover from data loss a
1784+
parameter setting of 0 is recommended. The default is 30 seconds.
1785+
This parameter can only be set in the <filename>postgresql.conf</>
1786+
file or on the server command line.
1787+
</para>
1788+
</listitem>
1789+
</varlistentry>
1790+
1791+
</variablelist>
1792+
</sect2>
17361793
</sect1>
17371794

17381795
<sect1 id="runtime-config-query">
@@ -4161,6 +4218,29 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
41614218
</listitem>
41624219
</varlistentry>
41634220

4221+
<varlistentry id="guc-vacuum-defer-cleanup-age" xreflabel="vacuum_defer_cleanup_age">
4222+
<term><varname>vacuum_defer_cleanup_age</varname> (<type>integer</type>)</term>
4223+
<indexterm>
4224+
<primary><varname>vacuum_defer_cleanup_age</> configuration parameter</primary>
4225+
</indexterm>
4226+
<listitem>
4227+
<para>
4228+
Specifies the number of transactions by which <command>VACUUM</> and
4229+
<acronym>HOT</> updates will defer cleanup of dead row versions. The
4230+
default is 0 transactions, meaning that dead row versions will be
4231+
removed as soon as possible. You may wish to set this to a non-zero
4232+
value when planning or maintaining a <xref linkend="hot-standby">
4233+
configuration. The recommended value is <literal>0</> unless you have
4234+
clear reason to increase it. The purpose of the parameter is to
4235+
allow the user to specify an approximate time delay before cleanup
4236+
occurs. However, it should be noted that there is no direct link with
4237+
any specific time delay and so the results will be application and
4238+
installation specific, as well as variable over time, depending upon
4239+
the transaction rate (of writes only).
4240+
</para>
4241+
</listitem>
4242+
</varlistentry>
4243+
41644244
<varlistentry id="guc-bytea-output" xreflabel="bytea_output">
41654245
<term><varname>bytea_output</varname> (<type>enum</type>)</term>
41664246
<indexterm>
@@ -4689,6 +4769,12 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
46894769
allows. See <xref linkend="sysvipc"> for information on how to
46904770
adjust those parameters, if necessary.
46914771
</para>
4772+
4773+
<para>
4774+
When running a standby server, you must set this parameter to the
4775+
same or higher value than on the master server. Otherwise, queries
4776+
will not be allowed in the standby server.
4777+
</para>
46924778
</listitem>
46934779
</varlistentry>
46944780

@@ -5546,6 +5632,32 @@ plruby.use_strict = true # generates error: unknown class name
55465632
</listitem>
55475633
</varlistentry>
55485634

5635+
<varlistentry id="guc-trace-recovery-messages" xreflabel="trace_recovery_messages">
5636+
<term><varname>trace_recovery_messages</varname> (<type>string</type>)</term>
5637+
<indexterm>
5638+
<primary><varname>trace_recovery_messages</> configuration parameter</primary>
5639+
</indexterm>
5640+
<listitem>
5641+
<para>
5642+
Controls which message levels are written to the server log
5643+
for system modules needed for recovery processing. This allows
5644+
the user to override the normal setting of log_min_messages,
5645+
but only for specific messages. This is intended for use in
5646+
debugging Hot Standby.
5647+
Valid values are <literal>DEBUG5</>, <literal>DEBUG4</>,
5648+
<literal>DEBUG3</>, <literal>DEBUG2</>, <literal>DEBUG1</>,
5649+
<literal>INFO</>, <literal>NOTICE</>, <literal>WARNING</>,
5650+
<literal>ERROR</>, <literal>LOG</>, <literal>FATAL</>, and
5651+
<literal>PANIC</>. Each level includes all the levels that
5652+
follow it. The later the level, the fewer messages are sent
5653+
to the log. The default is <literal>WARNING</>. Note that
5654+
<literal>LOG</> has a different rank here than in
5655+
<varname>client_min_messages</>.
5656+
Parameter should be set in the postgresql.conf only.
5657+
</para>
5658+
</listitem>
5659+
</varlistentry>
5660+
55495661
<varlistentry id="guc-zero-damaged-pages" xreflabel="zero_damaged_pages">
55505662
<term><varname>zero_damaged_pages</varname> (<type>boolean</type>)</term>
55515663
<indexterm>

doc/src/sgml/func.sgml

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.493 2009/12/15 17:57:46 tgl Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.494 2009/12/19 01:32:31 sriggs Exp $ -->
22

33
<chapter id="functions">
44
<title>Functions and Operators</title>
@@ -13132,6 +13132,38 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
1313213132
<xref linkend="continuous-archiving">.
1313313133
</para>
1313413134

13135+
<indexterm>
13136+
<primary>pg_is_in_recovery</primary>
13137+
</indexterm>
13138+
13139+
<para>
13140+
The functions shown in <xref
13141+
linkend="functions-recovery-info-table"> provide information
13142+
about the current status of Hot Standby.
13143+
These functions may be executed during both recovery and in normal running.
13144+
</para>
13145+
13146+
<table id="functions-recovery-info-table">
13147+
<title>Recovery Information Functions</title>
13148+
<tgroup cols="3">
13149+
<thead>
13150+
<row><entry>Name</entry> <entry>Return Type</entry> <entry>Description</entry>
13151+
</row>
13152+
</thead>
13153+
13154+
<tbody>
13155+
<row>
13156+
<entry>
13157+
<literal><function>pg_is_in_recovery</function>()</literal>
13158+
</entry>
13159+
<entry><type>bool</type></entry>
13160+
<entry>True if recovery is still in progress.
13161+
</entry>
13162+
</row>
13163+
</tbody>
13164+
</tgroup>
13165+
</table>
13166+
1313513167
<para>
1313613168
The functions shown in <xref linkend="functions-admin-dbsize"> calculate
1313713169
the disk space usage of database objects.

doc/src/sgml/ref/checkpoint.sgml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/ref/checkpoint.sgml,v 1.16 2008/11/14 10:22:45 petere Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/ref/checkpoint.sgml,v 1.17 2009/12/19 01:32:31 sriggs Exp $ -->
22

33
<refentry id="sql-checkpoint">
44
<refmeta>
@@ -42,6 +42,11 @@ CHECKPOINT
4242
<xref linkend="wal"> for more information about the WAL system.
4343
</para>
4444

45+
<para>
46+
If executed during recovery, the <command>CHECKPOINT</command> command
47+
will force a restartpoint rather than writing a new checkpoint.
48+
</para>
49+
4550
<para>
4651
Only superusers can call <command>CHECKPOINT</command>. The command is
4752
not intended for use during normal operation.

src/backend/access/gin/ginxlog.c

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
* Portions Copyright (c) 1994, Regents of the University of California
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/access/gin/ginxlog.c,v 1.19 2009/06/11 14:48:53 momjian Exp $
11+
* $PostgreSQL: pgsql/src/backend/access/gin/ginxlog.c,v 1.20 2009/12/19 01:32:31 sriggs Exp $
1212
*-------------------------------------------------------------------------
1313
*/
1414
#include "postgres.h"
@@ -621,6 +621,10 @@ gin_redo(XLogRecPtr lsn, XLogRecord *record)
621621
{
622622
uint8 info = record->xl_info & ~XLR_INFO_MASK;
623623

624+
/*
625+
* GIN indexes do not require any conflict processing.
626+
*/
627+
624628
RestoreBkpBlocks(lsn, record, false);
625629

626630
topCtx = MemoryContextSwitchTo(opCtx);

src/backend/access/gist/gistxlog.c

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
* Portions Copyright (c) 1994, Regents of the University of California
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/access/gist/gistxlog.c,v 1.32 2009/01/20 18:59:36 heikki Exp $
11+
* $PostgreSQL: pgsql/src/backend/access/gist/gistxlog.c,v 1.33 2009/12/19 01:32:32 sriggs Exp $
1212
*-------------------------------------------------------------------------
1313
*/
1414
#include "postgres.h"
@@ -396,6 +396,12 @@ gist_redo(XLogRecPtr lsn, XLogRecord *record)
396396
uint8 info = record->xl_info & ~XLR_INFO_MASK;
397397
MemoryContext oldCxt;
398398

399+
/*
400+
* GIST indexes do not require any conflict processing. NB: If we ever
401+
* implement a similar optimization we have in b-tree, and remove killed
402+
* tuples outside VACUUM, we'll need to handle that here.
403+
*/
404+
399405
RestoreBkpBlocks(lsn, record, false);
400406

401407
oldCxt = MemoryContextSwitchTo(opCtx);

0 commit comments

Comments
 (0)