Skip to content

Commit 61081e7

Browse files
committed
Add pg_rewind, for re-synchronizing a master server after failback.
Earlier versions of this tool were available (and still are) on github. Thanks to Michael Paquier, Alvaro Herrera, Peter Eisentraut, Amit Kapila, and Satoshi Nagayasu for review.
1 parent 87cec51 commit 61081e7

29 files changed

+4141
-2
lines changed

doc/src/sgml/high-availability.sgml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1272,7 +1272,9 @@ primary_slot_name = 'node_a_slot'
12721272
and might stay down. To return to normal operation, a standby server
12731273
must be recreated,
12741274
either on the former primary system when it comes up, or on a third,
1275-
possibly new, system. Once complete, the primary and standby can be
1275+
possibly new, system. The <xref linkend="app-pgrewind"> utility can be
1276+
used to speed up this process on large clusters.
1277+
Once complete, the primary and standby can be
12761278
considered to have switched roles. Some people choose to use a third
12771279
server to provide backup for the new primary until the new standby
12781280
server is recreated,

doc/src/sgml/ref/allfiles.sgml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -190,6 +190,7 @@ Complete list of usable sgml source files in this directory.
190190
<!ENTITY pgRecvlogical SYSTEM "pg_recvlogical.sgml">
191191
<!ENTITY pgResetxlog SYSTEM "pg_resetxlog.sgml">
192192
<!ENTITY pgRestore SYSTEM "pg_restore.sgml">
193+
<!ENTITY pgRewind SYSTEM "pg_rewind.sgml">
193194
<!ENTITY postgres SYSTEM "postgres-ref.sgml">
194195
<!ENTITY postmaster SYSTEM "postmaster.sgml">
195196
<!ENTITY psqlRef SYSTEM "psql-ref.sgml">

doc/src/sgml/ref/pg_rewind.sgml

Lines changed: 237 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,237 @@
1+
<!--
2+
doc/src/sgml/ref/pg_rewind.sgml
3+
PostgreSQL documentation
4+
-->
5+
6+
<refentry id="app-pgrewind">
7+
<indexterm zone="app-pgrewind">
8+
<primary>pg_rewind</primary>
9+
</indexterm>
10+
11+
<refmeta>
12+
<refentrytitle><application>pg_rewind</application></refentrytitle>
13+
<manvolnum>1</manvolnum>
14+
<refmiscinfo>Application</refmiscinfo>
15+
</refmeta>
16+
17+
<refnamediv>
18+
<refname>pg_rewind</refname>
19+
<refpurpose>synchronize a <productname>PostgreSQL</productname> data directory with another data directory that was forked from the first one</refpurpose>
20+
</refnamediv>
21+
22+
<refsynopsisdiv>
23+
<cmdsynopsis>
24+
<command>pg_rewind</command>
25+
<arg rep="repeat"><replaceable>option</replaceable></arg>
26+
<group choice="plain">
27+
<group choice="req">
28+
<arg choice="plain"><option>-D </option></arg>
29+
<arg choice="plain"><option>--target-pgdata</option></arg>
30+
</group>
31+
<replaceable> directory</replaceable>
32+
<group choice="req">
33+
<arg choice="plain"><option>--source-pgdata=<replaceable>directory</replaceable></option></arg>
34+
<arg choice="plain"><option>--source-server=<replaceable>connstr</replaceable></option></arg>
35+
</group>
36+
</group>
37+
</cmdsynopsis>
38+
</refsynopsisdiv>
39+
40+
<refsect1>
41+
<title>Description</title>
42+
43+
<para>
44+
<application>pg_rewind</> is a tool for synchronizing a PostgreSQL cluster
45+
with another copy of the same cluster, after the clusters' timelines have
46+
diverged. A typical scenario is to bring an old master server back online
47+
after failover, as a standby that follows the new master.
48+
</para>
49+
50+
<para>
51+
The result is equivalent to replacing the target data directory with the
52+
source one. All files are copied, including configuration files. The
53+
advantage of <application>pg_rewind</> over taking a new base backup, or
54+
tools like <application>rsync</>, is that <application>pg_rewind</> does
55+
not require reading through all unchanged files in the cluster. That makes
56+
it a lot faster when the database is large and only a small portion of it
57+
differs between the clusters.
58+
</para>
59+
60+
<para>
61+
<application>pg_rewind</> examines the timeline histories of the source
62+
and target clusters to determine the point where they diverged, and
63+
expects to find WAL in the target cluster's <filename>pg_xlog</> directory
64+
reaching all the way back to the point of divergence. In the typical
65+
failover scenario where the target cluster was shut down soon after the
66+
divergence, that is not a problem, but if the target cluster had run for a
67+
long time after the divergence, the old WAL files might not be present
68+
anymore. In that case, they can be manually copied from the WAL archive to
69+
the <filename>pg_xlog</> directory. Fetching missing files from a WAL
70+
archive automatically is currently not supported.
71+
</para>
72+
73+
<para>
74+
When the target server is started up for the first time after running
75+
<application>pg_rewind</>, it will go into recovery mode and replay all
76+
WAL generated in the source server after the point of divergence.
77+
If some of the WAL was no longer available in the source server when
78+
<application>pg_rewind</> was run, and therefore could not be copied by
79+
<application>pg_rewind</> session, it needs to be made available when the
80+
target server is started up. That can be done by creating a
81+
<filename>recovery.conf</> file in the target data directory with a
82+
suitable <varname>restore_command</>.
83+
</para>
84+
</refsect1>
85+
86+
<refsect1>
87+
<title>Options</title>
88+
89+
<para>
90+
<application>pg_rewind</application> accepts the following command-line
91+
arguments:
92+
93+
<variablelist>
94+
<varlistentry>
95+
<term><option>-D</option></term>
96+
<term><option>--target-pgdata</option></term>
97+
<listitem>
98+
<para>
99+
This option specifies the target data directory that is synchronized
100+
with the source. The target server must shut down cleanly before
101+
running <application>pg_rewind</application>
102+
</para>
103+
</listitem>
104+
</varlistentry>
105+
106+
<varlistentry>
107+
<term><option>--source-pgdata</option></term>
108+
<listitem>
109+
<para>
110+
Specifies path to the data directory of the source server, to
111+
synchronize the target with. When <option>--source-pgdata</> is
112+
used, the source server must be cleanly shut down.
113+
</para>
114+
</listitem>
115+
</varlistentry>
116+
117+
<varlistentry>
118+
<term><option>--source-server</option></term>
119+
<listitem>
120+
<para>
121+
Specifies a libpq connection string to connect to the source
122+
<productname>PostgreSQL</> server to synchronize the target with.
123+
The server must be up and running, and must not be in recovery mode.
124+
</para>
125+
</listitem>
126+
</varlistentry>
127+
128+
<varlistentry>
129+
<term><option>-n</option></term>
130+
<term><option>--dry-run</option></term>
131+
<listitem>
132+
<para>
133+
Do everything except actually modifying the target directory.
134+
</para>
135+
</listitem>
136+
</varlistentry>
137+
138+
<varlistentry>
139+
<term><option>-P</option></term>
140+
<term><option>--progress</option></term>
141+
<listitem>
142+
<para>
143+
Enables progress reporting. Turning this on will deliver an approximate
144+
progress report while copying data over from the source cluster.
145+
</para>
146+
</listitem>
147+
</varlistentry>
148+
149+
<varlistentry>
150+
<term><option>--debug</option></term>
151+
<listitem>
152+
<para>
153+
Print verbose debugging output that is mostly useful for developers
154+
debugging <application>pg_rewind</>.
155+
</para>
156+
</listitem>
157+
</varlistentry>
158+
159+
<varlistentry>
160+
<term><option>-V</option></term>
161+
<term><option>--version</option></term>
162+
<listitem><para>Display version information, then exit</para></listitem>
163+
</varlistentry>
164+
165+
<varlistentry>
166+
<term><option>-?</option></term>
167+
<term><option>--help</option></term>
168+
<listitem><para>Show help, then exit</para></listitem>
169+
</varlistentry>
170+
171+
</variablelist>
172+
</para>
173+
</refsect1>
174+
175+
<refsect1>
176+
<title>Environment</title>
177+
178+
<para>
179+
When <option>--source-server</> option is used,
180+
<application>pg_rewind</application> also uses the environment variables
181+
supported by <application>libpq</> (see <xref linkend="libpq-envars">).
182+
</para>
183+
</refsect1>
184+
185+
<refsect1>
186+
<title>Notes</title>
187+
188+
<para>
189+
<application>pg_rewind</> requires that the <varname>wal_log_hints</>
190+
option is enabled in <filename>postgresql.conf</>, or that data checksums
191+
were enabled when the cluster was initialized with <application>initdb</>.
192+
<varname>full_page_writes</> must also be enabled.
193+
</para>
194+
195+
<refsect2>
196+
<title>How it works</title>
197+
198+
<para>
199+
The basic idea is to copy everything from the new cluster to the old
200+
cluster, except for the blocks that we know to be the same.
201+
</para>
202+
203+
<procedure>
204+
<step>
205+
<para>
206+
Scan the WAL log of the old cluster, starting from the last checkpoint
207+
before the point where the new cluster's timeline history forked off
208+
from the old cluster. For each WAL record, make a note of the data
209+
blocks that were touched. This yields a list of all the data blocks
210+
that were changed in the old cluster, after the new cluster forked off.
211+
</para>
212+
</step>
213+
<step>
214+
<para>
215+
Copy all those changed blocks from the new cluster to the old cluster.
216+
</para>
217+
</step>
218+
<step>
219+
<para>
220+
Copy all other files like clog, conf files etc. from the new cluster
221+
to old cluster. Everything except the relation files.
222+
</para>
223+
</step>
224+
<step>
225+
<para>
226+
Apply the WAL from the new cluster, starting from the checkpoint
227+
created at failover. (Strictly speaking, <application>pg_rewind</>
228+
doesn't apply the WAL, it just creates a backup label file indicating
229+
that when <productname>PostgreSQL</> is started, it will start replay
230+
from that checkpoint and apply all the required WAL.)
231+
</para>
232+
</step>
233+
</procedure>
234+
</refsect2>
235+
</refsect1>
236+
237+
</refentry>

doc/src/sgml/reference.sgml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -260,6 +260,7 @@
260260
&pgControldata;
261261
&pgCtl;
262262
&pgResetxlog;
263+
&pgRewind;
263264
&postgres;
264265
&postmaster;
265266

src/bin/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ SUBDIRS = \
2121
pg_ctl \
2222
pg_dump \
2323
pg_resetxlog \
24+
pg_rewind \
2425
psql \
2526
scripts
2627

src/bin/pg_rewind/.gitignore

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Files generated during build
2+
/xlogreader.c
3+
/pg_rewind
4+
5+
# Generated by test suite
6+
/tmp_check/
7+
/regress_log/

src/bin/pg_rewind/Makefile

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
#-------------------------------------------------------------------------
2+
#
3+
# Makefile for src/bin/pg_rewind
4+
#
5+
# Portions Copyright (c) 2013-2015, PostgreSQL Global Development Group
6+
#
7+
# src/bin/pg_rewind/Makefile
8+
#
9+
#-------------------------------------------------------------------------
10+
11+
PGFILEDESC = "pg_rewind - repurpose an old master server as standby"
12+
PGAPPICON = win32
13+
14+
subdir = src/bin/pg_rewind
15+
top_builddir = ../../..
16+
include $(top_builddir)/src/Makefile.global
17+
18+
PG_CPPFLAGS = -I$(libpq_srcdir)
19+
PG_LIBS = $(libpq_pgport)
20+
21+
override CPPFLAGS := -I$(libpq_srcdir) -DFRONTEND $(CPPFLAGS)
22+
23+
OBJS = pg_rewind.o parsexlog.o xlogreader.o datapagemap.o timeline.o \
24+
fetch.o file_ops.o copy_fetch.o libpq_fetch.o filemap.o logging.o \
25+
$(WIN32RES)
26+
27+
EXTRA_CLEAN = $(RMGRDESCSOURCES) xlogreader.c
28+
29+
all: pg_rewind
30+
31+
pg_rewind: $(OBJS) | submake-libpq submake-libpgport
32+
$(CC) $(CFLAGS) $^ $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
33+
34+
xlogreader.c: % : $(top_srcdir)/src/backend/access/transam/%
35+
rm -f $@ && $(LN_S) $< .
36+
37+
install: all installdirs
38+
$(INSTALL_PROGRAM) pg_rewind$(X) '$(DESTDIR)$(bindir)/pg_rewind$(X)'
39+
40+
installdirs:
41+
$(MKDIR_P) '$(DESTDIR)$(bindir)'
42+
43+
uninstall:
44+
rm -f '$(DESTDIR)$(bindir)/pg_rewind$(X)'
45+
46+
clean distclean maintainer-clean:
47+
rm -f pg_rewind$(X) $(OBJS) xlogreader.c
48+
rm -rf tmp_check
49+
50+
check: all
51+
$(prove_check) :: local
52+
$(prove_check) :: remote

0 commit comments

Comments
 (0)