Skip to content

Commit add6c31

Browse files
committed
Make the streaming replication protocol messages architecture-independent.
We used to send structs wrapped in CopyData messages, which works as long as the client and server agree on things like endianess, timestamp format and alignment. That's good enough for running a standby server, which has to run on the same platform anyway, but it's useful for tools like pg_receivexlog to work across platforms. This breaks protocol compatibility of streaming replication, but we never promised that to be compatible across versions, anyway.
1 parent ed5699d commit add6c31

File tree

7 files changed

+391
-374
lines changed

7 files changed

+391
-374
lines changed

doc/src/sgml/protocol.sgml

Lines changed: 70 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -1359,14 +1359,18 @@ The commands accepted in walsender mode are:
13591359
has already been recycled. On success, server responds with a
13601360
CopyBothResponse message, and then starts to stream WAL to the frontend.
13611361
WAL will continue to be streamed until the connection is broken;
1362-
no further commands will be accepted.
1362+
no further commands will be accepted. If the WAL sender process is
1363+
terminated normally (during postmaster shutdown), it will send a
1364+
CommandComplete message before exiting. This might not happen during an
1365+
abnormal shutdown, of course.
13631366
</para>
13641367

13651368
<para>
13661369
WAL data is sent as a series of CopyData messages. (This allows
13671370
other information to be intermixed; in particular the server can send
13681371
an ErrorResponse message if it encounters a failure after beginning
1369-
to stream.) The payload in each CopyData message follows this format:
1372+
to stream.) The payload of each CopyData message from server to the
1373+
client contains a message of one of the following formats:
13701374
</para>
13711375

13721376
<para>
@@ -1390,34 +1394,32 @@ The commands accepted in walsender mode are:
13901394
</varlistentry>
13911395
<varlistentry>
13921396
<term>
1393-
Byte8
1397+
Int64
13941398
</term>
13951399
<listitem>
13961400
<para>
1397-
The starting point of the WAL data in this message, given in
1398-
XLogRecPtr format.
1401+
The starting point of the WAL data in this message.
13991402
</para>
14001403
</listitem>
14011404
</varlistentry>
14021405
<varlistentry>
14031406
<term>
1404-
Byte8
1407+
Int64
14051408
</term>
14061409
<listitem>
14071410
<para>
1408-
The current end of WAL on the server, given in
1409-
XLogRecPtr format.
1411+
The current end of WAL on the server.
14101412
</para>
14111413
</listitem>
14121414
</varlistentry>
14131415
<varlistentry>
14141416
<term>
1415-
Byte8
1417+
Int64
14161418
</term>
14171419
<listitem>
14181420
<para>
1419-
The server's system clock at the time of transmission,
1420-
given in TimestampTz format.
1421+
The server's system clock at the time of transmission, as
1422+
microseconds since midnight on 2000-01-01.
14211423
</para>
14221424
</listitem>
14231425
</varlistentry>
@@ -1429,42 +1431,19 @@ The commands accepted in walsender mode are:
14291431
<para>
14301432
A section of the WAL data stream.
14311433
</para>
1434+
<para>
1435+
A single WAL record is never split across two XLogData messages.
1436+
When a WAL record crosses a WAL page boundary, and is therefore
1437+
already split using continuation records, it can be split at the page
1438+
boundary. In other words, the first main WAL record and its
1439+
continuation records can be sent in different XLogData messages.
1440+
</para>
14321441
</listitem>
14331442
</varlistentry>
14341443
</variablelist>
14351444
</para>
14361445
</listitem>
14371446
</varlistentry>
1438-
</variablelist>
1439-
</para>
1440-
<para>
1441-
A single WAL record is never split across two CopyData messages.
1442-
When a WAL record crosses a WAL page boundary, and is therefore
1443-
already split using continuation records, it can be split at the page
1444-
boundary. In other words, the first main WAL record and its
1445-
continuation records can be sent in different CopyData messages.
1446-
</para>
1447-
<para>
1448-
Note that all fields within the WAL data and the above-described header
1449-
will be in the sending server's native format. Endianness, and the
1450-
format for the timestamp, are unpredictable unless the receiver has
1451-
verified that the sender's system identifier matches its own
1452-
<filename>pg_control</> contents.
1453-
</para>
1454-
<para>
1455-
If the WAL sender process is terminated normally (during postmaster
1456-
shutdown), it will send a CommandComplete message before exiting.
1457-
This might not happen during an abnormal shutdown, of course.
1458-
</para>
1459-
1460-
<para>
1461-
The receiving process can send replies back to the sender at any time,
1462-
using one of the following message formats (also in the payload of a
1463-
CopyData message):
1464-
</para>
1465-
1466-
<para>
1467-
<variablelist>
14681447
<varlistentry>
14691448
<term>
14701449
Primary keepalive message (B)
@@ -1484,23 +1463,33 @@ The commands accepted in walsender mode are:
14841463
</varlistentry>
14851464
<varlistentry>
14861465
<term>
1487-
Byte8
1466+
Int64
14881467
</term>
14891468
<listitem>
14901469
<para>
1491-
The current end of WAL on the server, given in
1492-
XLogRecPtr format.
1470+
The current end of WAL on the server.
14931471
</para>
14941472
</listitem>
14951473
</varlistentry>
14961474
<varlistentry>
14971475
<term>
1498-
Byte8
1476+
Int64
14991477
</term>
15001478
<listitem>
15011479
<para>
1502-
The server's system clock at the time of transmission,
1503-
given in TimestampTz format.
1480+
The server's system clock at the time of transmission, as
1481+
microseconds since midnight on 2000-01-01.
1482+
</para>
1483+
</listitem>
1484+
</varlistentry>
1485+
<varlistentry>
1486+
<term>
1487+
Byte1
1488+
</term>
1489+
<listitem>
1490+
<para>
1491+
1 means that the client should reply to this message as soon as
1492+
possible, to avoid a timeout disconnect. 0 otherwise.
15041493
</para>
15051494
</listitem>
15061495
</varlistentry>
@@ -1511,6 +1500,12 @@ The commands accepted in walsender mode are:
15111500
</variablelist>
15121501
</para>
15131502

1503+
<para>
1504+
The receiving process can send replies back to the sender at any time,
1505+
using one of the following message formats (also in the payload of a
1506+
CopyData message):
1507+
</para>
1508+
15141509
<para>
15151510
<variablelist>
15161511
<varlistentry>
@@ -1532,45 +1527,56 @@ The commands accepted in walsender mode are:
15321527
</varlistentry>
15331528
<varlistentry>
15341529
<term>
1535-
Byte8
1530+
Int64
15361531
</term>
15371532
<listitem>
15381533
<para>
15391534
The location of the last WAL byte + 1 received and written to disk
1540-
in the standby, in XLogRecPtr format.
1535+
in the standby.
15411536
</para>
15421537
</listitem>
15431538
</varlistentry>
15441539
<varlistentry>
15451540
<term>
1546-
Byte8
1541+
Int64
15471542
</term>
15481543
<listitem>
15491544
<para>
15501545
The location of the last WAL byte + 1 flushed to disk in
1551-
the standby, in XLogRecPtr format.
1546+
the standby.
1547+
</para>
1548+
</listitem>
1549+
</varlistentry>
1550+
<varlistentry>
1551+
<term>
1552+
Int64
1553+
</term>
1554+
<listitem>
1555+
<para>
1556+
The location of the last WAL byte + 1 applied in the standby.
15521557
</para>
15531558
</listitem>
15541559
</varlistentry>
15551560
<varlistentry>
15561561
<term>
1557-
Byte8
1562+
Int64
15581563
</term>
15591564
<listitem>
15601565
<para>
1561-
The location of the last WAL byte + 1 applied in the standby, in
1562-
XLogRecPtr format.
1566+
The client's system clock at the time of transmission, as
1567+
microseconds since midnight on 2000-01-01.
15631568
</para>
15641569
</listitem>
15651570
</varlistentry>
15661571
<varlistentry>
15671572
<term>
1568-
Byte8
1573+
Byte1
15691574
</term>
15701575
<listitem>
15711576
<para>
1572-
The server's system clock at the time of transmission,
1573-
given in TimestampTz format.
1577+
If 1, the client requests the server to reply to this message
1578+
immediately. This can be used to ping the server, to test if
1579+
the connection is still healthy.
15741580
</para>
15751581
</listitem>
15761582
</varlistentry>
@@ -1602,28 +1608,29 @@ The commands accepted in walsender mode are:
16021608
</varlistentry>
16031609
<varlistentry>
16041610
<term>
1605-
Byte8
1611+
Int64
16061612
</term>
16071613
<listitem>
16081614
<para>
1609-
The server's system clock at the time of transmission,
1610-
given in TimestampTz format.
1615+
The client's system clock at the time of transmission, as
1616+
microseconds since midnight on 2000-01-01.
16111617
</para>
16121618
</listitem>
16131619
</varlistentry>
16141620
<varlistentry>
16151621
<term>
1616-
Byte4
1622+
Int32
16171623
</term>
16181624
<listitem>
16191625
<para>
1620-
The standby's current xmin.
1626+
The standby's current xmin. This may be 0, if the standby does not
1627+
support feedback, or is not yet in Hot Standby state.
16211628
</para>
16221629
</listitem>
16231630
</varlistentry>
16241631
<varlistentry>
16251632
<term>
1626-
Byte4
1633+
Int32
16271634
</term>
16281635
<listitem>
16291636
<para>

0 commit comments

Comments
 (0)