Skip to content

Commit c2dc193

Browse files
committed
Revert recovery prefetching feature.
This set of commits has some bugs with known fixes, but at this late stage in the release cycle it seems best to revert and resubmit next time, along with some new automated test coverage for this whole area. Commits reverted: dc88460: Doc: Review for "Optionally prefetch referenced data in recovery." 1d25757: Optionally prefetch referenced data in recovery. f003d9f: Add circular WAL decoding buffer. 323cbe7: Remove read_page callback from XLogReader. Remove the new GUC group WAL_RECOVERY recently added by a55a984, as the corresponding section of config.sgml is now reverted. Discussion: https://postgr.es/m/CAOuzzgrn7iKnFRsB4MHp3UisEQAGgZMbk_ViTN4HV4-Ksq8zCg%40mail.gmail.com
1 parent 63db0ac commit c2dc193

File tree

35 files changed

+815
-3080
lines changed

35 files changed

+815
-3080
lines changed

doc/src/sgml/config.sgml

-83
Original file line numberDiff line numberDiff line change
@@ -3588,89 +3588,6 @@ include_dir 'conf.d'
35883588
</variablelist>
35893589
</sect2>
35903590

3591-
<sect2 id="runtime-config-wal-recovery">
3592-
3593-
<title>Recovery</title>
3594-
3595-
<indexterm>
3596-
<primary>configuration</primary>
3597-
<secondary>of recovery</secondary>
3598-
<tertiary>general settings</tertiary>
3599-
</indexterm>
3600-
3601-
<para>
3602-
This section describes the settings that apply to recovery in general,
3603-
affecting crash recovery, streaming replication and archive-based
3604-
replication.
3605-
</para>
3606-
3607-
3608-
<variablelist>
3609-
<varlistentry id="guc-recovery-prefetch" xreflabel="recovery_prefetch">
3610-
<term><varname>recovery_prefetch</varname> (<type>boolean</type>)
3611-
<indexterm>
3612-
<primary><varname>recovery_prefetch</varname> configuration parameter</primary>
3613-
</indexterm>
3614-
</term>
3615-
<listitem>
3616-
<para>
3617-
Whether to try to prefetch blocks that are referenced in the WAL that
3618-
are not yet in the buffer pool, during recovery. Prefetching blocks
3619-
that will soon be needed can reduce I/O wait times in some workloads.
3620-
See also the <xref linkend="guc-wal-decode-buffer-size"/> and
3621-
<xref linkend="guc-maintenance-io-concurrency"/> settings, which limit
3622-
prefetching activity.
3623-
This setting is disabled by default.
3624-
</para>
3625-
<para>
3626-
This feature currently depends on an effective
3627-
<function>posix_fadvise</function> function, which some
3628-
operating systems lack.
3629-
</para>
3630-
</listitem>
3631-
</varlistentry>
3632-
3633-
<varlistentry id="guc-recovery-prefetch-fpw" xreflabel="recovery_prefetch_fpw">
3634-
<term><varname>recovery_prefetch_fpw</varname> (<type>boolean</type>)
3635-
<indexterm>
3636-
<primary><varname>recovery_prefetch_fpw</varname> configuration parameter</primary>
3637-
</indexterm>
3638-
</term>
3639-
<listitem>
3640-
<para>
3641-
Whether to prefetch blocks that were logged with full page images,
3642-
during recovery. Often this doesn't help, since such blocks will not
3643-
be read the first time they are needed and might remain in the buffer
3644-
pool after that. However, on file systems with a block size larger
3645-
than
3646-
<productname>PostgreSQL</productname>'s, prefetching can avoid a
3647-
costly read-before-write when blocks are later written.
3648-
The default is off.
3649-
</para>
3650-
</listitem>
3651-
</varlistentry>
3652-
3653-
<varlistentry id="guc-wal-decode-buffer-size" xreflabel="wal_decode_buffer_size">
3654-
<term><varname>wal_decode_buffer_size</varname> (<type>integer</type>)
3655-
<indexterm>
3656-
<primary><varname>wal_decode_buffer_size</varname> configuration parameter</primary>
3657-
</indexterm>
3658-
</term>
3659-
<listitem>
3660-
<para>
3661-
A limit on how far ahead the server can look in the WAL, to find
3662-
blocks to prefetch. Setting it too high might be counterproductive,
3663-
if it means that data falls out of the
3664-
kernel cache before it is needed. If this value is specified without
3665-
units, it is taken as bytes.
3666-
The default is 512kB.
3667-
</para>
3668-
</listitem>
3669-
</varlistentry>
3670-
3671-
</variablelist>
3672-
</sect2>
3673-
36743591
<sect2 id="runtime-config-wal-archive-recovery">
36753592

36763593
<title>Archive Recovery</title>

doc/src/sgml/monitoring.sgml

+2-84
Original file line numberDiff line numberDiff line change
@@ -337,13 +337,6 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
337337
</entry>
338338
</row>
339339

340-
<row>
341-
<entry><structname>pg_stat_prefetch_recovery</structname><indexterm><primary>pg_stat_prefetch_recovery</primary></indexterm></entry>
342-
<entry>Only one row, showing statistics about blocks prefetched during recovery.
343-
See <xref linkend="pg-stat-prefetch-recovery-view"/> for details.
344-
</entry>
345-
</row>
346-
347340
<row>
348341
<entry><structname>pg_stat_subscription</structname><indexterm><primary>pg_stat_subscription</primary></indexterm></entry>
349342
<entry>At least one row per subscription, showing information about
@@ -2948,78 +2941,6 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
29482941
copy of the subscribed tables.
29492942
</para>
29502943

2951-
<table id="pg-stat-prefetch-recovery-view" xreflabel="pg_stat_prefetch_recovery">
2952-
<title><structname>pg_stat_prefetch_recovery</structname> View</title>
2953-
<tgroup cols="3">
2954-
<thead>
2955-
<row>
2956-
<entry>Column</entry>
2957-
<entry>Type</entry>
2958-
<entry>Description</entry>
2959-
</row>
2960-
</thead>
2961-
2962-
<tbody>
2963-
<row>
2964-
<entry><structfield>prefetch</structfield></entry>
2965-
<entry><type>bigint</type></entry>
2966-
<entry>Number of blocks prefetched because they were not in the buffer pool</entry>
2967-
</row>
2968-
<row>
2969-
<entry><structfield>skip_hit</structfield></entry>
2970-
<entry><type>bigint</type></entry>
2971-
<entry>Number of blocks not prefetched because they were already in the buffer pool</entry>
2972-
</row>
2973-
<row>
2974-
<entry><structfield>skip_new</structfield></entry>
2975-
<entry><type>bigint</type></entry>
2976-
<entry>Number of blocks not prefetched because they were new (usually relation extension)</entry>
2977-
</row>
2978-
<row>
2979-
<entry><structfield>skip_fpw</structfield></entry>
2980-
<entry><type>bigint</type></entry>
2981-
<entry>Number of blocks not prefetched because a full page image was included in the WAL and <xref linkend="guc-recovery-prefetch-fpw"/> was set to <literal>off</literal></entry>
2982-
</row>
2983-
<row>
2984-
<entry><structfield>skip_seq</structfield></entry>
2985-
<entry><type>bigint</type></entry>
2986-
<entry>Number of blocks not prefetched because of repeated access</entry>
2987-
</row>
2988-
<row>
2989-
<entry><structfield>distance</structfield></entry>
2990-
<entry><type>integer</type></entry>
2991-
<entry>How far ahead of recovery the prefetcher is currently reading, in bytes</entry>
2992-
</row>
2993-
<row>
2994-
<entry><structfield>queue_depth</structfield></entry>
2995-
<entry><type>integer</type></entry>
2996-
<entry>How many prefetches have been initiated but are not yet known to have completed</entry>
2997-
</row>
2998-
<row>
2999-
<entry><structfield>avg_distance</structfield></entry>
3000-
<entry><type>float4</type></entry>
3001-
<entry>How far ahead of recovery the prefetcher is on average, while recovery is not idle</entry>
3002-
</row>
3003-
<row>
3004-
<entry><structfield>avg_queue_depth</structfield></entry>
3005-
<entry><type>float4</type></entry>
3006-
<entry>Average number of prefetches in flight while recovery is not idle</entry>
3007-
</row>
3008-
</tbody>
3009-
</tgroup>
3010-
</table>
3011-
3012-
<para>
3013-
The <structname>pg_stat_prefetch_recovery</structname> view will contain only
3014-
one row. It is filled with nulls if recovery is not running or WAL
3015-
prefetching is not enabled. See <xref linkend="guc-recovery-prefetch"/>
3016-
for more information. The counters in this view are reset whenever the
3017-
<xref linkend="guc-recovery-prefetch"/>,
3018-
<xref linkend="guc-recovery-prefetch-fpw"/> or
3019-
<xref linkend="guc-maintenance-io-concurrency"/> setting is changed and
3020-
the server configuration is reloaded.
3021-
</para>
3022-
30232944
<table id="pg-stat-subscription" xreflabel="pg_stat_subscription">
30242945
<title><structname>pg_stat_subscription</structname> View</title>
30252946
<tgroup cols="1">
@@ -5152,11 +5073,8 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
51525073
all the counters shown in
51535074
the <structname>pg_stat_bgwriter</structname>
51545075
view, <literal>archiver</literal> to reset all the counters shown in
5155-
the <structname>pg_stat_archiver</structname> view,
5156-
<literal>wal</literal> to reset all the counters shown in the
5157-
<structname>pg_stat_wal</structname> view or
5158-
<literal>prefetch_recovery</literal> to reset all the counters shown
5159-
in the <structname>pg_stat_prefetch_recovery</structname> view.
5076+
the <structname>pg_stat_archiver</structname> view or <literal>wal</literal>
5077+
to reset all the counters shown in the <structname>pg_stat_wal</structname> view.
51605078
</para>
51615079
<para>
51625080
This function is restricted to superusers by default, but other users

doc/src/sgml/wal.sgml

-15
Original file line numberDiff line numberDiff line change
@@ -803,21 +803,6 @@
803803
counted as <literal>wal_write</literal> and <literal>wal_sync</literal>
804804
in <structname>pg_stat_wal</structname>, respectively.
805805
</para>
806-
807-
<para>
808-
The <xref linkend="guc-recovery-prefetch"/> parameter can
809-
be used to improve I/O performance during recovery by instructing
810-
<productname>PostgreSQL</productname> to initiate reads
811-
of disk blocks that will soon be needed but are not currently in
812-
<productname>PostgreSQL</productname>'s buffer pool.
813-
The <xref linkend="guc-maintenance-io-concurrency"/> and
814-
<xref linkend="guc-wal-decode-buffer-size"/> settings limit prefetching
815-
concurrency and distance, respectively. The
816-
prefetching mechanism is most likely to be effective on systems
817-
with <varname>full_page_writes</varname> set to
818-
<varname>off</varname> (where that is safe), and where the working
819-
set is larger than RAM. By default, prefetching in recovery is disabled.
820-
</para>
821806
</sect1>
822807

823808
<sect1 id="wal-internals">

src/backend/access/transam/Makefile

-1
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,6 @@ OBJS = \
3131
xlogarchive.o \
3232
xlogfuncs.o \
3333
xloginsert.o \
34-
xlogprefetch.o \
3534
xlogreader.o \
3635
xlogutils.o
3736

src/backend/access/transam/generic_xlog.c

+3-3
Original file line numberDiff line numberDiff line change
@@ -482,10 +482,10 @@ generic_redo(XLogReaderState *record)
482482
uint8 block_id;
483483

484484
/* Protect limited size of buffers[] array */
485-
Assert(XLogRecMaxBlockId(record) < MAX_GENERIC_XLOG_PAGES);
485+
Assert(record->max_block_id < MAX_GENERIC_XLOG_PAGES);
486486

487487
/* Iterate over blocks */
488-
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
488+
for (block_id = 0; block_id <= record->max_block_id; block_id++)
489489
{
490490
XLogRedoAction action;
491491

@@ -525,7 +525,7 @@ generic_redo(XLogReaderState *record)
525525
}
526526

527527
/* Changes are done: unlock and release all buffers */
528-
for (block_id = 0; block_id <= XLogRecMaxBlockId(record); block_id++)
528+
for (block_id = 0; block_id <= record->max_block_id; block_id++)
529529
{
530530
if (BufferIsValid(buffers[block_id]))
531531
UnlockReleaseBuffer(buffers[block_id]);

src/backend/access/transam/twophase.c

+6-8
Original file line numberDiff line numberDiff line change
@@ -1330,21 +1330,19 @@ XlogReadTwoPhaseData(XLogRecPtr lsn, char **buf, int *len)
13301330
char *errormsg;
13311331
TimeLineID save_currtli = ThisTimeLineID;
13321332

1333-
xlogreader = XLogReaderAllocate(wal_segment_size, NULL, wal_segment_close);
1334-
1333+
xlogreader = XLogReaderAllocate(wal_segment_size, NULL,
1334+
XL_ROUTINE(.page_read = &read_local_xlog_page,
1335+
.segment_open = &wal_segment_open,
1336+
.segment_close = &wal_segment_close),
1337+
NULL);
13351338
if (!xlogreader)
13361339
ereport(ERROR,
13371340
(errcode(ERRCODE_OUT_OF_MEMORY),
13381341
errmsg("out of memory"),
13391342
errdetail("Failed while allocating a WAL reading processor.")));
13401343

13411344
XLogBeginRead(xlogreader, lsn);
1342-
while (XLogReadRecord(xlogreader, &record, &errormsg) ==
1343-
XLREAD_NEED_DATA)
1344-
{
1345-
if (!read_local_xlog_page(xlogreader))
1346-
break;
1347-
}
1345+
record = XLogReadRecord(xlogreader, &errormsg);
13481346

13491347
/*
13501348
* Restore immediately the timeline where it was previously, as

0 commit comments

Comments
 (0)