Skip to content

Commit 4d41799

Browse files
Add eager and lazy freezing strategies to VACUUM.
Eager freezing strategy avoids large build-ups of all-visible pages. It makes VACUUM trigger page-level freezing whenever doing so will enable the page to become all-frozen in the visibility map. This is useful for tables that experience continual growth, particularly strict append-only tables such as pgbench's history table. Eager freezing significantly improves performance stability by spreading out the cost of freezing over time, rather than doing most freezing during aggressive VACUUMs. It complements the insert autovacuum mechanism added by commit b07642d. VACUUM determines its freezing strategy based on the value of the new vacuum_freeze_strategy_threshold GUC (or reloption) with logged tables. Tables that exceed the size threshold use the eager freezing strategy. Unlogged tables and temp tables always use eager freezing strategy, since the added cost is negligible there. Non-permanent relations won't incur any extra overhead in WAL written (for the obvious reason), nor in pages dirtied (since any extra freezing will only take place on pages whose PD_ALL_VISIBLE bit needed to be set either way). VACUUM uses lazy freezing strategy for logged tables that fall under the GUC size threshold. Page-level freezing triggers based on the criteria established in commit 1de58df, which added basic page-level freezing. Eager freezing is strictly more aggressive than lazy freezing. Settings like vacuum_freeze_min_age still get applied in just the same way in every VACUUM, independent of the strategy in use. The only mechanical difference between eager and lazy freezing strategies is that only the former applies its own additional criteria to trigger freezing pages. Note that even lazy freezing strategy will trigger freezing whenever a page happens to have required that an FPI be written during pruning, provided that the page will thereby become all-frozen in the visibility map afterwards (due to the FPI optimization from commit 1de58df). The vacuum_freeze_strategy_threshold default setting is 4GB. This is a relatively low setting that prioritizes performance stability. It will be reviewed at the end of the Postgres 16 beta period. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Jeff Davis <pgsql@j-davis.com> Reviewed-By: Andres Freund <andres@anarazel.de> Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAH2-WzkFok_6EAHuK39GaW4FjEFQsY=3J0AAd6FXk93u-Xq3Fg@mail.gmail.com
1 parent 642e882 commit 4d41799

File tree

12 files changed

+197
-14
lines changed

12 files changed

+197
-14
lines changed

doc/src/sgml/config.sgml

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9272,6 +9272,36 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
92729272
</listitem>
92739273
</varlistentry>
92749274

9275+
<varlistentry id="guc-vacuum-freeze-strategy-threshold" xreflabel="vacuum_freeze_strategy_threshold">
9276+
<term><varname>vacuum_freeze_strategy_threshold</varname> (<type>integer</type>)
9277+
<indexterm>
9278+
<primary><varname>vacuum_freeze_strategy_threshold</varname> configuration parameter</primary>
9279+
</indexterm>
9280+
</term>
9281+
<listitem>
9282+
<para>
9283+
Specifies the cutoff storage size that
9284+
<command>VACUUM</command> should use to determine its freezing
9285+
strategy. This is applied by comparing it to the size of the
9286+
target table's <glossterm linkend="glossary-fork">main
9287+
fork</glossterm> at the beginning of each <command>VACUUM</command>.
9288+
Eager freezing strategy is used by <command>VACUUM</command>
9289+
when the table's main fork size exceeds this value.
9290+
<command>VACUUM</command> <emphasis>always</emphasis> uses
9291+
eager freezing strategy when processing <glossterm
9292+
linkend="glossary-unlogged">unlogged</glossterm> tables,
9293+
regardless of this setting. Otherwise <command>VACUUM</command>
9294+
uses lazy freezing strategy. For more information see <xref
9295+
linkend="vacuum-for-wraparound"/>.
9296+
</para>
9297+
<para>
9298+
If this value is specified without units, it is taken as
9299+
megabytes. The default is four gigabytes
9300+
(<literal>4GB</literal>).
9301+
</para>
9302+
</listitem>
9303+
</varlistentry>
9304+
92759305
<varlistentry id="guc-vacuum-failsafe-age" xreflabel="vacuum_failsafe_age">
92769306
<term><varname>vacuum_failsafe_age</varname> (<type>integer</type>)
92779307
<indexterm>

doc/src/sgml/maintenance.sgml

Lines changed: 38 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -478,13 +478,30 @@
478478
</note>
479479

480480
<para>
481-
<xref linkend="guc-vacuum-freeze-min-age"/>
482-
controls how old an XID value has to be before rows bearing that XID will be
483-
frozen. Increasing this setting may avoid unnecessary work if the
484-
rows that would otherwise be frozen will soon be modified again,
485-
but decreasing this setting increases
486-
the number of transactions that can elapse before the table must be
487-
vacuumed again.
481+
<xref linkend="guc-vacuum-freeze-strategy-threshold"/> controls
482+
<command>VACUUM</command>'s freezing strategy. The
483+
<firstterm>eager freezing strategy</firstterm> makes
484+
<command>VACUUM</command> freeze all rows on a page whenever each
485+
and every row on the page is considered visible to all current
486+
transactions (immediately after dead row versions are removed).
487+
Freezing pages early and in batch often spreads out the overhead
488+
of freezing over time. <command>VACUUM</command> consistently
489+
avoids allowing unfrozen all-visible pages to build up, improving
490+
system level performance stability. The <firstterm>lazy freezing
491+
strategy</firstterm> makes <command>VACUUM</command> determine
492+
whether pages should be frozen on the basis of the age of the
493+
oldest XID on the page. Freezing pages lazily sometimes avoids
494+
the overhead of freezing that turns out to have been unnecessary
495+
because the rows were modified soon after freezing took place.
496+
</para>
497+
498+
<para>
499+
<xref linkend="guc-vacuum-freeze-min-age"/> controls how old an
500+
XID value has to be before pages with rows bearing that XID are
501+
frozen. This setting is an additional trigger criteria for
502+
freezing a page's tuples. It is used by both freezing strategies,
503+
though it typically has little impact when <command>VACUUM</command>
504+
uses the eager freezing strategy.
488505
</para>
489506

490507
<para>
@@ -506,12 +523,21 @@
506523
always use its aggressive strategy.
507524
</para>
508525

526+
<para>
527+
Controlling the overhead of freezing existing all-visible pages
528+
during aggressive vacuuming is the goal of the eager freezing
529+
strategy. Increasing <varname>vacuum_freeze_strategy_threshold</varname>
530+
may avoid unnecessary work, but it increases the risk of an
531+
eventual aggressive vacuum that performs an excessive amount of
532+
<quote>catch up</quote> freezing all at once.
533+
</para>
534+
509535
<para>
510536
The maximum time that a table can go unvacuumed is two billion
511537
transactions minus the <varname>vacuum_freeze_min_age</varname> value at
512538
the time of the last aggressive vacuum. If it were to go
513-
unvacuumed for longer than
514-
that, data loss could result. To ensure that this does not happen,
539+
unvacuumed for longer than that, the system could temporarily refuse to
540+
allocate new transaction IDs. To ensure that this never happens,
515541
autovacuum is invoked on any table that might contain unfrozen rows with
516542
XIDs older than the age specified by the configuration parameter <xref
517543
linkend="guc-autovacuum-freeze-max-age"/>. (This will happen even if
@@ -551,7 +577,7 @@
551577
</para>
552578

553579
<para>
554-
The sole disadvantage of increasing <varname>autovacuum_freeze_max_age</varname>
580+
One disadvantage of increasing <varname>autovacuum_freeze_max_age</varname>
555581
(and <varname>vacuum_freeze_table_age</varname> along with it) is that
556582
the <filename>pg_xact</filename> and <filename>pg_commit_ts</filename>
557583
subdirectories of the database cluster will take more space, because it
@@ -837,8 +863,8 @@ vacuum insert threshold = vacuum base insert threshold + vacuum insert scale fac
837863
For tables which receive <command>INSERT</command> operations but no or
838864
almost no <command>UPDATE</command>/<command>DELETE</command> operations,
839865
it may be beneficial to lower the table's
840-
<xref linkend="reloption-autovacuum-freeze-min-age"/> as this may allow
841-
tuples to be frozen by earlier vacuums. The number of obsolete tuples and
866+
<xref linkend="reloption-autovacuum-freeze-strategy-threshold"/>
867+
to allow freezing to take place proactively. The number of obsolete tuples and
842868
the number of inserted tuples are obtained from the cumulative statistics system;
843869
it is a semi-accurate count updated by each <command>UPDATE</command>,
844870
<command>DELETE</command> and <command>INSERT</command> operation. (It is

doc/src/sgml/ref/create_table.sgml

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1781,6 +1781,20 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
17811781
</listitem>
17821782
</varlistentry>
17831783

1784+
<varlistentry id="reloption-autovacuum-freeze-strategy-threshold" xreflabel="autovacuum_freeze_strategy_threshold">
1785+
<term><literal>autovacuum_freeze_strategy_threshold</literal>, <literal>toast.autovacuum_freeze_strategy_threshold</literal> (<type>integer</type>)
1786+
<indexterm>
1787+
<primary><varname>autovacuum_freeze_strategy_threshold</varname> storage parameter</primary>
1788+
</indexterm>
1789+
</term>
1790+
<listitem>
1791+
<para>
1792+
Per-table value for <xref linkend="guc-vacuum-freeze-strategy-threshold"/>
1793+
parameter.
1794+
</para>
1795+
</listitem>
1796+
</varlistentry>
1797+
17841798
<varlistentry id="reloption-log-autovacuum-min-duration" xreflabel="log_autovacuum_min_duration">
17851799
<term><literal>log_autovacuum_min_duration</literal>, <literal>toast.log_autovacuum_min_duration</literal> (<type>integer</type>)
17861800
<indexterm>

src/backend/access/common/reloptions.c

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -312,6 +312,14 @@ static relopt_int intRelOpts[] =
312312
ShareUpdateExclusiveLock
313313
}, -1, 0, 2000000000
314314
},
315+
{
316+
{
317+
"autovacuum_freeze_strategy_threshold",
318+
"Table size at which VACUUM freezes using eager strategy, in megabytes.",
319+
RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
320+
ShareUpdateExclusiveLock
321+
}, -1, 0, MAX_KILOBYTES
322+
},
315323
{
316324
{
317325
"log_autovacuum_min_duration",
@@ -1863,6 +1871,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
18631871
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, multixact_freeze_max_age)},
18641872
{"autovacuum_multixact_freeze_table_age", RELOPT_TYPE_INT,
18651873
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, multixact_freeze_table_age)},
1874+
{"autovacuum_freeze_strategy_threshold", RELOPT_TYPE_INT,
1875+
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, freeze_strategy_threshold)},
18661876
{"log_autovacuum_min_duration", RELOPT_TYPE_INT,
18671877
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, log_min_duration)},
18681878
{"toast_tuple_target", RELOPT_TYPE_INT,

src/backend/access/heap/heapam.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7057,6 +7057,7 @@ heap_freeze_tuple(HeapTupleHeader tuple,
70577057
cutoffs.OldestMxact = MultiXactCutoff;
70587058
cutoffs.FreezeLimit = FreezeLimit;
70597059
cutoffs.MultiXactCutoff = MultiXactCutoff;
7060+
cutoffs.freeze_strategy_threshold_pages = 0;
70607061

70617062
pagefrz.freeze_required = true;
70627063
pagefrz.FreezePageRelfrozenXid = FreezeLimit;

src/backend/access/heap/vacuumlazy.c

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -153,6 +153,8 @@ typedef struct LVRelState
153153
bool aggressive;
154154
/* Use visibility map to skip? (disabled by DISABLE_PAGE_SKIPPING) */
155155
bool skipwithvm;
156+
/* Eagerly freeze pages that are eligible to become all-frozen? */
157+
bool eager_freeze_strategy;
156158
/* Wraparound failsafe has been triggered? */
157159
bool failsafe_active;
158160
/* Consider index vacuuming bypass optimization? */
@@ -243,6 +245,7 @@ typedef struct LVSavedErrInfo
243245

244246
/* non-export function prototypes */
245247
static void lazy_scan_heap(LVRelState *vacrel);
248+
static void lazy_scan_strategy(LVRelState *vacrel);
246249
static BlockNumber lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer,
247250
BlockNumber next_block,
248251
bool *next_unskippable_allvis,
@@ -472,6 +475,10 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
472475

473476
vacrel->skipwithvm = skipwithvm;
474477

478+
/*
479+
* Now determine VACUUM's freezing strategy
480+
*/
481+
lazy_scan_strategy(vacrel);
475482
if (verbose)
476483
{
477484
if (vacrel->aggressive)
@@ -1267,6 +1274,38 @@ lazy_scan_heap(LVRelState *vacrel)
12671274
lazy_cleanup_all_indexes(vacrel);
12681275
}
12691276

1277+
/*
1278+
* lazy_scan_strategy() -- Determine freezing strategy.
1279+
*
1280+
* Our lazy freezing strategy is useful when putting off the work of freezing
1281+
* totally avoids freezing that turns out to have been wasted effort later on.
1282+
* Our eager freezing strategy is useful with larger tables that experience
1283+
* continual growth, where freezing pages proactively is needed just to avoid
1284+
* falling behind on freezing (eagerness is also likely to be cheaper in the
1285+
* short/medium term for such tables, but the long term picture matters most).
1286+
*/
1287+
static void
1288+
lazy_scan_strategy(LVRelState *vacrel)
1289+
{
1290+
BlockNumber rel_pages = vacrel->rel_pages;
1291+
1292+
/*
1293+
* Decide freezing strategy.
1294+
*
1295+
* The eager freezing strategy is used whenever rel_pages exceeds a
1296+
* threshold controlled by the freeze_strategy_threshold GUC/reloption.
1297+
*
1298+
* Also freeze eagerly with an unlogged or temp table, where the total
1299+
* cost of freezing pages is mostly just the cycles needed to prepare a
1300+
* set of freeze plans. Executing the freeze plans adds very little cost.
1301+
* Dirtying extra pages isn't a concern, either; VACUUM will definitely
1302+
* set PD_ALL_VISIBLE on affected pages, regardless of freezing strategy.
1303+
*/
1304+
vacrel->eager_freeze_strategy =
1305+
(rel_pages > vacrel->cutoffs.freeze_strategy_threshold_pages ||
1306+
!RelationIsPermanent(vacrel->rel));
1307+
}
1308+
12701309
/*
12711310
* lazy_scan_skip() -- set up range of skippable blocks using visibility map.
12721311
*
@@ -1795,10 +1834,12 @@ lazy_scan_prune(LVRelState *vacrel,
17951834
* one XID/MXID from before FreezeLimit/MultiXactCutoff is present. Also
17961835
* freeze when pruning generated an FPI, if doing so means that we set the
17971836
* page all-frozen afterwards (might not happen until final heap pass).
1837+
* When ongoing VACUUM opted to use the eager freezing strategy we freeze
1838+
* any page that will thereby become all-frozen in the visibility map.
17981839
*/
17991840
if (pagefrz.freeze_required || tuples_frozen == 0 ||
18001841
(prunestate->all_visible && prunestate->all_frozen &&
1801-
fpi_before != pgWalUsage.wal_fpi))
1842+
(fpi_before != pgWalUsage.wal_fpi || vacrel->eager_freeze_strategy)))
18021843
{
18031844
/*
18041845
* We're freezing the page. Our final NewRelfrozenXid doesn't need to

src/backend/commands/vacuum.c

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@ int vacuum_freeze_min_age;
6868
int vacuum_freeze_table_age;
6969
int vacuum_multixact_freeze_min_age;
7070
int vacuum_multixact_freeze_table_age;
71+
int vacuum_freeze_strategy_threshold;
7172
int vacuum_failsafe_age;
7273
int vacuum_multixact_failsafe_age;
7374

@@ -264,13 +265,15 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
264265
params.freeze_table_age = 0;
265266
params.multixact_freeze_min_age = 0;
266267
params.multixact_freeze_table_age = 0;
268+
params.freeze_strategy_threshold = 0;
267269
}
268270
else
269271
{
270272
params.freeze_min_age = -1;
271273
params.freeze_table_age = -1;
272274
params.multixact_freeze_min_age = -1;
273275
params.multixact_freeze_table_age = -1;
276+
params.freeze_strategy_threshold = -1;
274277
}
275278

276279
/* user-invoked vacuum is never "for wraparound" */
@@ -962,7 +965,9 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams *params,
962965
multixact_freeze_min_age,
963966
freeze_table_age,
964967
multixact_freeze_table_age,
965-
effective_multixact_freeze_max_age;
968+
effective_multixact_freeze_max_age,
969+
freeze_strategy_threshold;
970+
uint64 threshold_strategy_pages;
966971
TransactionId nextXID,
967972
safeOldestXmin,
968973
aggressiveXIDCutoff;
@@ -975,6 +980,7 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams *params,
975980
multixact_freeze_min_age = params->multixact_freeze_min_age;
976981
freeze_table_age = params->freeze_table_age;
977982
multixact_freeze_table_age = params->multixact_freeze_table_age;
983+
freeze_strategy_threshold = params->freeze_strategy_threshold;
978984

979985
/* Set pg_class fields in cutoffs */
980986
cutoffs->relfrozenxid = rel->rd_rel->relfrozenxid;
@@ -1089,6 +1095,23 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams *params,
10891095
if (MultiXactIdPrecedes(cutoffs->OldestMxact, cutoffs->MultiXactCutoff))
10901096
cutoffs->MultiXactCutoff = cutoffs->OldestMxact;
10911097

1098+
/*
1099+
* Determine the freeze_strategy_threshold to use: as specified by the
1100+
* caller, or vacuum_freeze_strategy_threshold
1101+
*/
1102+
if (freeze_strategy_threshold < 0)
1103+
freeze_strategy_threshold = vacuum_freeze_strategy_threshold;
1104+
Assert(freeze_strategy_threshold >= 0);
1105+
1106+
/*
1107+
* Convert MB-based freeze_strategy_threshold to page-based value used by
1108+
* our vacuumlazy.c caller, while being careful to avoid overflow
1109+
*/
1110+
threshold_strategy_pages =
1111+
((uint64) freeze_strategy_threshold * 1024 * 1024) / BLCKSZ;
1112+
threshold_strategy_pages = Min(threshold_strategy_pages, MaxBlockNumber);
1113+
cutoffs->freeze_strategy_threshold_pages = threshold_strategy_pages;
1114+
10921115
/*
10931116
* Finally, figure out if caller needs to do an aggressive VACUUM or not.
10941117
*

src/backend/postmaster/autovacuum.c

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,6 +151,7 @@ static int default_freeze_min_age;
151151
static int default_freeze_table_age;
152152
static int default_multixact_freeze_min_age;
153153
static int default_multixact_freeze_table_age;
154+
static int default_freeze_strategy_threshold;
154155

155156
/* Memory context for long-lived data */
156157
static MemoryContext AutovacMemCxt;
@@ -2010,13 +2011,15 @@ do_autovacuum(void)
20102011
default_freeze_table_age = 0;
20112012
default_multixact_freeze_min_age = 0;
20122013
default_multixact_freeze_table_age = 0;
2014+
default_freeze_strategy_threshold = 0;
20132015
}
20142016
else
20152017
{
20162018
default_freeze_min_age = vacuum_freeze_min_age;
20172019
default_freeze_table_age = vacuum_freeze_table_age;
20182020
default_multixact_freeze_min_age = vacuum_multixact_freeze_min_age;
20192021
default_multixact_freeze_table_age = vacuum_multixact_freeze_table_age;
2022+
default_freeze_strategy_threshold = vacuum_freeze_strategy_threshold;
20202023
}
20212024

20222025
ReleaseSysCache(tuple);
@@ -2801,6 +2804,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
28012804
int freeze_table_age;
28022805
int multixact_freeze_min_age;
28032806
int multixact_freeze_table_age;
2807+
int freeze_strategy_threshold;
28042808
int vac_cost_limit;
28052809
double vac_cost_delay;
28062810
int log_min_duration;
@@ -2850,6 +2854,11 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
28502854
? avopts->multixact_freeze_table_age
28512855
: default_multixact_freeze_table_age;
28522856

2857+
freeze_strategy_threshold = (avopts &&
2858+
avopts->freeze_strategy_threshold >= 0)
2859+
? avopts->freeze_strategy_threshold
2860+
: default_freeze_strategy_threshold;
2861+
28532862
tab = palloc(sizeof(autovac_table));
28542863
tab->at_relid = relid;
28552864
tab->at_sharedrel = classForm->relisshared;
@@ -2877,6 +2886,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
28772886
tab->at_params.freeze_table_age = freeze_table_age;
28782887
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
28792888
tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
2889+
tab->at_params.freeze_strategy_threshold = freeze_strategy_threshold;
28802890
tab->at_params.is_wraparound = wraparound;
28812891
tab->at_params.log_min_duration = log_min_duration;
28822892
tab->at_vacuum_cost_limit = vac_cost_limit;

src/backend/utils/misc/guc_tables.c

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2535,6 +2535,20 @@ struct config_int ConfigureNamesInt[] =
25352535
NULL, NULL, NULL
25362536
},
25372537

2538+
{
2539+
{"vacuum_freeze_strategy_threshold", PGC_USERSET, CLIENT_CONN_STATEMENT,
2540+
gettext_noop("Table size at which VACUUM freezes using eager strategy, in megabytes."),
2541+
gettext_noop("This is applied by comparing it to the size of a table's main fork at "
2542+
"the beginning of each VACUUM. Eager freezing strategy is used when size "
2543+
"exceeds the threshold, or when table is a temporary or unlogged table. "
2544+
"Otherwise lazy freezing strategy is used."),
2545+
GUC_UNIT_MB
2546+
},
2547+
&vacuum_freeze_strategy_threshold,
2548+
4096, 0, MAX_KILOBYTES,
2549+
NULL, NULL, NULL
2550+
},
2551+
25382552
{
25392553
{"vacuum_defer_cleanup_age", PGC_SIGHUP, REPLICATION_PRIMARY,
25402554
gettext_noop("Number of transactions by which VACUUM and HOT cleanup should be deferred, if any."),

0 commit comments

Comments
 (0)