Skip to content

Commit 9663d12

Browse files
Don't consider newly inserted tuples in nbtree VACUUM.
Remove the entire idea of "stale stats" within nbtree VACUUM (stop caring about stats involving the number of inserted tuples). Also remove the vacuum_cleanup_index_scale_factor GUC/param on the master branch (though just disable them on postgres 13). The vacuum_cleanup_index_scale_factor/stats interface made the nbtree AM partially responsible for deciding when pg_class.reltuples stats needed to be updated. This seems contrary to the spirit of the index AM API, though -- it is not actually necessary for an index AM's bulk delete and cleanup callbacks to provide accurate stats when it happens to be inconvenient. The core code owns that. (Index AMs have the authority to perform or not perform certain kinds of deferred cleanup based on their own considerations, such as page deletion and recycling, but that has little to do with pg_class.reltuples/num_index_tuples.) This issue was fairly harmless until the introduction of the autovacuum_vacuum_insert_threshold feature by commit b07642d, which had an undesirable interaction with the vacuum_cleanup_index_scale_factor mechanism: it made insert-driven autovacuums perform full index scans, even though there is no real benefit to doing so. This has been tied to a regression with an append-only insert benchmark [1]. Also have remaining cases that perform a full scan of an index during a cleanup-only nbtree VACUUM indicate that the final tuple count is only an estimate. This prevents vacuumlazy.c from setting the index's pg_class.reltuples in those cases (it will now only update pg_class when vacuumlazy.c had TIDs for nbtree to bulk delete). This arguably fixes an oversight in deduplication-related bugfix commit 48e1291. [1] https://smalldatum.blogspot.com/2021/01/insert-benchmark-postgres-is-still.html Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAD21AoA4WHthN5uU6+WScZ7+J_RcEjmcuH94qcoUPuB42ShXzg@mail.gmail.com Backpatch: 13-, where autovacuum_vacuum_insert_threshold was added.
1 parent 9a4e4af commit 9663d12

File tree

6 files changed

+60
-167
lines changed

6 files changed

+60
-167
lines changed

doc/src/sgml/config.sgml

Lines changed: 0 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -8349,47 +8349,6 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
83498349
</listitem>
83508350
</varlistentry>
83518351

8352-
<varlistentry id="guc-vacuum-cleanup-index-scale-factor" xreflabel="vacuum_cleanup_index_scale_factor">
8353-
<term><varname>vacuum_cleanup_index_scale_factor</varname> (<type>floating point</type>)
8354-
<indexterm>
8355-
<primary><varname>vacuum_cleanup_index_scale_factor</varname></primary>
8356-
<secondary>configuration parameter</secondary>
8357-
</indexterm>
8358-
</term>
8359-
<listitem>
8360-
<para>
8361-
Specifies the fraction of the total number of heap tuples counted in
8362-
the previous statistics collection that can be inserted without
8363-
incurring an index scan at the <command>VACUUM</command> cleanup stage.
8364-
This setting currently applies to B-tree indexes only.
8365-
</para>
8366-
8367-
<para>
8368-
If no tuples were deleted from the heap, B-tree indexes are still
8369-
scanned at the <command>VACUUM</command> cleanup stage when at least one
8370-
of the following conditions is met: the index statistics are stale, or
8371-
the index contains deleted pages that can be recycled during cleanup.
8372-
Index statistics are considered to be stale if the number of newly
8373-
inserted tuples exceeds the <varname>vacuum_cleanup_index_scale_factor</varname>
8374-
fraction of the total number of heap tuples detected by the previous
8375-
statistics collection. The total number of heap tuples is stored in
8376-
the index meta-page. Note that the meta-page does not include this data
8377-
until <command>VACUUM</command> finds no dead tuples, so B-tree index
8378-
scan at the cleanup stage can only be skipped if the second and
8379-
subsequent <command>VACUUM</command> cycles detect no dead tuples.
8380-
</para>
8381-
8382-
<para>
8383-
The value can range from <literal>0</literal> to
8384-
<literal>10000000000</literal>.
8385-
When <varname>vacuum_cleanup_index_scale_factor</varname> is set to
8386-
<literal>0</literal>, index scans are never skipped during
8387-
<command>VACUUM</command> cleanup. The default value is <literal>0.1</literal>.
8388-
</para>
8389-
8390-
</listitem>
8391-
</varlistentry>
8392-
83938352
<varlistentry id="guc-bytea-output" xreflabel="bytea_output">
83948353
<term><varname>bytea_output</varname> (<type>enum</type>)
83958354
<indexterm>

doc/src/sgml/ref/create_index.sgml

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -434,20 +434,6 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
434434
</note>
435435
</listitem>
436436
</varlistentry>
437-
438-
<varlistentry id="index-reloption-vacuum-cleanup-index-scale-factor" xreflabel="vacuum_cleanup_index_scale_factor">
439-
<term><literal>vacuum_cleanup_index_scale_factor</literal> (<type>floating point</type>)
440-
<indexterm>
441-
<primary><varname>vacuum_cleanup_index_scale_factor</varname></primary>
442-
<secondary>storage parameter</secondary>
443-
</indexterm>
444-
</term>
445-
<listitem>
446-
<para>
447-
Per-index value for <xref linkend="guc-vacuum-cleanup-index-scale-factor"/>.
448-
</para>
449-
</listitem>
450-
</varlistentry>
451437
</variablelist>
452438

453439
<para>

src/backend/access/nbtree/nbtpage.c

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -168,6 +168,9 @@ _bt_getmeta(Relation rel, Buffer metabuf)
168168
*
169169
* This routine checks if provided cleanup-related information is matching
170170
* to those written in the metapage. On mismatch, metapage is overwritten.
171+
*
172+
* Postgres 13 ignores btm_last_cleanup_num_heap_tuples value here
173+
* following backbranch disabling of vacuum_cleanup_index_scale_factor.
171174
*/
172175
void
173176
_bt_update_meta_cleanup_info(Relation rel, TransactionId oldestBtpoXact,
@@ -176,22 +179,15 @@ _bt_update_meta_cleanup_info(Relation rel, TransactionId oldestBtpoXact,
176179
Buffer metabuf;
177180
Page metapg;
178181
BTMetaPageData *metad;
179-
bool needsRewrite = false;
180-
XLogRecPtr recptr;
181182

182183
/* read the metapage and check if it needs rewrite */
183184
metabuf = _bt_getbuf(rel, BTREE_METAPAGE, BT_READ);
184185
metapg = BufferGetPage(metabuf);
185186
metad = BTPageGetMeta(metapg);
186187

187-
/* outdated version of metapage always needs rewrite */
188-
if (metad->btm_version < BTREE_NOVAC_VERSION)
189-
needsRewrite = true;
190-
else if (metad->btm_oldest_btpo_xact != oldestBtpoXact ||
191-
metad->btm_last_cleanup_num_heap_tuples != numHeapTuples)
192-
needsRewrite = true;
193-
194-
if (!needsRewrite)
188+
/* Don't miss chance to upgrade index/metapage when BTREE_MIN_VERSION */
189+
if (metad->btm_version >= BTREE_NOVAC_VERSION &&
190+
metad->btm_oldest_btpo_xact == oldestBtpoXact)
195191
{
196192
_bt_relbuf(rel, metabuf);
197193
return;
@@ -209,13 +205,14 @@ _bt_update_meta_cleanup_info(Relation rel, TransactionId oldestBtpoXact,
209205

210206
/* update cleanup-related information */
211207
metad->btm_oldest_btpo_xact = oldestBtpoXact;
212-
metad->btm_last_cleanup_num_heap_tuples = numHeapTuples;
208+
metad->btm_last_cleanup_num_heap_tuples = -1;
213209
MarkBufferDirty(metabuf);
214210

215211
/* write wal record if needed */
216212
if (RelationNeedsWAL(rel))
217213
{
218214
xl_btree_metadata md;
215+
XLogRecPtr recptr;
219216

220217
XLogBeginInsert();
221218
XLogRegisterBuffer(0, metabuf, REGBUF_WILL_INIT | REGBUF_STANDARD);
@@ -227,7 +224,7 @@ _bt_update_meta_cleanup_info(Relation rel, TransactionId oldestBtpoXact,
227224
md.fastroot = metad->btm_fastroot;
228225
md.fastlevel = metad->btm_fastlevel;
229226
md.oldest_btpo_xact = oldestBtpoXact;
230-
md.last_cleanup_num_heap_tuples = numHeapTuples;
227+
md.last_cleanup_num_heap_tuples = -1; /* Disabled */
231228
md.allequalimage = metad->btm_allequalimage;
232229

233230
XLogRegisterBufData(0, (char *) &md, sizeof(xl_btree_metadata));
@@ -238,6 +235,7 @@ _bt_update_meta_cleanup_info(Relation rel, TransactionId oldestBtpoXact,
238235
}
239236

240237
END_CRIT_SECTION();
238+
241239
_bt_relbuf(rel, metabuf);
242240
}
243241

src/backend/access/nbtree/nbtree.c

Lines changed: 50 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -794,67 +794,54 @@ _bt_parallel_advance_array_keys(IndexScanDesc scan)
794794
* When we return false, VACUUM can even skip the cleanup-only call to
795795
* btvacuumscan (i.e. there will be no btvacuumscan call for this index at
796796
* all). Otherwise, a cleanup-only btvacuumscan call is required.
797+
*
798+
* Postgres 13 ignores btm_last_cleanup_num_heap_tuples value here following
799+
* backbranch disabling of vacuum_cleanup_index_scale_factor.
797800
*/
798801
static bool
799802
_bt_vacuum_needs_cleanup(IndexVacuumInfo *info)
800803
{
801804
Buffer metabuf;
802805
Page metapg;
803806
BTMetaPageData *metad;
804-
bool result = false;
807+
uint32 btm_version;
808+
TransactionId prev_btm_oldest_btpo_xact;
805809

810+
/*
811+
* Copy details from metapage to local variables quickly.
812+
*
813+
* Note that we deliberately avoid using cached version of metapage here.
814+
*/
806815
metabuf = _bt_getbuf(info->index, BTREE_METAPAGE, BT_READ);
807816
metapg = BufferGetPage(metabuf);
808817
metad = BTPageGetMeta(metapg);
818+
btm_version = metad->btm_version;
809819

810-
if (metad->btm_version < BTREE_NOVAC_VERSION)
820+
if (btm_version < BTREE_NOVAC_VERSION)
811821
{
812822
/*
813-
* Do cleanup if metapage needs upgrade, because we don't have
814-
* cleanup-related meta-information yet.
823+
* Metapage needs to be dynamically upgraded to store fields that are
824+
* only present when btm_version >= BTREE_NOVAC_VERSION
815825
*/
816-
result = true;
826+
_bt_relbuf(info->index, metabuf);
827+
return true;
817828
}
818-
else if (TransactionIdIsValid(metad->btm_oldest_btpo_xact) &&
819-
TransactionIdPrecedes(metad->btm_oldest_btpo_xact,
820-
RecentGlobalXmin))
829+
830+
prev_btm_oldest_btpo_xact = metad->btm_oldest_btpo_xact;
831+
_bt_relbuf(info->index, metabuf);
832+
833+
if (TransactionIdIsValid(prev_btm_oldest_btpo_xact) &&
834+
TransactionIdPrecedes(prev_btm_oldest_btpo_xact, RecentGlobalXmin))
821835
{
822836
/*
823837
* If any oldest btpo.xact from a previously deleted page in the index
824838
* is older than RecentGlobalXmin, then at least one deleted page can
825839
* be recycled -- don't skip cleanup.
826840
*/
827-
result = true;
828-
}
829-
else
830-
{
831-
BTOptions *relopts;
832-
float8 cleanup_scale_factor;
833-
float8 prev_num_heap_tuples;
834-
835-
/*
836-
* If table receives enough insertions and no cleanup was performed,
837-
* then index would appear have stale statistics. If scale factor is
838-
* set, we avoid that by performing cleanup if the number of inserted
839-
* tuples exceeds vacuum_cleanup_index_scale_factor fraction of
840-
* original tuples count.
841-
*/
842-
relopts = (BTOptions *) info->index->rd_options;
843-
cleanup_scale_factor = (relopts &&
844-
relopts->vacuum_cleanup_index_scale_factor >= 0)
845-
? relopts->vacuum_cleanup_index_scale_factor
846-
: vacuum_cleanup_index_scale_factor;
847-
prev_num_heap_tuples = metad->btm_last_cleanup_num_heap_tuples;
848-
849-
if (cleanup_scale_factor <= 0 ||
850-
prev_num_heap_tuples <= 0 ||
851-
(info->num_heap_tuples - prev_num_heap_tuples) /
852-
prev_num_heap_tuples >= cleanup_scale_factor)
853-
result = true;
841+
return true;
854842
}
855843

856-
_bt_relbuf(info->index, metabuf);
857-
return result;
844+
return false;
858845
}
859846

860847
/*
@@ -907,31 +894,37 @@ btvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
907894
* still need to do a pass over the index, to recycle any newly-recyclable
908895
* pages or to obtain index statistics. _bt_vacuum_needs_cleanup
909896
* determines if either are needed.
910-
*
911-
* Since we aren't going to actually delete any leaf items, there's no
912-
* need to go through all the vacuum-cycle-ID pushups.
913897
*/
914898
if (stats == NULL)
915899
{
916900
/* Check if we need a cleanup */
917901
if (!_bt_vacuum_needs_cleanup(info))
918902
return NULL;
919903

904+
/*
905+
* Since we aren't going to actually delete any leaf items, there's no
906+
* need to go through all the vacuum-cycle-ID pushups here.
907+
*
908+
* Posting list tuples are a source of inaccuracy for cleanup-only
909+
* scans. btvacuumscan() will assume that the number of index tuples
910+
* from each page can be used as num_index_tuples, even though
911+
* num_index_tuples is supposed to represent the number of TIDs in the
912+
* index. This naive approach can underestimate the number of tuples
913+
* in the index significantly.
914+
*
915+
* We handle the problem by making num_index_tuples an estimate in
916+
* cleanup-only case.
917+
*/
920918
stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
921919
btvacuumscan(info, stats, NULL, NULL, 0);
920+
stats->estimated_count = true;
922921
}
923922

924923
/*
925924
* It's quite possible for us to be fooled by concurrent page splits into
926925
* double-counting some index tuples, so disbelieve any total that exceeds
927926
* the underlying heap's count ... if we know that accurately. Otherwise
928927
* this might just make matters worse.
929-
*
930-
* Posting list tuples are another source of inaccuracy. Cleanup-only
931-
* btvacuumscan calls assume that the number of index tuples can be used
932-
* as num_index_tuples, even though num_index_tuples is supposed to
933-
* represent the number of TIDs in the index. This naive approach can
934-
* underestimate the number of tuples in the index.
935928
*/
936929
if (!info->estimated_count)
937930
{
@@ -971,7 +964,6 @@ btvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
971964
* Reset counts that will be incremented during the scan; needed in case
972965
* of multiple scans during a single VACUUM command
973966
*/
974-
stats->estimated_count = false;
975967
stats->num_index_tuples = 0;
976968
stats->pages_deleted = 0;
977969

@@ -1059,8 +1051,12 @@ btvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
10591051
IndexFreeSpaceMapVacuum(rel);
10601052

10611053
/*
1062-
* Maintain the oldest btpo.xact and a count of the current number of heap
1063-
* tuples in the metapage (for the benefit of _bt_vacuum_needs_cleanup).
1054+
* Maintain the oldest btpo.xact using _bt_update_meta_cleanup_info, for
1055+
* the benefit of _bt_vacuum_needs_cleanup.
1056+
*
1057+
* Note: We deliberately don't store the count of heap tuples here
1058+
* anymore. The numHeapTuples argument to _bt_update_meta_cleanup_info()
1059+
* is left in place on Postgres 13.
10641060
*
10651061
* The page with the oldest btpo.xact is typically a page deleted by this
10661062
* VACUUM operation, since pages deleted by a previous VACUUM operation
@@ -1070,8 +1066,7 @@ btvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
10701066
* statistics, despite not counting as deleted pages for the purposes of
10711067
* determining the oldest btpo.xact.)
10721068
*/
1073-
_bt_update_meta_cleanup_info(rel, vstate.oldestBtpoXact,
1074-
info->num_heap_tuples);
1069+
_bt_update_meta_cleanup_info(rel, vstate.oldestBtpoXact, -1);
10751070

10761071
/* update statistics */
10771072
stats->num_pages = num_pages;
@@ -1399,7 +1394,10 @@ btvacuumpage(BTVacState *vstate, BlockNumber scanblkno)
13991394
* We don't count the number of live TIDs during cleanup-only calls to
14001395
* btvacuumscan (i.e. when callback is not set). We count the number
14011396
* of index tuples directly instead. This avoids the expense of
1402-
* directly examining all of the tuples on each page.
1397+
* directly examining all of the tuples on each page. VACUUM will
1398+
* treat num_index_tuples as an estimate in cleanup-only case, so it
1399+
* doesn't matter that this underestimates num_index_tuples
1400+
* significantly in some cases.
14031401
*/
14041402
if (minoff > maxoff)
14051403
attempt_pagedel = (blkno == scanblkno);

src/test/regress/expected/btree_index.out

Lines changed: 0 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -308,35 +308,6 @@ alter table btree_tall_tbl alter COLUMN t set storage plain;
308308
create index btree_tall_idx on btree_tall_tbl (t, id) with (fillfactor = 10);
309309
insert into btree_tall_tbl select g, repeat('x', 250)
310310
from generate_series(1, 130) g;
311-
--
312-
-- Test vacuum_cleanup_index_scale_factor
313-
--
314-
-- Simple create
315-
create table btree_test(a int);
316-
create index btree_idx1 on btree_test(a) with (vacuum_cleanup_index_scale_factor = 40.0);
317-
select reloptions from pg_class WHERE oid = 'btree_idx1'::regclass;
318-
reloptions
319-
------------------------------------------
320-
{vacuum_cleanup_index_scale_factor=40.0}
321-
(1 row)
322-
323-
-- Fail while setting improper values
324-
create index btree_idx_err on btree_test(a) with (vacuum_cleanup_index_scale_factor = -10.0);
325-
ERROR: value -10.0 out of bounds for option "vacuum_cleanup_index_scale_factor"
326-
DETAIL: Valid values are between "0.000000" and "10000000000.000000".
327-
create index btree_idx_err on btree_test(a) with (vacuum_cleanup_index_scale_factor = 100.0);
328-
create index btree_idx_err on btree_test(a) with (vacuum_cleanup_index_scale_factor = 'string');
329-
ERROR: invalid value for floating point option "vacuum_cleanup_index_scale_factor": string
330-
create index btree_idx_err on btree_test(a) with (vacuum_cleanup_index_scale_factor = true);
331-
ERROR: invalid value for floating point option "vacuum_cleanup_index_scale_factor": true
332-
-- Simple ALTER INDEX
333-
alter index btree_idx1 set (vacuum_cleanup_index_scale_factor = 70.0);
334-
select reloptions from pg_class WHERE oid = 'btree_idx1'::regclass;
335-
reloptions
336-
------------------------------------------
337-
{vacuum_cleanup_index_scale_factor=70.0}
338-
(1 row)
339-
340311
--
341312
-- Test for multilevel page deletion
342313
--

src/test/regress/sql/btree_index.sql

Lines changed: 0 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -150,25 +150,6 @@ create index btree_tall_idx on btree_tall_tbl (t, id) with (fillfactor = 10);
150150
insert into btree_tall_tbl select g, repeat('x', 250)
151151
from generate_series(1, 130) g;
152152

153-
--
154-
-- Test vacuum_cleanup_index_scale_factor
155-
--
156-
157-
-- Simple create
158-
create table btree_test(a int);
159-
create index btree_idx1 on btree_test(a) with (vacuum_cleanup_index_scale_factor = 40.0);
160-
select reloptions from pg_class WHERE oid = 'btree_idx1'::regclass;
161-
162-
-- Fail while setting improper values
163-
create index btree_idx_err on btree_test(a) with (vacuum_cleanup_index_scale_factor = -10.0);
164-
create index btree_idx_err on btree_test(a) with (vacuum_cleanup_index_scale_factor = 100.0);
165-
create index btree_idx_err on btree_test(a) with (vacuum_cleanup_index_scale_factor = 'string');
166-
create index btree_idx_err on btree_test(a) with (vacuum_cleanup_index_scale_factor = true);
167-
168-
-- Simple ALTER INDEX
169-
alter index btree_idx1 set (vacuum_cleanup_index_scale_factor = 70.0);
170-
select reloptions from pg_class WHERE oid = 'btree_idx1'::regclass;
171-
172153
--
173154
-- Test for multilevel page deletion
174155
--

0 commit comments

Comments
 (0)