Skip to content

Commit 675a368

Browse files
committed
doc: add section about heap-only tuples (HOT)
Reported-by: Jonathan S. Katz Discussion: https://postgr.es/m/c59ffbd5-96ac-a5a5-a401-14f627ca1405@postgresql.org Backpatch-through: 11
1 parent 9039e34 commit 675a368

File tree

9 files changed

+86
-11
lines changed

9 files changed

+86
-11
lines changed

doc/src/sgml/acronyms.sgml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -299,9 +299,7 @@
299299
<term><acronym>HOT</acronym></term>
300300
<listitem>
301301
<para>
302-
<ulink
303-
url="https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/heap/README.HOT;hb=HEAD">Heap-Only
304-
Tuples</ulink>
302+
<link linkend="storage-hot">Heap-Only Tuples</link>
305303
</para>
306304
</listitem>
307305
</varlistentry>

doc/src/sgml/btree.sgml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -639,7 +639,8 @@ options(<replaceable>relopts</replaceable> <type>local_relopts *</type>) returns
639639
accumulate and adversely affect query latency and throughput. This
640640
typically occurs with <command>UPDATE</command>-heavy workloads
641641
where most individual updates cannot apply the
642-
<acronym>HOT</acronym> optimization. Changing the value of only
642+
<link linkend="storage-hot"><acronym>HOT</acronym> optimization.</link>
643+
Changing the value of only
643644
one column covered by one index during an <command>UPDATE</command>
644645
<emphasis>always</emphasis> necessitates a new set of index tuples
645646
&mdash; one for <emphasis>each and every</emphasis> index on the

doc/src/sgml/catalogs.sgml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4317,7 +4317,7 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
43174317
<para>
43184318
If true, queries must not use the index until the <structfield>xmin</structfield>
43194319
of this <structname>pg_index</structname> row is below their <symbol>TransactionXmin</symbol>
4320-
event horizon, because the table may contain broken HOT chains with
4320+
event horizon, because the table may contain broken <link linkend="storage-hot">HOT chains</link> with
43214321
incompatible rows that they can see
43224322
</para></entry>
43234323
</row>

doc/src/sgml/config.sgml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4379,7 +4379,8 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
43794379
<listitem>
43804380
<para>
43814381
Specifies the number of transactions by which <command>VACUUM</command> and
4382-
<acronym>HOT</acronym> updates will defer cleanup of dead row versions. The
4382+
<link linkend="storage-hot"><acronym>HOT</acronym> updates</link>
4383+
will defer cleanup of dead row versions. The
43834384
default is zero transactions, meaning that dead row versions can be
43844385
removed as soon as possible, that is, as soon as they are no longer
43854386
visible to any open transaction. You may wish to set this to a

doc/src/sgml/indexam.sgml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,8 @@
4545
extant versions of the same logical row; to an index, each tuple is
4646
an independent object that needs its own index entry. Thus, an
4747
update of a row always creates all-new index entries for the row, even if
48-
the key values did not change. (HOT tuples are an exception to this
48+
the key values did not change. (<link linkend="storage-hot">HOT
49+
tuples</link> are an exception to this
4950
statement; but indexes do not deal with those, either.) Index entries for
5051
dead tuples are reclaimed (by vacuuming) when the dead tuples themselves
5152
are reclaimed.

doc/src/sgml/indices.sgml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,9 @@ CREATE INDEX test1_id_index ON test1 (id);
103103

104104
<para>
105105
After an index is created, the system has to keep it synchronized with the
106-
table. This adds overhead to data manipulation operations.
106+
table. This adds overhead to data manipulation operations. Indexes can
107+
also prevent the creation of <link linkend="storage-hot">heap-only
108+
tuples</link>.
107109
Therefore indexes that are seldom or never used in queries
108110
should be removed.
109111
</para>
@@ -749,7 +751,7 @@ CREATE INDEX people_names ON people ((first_name || ' ' || last_name));
749751
<para>
750752
Index expressions are relatively expensive to maintain, because the
751753
derived expression(s) must be computed for each row insertion
752-
and non-HOT update. However, the index expressions are
754+
and <link linkend="storage-hot">non-HOT update.</link> However, the index expressions are
753755
<emphasis>not</emphasis> recomputed during an indexed search, since they are
754756
already stored in the index. In both examples above, the system
755757
sees the query as just <literal>WHERE indexedcolumn = 'constant'</literal>

doc/src/sgml/monitoring.sgml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4153,7 +4153,7 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
41534153
<structfield>n_tup_upd</structfield> <type>bigint</type>
41544154
</para>
41554155
<para>
4156-
Number of rows updated (includes HOT updated rows)
4156+
Number of rows updated (includes <link linkend="storage-hot">HOT updated rows</link>)
41574157
</para></entry>
41584158
</row>
41594159

doc/src/sgml/ref/create_table.sgml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1394,7 +1394,9 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
13941394
to the indicated percentage; the remaining space on each page is
13951395
reserved for updating rows on that page. This gives <command>UPDATE</command>
13961396
a chance to place the updated copy of a row on the same page as the
1397-
original, which is more efficient than placing it on a different page.
1397+
original, which is more efficient than placing it on a different
1398+
page, and makes <link linkend="storage-hot">heap-only tuple
1399+
updates</link> more likely.
13981400
For a table whose entries are never updated, complete packing is the
13991401
best choice, but in heavily updated tables smaller fillfactors are
14001402
appropriate. This parameter cannot be set for TOAST tables.

doc/src/sgml/storage.sgml

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1074,4 +1074,74 @@ data. Empty in ordinary tables.</entry>
10741074
</sect2>
10751075
</sect1>
10761076

1077+
<sect1 id="storage-hot">
1078+
1079+
<title>Heap-Only Tuples (<acronym>HOT</acronym>)</title>
1080+
1081+
<para>
1082+
To allow for high concurrency, <productname>PostgreSQL</productname>
1083+
uses <link linkend="mvcc-intro">multiversion concurrency
1084+
control</link> (<acronym>MVCC</acronym>) to store rows. However,
1085+
<acronym>MVCC</acronym> has some downsides for update queries.
1086+
Specifically, updates require new versions of rows to be added to
1087+
tables. This can also require new index entries for each updated row,
1088+
and removal of old versions of rows and their index entries can be
1089+
expensive.
1090+
</para>
1091+
1092+
<para>
1093+
To help reduce the overhead of updates,
1094+
<productname>PostgreSQL</productname> has an optimization called
1095+
heap-only tuples (<acronym>HOT</acronym>). This optimization is
1096+
possible when:
1097+
1098+
<itemizedlist>
1099+
<listitem>
1100+
<para>
1101+
The update does not modify any columns referenced by the table's
1102+
indexes, including expression and partial indexes.
1103+
</para>
1104+
</listitem>
1105+
<listitem>
1106+
<para>
1107+
There is sufficient free space on the page containing the old row
1108+
for the updated row.
1109+
</para>
1110+
</listitem>
1111+
</itemizedlist>
1112+
1113+
In such cases, heap-only tuples provide two optimizations:
1114+
1115+
<itemizedlist>
1116+
<listitem>
1117+
<para>
1118+
New index entries are not needed to represent updated rows.
1119+
</para>
1120+
</listitem>
1121+
<listitem>
1122+
<para>
1123+
Old versions of updated rows can be completely removed during normal
1124+
operation, including <command>SELECT</command>s, instead of requiring
1125+
periodic vacuum operations. (This is possible because indexes
1126+
do not reference their <link linkend="storage-page-layout">page
1127+
item identifiers</link>.)
1128+
</para>
1129+
</listitem>
1130+
</itemizedlist>
1131+
</para>
1132+
1133+
<para>
1134+
In summary, heap-only tuple updates can only be created
1135+
if columns used by indexes are not updated. You can
1136+
increase the likelihood of sufficient page space for
1137+
<acronym>HOT</acronym> updates by decreasing a table's <link
1138+
linkend="sql-createtable"><literal>fillfactor</literal></link>.
1139+
If you don't, <acronym>HOT</acronym> updates will still happen because
1140+
new rows will naturally migrate to new pages and existing pages with
1141+
sufficient free space for new row versions. The system view <link
1142+
linkend="monitoring-pg-stat-all-tables-view">pg_stat_all_tables</link>
1143+
allows monitoring of the occurrence of HOT and non-HOT updates.
1144+
</para>
1145+
</sect1>
1146+
10771147
</chapter>

0 commit comments

Comments
 (0)