Skip to content

Commit ec14081

Browse files
committed
Improve multixact emergency autovacuum logic.
Previously autovacuum was not necessarily triggered if space in the members slru got tight. The first problem was that the signalling was tied to values in the offsets slru, but members can advance much faster. Thats especially a problem if old sessions had been around that previously prevented the multixact horizon to increase. Secondly the skipping logic doesn't work if the database was restarted after autovacuum was triggered - that knowledge is not preserved across restart. This is especially a problem because it's a common panic-reaction to restart the database if it gets slow to anti-wraparound vacuums. Fix the first problem by separating the logic for members from offsets. Trigger autovacuum whenever a multixact crosses a segment boundary, as the current member offset increases in irregular values, so we can't use a simple modulo logic as for offsets. Add a stopgap for the second problem, by signalling autovacuum whenver ERRORing out because of boundaries. Discussion: 20150608163707.GD20772@alap3.anarazel.de Backpatch into 9.3, where it became more likely that multixacts wrap around.
1 parent 67876a1 commit ec14081

File tree

1 file changed

+48
-17
lines changed

1 file changed

+48
-17
lines changed

src/backend/access/transam/multixact.c

Lines changed: 48 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -982,10 +982,7 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
982982
* Note these are pretty much the same protections in GetNewTransactionId.
983983
*----------
984984
*/
985-
if (!MultiXactIdPrecedes(result, MultiXactState->multiVacLimit) ||
986-
!MultiXactState->oldestOffsetKnown ||
987-
(MultiXactState->nextOffset - MultiXactState->oldestOffset
988-
> MULTIXACT_MEMBER_SAFE_THRESHOLD))
985+
if (!MultiXactIdPrecedes(result, MultiXactState->multiVacLimit))
989986
{
990987
/*
991988
* For safety's sake, we release MultiXactGenLock while sending
@@ -1001,19 +998,17 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
1001998

1002999
LWLockRelease(MultiXactGenLock);
10031000

1004-
/*
1005-
* To avoid swamping the postmaster with signals, we issue the autovac
1006-
* request only once per 64K multis generated. This still gives
1007-
* plenty of chances before we get into real trouble.
1008-
*/
1009-
if (IsUnderPostmaster && (result % 65536) == 0)
1010-
SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER);
1011-
10121001
if (IsUnderPostmaster &&
10131002
!MultiXactIdPrecedes(result, multiStopLimit))
10141003
{
10151004
char *oldest_datname = get_database_name(oldest_datoid);
10161005

1006+
/*
1007+
* Immediately kick autovacuum into action as we're already
1008+
* in ERROR territory.
1009+
*/
1010+
SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER);
1011+
10171012
/* complain even if that DB has disappeared */
10181013
if (oldest_datname)
10191014
ereport(ERROR,
@@ -1030,7 +1025,16 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
10301025
errhint("Execute a database-wide VACUUM in that database.\n"
10311026
"You might also need to commit or roll back old prepared transactions.")));
10321027
}
1033-
else if (!MultiXactIdPrecedes(result, multiWarnLimit))
1028+
1029+
/*
1030+
* To avoid swamping the postmaster with signals, we issue the autovac
1031+
* request only once per 64K multis generated. This still gives
1032+
* plenty of chances before we get into real trouble.
1033+
*/
1034+
if (IsUnderPostmaster && (result % 65536) == 0)
1035+
SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER);
1036+
1037+
if (!MultiXactIdPrecedes(result, multiWarnLimit))
10341038
{
10351039
char *oldest_datname = get_database_name(oldest_datoid);
10361040

@@ -1101,6 +1105,10 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
11011105
if (MultiXactState->offsetStopLimitKnown &&
11021106
MultiXactOffsetWouldWrap(MultiXactState->offsetStopLimit, nextOffset,
11031107
nmembers))
1108+
{
1109+
/* see comment in the corresponding offsets wraparound case */
1110+
SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER);
1111+
11041112
ereport(ERROR,
11051113
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
11061114
errmsg("multixact \"members\" limit exceeded"),
@@ -1111,10 +1119,33 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
11111119
MultiXactState->offsetStopLimit - nextOffset - 1),
11121120
errhint("Execute a database-wide VACUUM in database with OID %u with reduced vacuum_multixact_freeze_min_age and vacuum_multixact_freeze_table_age settings.",
11131121
MultiXactState->oldestMultiXactDB)));
1114-
else if (MultiXactState->offsetStopLimitKnown &&
1115-
MultiXactOffsetWouldWrap(MultiXactState->offsetStopLimit,
1116-
nextOffset,
1117-
nmembers + MULTIXACT_MEMBERS_PER_PAGE * SLRU_PAGES_PER_SEGMENT * OFFSET_WARN_SEGMENTS))
1122+
}
1123+
1124+
/*
1125+
* Check whether we should kick autovacuum into action, to prevent members
1126+
* wraparound. NB we use a much larger window to trigger autovacuum than
1127+
* just the warning limit. The warning is just a measure of last resort -
1128+
* this is in line with GetNewTransactionId's behaviour.
1129+
*/
1130+
if (!MultiXactState->oldestOffsetKnown ||
1131+
(MultiXactState->nextOffset - MultiXactState->oldestOffset
1132+
> MULTIXACT_MEMBER_SAFE_THRESHOLD))
1133+
{
1134+
/*
1135+
* To avoid swamping the postmaster with signals, we issue the autovac
1136+
* request only when crossing a segment boundary. With default
1137+
* compilation settings that's rougly after 50k members. This still
1138+
* gives plenty of chances before we get into real trouble.
1139+
*/
1140+
if ((MXOffsetToMemberPage(nextOffset) / SLRU_PAGES_PER_SEGMENT) !=
1141+
(MXOffsetToMemberPage(nextOffset + nmembers) / SLRU_PAGES_PER_SEGMENT))
1142+
SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER);
1143+
}
1144+
1145+
if (MultiXactState->offsetStopLimitKnown &&
1146+
MultiXactOffsetWouldWrap(MultiXactState->offsetStopLimit,
1147+
nextOffset,
1148+
nmembers + MULTIXACT_MEMBERS_PER_PAGE * SLRU_PAGES_PER_SEGMENT * OFFSET_WARN_SEGMENTS))
11181149
ereport(WARNING,
11191150
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
11201151
errmsg("database with OID %u must be vacuumed before %d more multixact members are used",

0 commit comments

Comments
 (0)