Skip to content

Commit 667912a

Browse files
committed
Improve multixact emergency autovacuum logic.
Previously autovacuum was not necessarily triggered if space in the members slru got tight. The first problem was that the signalling was tied to values in the offsets slru, but members can advance much faster. Thats especially a problem if old sessions had been around that previously prevented the multixact horizon to increase. Secondly the skipping logic doesn't work if the database was restarted after autovacuum was triggered - that knowledge is not preserved across restart. This is especially a problem because it's a common panic-reaction to restart the database if it gets slow to anti-wraparound vacuums. Fix the first problem by separating the logic for members from offsets. Trigger autovacuum whenever a multixact crosses a segment boundary, as the current member offset increases in irregular values, so we can't use a simple modulo logic as for offsets. Add a stopgap for the second problem, by signalling autovacuum whenver ERRORing out because of boundaries. Discussion: 20150608163707.GD20772@alap3.anarazel.de Backpatch into 9.3, where it became more likely that multixacts wrap around.
1 parent 90231cd commit 667912a

File tree

1 file changed

+48
-17
lines changed

1 file changed

+48
-17
lines changed

src/backend/access/transam/multixact.c

Lines changed: 48 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -977,10 +977,7 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
977977
* Note these are pretty much the same protections in GetNewTransactionId.
978978
*----------
979979
*/
980-
if (!MultiXactIdPrecedes(result, MultiXactState->multiVacLimit) ||
981-
!MultiXactState->oldestOffsetKnown ||
982-
(MultiXactState->nextOffset - MultiXactState->oldestOffset
983-
> MULTIXACT_MEMBER_SAFE_THRESHOLD))
980+
if (!MultiXactIdPrecedes(result, MultiXactState->multiVacLimit))
984981
{
985982
/*
986983
* For safety's sake, we release MultiXactGenLock while sending
@@ -996,19 +993,17 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
996993

997994
LWLockRelease(MultiXactGenLock);
998995

999-
/*
1000-
* To avoid swamping the postmaster with signals, we issue the autovac
1001-
* request only once per 64K multis generated. This still gives
1002-
* plenty of chances before we get into real trouble.
1003-
*/
1004-
if (IsUnderPostmaster && (result % 65536) == 0)
1005-
SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER);
1006-
1007996
if (IsUnderPostmaster &&
1008997
!MultiXactIdPrecedes(result, multiStopLimit))
1009998
{
1010999
char *oldest_datname = get_database_name(oldest_datoid);
10111000

1001+
/*
1002+
* Immediately kick autovacuum into action as we're already
1003+
* in ERROR territory.
1004+
*/
1005+
SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER);
1006+
10121007
/* complain even if that DB has disappeared */
10131008
if (oldest_datname)
10141009
ereport(ERROR,
@@ -1025,7 +1020,16 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
10251020
errhint("Execute a database-wide VACUUM in that database.\n"
10261021
"You might also need to commit or roll back old prepared transactions.")));
10271022
}
1028-
else if (!MultiXactIdPrecedes(result, multiWarnLimit))
1023+
1024+
/*
1025+
* To avoid swamping the postmaster with signals, we issue the autovac
1026+
* request only once per 64K multis generated. This still gives
1027+
* plenty of chances before we get into real trouble.
1028+
*/
1029+
if (IsUnderPostmaster && (result % 65536) == 0)
1030+
SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER);
1031+
1032+
if (!MultiXactIdPrecedes(result, multiWarnLimit))
10291033
{
10301034
char *oldest_datname = get_database_name(oldest_datoid);
10311035

@@ -1096,6 +1100,10 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
10961100
if (MultiXactState->offsetStopLimitKnown &&
10971101
MultiXactOffsetWouldWrap(MultiXactState->offsetStopLimit, nextOffset,
10981102
nmembers))
1103+
{
1104+
/* see comment in the corresponding offsets wraparound case */
1105+
SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER);
1106+
10991107
ereport(ERROR,
11001108
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
11011109
errmsg("multixact \"members\" limit exceeded"),
@@ -1106,10 +1114,33 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
11061114
MultiXactState->offsetStopLimit - nextOffset - 1),
11071115
errhint("Execute a database-wide VACUUM in database with OID %u with reduced vacuum_multixact_freeze_min_age and vacuum_multixact_freeze_table_age settings.",
11081116
MultiXactState->oldestMultiXactDB)));
1109-
else if (MultiXactState->offsetStopLimitKnown &&
1110-
MultiXactOffsetWouldWrap(MultiXactState->offsetStopLimit,
1111-
nextOffset,
1112-
nmembers + MULTIXACT_MEMBERS_PER_PAGE * SLRU_PAGES_PER_SEGMENT * OFFSET_WARN_SEGMENTS))
1117+
}
1118+
1119+
/*
1120+
* Check whether we should kick autovacuum into action, to prevent members
1121+
* wraparound. NB we use a much larger window to trigger autovacuum than
1122+
* just the warning limit. The warning is just a measure of last resort -
1123+
* this is in line with GetNewTransactionId's behaviour.
1124+
*/
1125+
if (!MultiXactState->oldestOffsetKnown ||
1126+
(MultiXactState->nextOffset - MultiXactState->oldestOffset
1127+
> MULTIXACT_MEMBER_SAFE_THRESHOLD))
1128+
{
1129+
/*
1130+
* To avoid swamping the postmaster with signals, we issue the autovac
1131+
* request only when crossing a segment boundary. With default
1132+
* compilation settings that's rougly after 50k members. This still
1133+
* gives plenty of chances before we get into real trouble.
1134+
*/
1135+
if ((MXOffsetToMemberPage(nextOffset) / SLRU_PAGES_PER_SEGMENT) !=
1136+
(MXOffsetToMemberPage(nextOffset + nmembers) / SLRU_PAGES_PER_SEGMENT))
1137+
SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER);
1138+
}
1139+
1140+
if (MultiXactState->offsetStopLimitKnown &&
1141+
MultiXactOffsetWouldWrap(MultiXactState->offsetStopLimit,
1142+
nextOffset,
1143+
nmembers + MULTIXACT_MEMBERS_PER_PAGE * SLRU_PAGES_PER_SEGMENT * OFFSET_WARN_SEGMENTS))
11131144
ereport(WARNING,
11141145
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
11151146
errmsg("database with OID %u must be vacuumed before %d more multixact members are used",

0 commit comments

Comments
 (0)