Skip to content

Commit 43c9790

Browse files
committed
Try to handle torn reads of pg_control in frontend.
Some of our src/bin tools read the control file without any kind of interlocking against concurrent writes from the server. At least ext4 and ntfs can expose partially modified contents when you do that. For now, we'll try to tolerate this by retrying up to 10 times if the checksum doesn't match, until we get two reads in a row with the same bad checksum. This is not guaranteed to reach the right conclusion, but it seems very likely to. Thanks to Tom Lane for this suggestion. Various ideas for interlocking or atomicity were considered too complicated, unportable or expensive given the lack of field reports, but remain open for future reconsideration. Back-patch as far as 12. It doesn't seem like a good idea to put a heuristic change for a very rare problem into the final release of 11. Reviewed-by: Anton A. Melnikov <aamelnikov@inbox.ru> Reviewed-by: David Steele <david@pgmasters.net> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/20221123014224.xisi44byq3cf5psi%40awork3.anarazel.de
1 parent 637e86e commit 43c9790

File tree

1 file changed

+30
-0
lines changed

1 file changed

+30
-0
lines changed

src/common/controldata_utils.c

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,12 +55,22 @@ get_controlfile(const char *DataDir, bool *crc_ok_p)
5555
char ControlFilePath[MAXPGPATH];
5656
pg_crc32c crc;
5757
int r;
58+
#ifdef FRONTEND
59+
pg_crc32c last_crc;
60+
int retries = 0;
61+
#endif
5862

5963
AssertArg(crc_ok_p);
6064

6165
ControlFile = palloc(sizeof(ControlFileData));
6266
snprintf(ControlFilePath, MAXPGPATH, "%s/global/pg_control", DataDir);
6367

68+
#ifdef FRONTEND
69+
INIT_CRC32C(last_crc);
70+
71+
retry:
72+
#endif
73+
6474
#ifndef FRONTEND
6575
if ((fd = OpenTransientFile(ControlFilePath, O_RDONLY | PG_BINARY)) == -1)
6676
ereport(ERROR,
@@ -128,6 +138,26 @@ get_controlfile(const char *DataDir, bool *crc_ok_p)
128138

129139
*crc_ok_p = EQ_CRC32C(crc, ControlFile->crc);
130140

141+
#ifdef FRONTEND
142+
143+
/*
144+
* If the server was writing at the same time, it is possible that we read
145+
* partially updated contents on some systems. If the CRC doesn't match,
146+
* retry a limited number of times until we compute the same bad CRC twice
147+
* in a row with a short sleep in between. Then the failure is unlikely
148+
* to be due to a concurrent write.
149+
*/
150+
if (!*crc_ok_p &&
151+
(retries == 0 || !EQ_CRC32C(crc, last_crc)) &&
152+
retries < 10)
153+
{
154+
retries++;
155+
last_crc = crc;
156+
pg_usleep(10000);
157+
goto retry;
158+
}
159+
#endif
160+
131161
/* Make sure the control file is valid byte order. */
132162
if (ControlFile->pg_control_version % 65536 == 0 &&
133163
ControlFile->pg_control_version / 65536 != 0)

0 commit comments

Comments
 (0)