Skip to content

Commit 63a5822

Browse files
committed
Try to handle torn reads of pg_control in frontend.
Some of our src/bin tools read the control file without any kind of interlocking against concurrent writes from the server. At least ext4 and ntfs can expose partially modified contents when you do that. For now, we'll try to tolerate this by retrying up to 10 times if the checksum doesn't match, until we get two reads in a row with the same bad checksum. This is not guaranteed to reach the right conclusion, but it seems very likely to. Thanks to Tom Lane for this suggestion. Various ideas for interlocking or atomicity were considered too complicated, unportable or expensive given the lack of field reports, but remain open for future reconsideration. Back-patch as far as 12. It doesn't seem like a good idea to put a heuristic change for a very rare problem into the final release of 11. Reviewed-by: Anton A. Melnikov <aamelnikov@inbox.ru> Reviewed-by: David Steele <david@pgmasters.net> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/20221123014224.xisi44byq3cf5psi%40awork3.anarazel.de
1 parent 4817da5 commit 63a5822

File tree

1 file changed

+30
-0
lines changed

1 file changed

+30
-0
lines changed

src/common/controldata_utils.c

+30
Original file line numberDiff line numberDiff line change
@@ -56,12 +56,22 @@ get_controlfile(const char *DataDir, bool *crc_ok_p)
5656
char ControlFilePath[MAXPGPATH];
5757
pg_crc32c crc;
5858
int r;
59+
#ifdef FRONTEND
60+
pg_crc32c last_crc;
61+
int retries = 0;
62+
#endif
5963

6064
Assert(crc_ok_p);
6165

6266
ControlFile = palloc_object(ControlFileData);
6367
snprintf(ControlFilePath, MAXPGPATH, "%s/global/pg_control", DataDir);
6468

69+
#ifdef FRONTEND
70+
INIT_CRC32C(last_crc);
71+
72+
retry:
73+
#endif
74+
6575
#ifndef FRONTEND
6676
if ((fd = OpenTransientFile(ControlFilePath, O_RDONLY | PG_BINARY)) == -1)
6777
ereport(ERROR,
@@ -117,6 +127,26 @@ get_controlfile(const char *DataDir, bool *crc_ok_p)
117127

118128
*crc_ok_p = EQ_CRC32C(crc, ControlFile->crc);
119129

130+
#ifdef FRONTEND
131+
132+
/*
133+
* If the server was writing at the same time, it is possible that we read
134+
* partially updated contents on some systems. If the CRC doesn't match,
135+
* retry a limited number of times until we compute the same bad CRC twice
136+
* in a row with a short sleep in between. Then the failure is unlikely
137+
* to be due to a concurrent write.
138+
*/
139+
if (!*crc_ok_p &&
140+
(retries == 0 || !EQ_CRC32C(crc, last_crc)) &&
141+
retries < 10)
142+
{
143+
retries++;
144+
last_crc = crc;
145+
pg_usleep(10000);
146+
goto retry;
147+
}
148+
#endif
149+
120150
/* Make sure the control file is valid byte order. */
121151
if (ControlFile->pg_control_version % 65536 == 0 &&
122152
ControlFile->pg_control_version / 65536 != 0)

0 commit comments

Comments
 (0)