Skip to content

Commit 8620a7f

Browse files
committed
Code review for server's handling of "tablespace map" files.
While looking at Robert Foggia's report, I noticed a passel of other issues in the same area: * The scheme for backslash-quoting newlines in pathnames is just wrong; it will misbehave if the last ordinary character in a pathname is a backslash. I'm not sure why we're bothering to allow newlines in tablespace paths, but if we're going to do it we should do it without introducing other problems. Hence, backslashes themselves have to be backslashed too. * The author hadn't read the sscanf man page very carefully, because this code would drop any leading whitespace from the path. (I doubt that a tablespace path with leading whitespace could happen in practice; but if we're bothering to allow newlines in the path, it sure seems like leading whitespace is little less implausible.) Using sscanf for the task of finding the first space is overkill anyway. * While I'm not 100% sure what the rationale for escaping both \r and \n is, if the idea is to allow Windows newlines in the file then this code failed, because it'd throw an error if it saw \r followed by \n. * There's no cross-check for an incomplete final line in the map file, which would be a likely apparent symptom of the improper-escaping bug. On the generation end, aside from the escaping issue we have: * If needtblspcmapfile is true then do_pg_start_backup will pass back escaped strings in tablespaceinfo->path values, which no caller wants or is prepared to deal with. I'm not sure if there's a live bug from that, but it looks like there might be (given the dubious assumption that anyone actually has newlines in their tablespace paths). * It's not being very paranoid about the possibility of random stuff in the pg_tblspc directory. IMO we should ignore anything without an OID-like name. The escaping rule change doesn't seem back-patchable: it'll require doubling of backslashes in the tablespace_map file, which is basically a basebackup format change. The odds of that causing trouble are considerably more than the odds of the existing bug causing trouble. The rest of this seems somewhat unlikely to cause problems too, so no back-patch.
1 parent a50e4fd commit 8620a7f

File tree

5 files changed

+82
-64
lines changed

5 files changed

+82
-64
lines changed

src/backend/access/transam/xlog.c

Lines changed: 68 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -10702,19 +10702,17 @@ issue_xlog_fsync(int fd, XLogSegNo segno)
1070210702
* active at the same time, and they don't conflict with an exclusive backup
1070310703
* either.
1070410704
*
10705-
* tablespaces is required only when this function is called while
10706-
* the streaming base backup requested by pg_basebackup is running.
10707-
* NULL should be specified otherwise.
10705+
* labelfile and tblspcmapfile must be passed as NULL when starting an
10706+
* exclusive backup, and as initially-empty StringInfos for a non-exclusive
10707+
* backup.
10708+
*
10709+
* If "tablespaces" isn't NULL, it receives a list of tablespaceinfo structs
10710+
* describing the cluster's tablespaces.
1070810711
*
1070910712
* tblspcmapfile is required mainly for tar format in windows as native windows
1071010713
* utilities are not able to create symlinks while extracting files from tar.
1071110714
* However for consistency, the same is used for all platforms.
1071210715
*
10713-
* needtblspcmapfile is true for the cases (exclusive backup and for
10714-
* non-exclusive backup only when tar format is used for taking backup)
10715-
* when backup needs to generate tablespace_map file, it is used to
10716-
* embed escape character before newline character in tablespace path.
10717-
*
1071810716
* Returns the minimum WAL location that must be present to restore from this
1071910717
* backup, and the corresponding timeline ID in *starttli_p.
1072010718
*
@@ -10727,7 +10725,7 @@ issue_xlog_fsync(int fd, XLogSegNo segno)
1072710725
XLogRecPtr
1072810726
do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
1072910727
StringInfo labelfile, List **tablespaces,
10730-
StringInfo tblspcmapfile, bool needtblspcmapfile)
10728+
StringInfo tblspcmapfile)
1073110729
{
1073210730
bool exclusive = (labelfile == NULL);
1073310731
bool backup_started_in_recovery = false;
@@ -10940,9 +10938,10 @@ do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
1094010938
XLogFileName(xlogfilename, starttli, _logSegNo, wal_segment_size);
1094110939

1094210940
/*
10943-
* Construct tablespace_map file
10941+
* Construct tablespace_map file. If caller isn't interested in this,
10942+
* we make a local StringInfo.
1094410943
*/
10945-
if (exclusive)
10944+
if (tblspcmapfile == NULL)
1094610945
tblspcmapfile = makeStringInfo();
1094710946

1094810947
datadirpathlen = strlen(DataDir);
@@ -10955,11 +10954,11 @@ do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
1095510954
char linkpath[MAXPGPATH];
1095610955
char *relpath = NULL;
1095710956
int rllen;
10958-
StringInfoData buflinkpath;
10959-
char *s = linkpath;
10957+
StringInfoData escapedpath;
10958+
char *s;
1096010959

10961-
/* Skip special stuff */
10962-
if (strcmp(de->d_name, ".") == 0 || strcmp(de->d_name, "..") == 0)
10960+
/* Skip anything that doesn't look like a tablespace */
10961+
if (strspn(de->d_name, "0123456789") != strlen(de->d_name))
1096310962
continue;
1096410963

1096510964
snprintf(fullpath, sizeof(fullpath), "pg_tblspc/%s", de->d_name);
@@ -10983,18 +10982,15 @@ do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
1098310982
linkpath[rllen] = '\0';
1098410983

1098510984
/*
10986-
* Add the escape character '\\' before newline in a string to
10987-
* ensure that we can distinguish between the newline in the
10988-
* tablespace path and end of line while reading tablespace_map
10989-
* file during archive recovery.
10985+
* Build a backslash-escaped version of the link path to include
10986+
* in the tablespace map file.
1099010987
*/
10991-
initStringInfo(&buflinkpath);
10992-
10993-
while (*s)
10988+
initStringInfo(&escapedpath);
10989+
for (s = linkpath; *s; s++)
1099410990
{
10995-
if ((*s == '\n' || *s == '\r') && needtblspcmapfile)
10996-
appendStringInfoChar(&buflinkpath, '\\');
10997-
appendStringInfoChar(&buflinkpath, *s++);
10991+
if (*s == '\n' || *s == '\r' || *s == '\\')
10992+
appendStringInfoChar(&escapedpath, '\\');
10993+
appendStringInfoChar(&escapedpath, *s);
1099810994
}
1099910995

1100010996
/*
@@ -11009,16 +11005,17 @@ do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
1100911005

1101011006
ti = palloc(sizeof(tablespaceinfo));
1101111007
ti->oid = pstrdup(de->d_name);
11012-
ti->path = pstrdup(buflinkpath.data);
11008+
ti->path = pstrdup(linkpath);
1101311009
ti->rpath = relpath ? pstrdup(relpath) : NULL;
1101411010
ti->size = -1;
1101511011

1101611012
if (tablespaces)
1101711013
*tablespaces = lappend(*tablespaces, ti);
1101811014

11019-
appendStringInfo(tblspcmapfile, "%s %s\n", ti->oid, ti->path);
11015+
appendStringInfo(tblspcmapfile, "%s %s\n",
11016+
ti->oid, escapedpath.data);
1102011017

11021-
pfree(buflinkpath.data);
11018+
pfree(escapedpath.data);
1102211019
#else
1102311020

1102411021
/*
@@ -11034,9 +11031,10 @@ do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
1103411031
FreeDir(tblspcdir);
1103511032

1103611033
/*
11037-
* Construct backup label file
11034+
* Construct backup label file. If caller isn't interested in this,
11035+
* we make a local StringInfo.
1103811036
*/
11039-
if (exclusive)
11037+
if (labelfile == NULL)
1104011038
labelfile = makeStringInfo();
1104111039

1104211040
/* Use the log timezone here, not the session timezone */
@@ -11898,22 +11896,20 @@ read_backup_label(XLogRecPtr *checkPointLoc, bool *backupEndRequired,
1189811896
* recovering from a backup dump file, and we therefore need to create symlinks
1189911897
* as per the information present in tablespace_map file.
1190011898
*
11901-
* Returns true if a tablespace_map file was found (and fills the link
11902-
* information for all the tablespace links present in file); returns false
11903-
* if not.
11899+
* Returns true if a tablespace_map file was found (and fills *tablespaces
11900+
* with a tablespaceinfo struct for each tablespace listed in the file);
11901+
* returns false if not.
1190411902
*/
1190511903
static bool
1190611904
read_tablespace_map(List **tablespaces)
1190711905
{
1190811906
tablespaceinfo *ti;
1190911907
FILE *lfp;
11910-
char tbsoid[MAXPGPATH];
11911-
char *tbslinkpath;
1191211908
char str[MAXPGPATH];
1191311909
int ch,
11914-
prev_ch = -1,
11915-
i = 0,
11910+
i,
1191611911
n;
11912+
bool was_backslash;
1191711913

1191811914
/*
1191911915
* See if tablespace_map file is present
@@ -11932,38 +11928,55 @@ read_tablespace_map(List **tablespaces)
1193211928
/*
1193311929
* Read and parse the link name and path lines from tablespace_map file
1193411930
* (this code is pretty crude, but we are not expecting any variability in
11935-
* the file format). While taking backup we embed escape character '\\'
11936-
* before newline in tablespace path, so that during reading of
11937-
* tablespace_map file, we could distinguish newline in tablespace path
11938-
* and end of line. Now while reading tablespace_map file, remove the
11939-
* escape character that has been added in tablespace path during backup.
11931+
* the file format). De-escape any backslashes that were inserted.
1194011932
*/
11933+
i = 0;
11934+
was_backslash = false;
1194111935
while ((ch = fgetc(lfp)) != EOF)
1194211936
{
11943-
if ((ch == '\n' || ch == '\r') && prev_ch != '\\')
11937+
if (!was_backslash && (ch == '\n' || ch == '\r'))
1194411938
{
11939+
if (i == 0)
11940+
continue; /* \r immediately followed by \n */
11941+
11942+
/*
11943+
* The de-escaped line should contain an OID followed by exactly
11944+
* one space followed by a path. The path might start with
11945+
* spaces, so don't be too liberal about parsing.
11946+
*/
1194511947
str[i] = '\0';
11946-
if (sscanf(str, "%s %n", tbsoid, &n) != 1)
11948+
n = 0;
11949+
while (str[n] && str[n] != ' ')
11950+
n++;
11951+
if (n < 1 || n >= i - 1)
1194711952
ereport(FATAL,
1194811953
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
1194911954
errmsg("invalid data in file \"%s\"", TABLESPACE_MAP)));
11950-
tbslinkpath = str + n;
11951-
i = 0;
11952-
11953-
ti = palloc(sizeof(tablespaceinfo));
11954-
ti->oid = pstrdup(tbsoid);
11955-
ti->path = pstrdup(tbslinkpath);
11955+
str[n++] = '\0';
1195611956

11957+
ti = palloc0(sizeof(tablespaceinfo));
11958+
ti->oid = pstrdup(str);
11959+
ti->path = pstrdup(str + n);
1195711960
*tablespaces = lappend(*tablespaces, ti);
11961+
11962+
i = 0;
1195811963
continue;
1195911964
}
11960-
else if ((ch == '\n' || ch == '\r') && prev_ch == '\\')
11961-
str[i - 1] = ch;
11962-
else if (i < sizeof(str) - 1)
11963-
str[i++] = ch;
11964-
prev_ch = ch;
11965+
else if (!was_backslash && ch == '\\')
11966+
was_backslash = true;
11967+
else
11968+
{
11969+
if (i < sizeof(str) - 1)
11970+
str[i++] = ch;
11971+
was_backslash = false;
11972+
}
1196511973
}
1196611974

11975+
if (i != 0 || was_backslash) /* last line not terminated? */
11976+
ereport(FATAL,
11977+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
11978+
errmsg("invalid data in file \"%s\"", TABLESPACE_MAP)));
11979+
1196711980
if (ferror(lfp) || FreeFile(lfp))
1196811981
ereport(FATAL,
1196911982
(errcode_for_file_access(),

src/backend/access/transam/xlogfuncs.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ pg_start_backup(PG_FUNCTION_ARGS)
7676
if (exclusive)
7777
{
7878
startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL,
79-
NULL, NULL, true);
79+
NULL, NULL);
8080
}
8181
else
8282
{
@@ -94,7 +94,7 @@ pg_start_backup(PG_FUNCTION_ARGS)
9494
register_persistent_abort_backup_handler();
9595

9696
startpoint = do_pg_start_backup(backupidstr, fast, NULL, label_file,
97-
NULL, tblspc_map_file, true);
97+
NULL, tblspc_map_file);
9898
}
9999

100100
PG_RETURN_LSN(startpoint);

src/backend/replication/basebackup.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -299,7 +299,7 @@ perform_base_backup(basebackup_options *opt)
299299
PROGRESS_BASEBACKUP_PHASE_WAIT_CHECKPOINT);
300300
startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, &starttli,
301301
labelfile, &tablespaces,
302-
tblspc_map_file, opt->sendtblspcmapfile);
302+
tblspc_map_file);
303303

304304
/*
305305
* Once do_pg_start_backup has been called, ensure that any failure causes

src/include/access/xlog.h

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -384,8 +384,7 @@ typedef enum SessionBackupState
384384

385385
extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
386386
TimeLineID *starttli_p, StringInfo labelfile,
387-
List **tablespaces, StringInfo tblspcmapfile,
388-
bool needtblspcmapfile);
387+
List **tablespaces, StringInfo tblspcmapfile);
389388
extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
390389
TimeLineID *stoptli_p);
391390
extern void do_pg_abort_backup(int code, Datum arg);

src/include/replication/basebackup.h

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,12 +20,18 @@
2020
#define MAX_RATE_LOWER 32
2121
#define MAX_RATE_UPPER 1048576
2222

23+
/*
24+
* Information about a tablespace
25+
*
26+
* In some usages, "path" can be NULL to denote the PGDATA directory itself.
27+
*/
2328
typedef struct
2429
{
25-
char *oid;
26-
char *path;
27-
char *rpath; /* relative path within PGDATA, or NULL */
28-
int64 size;
30+
char *oid; /* tablespace's OID, as a decimal string */
31+
char *path; /* full path to tablespace's directory */
32+
char *rpath; /* relative path if it's within PGDATA, else
33+
* NULL */
34+
int64 size; /* total size as sent; -1 if not known */
2935
} tablespaceinfo;
3036

3137
extern void SendBaseBackup(BaseBackupCmd *cmd);

0 commit comments

Comments
 (0)