Skip to content

Commit a494f10

Browse files
committed
Fix data loss in wal_level=minimal crash recovery of CREATE TABLESPACE.
If the system crashed between CREATE TABLESPACE and the next checkpoint, the result could be some files in the tablespace unexpectedly containing no rows. Affected files would be those for which the system did not write WAL; see the wal_skip_threshold documentation. Before v13, a different set of conditions governed the writing of WAL; see v12's <sect2 id="populate-pitr">. (The v12 conditions were broader in some ways and narrower in others.) Users may want to audit non-default tablespaces for unexpected short files. The bug could have truncated an index without affecting the associated table, and reindexing the index would fix that particular problem. This fixes the bug by making create_tablespace_directories() more like TablespaceCreateDbspace(). create_tablespace_directories() was recursively removing tablespace contents, reasoning that WAL redo would recreate everything removed that way. That assumption holds for other wal_level values. Under wal_level=minimal, the old approach could delete files for which no other copy existed. Back-patch to 9.6 (all supported versions). Reviewed by Robert Haas and Prabhat Sahu. Reported by Robert Haas. Discussion: https://postgr.es/m/CA+TgmoaLO9ncuwvr2nN-J4VEP5XyAcy=zKiHxQzBbFRxxGxm0w@mail.gmail.com
1 parent 187b5fe commit a494f10

File tree

1 file changed

+19
-23
lines changed

1 file changed

+19
-23
lines changed

src/backend/commands/tablespace.c

Lines changed: 19 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -617,40 +617,36 @@ create_tablespace_directories(const char *location, const Oid tablespaceoid)
617617
location)));
618618
}
619619

620-
if (InRecovery)
621-
{
622-
/*
623-
* Our theory for replaying a CREATE is to forcibly drop the target
624-
* subdirectory if present, and then recreate it. This may be more
625-
* work than needed, but it is simple to implement.
626-
*/
627-
if (stat(location_with_version_dir, &st) == 0 && S_ISDIR(st.st_mode))
628-
{
629-
if (!rmtree(location_with_version_dir, true))
630-
/* If this failed, MakePGDirectory() below is going to error. */
631-
ereport(WARNING,
632-
(errmsg("some useless files may be left behind in old database directory \"%s\"",
633-
location_with_version_dir)));
634-
}
635-
}
636-
637620
/*
638621
* The creation of the version directory prevents more than one tablespace
639-
* in a single location.
622+
* in a single location. This imitates TablespaceCreateDbspace(), but it
623+
* ignores concurrency and missing parent directories. The chmod() would
624+
* have failed in the absence of a parent. pg_tablespace_spcname_index
625+
* prevents concurrency.
640626
*/
641-
if (MakePGDirectory(location_with_version_dir) < 0)
627+
if (stat(location_with_version_dir, &st) < 0)
642628
{
643-
if (errno == EEXIST)
629+
if (errno != ENOENT)
644630
ereport(ERROR,
645-
(errcode(ERRCODE_OBJECT_IN_USE),
646-
errmsg("directory \"%s\" already in use as a tablespace",
631+
(errcode_for_file_access(),
632+
errmsg("could not stat directory \"%s\": %m",
647633
location_with_version_dir)));
648-
else
634+
else if (MakePGDirectory(location_with_version_dir) < 0)
649635
ereport(ERROR,
650636
(errcode_for_file_access(),
651637
errmsg("could not create directory \"%s\": %m",
652638
location_with_version_dir)));
653639
}
640+
else if (!S_ISDIR(st.st_mode))
641+
ereport(ERROR,
642+
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
643+
errmsg("\"%s\" exists but is not a directory",
644+
location_with_version_dir)));
645+
else if (!InRecovery)
646+
ereport(ERROR,
647+
(errcode(ERRCODE_OBJECT_IN_USE),
648+
errmsg("directory \"%s\" already in use as a tablespace",
649+
location_with_version_dir)));
654650

655651
/*
656652
* In recovery, remove old symlink, in case it points to the wrong place.

0 commit comments

Comments
 (0)