Skip to content

Commit 208e41d

Browse files
committed
Update our documentation concerning where to create data directories.
Although initdb has long discouraged use of a filesystem mount-point directory as a PG data directory, this point was covered nowhere in the user-facing documentation. Also, with the popularity of pg_upgrade, we really need to recommend that the PG user own not only the data directory but its parent directory too. (Without a writable parent directory, operations such as "mv data data.old" fail immediately. pg_upgrade itself doesn't do that, but wrapper scripts for it often do.) Hence, adjust the "Creating a Database Cluster" section to address these points. I also took the liberty of wordsmithing the discussion of NFS a bit. These considerations aren't by any means new, so back-patch to all supported branches.
1 parent f527c0a commit 208e41d

File tree

1 file changed

+57
-22
lines changed

1 file changed

+57
-22
lines changed

doc/src/sgml/runtime.sgml

Lines changed: 57 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@
4949
<para>
5050
Before you can do anything, you must initialize a database storage
5151
area on disk. We call this a <firstterm>database cluster</firstterm>.
52-
(<acronym>SQL</acronym> uses the term catalog cluster.) A
52+
(The <acronym>SQL</acronym> standard uses the term catalog cluster.) A
5353
database cluster is a collection of databases that is managed by a
5454
single instance of a running database server. After initialization, a
5555
database cluster will contain a database named <literal>postgres</literal>,
@@ -65,7 +65,7 @@
6565
</para>
6666

6767
<para>
68-
In file system terms, a database cluster will be a single directory
68+
In file system terms, a database cluster is a single directory
6969
under which all data will be stored. We call this the <firstterm>data
7070
directory</firstterm> or <firstterm>data area</firstterm>. It is
7171
completely up to you where you choose to store your data. There is no
@@ -109,23 +109,28 @@
109109

110110
<para>
111111
<command>initdb</command> will attempt to create the directory you
112-
specify if it does not already exist. It is likely that it will not
113-
have the permission to do so (if you followed our advice and created
114-
an unprivileged account). In that case you should create the
115-
directory yourself (as root) and change the owner to be the
116-
<productname>PostgreSQL</productname> user. Here is how this might
117-
be done:
112+
specify if it does not already exist. Of course, this will fail if
113+
<command>initdb</command> does not have permissions to write in the
114+
parent directory. It's generally recommendable that the
115+
<productname>PostgreSQL</productname> user own not just the data
116+
directory but its parent directory as well, so that this should not
117+
be a problem. If the desired parent directory doesn't exist either,
118+
you will need to create it first, using root privileges if the
119+
grandparent directory isn't writable. So the process might look
120+
like this:
118121
<screen>
119-
root# <userinput>mkdir /usr/local/pgsql/data</userinput>
120-
root# <userinput>chown postgres /usr/local/pgsql/data</userinput>
122+
root# <userinput>mkdir /usr/local/pgsql</userinput>
123+
root# <userinput>chown postgres /usr/local/pgsql</userinput>
121124
root# <userinput>su postgres</userinput>
122125
postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
123126
</screen>
124127
</para>
125128

126129
<para>
127130
<command>initdb</command> will refuse to run if the data directory
128-
looks like it has already been initialized.</para>
131+
exists and already contains files; this is to prevent accidentally
132+
overwriting an existing installation.
133+
</para>
129134

130135
<para>
131136
Because the data directory contains all the data stored in the
@@ -175,8 +180,30 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
175180
locale setting. For details see <xref linkend="multibyte">.
176181
</para>
177182

183+
<sect2 id="creating-cluster-mount-points">
184+
<title>Use of Secondary File Systems</title>
185+
186+
<indexterm zone="creating-cluster-mount-points">
187+
<primary>file system mount points</primary>
188+
</indexterm>
189+
190+
<para>
191+
Many installations create their database clusters on file systems
192+
(volumes) other than the machine's <quote>root</> volume. If you
193+
choose to do this, it is not advisable to try to use the secondary
194+
volume's topmost directory (mount point) as the data directory.
195+
Best practice is to create a directory within the mount-point
196+
directory that is owned by the <productname>PostgreSQL</productname>
197+
user, and then create the data directory within that. This avoids
198+
permissions problems, particularly for operations such
199+
as <application>pg_upgrade</>, and it also ensures clean failures if
200+
the secondary volume is taken offline.
201+
</para>
202+
203+
</sect2>
204+
178205
<sect2 id="creating-cluster-nfs">
179-
<title>Network File Systems</title>
206+
<title>Use of Network File Systems</title>
180207

181208
<indexterm zone="creating-cluster-nfs">
182209
<primary>Network File Systems</primary>
@@ -185,22 +212,30 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
185212
<indexterm><primary>Network Attached Storage (<acronym>NAS</>)</><see>Network File Systems</></>
186213

187214
<para>
188-
Many installations create database clusters on network file systems.
189-
Sometimes this is done directly via <acronym>NFS</>, or by using a
215+
Many installations create their database clusters on network file
216+
systems. Sometimes this is done via <acronym>NFS</>, or by using a
190217
Network Attached Storage (<acronym>NAS</>) device that uses
191218
<acronym>NFS</> internally. <productname>PostgreSQL</> does nothing
192219
special for <acronym>NFS</> file systems, meaning it assumes
193-
<acronym>NFS</> behaves exactly like locally-connected drives
194-
(<acronym>DAS</>, Direct Attached Storage). If client and server
195-
<acronym>NFS</> implementations have non-standard semantics, this can
220+
<acronym>NFS</> behaves exactly like locally-connected drives.
221+
If the client or server <acronym>NFS</> implementation does not
222+
provide standard file system semantics, this can
196223
cause reliability problems (see <ulink
197224
url="http://www.time-travellers.org/shane/papers/NFS_considered_harmful.html"></ulink>).
198225
Specifically, delayed (asynchronous) writes to the <acronym>NFS</>
199-
server can cause reliability problems; if possible, mount
200-
<acronym>NFS</> file systems synchronously (without caching) to avoid
201-
this. Also, soft-mounting <acronym>NFS</> is not recommended.
202-
(Storage Area Networks (<acronym>SAN</>) use a low-level
203-
communication protocol rather than <acronym>NFS</>.)
226+
server can cause data corruption problems. If possible, mount the
227+
<acronym>NFS</> file system synchronously (without caching) to avoid
228+
this hazard. Also, soft-mounting the <acronym>NFS</> file system is
229+
not recommended.
230+
</para>
231+
232+
<para>
233+
Storage Area Networks (<acronym>SAN</>) typically use communication
234+
protocols other than <acronym>NFS</>, and may or may not be subject
235+
to hazards of this sort. It's advisable to consult the vendor's
236+
documentation concerning data consistency guarantees.
237+
<productname>PostgreSQL</productname> cannot be more reliable than
238+
the file system it's using.
204239
</para>
205240

206241
</sect2>

0 commit comments

Comments
 (0)