49
49
<para>
50
50
Before you can do anything, you must initialize a database storage
51
51
area on disk. We call this a <firstterm>database cluster</firstterm>.
52
- (<acronym>SQL</acronym> uses the term catalog cluster.) A
52
+ (The <acronym>SQL</acronym> standard uses the term catalog cluster.) A
53
53
database cluster is a collection of databases that is managed by a
54
54
single instance of a running database server. After initialization, a
55
55
database cluster will contain a database named <literal>postgres</literal>,
65
65
</para>
66
66
67
67
<para>
68
- In file system terms, a database cluster will be a single directory
68
+ In file system terms, a database cluster is a single directory
69
69
under which all data will be stored. We call this the <firstterm>data
70
70
directory</firstterm> or <firstterm>data area</firstterm>. It is
71
71
completely up to you where you choose to store your data. There is no
109
109
110
110
<para>
111
111
<command>initdb</command> will attempt to create the directory you
112
- specify if it does not already exist. It is likely that it will not
113
- have the permission to do so (if you followed our advice and created
114
- an unprivileged account). In that case you should create the
115
- directory yourself (as root) and change the owner to be the
116
- <productname>PostgreSQL</productname> user. Here is how this might
117
- be done:
112
+ specify if it does not already exist. Of course, this will fail if
113
+ <command>initdb</command> does not have permissions to write in the
114
+ parent directory. It's generally recommendable that the
115
+ <productname>PostgreSQL</productname> user own not just the data
116
+ directory but its parent directory as well, so that this should not
117
+ be a problem. If the desired parent directory doesn't exist either,
118
+ you will need to create it first, using root privileges if the
119
+ grandparent directory isn't writable. So the process might look
120
+ like this:
118
121
<screen>
119
- root# <userinput>mkdir /usr/local/pgsql/data </userinput>
120
- root# <userinput>chown postgres /usr/local/pgsql/data </userinput>
122
+ root# <userinput>mkdir /usr/local/pgsql</userinput>
123
+ root# <userinput>chown postgres /usr/local/pgsql</userinput>
121
124
root# <userinput>su postgres</userinput>
122
125
postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
123
126
</screen>
124
127
</para>
125
128
126
129
<para>
127
130
<command>initdb</command> will refuse to run if the data directory
128
- looks like it has already been initialized.</para>
131
+ exists and already contains files; this is to prevent accidentally
132
+ overwriting an existing installation.
133
+ </para>
129
134
130
135
<para>
131
136
Because the data directory contains all the data stored in the
@@ -178,8 +183,30 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
178
183
locale setting. For details see <xref linkend="multibyte">.
179
184
</para>
180
185
186
+ <sect2 id="creating-cluster-mount-points">
187
+ <title>Use of Secondary File Systems</title>
188
+
189
+ <indexterm zone="creating-cluster-mount-points">
190
+ <primary>file system mount points</primary>
191
+ </indexterm>
192
+
193
+ <para>
194
+ Many installations create their database clusters on file systems
195
+ (volumes) other than the machine's <quote>root</> volume. If you
196
+ choose to do this, it is not advisable to try to use the secondary
197
+ volume's topmost directory (mount point) as the data directory.
198
+ Best practice is to create a directory within the mount-point
199
+ directory that is owned by the <productname>PostgreSQL</productname>
200
+ user, and then create the data directory within that. This avoids
201
+ permissions problems, particularly for operations such
202
+ as <application>pg_upgrade</>, and it also ensures clean failures if
203
+ the secondary volume is taken offline.
204
+ </para>
205
+
206
+ </sect2>
207
+
181
208
<sect2 id="creating-cluster-nfs">
182
- <title>Network File Systems</title>
209
+ <title>Use of Network File Systems</title>
183
210
184
211
<indexterm zone="creating-cluster-nfs">
185
212
<primary>Network File Systems</primary>
@@ -188,22 +215,30 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
188
215
<indexterm><primary>Network Attached Storage (<acronym>NAS</>)</><see>Network File Systems</></>
189
216
190
217
<para>
191
- Many installations create database clusters on network file systems.
192
- Sometimes this is done directly via <acronym>NFS</>, or by using a
218
+ Many installations create their database clusters on network file
219
+ systems. Sometimes this is done via <acronym>NFS</>, or by using a
193
220
Network Attached Storage (<acronym>NAS</>) device that uses
194
221
<acronym>NFS</> internally. <productname>PostgreSQL</> does nothing
195
222
special for <acronym>NFS</> file systems, meaning it assumes
196
- <acronym>NFS</> behaves exactly like locally-connected drives
197
- ( <acronym>DAS </>, Direct Attached Storage). If client and server
198
- <acronym>NFS</> implementations have non-standard semantics, this can
223
+ <acronym>NFS</> behaves exactly like locally-connected drives.
224
+ If the client or server <acronym>NFS </> implementation does not
225
+ provide standard file system semantics, this can
199
226
cause reliability problems (see <ulink
200
227
url="http://www.time-travellers.org/shane/papers/NFS_considered_harmful.html"></ulink>).
201
228
Specifically, delayed (asynchronous) writes to the <acronym>NFS</>
202
- server can cause reliability problems; if possible, mount
203
- <acronym>NFS</> file systems synchronously (without caching) to avoid
204
- this. Also, soft-mounting <acronym>NFS</> is not recommended.
205
- (Storage Area Networks (<acronym>SAN</>) use a low-level
206
- communication protocol rather than <acronym>NFS</>.)
229
+ server can cause data corruption problems. If possible, mount the
230
+ <acronym>NFS</> file system synchronously (without caching) to avoid
231
+ this hazard. Also, soft-mounting the <acronym>NFS</> file system is
232
+ not recommended.
233
+ </para>
234
+
235
+ <para>
236
+ Storage Area Networks (<acronym>SAN</>) typically use communication
237
+ protocols other than <acronym>NFS</>, and may or may not be subject
238
+ to hazards of this sort. It's advisable to consult the vendor's
239
+ documentation concerning data consistency guarantees.
240
+ <productname>PostgreSQL</productname> cannot be more reliable than
241
+ the file system it's using.
207
242
</para>
208
243
209
244
</sect2>
0 commit comments