Skip to content

Commit b997e36

Browse files
authored
be more permissive with standbys (zalando#842)
* be more permissive with standbys * reflect feedback and updated docs
1 parent 7b94060 commit b997e36

File tree

4 files changed

+110
-52
lines changed

4 files changed

+110
-52
lines changed

docs/administrator.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,11 @@ switchover (planned failover) of the master to the Pod with new minor version.
1111
The switch should usually take less than 5 seconds, still clients have to
1212
reconnect.
1313

14-
Major version upgrades are supported via [cloning](user.md#clone-directly). The
15-
new cluster manifest must have a higher `version` string than the source cluster
16-
and will be created from a basebackup. Depending of the cluster size, downtime
17-
in this case can be significant as writes to the database should be stopped and
18-
all WAL files should be archived first before cloning is started.
14+
Major version upgrades are supported via [cloning](user.md#how-to-clone-an-existing-postgresql-cluster).
15+
The new cluster manifest must have a higher `version` string than the source
16+
cluster and will be created from a basebackup. Depending of the cluster size,
17+
downtime in this case can be significant as writes to the database should be
18+
stopped and all WAL files should be archived first before cloning is started.
1919

2020
Note, that simply changing the version string in the `postgresql` manifest does
2121
not work at present and leads to errors. Neither Patroni nor Postgres Operator

docs/reference/operator_parameters.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -110,8 +110,10 @@ Those are top-level keys, containing both leaf keys and groups.
110110

111111
* **min_instances**
112112
operator will run at least the number of instances for any given Postgres
113-
cluster equal to the value of this parameter. When `-1` is specified, no
114-
limits are applied. The default is `-1`.
113+
cluster equal to the value of this parameter. Standby clusters can still run
114+
with `numberOfInstances: 1` as this is the [recommended setup](../user.md#setting-up-a-standby-cluster).
115+
When `-1` is specified for `min_instances`, no limits are applied. The default
116+
is `-1`.
115117

116118
* **resync_period**
117119
period between consecutive sync requests. The default is `30m`.

docs/user.md

Lines changed: 95 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -254,29 +254,22 @@ spec:
254254

255255
## How to clone an existing PostgreSQL cluster
256256

257-
You can spin up a new cluster as a clone of the existing one, using a clone
257+
You can spin up a new cluster as a clone of the existing one, using a `clone`
258258
section in the spec. There are two options here:
259259

260-
* Clone directly from a source cluster using `pg_basebackup`
261-
* Clone from an S3 bucket
260+
* Clone from an S3 bucket (recommended)
261+
* Clone directly from a source cluster
262262

263-
### Clone directly
264-
265-
```yaml
266-
spec:
267-
clone:
268-
cluster: "acid-batman"
269-
```
270-
271-
Here `cluster` is a name of a source cluster that is going to be cloned. The
272-
cluster to clone is assumed to be running and the clone procedure invokes
273-
`pg_basebackup` from it. The operator will setup the cluster to be cloned to
274-
connect to the service of the source cluster by name (if the cluster is called
275-
test, then the connection string will look like host=test port=5432), which
276-
means that you can clone only from clusters within the same namespace.
263+
Note, that cloning can also be used for [major version upgrades](administrator.md#minor-and-major-version-upgrade)
264+
of PostgreSQL.
277265

278266
### Clone from S3
279267

268+
Cloning from S3 has the advantage that there is no impact on your production
269+
database. A new Postgres cluster is created by restoring the data of another
270+
source cluster. If you create it in the same Kubernetes environment, use a
271+
different name.
272+
280273
```yaml
281274
spec:
282275
clone:
@@ -287,7 +280,8 @@ spec:
287280

288281
Here `cluster` is a name of a source cluster that is going to be cloned. A new
289282
cluster will be cloned from S3, using the latest backup before the `timestamp`.
290-
In this case, `uid` field is also mandatory - operator will use it to find a
283+
Note, that a time zone is required for `timestamp` in the format of +00:00 which
284+
is UTC. The `uid` field is also mandatory. The operator will use it to find a
291285
correct key inside an S3 bucket. You can find this field in the metadata of the
292286
source cluster:
293287

@@ -299,9 +293,6 @@ metadata:
299293
uid: efd12e58-5786-11e8-b5a7-06148230260c
300294
```
301295

302-
Note that timezone is required for `timestamp`. Otherwise, offset is relative
303-
to UTC, see [RFC 3339 section 5.6) 3339 section 5.6](https://www.ietf.org/rfc/rfc3339.txt).
304-
305296
For non AWS S3 following settings can be set to support cloning from other S3
306297
implementations:
307298

@@ -317,35 +308,98 @@ spec:
317308
s3_force_path_style: true
318309
```
319310

311+
### Clone directly
312+
313+
Another way to get a fresh copy of your source DB cluster is via basebackup. To
314+
use this feature simply leave out the timestamp field from the clone section.
315+
The operator will connect to the service of the source cluster by name. If the
316+
cluster is called test, then the connection string will look like host=test
317+
port=5432), which means that you can clone only from clusters within the same
318+
namespace.
319+
320+
```yaml
321+
spec:
322+
clone:
323+
cluster: "acid-batman"
324+
```
325+
326+
Be aware that on a busy source database this can result in an elevated load!
327+
320328
## Setting up a standby cluster
321329

322-
Standby clusters are like normal cluster but they are streaming from a remote
323-
cluster. As the first version of this feature, the only scenario covered by
324-
operator is to stream from a WAL archive of the master. Following the more
325-
popular infrastructure of using Amazon's S3 buckets, it is mentioned as
326-
`s3_wal_path` here. To start a cluster as standby add the following `standby`
327-
section in the YAML file:
330+
Standby cluster is a [Patroni feature](https://github.com/zalando/patroni/blob/master/docs/replica_bootstrap.rst#standby-cluster)
331+
that first clones a database, and keeps replicating changes afterwards. As the
332+
replication is happening by the means of archived WAL files (stored on S3 or
333+
the equivalent of other cloud providers), the standby cluster can exist in a
334+
different location than its source database. Unlike cloning, the PostgreSQL
335+
version between source and target cluster has to be the same.
336+
337+
To start a cluster as standby, add the following `standby` section in the YAML
338+
file and specify the S3 bucket path. An empty path will result in an error and
339+
no statefulset will be created.
328340

329341
```yaml
330342
spec:
331343
standby:
332344
s3_wal_path: "s3 bucket path to the master"
333345
```
334346

335-
Things to note:
336-
337-
- An empty string in the `s3_wal_path` field of the standby cluster will result
338-
in an error and no statefulset will be created.
339-
- Only one pod can be deployed for stand-by cluster.
340-
- To manually promote the standby_cluster, use `patronictl` and remove config
341-
entry.
342-
- There is no way to transform a non-standby cluster to a standby cluster
343-
through the operator. Adding the standby section to the manifest of a running
344-
Postgres cluster will have no effect. However, it can be done through Patroni
345-
by adding the [standby_cluster](https://github.com/zalando/patroni/blob/bd2c54581abb42a7d3a3da551edf0b8732eefd27/docs/replica_bootstrap.rst#standby-cluster)
346-
section using `patronictl edit-config`. Note that the transformed standby
347-
cluster will not be doing any streaming. It will be in standby mode and allow
348-
read-only transactions only.
347+
At the moment, the operator only allows to stream from the WAL archive of the
348+
master. Thus, it is recommended to deploy standby clusters with only [one pod](../manifests/standby-manifest.yaml#L10).
349+
You can raise the instance count when detaching. Note, that the same pod role
350+
labels like for normal clusters are used: The standby leader is labeled as
351+
`master`.
352+
353+
### Providing credentials of source cluster
354+
355+
A standby cluster is replicating the data (including users and passwords) from
356+
the source database and is read-only. The system and application users (like
357+
standby, postgres etc.) all have a password that does not match the credentials
358+
stored in secrets which are created by the operator. One solution is to create
359+
secrets beforehand and paste in the credentials of the source cluster.
360+
Otherwise, you will see errors in the Postgres logs saying users cannot log in
361+
and the operator logs will complain about not being able to sync resources.
362+
This, however, can safely be ignored as it will be sorted out once the cluster
363+
is detached from the source (and it’s still harmless if you don’t plan to).
364+
365+
You can also edit the secrets afterwards. Find them by:
366+
367+
```bash
368+
kubectl get secrets --all-namespaces | grep <postgres-cluster-name>
369+
```
370+
371+
### Promote the standby
372+
373+
One big advantage of standby clusters is that they can be promoted to a proper
374+
database cluster. This means it will stop replicating changes from the source,
375+
and start accept writes itself. This mechanism makes it possible to move
376+
databases from one place to another with minimal downtime. Currently, the
377+
operator does not support promoting a standby cluster. It has to be done
378+
manually using `patronictl edit-config` inside the postgres container of the
379+
standby leader pod. Remove the following lines from the YAML structure and the
380+
leader promotion happens immediately. Before doing so, make sure that the
381+
standby is not behind the source database.
382+
383+
```yaml
384+
standby_cluster:
385+
create_replica_methods:
386+
- bootstrap_standby_with_wale
387+
- basebackup_fast_xlog
388+
restore_command: envdir "/home/postgres/etc/wal-e.d/env-standby" /scripts/restore_command.sh
389+
"%f" "%p"
390+
```
391+
392+
Finally, remove the `standby` section from the postgres cluster manifest.
393+
394+
### Turn a normal cluster into a standby
395+
396+
There is no way to transform a non-standby cluster to a standby cluster through
397+
the operator. Adding the `standby` section to the manifest of a running
398+
Postgres cluster will have no effect. But, as explained in the previous
399+
paragraph it can be done manually through `patronictl edit-config`. This time,
400+
by adding the `standby_cluster` section to the Patroni configuration. However,
401+
the transformed standby cluster will not be doing any streaming. It will be in
402+
standby mode and allow read-only transactions only.
349403

350404
## Sidecar Support
351405

pkg/cluster/k8sres.go

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1048,11 +1048,13 @@ func (c *Cluster) getNumberOfInstances(spec *acidv1.PostgresSpec) int32 {
10481048
cur := spec.NumberOfInstances
10491049
newcur := cur
10501050

1051-
/* Limit the max number of pods to one, if this is standby-cluster */
10521051
if spec.StandbyCluster != nil {
1053-
c.logger.Info("Standby cluster can have maximum of 1 pod")
1054-
min = 1
1055-
max = 1
1052+
if newcur == 1 {
1053+
min = newcur
1054+
max = newcur
1055+
} else {
1056+
c.logger.Warningf("operator only supports standby clusters with 1 pod")
1057+
}
10561058
}
10571059
if max >= 0 && newcur > max {
10581060
newcur = max

0 commit comments

Comments
 (0)