[WIP] Connection pooler #799

erthalion · 2020-01-22T13:55:10Z

The idea is to support connection pooler creation via operator. There is no strict dependencies, one can specify which type of pooler is required, but at the beginning probably only pgbouncer would be implemented.

Suggested changes to the manifest assume a new section:

connectionPool:
  type: pgbouncer
  instancesNumber: 1
  schema: pgbouncer
  user: pgbouncer
  mode: session
  resources:
  ...

Schema & user describe into which schema a lookup function will be installed for all available databases, and which user pooler will use for connection. A set of reasonable default values must be provided with the possibility to simply specify a full pod template for deployment.

pkg/apis/acid.zalan.do/v1/operator_configuration_type.go

Jan-M · 2020-01-22T14:15:32Z

pkg/apis/acid.zalan.do/v1/postgresql_type.go

+
+// Options for connection pooler
+type ConnectionPool struct {
+	NumberOfInstances *int32              `json:"instancesNumber,omitempty"`


My opinion is this is too many options.

pooling: [numberOfPods: 2 #default to op config] enabled: true #default to false from opconfig

That's why most of them are optional.
I wanted originally call it "pooling", but it's too broad and not clear, what kind of "pool" are we talking about. NumberOfPods is good.

pkg/apis/acid.zalan.do/v1/postgresql_type.go

Jan-M · 2020-01-22T14:17:37Z

I would slim down some Postgres manifest options in favor of a slim user configuration.

If one is not happy with the number of fields we offer, it is easy to deploy pgbouncer similar to our config manually.

I also think the type can probably be hidden behind the supplied Docker image given we have good env vars specced.

pkg/apis/acid.zalan.do/v1/operator_configuration_type.go

pkg/cluster/k8sres.go

Add an initial support for a connection pooler. The idea is to make it generic enough to be able to switch a corresponding docker image to change from pgbouncer to e.g. odyssey. Operator needs to create a deployment with pooler and a service for it to access.

Set up a proper owner reference to StatefulSet, and delete with foreground policy to not leave orphans.

With convertion for config, and start tests.

Add synchronization logic. For now get rid of podTemplate, type fields. Add crd validation & configuration part, put retry on top of lookup function installation.

… and chart

Add pool configuration into CRD & charts. Add preliminary documentation. Rename NumberOfInstances to Replicas like in Deployment. Mention couple of potential improvement points for connection pool specification.

Add an initial support for a connection pooler. The idea is to make it generic enough to be able to switch a corresponding docker image to change from pgbouncer to e.g. odyssey. Operator needs to create a deployment with pooler and a service for it to access.

Set up a proper owner reference to StatefulSet, and delete with foreground policy to not leave orphans.

With convertion for config, and start tests.

Add synchronization logic. For now get rid of podTemplate, type fields. Add crd validation & configuration part, put retry on top of lookup function installation.

It requires more accurate lookup function synchronization and couple fixes on the way (e.g. few get rid of using schema from a user secret). For lookup function, since it's idempotend, sync it when we're not sure if it was installed (e.g. when the operator was shutdown and start sync everything at the start) and then remember that it was installed.

Small typo-like fixes and proper installing of a lookup function in all the databases.

Rename application for connection pool (ideally in the future make it configurable). Take into accounts nils for MaxInt32

FxKu · 2020-03-18T16:08:19Z

pkg/cluster/cluster.go

+		if env.Name == "PGUSER" {
+			ref := env.ValueFrom.SecretKeyRef.LocalObjectReference
+
+			if ref.Name != c.credentialSecretName(config.User) {


I think, one check is missing here. Only apply the change if the user in the spec is empty. Same for schema.

FxKu · 2020-03-19T09:28:47Z

pkg/cluster/cluster.go

+
+// Check if we need to synchronize connection pool deployment due to new
+// defaults, that are different from what we see in the DeploymentSpec
+func (c *Cluster) needSyncConnPoolDefaults(


maxDBConnections syncing is missing here (?)

It's mostly on purpose. Options mentioned there are reasonable to change and relatively easy to track. Not sure if there would be ever need to change maxDBConnections (e.g. pgbouncer in all my tests was always keeping it substantially lower than this limit).

* Set minimum number of pool instances to 2 * Improve logging of sync reasons * Improve logging of a new pool role

Verify the defaul values only if the schema doesn't override them.

Since connection poolers are usually cpu bounded.

FxKu · 2020-03-19T11:22:42Z

pkg/cluster/database.go

+		$$ LANGUAGE plpgsql SECURITY DEFINER;
+
+		REVOKE ALL ON FUNCTION {{.pool_schema}}.user_lookup(text)
+			FROM public, {{.pool_schema}};


Suggested change

FROM public, {{.pool_schema}};

FROM public, {{.pool_user}};

If you don't choose pooler as your user, it will not get created which leads to an role does not exist error here

FxKu · 2020-03-19T13:29:22Z

pkg/cluster/sync.go

+
+		// in this case also do not forget to install lookup function as for
+		// creating cluster
+		if !oldNeedConnPool || !c.ConnectionPool.LookupFunction {


Doesn't this need to be &&? On each sync I see the LookupFunction logs ("Installing..."). oldNeedConnPool should be false on sync but ConnectionPool.LookupFunction is set to true within the lookup function. Or is it intended?

FxKu · 2020-03-19T13:36:20Z

pkg/cluster/k8sres.go

+
+	if *numberOfInstances < constants.ConnPoolMinInstances {
+		msg := "Adjusted number of connection pool instances from %d to %d"
+		c.logger.Warningf(msg, numberOfInstances, constants.ConnPoolMinInstances)


something like coalesce(numberOfInstances, c.OpConfig.ConnectionPool.NumberOfInstances) would prevent logs to look like: Adjusted number of connection pool instances from 824638671720 to 2

Nope, it's just a pointer needs to be dereferenced.

FxKu · 2020-03-19T14:08:54Z

pkg/cluster/database.go

+	}
+	defer func() {
+		// in case if everything went fine this can generate a warning about
+		// trying to close an empty connection.


seen this, too. Can't you simply remove the closeDBConn block at the end of this function?

EDIT: Ah, you're connection to mutliple DBs. Okay. And c.pgDb != nil cannot be used here, too?

FxKu · 2020-03-19T14:42:51Z

pkg/cluster/sync.go

+
+			if newConnPool != nil {
+				specSchema = newConnPool.Schema
+				specUser = newConnPool.Schema


Suggested change

specUser = newConnPool.Schema

specUser = newConnPool.User

FxKu · 2020-03-19T15:01:25Z

pkg/cluster/cluster.go

+	}
+
+	if spec.NumberOfInstances == nil &&
+		*deployment.Spec.Replicas != *config.NumberOfInstances {


this should also check if deployment.Spec.Replicas is smaller than ConnPoolMinInstances. Otherwise, it could be set to 1 here and then raised to 2 again.

Nope, it needs to be done on the config level (e.g. when it's created).

FxKu · 2020-03-19T15:06:09Z

pkg/util/constants/pooler.go

+	ConnPoolContainer            = 0
+	ConnPoolMaxDBConnections     = 60
+	ConnPoolMaxClientConnections = 10000
+	ConnPoolMinInstances         = 2


Number for min instances should be the same as the internal default for Connection Pool's numberOfInstances. So raise it to two everywhere?

Set default numberOfInstances to 2. Add verifications for config. Fix schema/user typos. Avoid closing an empty connections.

…param

FxKu · 2020-03-24T13:19:30Z

pkg/cluster/k8sres.go

+		ObjectMeta: metav1.ObjectMeta{
+			Name:        c.connPoolName(),
+			Namespace:   c.Namespace,
+			Labels:      c.labelsSet(true),


should this be connPoolLabelsSelector().MatchLabels instead, like in the pod template of this deployment?

FxKu · 2020-03-24T13:19:57Z

pkg/cluster/k8sres.go

+		ObjectMeta: metav1.ObjectMeta{
+			Name:        c.connPoolName(),
+			Namespace:   c.Namespace,
+			Labels:      c.labelsSet(true),


Same with deployment. Use connPoolLabelsSelector().MatchLabels instead?

FxKu · 2020-03-25T07:24:11Z

👍

erthalion · 2020-03-25T09:41:31Z

👍

erthalion requested review from avaczi, CyberDem0n, FxKu, Jan-M, RafiaSabih and sdudoladov as code owners January 22, 2020 13:55

Jan-M reviewed Jan 22, 2020

View reviewed changes

pkg/apis/acid.zalan.do/v1/operator_configuration_type.go Outdated Show resolved Hide resolved

Jan-M reviewed Jan 22, 2020

View reviewed changes

pkg/apis/acid.zalan.do/v1/postgresql_type.go Outdated Show resolved Hide resolved

Jan-M reviewed Jan 22, 2020

View reviewed changes

pkg/apis/acid.zalan.do/v1/operator_configuration_type.go Show resolved Hide resolved

RafiaSabih reviewed Jan 23, 2020

View reviewed changes

pkg/cluster/k8sres.go Show resolved Hide resolved

FxKu added this to the 1.4 milestone Jan 24, 2020

Jan-M added the zalando label Jan 27, 2020

erthalion added 4 commits February 12, 2020 17:31

Improve cleaning up

4c69b2b

Set up a proper owner reference to StatefulSet, and delete with foreground policy to not leave orphans.

Add CRD configuration

2b2f29f

With convertion for config, and start tests.

Add more tests

b40ea2c

erthalion force-pushed the feature/connection-pooler branch from d1a756a to 6c37520 Compare February 12, 2020 16:35

erthalion and others added 8 commits February 12, 2020 17:35

Various improvements

6c37520

Add synchronization logic. For now get rid of podTemplate, type fields. Add crd validation & configuration part, put retry on top of lookup function installation.

Add test for both ways to enable connection pool

55873f0

add validation for postgresql CRD

b66b163

reflect connectionPool validation in Go code and publish in manifests…

f2c9905

… and chart

Cleanup configuration

6dad833

Add pool configuration into CRD & charts. Add preliminary documentation. Rename NumberOfInstances to Replicas like in Deployment. Mention couple of potential improvement points for connection pool specification.

Improve cleaning up

c028be4

Set up a proper owner reference to StatefulSet, and delete with foreground policy to not leave orphans.

Add CRD configuration

7254039

With convertion for config, and start tests.

erthalion force-pushed the feature/connection-pooler branch from 6dad833 to 2384e1e Compare February 13, 2020 12:46

erthalion added 2 commits February 13, 2020 13:47

Add more tests

8bd2086

Various improvements

3ff1147

Add synchronization logic. For now get rid of podTemplate, type fields. Add crd validation & configuration part, put retry on top of lookup function installation.

erthalion added 4 commits March 13, 2020 14:34

Merge branch 'master' into feature/connection-pooler

4d61adf

Address feedback

cf6541b

Small typo-like fixes and proper installing of a lookup function in all the databases.

Address feedback

1c7065e

Rename application for connection pool (ideally in the future make it configurable). Take into accounts nils for MaxInt32

FxKu reviewed Mar 18, 2020

View reviewed changes

FxKu reviewed Mar 19, 2020

View reviewed changes

erthalion added 3 commits March 19, 2020 10:31

Minor improvements

48cdbb6

* Set minimum number of pool instances to 2 * Improve logging of sync reasons * Improve logging of a new pool role

Defaults for user/schema fix

20b2fb4

Verify the defaul values only if the schema doesn't override them.

Adjust default resource configuration

6ae3c3d

Since connection poolers are usually cpu bounded.

FxKu reviewed Mar 19, 2020

View reviewed changes

erthalion and others added 7 commits March 19, 2020 16:56

Address feedback

f839806

Set default numberOfInstances to 2. Add verifications for config. Fix schema/user typos. Avoid closing an empty connections.

Set min number of instances to 2

7a9d898

use min instances 2 everywhere and update reference docs

1ca8028

fix typo

0aff65e

some more minor changes

bdb3eaf

update pooler default image and add explanation for maxDBConnections …

b0f5347

…param

lower CPU request and update docs

21af410

FxKu reviewed Mar 24, 2020

View reviewed changes

Use connection pool labels

9f51d73

erthalion merged commit 9dfa433 into master Mar 25, 2020

andypeng2015 mentioned this pull request Dec 30, 2021

not able to configure logical_backup_docker_image for postgres-operator:v1.4.0 #1712

Open

[WIP] Connection pooler #799

[WIP] Connection pooler #799

Uh oh!

Conversation

erthalion commented Jan 22, 2020

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Jan-M commented Jan 22, 2020

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

FxKu Mar 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

FxKu Mar 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

FxKu commented Mar 25, 2020

Uh oh!

erthalion commented Mar 25, 2020

Uh oh!

Uh oh!

FxKu Mar 19, 2020 •

edited

Loading

FxKu Mar 19, 2020 •

edited

Loading