Skip to content

Commit a540338

Browse files
committed
Merge branch 'PGPROEE9_6_MULTIMASTER' of https://gitlab.postgrespro.ru/pgpro-dev/postgrespro into PGPROEE9_6_MULTIMASTER
2 parents 288462c + d391af4 commit a540338

File tree

13 files changed

+1489
-119
lines changed

13 files changed

+1489
-119
lines changed

contrib/mmts/Cluster.pm

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ sub configure
8888
port = $pgport
8989
max_prepared_transactions = 10
9090
max_connections = 10
91-
max_worker_processes = 40
91+
max_worker_processes = 100
9292
wal_level = logical
9393
max_wal_senders = 5
9494
wal_sender_timeout = 0
@@ -101,7 +101,7 @@ sub configure
101101
multimaster.workers = 1
102102
multimaster.node_id = $id
103103
multimaster.conn_strings = '$connstr'
104-
multimaster.heartbeat_recv_timeout = 1050
104+
multimaster.heartbeat_recv_timeout = 2050
105105
multimaster.heartbeat_send_timeout = 250
106106
multimaster.max_nodes = $nnodes
107107
multimaster.ignore_tables_without_pk = true

contrib/mmts/doc/configuration.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,10 @@ Default: 10000000
2323

2424
```multimaster.cluster_name``` Name of the cluster. If you set this variable, `multimaster` checks that the cluster name is the same for all the cluster nodes.
2525

26+
```multimaster.queue_size``` Multimaster queue size. default = 256*1024*1024
27+
28+
```multimaster.trans_spill_threshold``` Maximal size (Mb) of transaction after which transaction is written to the disk. Default = 100, /* 100Mb */
29+
2630

2731

2832
## Questionable
@@ -33,10 +37,6 @@ Default: 10000000
3337

3438
```multimaster.max_2pc_ratio``` Maximal ratio (in percents) between prepare time at different nodes: if T is time of preparing transaction at some node, then transaction can be aborted if prepared responce was not received in T*MtmMax2PCRatio/100. default = 200, /* 2 times */
3539

36-
```multimaster.queue_size``` Multimaster queue size. default = 256*1024*1024,
37-
38-
```multimaster.trans_spill_threshold``` Maximal size (Mb) of transaction after which transaction is written to the disk. Default = 1000, /* 1Gb */ (istm reorderbuffer also can do that, isn't it?)
39-
4040
```multimaster.vacuum_delay``` Minimal age of records which can be vacuumed (seconds). default = 1.
4141

4242
```multimaster.worker``` Number of multimaster executor workers. Default = 8. (use dynamic workers with some timeout to die?)

contrib/mmts/doc/functions.md

Lines changed: 58 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -2,51 +2,71 @@
22

33
## Cluster information functions
44

5-
* `mtm.get_nodes_state()` — show status of nodes in cluster. Returns tuple of following values:
6-
* id, integer
7-
* disabled, bool
8-
* disconnected, bool
9-
* catchUp, bool
10-
* slotLag, bigint
11-
* avgTransDelay, bigint
12-
* lastStatusChange, timestamp
13-
* oldestSnapshot, bigint
14-
* SenderPid integer
15-
* SenderStartTime timestamp
16-
* ReceiverPid integer
17-
* ReceiverStartTime timestamp
18-
* connStr text
19-
* connectivityMask bigint
20-
21-
* `mtm.get_cluster_state()` -- show whole cluster status
22-
* status, text
23-
* disabledNodeMask, bigint
24-
* disconnectedNodeMask, bigint
25-
* catchUpNodeMask, bigint
26-
* liveNodes, integer
27-
* allNodes, integer
28-
* nActiveQueries, integer
29-
* nPendingQueries, integer
30-
* queueSize, bigint
31-
* transCount, bigint
32-
* timeShift, bigint
33-
* recoverySlot, integer
34-
* xidHashSize, bigint
35-
* gidHashSize, bigint
36-
* oldestXid, bigint
37-
* configChanges, integer
5+
* `mtm.get_nodes_state()` — Shows the status of nodes in the cluster. Returns a tuple of the following values:
6+
* id, integer - Node ID.
7+
* enabled, bool - Shows whether the node is excluded from the cluster. The node can only be disabled if responses to heartbeats are not received within the `heartbeat_recv_timeout` time interval. When the node starts responding to heartbeats, `multimaster` can automatically restore the node and switch it back to the enabled state. Automatic recovery is only possible if the replication slot is still active. Otherwise, you can restore the node manually.
8+
* connected, bool - Shows whether the node is connected to the WAL sender.
9+
* slot_active, bool - Shows whether the node has an active replication slot. For a disabled node, the slot remains active until the `max_recovery_lag` value is reached.
10+
* stopped, bool - Shows whether replication to this node was stopped by the `mtm.stop_node()` function. A stopped node acts as a disabled one, but cannot be automatically recovered. Call `mtm.recover_node()` to re-enable such a node.
11+
* catchUp - During the node recovery, shows whether the data is recovered up to the `min_recovery_lag` value.
12+
* slotLag - Size of the WAL data that the replication slot holds for a disabled/stopped node. The slot is dropped when `slotLag` reaches the `max_recovery_lag` value.
13+
* avgTransDelay - An average commit delay caused by this node, in microseconds.
14+
* lastStatusChange - Last time when the node changed its status (enabled/disabled).
15+
* oldestSnapshot - The oldest global snapshot existing on this node.
16+
* SenderPid - Process ID of the WAL sender.
17+
* SenderStartTime - WAL sender start time.
18+
* ReceiverPid - Process ID of the WAL receiver.
19+
* ReceiverStartTime - WAL receiver start time.
20+
* connStr - Connection string to this node.
21+
* connectivityMask - Bitmask representing connectivity to neighbor nodes. Each bit represents a connection to node.
22+
* nHeartbeats - Number of heartbeat responses received from this node.
23+
24+
* `mtm.collect_cluster_state()` - Collects the data returned by the `mtm.get_cluster_state()` function from all available nodes. For this function to work, in addition to replication connections, pg_hba.conf must allow ordinary connections to the node with the specified connection string.
25+
26+
* `mtm.get_cluster_state()` - Shows the status of the multimaster extension. Returns a tuple of the following values:
27+
* status - Node status. Possible values are: "Initialization", "Offline", "Connected", "Online", "Recovery", "Recovered", "InMinor", "OutOfService". The <literal>inMinor</literal> status indicates that the corresponding node got disconnected from the majority of the cluster nodes. Even though the node is active, it will not accept write transactions until it is reconnected to the cluster.
28+
* disabledNodeMask - Bitmask of disabled nodes.
29+
* disconnectedNodeMask - Bitmask of disconnected nodes.
30+
* catchUpNodeMask - Bitmask of nodes that completed the recovery.
31+
* liveNodes - Number of enabled nodes.
32+
* allNodes - Number of nodes in the cluster. The majority of alive nodes is calculated based on this parameter.
33+
* nActiveQueries - Number of queries being currently processed on this node.
34+
* nPendingQueries - Number of queries waiting for execution on this node.
35+
* queueSize - Size of the pending query queue, in bytes.
36+
* transCount - The total number of replicated transactions processed by this node.
37+
* timeShift - Global snapshot shift caused by unsynchronized clocks on nodes, in microseconds.
38+
* recoverySlot - The node from which a failed node gets data updates during automatic recovery.
39+
* xidHashSize - Size of xid2state hash.
40+
* gidHashSize - Size of gid2state hash.
41+
* oldestXid - The oldest transaction ID on this node.
42+
* configChanges - Number of state changes (enabled/disabled) since the last reboot.
43+
* stalledNodeMask - Bitmask of nodes for which replication slots were dropped.
44+
* stoppedNodeMask - Bitmask of nodes that were stopped by `mtm.stop_node()`.
45+
* lastStatusChange - Timestamp of the last state change.
3846

3947

4048
## Node management functions
4149

42-
* `mtm.add_node(conn_str text)` -- add node to the cluster.
43-
* `mtm.drop_node(node integer, drop_slot bool default false)` -- exclude node from the cluster.
44-
* `mtm.poll_node(nodeId integer, noWait boolean default FALSE)` -- wait for node to become online.
45-
* `mtm.recover_node(node integer)` -- create replication slot for the node which was previously dropped together with it's slot.
50+
* `mtm.add_node(conn_str text)` -- Adds a new node to the cluster.
51+
* `conn_str` - Connection string for the new node. For example, for the database `mydb`, user `myuser`, and the new node `node4`, the connection string is `"dbname=mydb user=myuser host=node4"`. Type: `text`
52+
53+
54+
* `mtm.stop_node(node integer, drop_slot bool default false)` -- Excludes a node from the cluster.
55+
* `node` - ID of the node to be dropped that you specified in the `multimaster.node_id` variable. Type: `integer`
56+
* `drop_slot` - Optional. Defines whether the replication slot should be dropped together with the node. Set this option to true if you do not plan to restore the node in the future. Type: `boolean` Default: `false`
57+
58+
59+
* `mtm.recover_node(node integer)` -- Creates a replication slot for the node that was previously dropped together with its slot.
60+
* `node` - ID of the node to be restored.
61+
62+
63+
* `mtm.poll_node(nodeId integer, noWait boolean default FALSE)` -- Waits for the node to become online.
64+
4665

4766
## Data management functions
4867

49-
* `mtm.make_table_local(relation regclass)` -- stop replication for a given table
68+
* `mtm.make_table_local(relation regclass)` -- Stops replication for the specified table.
69+
* `relation` - The table you would like to exclude from the replication scheme. Type: `regclass`
5070

5171
## Debug functions
5272

contrib/mmts/multimaster--1.0.sql

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -35,14 +35,13 @@ CREATE FUNCTION mtm.get_last_csn() RETURNS bigint
3535
AS 'MODULE_PATHNAME','mtm_get_last_csn'
3636
LANGUAGE C;
3737

38-
39-
CREATE TYPE mtm.node_state AS ("id" integer, "disabled" bool, "disconnected" bool, "catchUp" bool, "slotLag" bigint, "avgTransDelay" bigint, "lastStatusChange" timestamp, "oldestSnapshot" bigint, "SenderPid" integer, "SenderStartTime" timestamp, "ReceiverPid" integer, "ReceiverStartTime" timestamp, "connStr" text, "connectivityMask" bigint, "stalled" bool, "stopped" bool, "nHeartbeats" bigint);
38+
CREATE TYPE mtm.node_state AS ("id" integer, "enabled" bool, "connected" bool, "slot_active" bool, "stopped" bool, "catchUp" bool, "slotLag" bigint, "avgTransDelay" bigint, "lastStatusChange" timestamp, "oldestSnapshot" bigint, "SenderPid" integer, "SenderStartTime" timestamp, "ReceiverPid" integer, "ReceiverStartTime" timestamp, "connStr" text, "connectivityMask" bigint, "nHeartbeats" bigint);
4039

4140
CREATE FUNCTION mtm.get_nodes_state() RETURNS SETOF mtm.node_state
4241
AS 'MODULE_PATHNAME','mtm_get_nodes_state'
4342
LANGUAGE C;
4443

45-
CREATE TYPE mtm.cluster_state AS ("status" text, "disabledNodeMask" bigint, "disconnectedNodeMask" bigint, "catchUpNodeMask" bigint, "liveNodes" integer, "allNodes" integer, "nActiveQueries" integer, "nPendingQueries" integer, "queueSize" bigint, "transCount" bigint, "timeShift" bigint, "recoverySlot" integer,
44+
CREATE TYPE mtm.cluster_state AS ("id" integer, "status" text, "disabledNodeMask" bigint, "disconnectedNodeMask" bigint, "catchUpNodeMask" bigint, "liveNodes" integer, "allNodes" integer, "nActiveQueries" integer, "nPendingQueries" integer, "queueSize" bigint, "transCount" bigint, "timeShift" bigint, "recoverySlot" integer,
4645
"xidHashSize" bigint, "gidHashSize" bigint, "oldestXid" bigint, "configChanges" integer, "stalledNodeMask" bigint, "stoppedNodeMask" bigint, "lastStatusChange" timestamp);
4746

4847
CREATE TYPE mtm.trans_state AS ("status" text, "gid" text, "xid" bigint, "coordinator" integer, "gxid" bigint, "csn" timestamp, "snapshot" timestamp, "local" boolean, "prepared" boolean, "active" boolean, "twophase" boolean, "votingCompleted" boolean, "participants" bigint, "voted" bigint, "configChanges" integer);
@@ -59,8 +58,8 @@ CREATE FUNCTION mtm.get_cluster_state() RETURNS mtm.cluster_state
5958
AS 'MODULE_PATHNAME','mtm_get_cluster_state'
6059
LANGUAGE C;
6160

62-
CREATE FUNCTION mtm.get_cluster_info() RETURNS SETOF mtm.cluster_state
63-
AS 'MODULE_PATHNAME','mtm_get_cluster_info'
61+
CREATE FUNCTION mtm.collect_cluster_info() RETURNS SETOF mtm.cluster_state
62+
AS 'MODULE_PATHNAME','mtm_collect_cluster_info'
6463
LANGUAGE C;
6564

6665
CREATE FUNCTION mtm.make_table_local(relation regclass) RETURNS void

0 commit comments

Comments
 (0)