@@ -7,15 +7,16 @@ Distributed transaction management tools for PostgreSQL.
7
7
--------------------
8
8
Communication scheme
9
9
--------------------
10
-
11
- .- Backend -.
12
- / \
13
- / \
14
- DTM ---- Backend ---- Coordinator
15
- \ /
16
- \ /
17
- `- Backend -´
18
-
10
+ ┏━━━━━━━━━┓
11
+ ┌────────┨ Backend ┠──────────┐
12
+ │ ┗━━━━━━━━━┛ │
13
+ ┏━━━━┷━━━━┓ ┏━━━━━━━━━┓ ┏━━━━━━┷━━━━━━┓
14
+ ┃ Arbiter ┠───┨ Backend ┠───┨ Coordinator ┃
15
+ ┗━━━━┯━━━━┛ ┗━━━━━━━━━┛ ┗━━━━━━┯━━━━━━┛
16
+ │ ┏━━━━━━━━━┓ │
17
+ └──┬─────┨ Backend ┠───────┬──┘
18
+ ┆ ┗━━━━━━━━━┛ ┆
19
+ libarbiter libpq + xtm procs
19
20
20
21
-----------------------
21
22
Coordinator-Backend API
@@ -24,128 +25,80 @@ Coordinator-Backend API
24
25
This API includes a set of postgres procedures that
25
26
the coordinator can call with "select" statement.
26
27
27
- -- Informs the DTM about a global transaction
28
- -- identified by the corresponding pairs of node:xid values.
29
- dtm_begin_transaction(nodes integer[], xids integer[]) RETURNS void
30
-
31
- -- Causes the backend to get a snapshot from the DTM
32
- -- and merge it with the local snapshot.
33
- dtm_get_snapshot() RETURNS void
34
-
35
- ----------
36
- libdtm api
37
- ----------
38
-
39
- // Sets up the host and port for DTM connection.
40
- // The defaults are "127.0.0.1" and 5431.
41
- void TuneToDtm(char *host, int port);
42
-
43
- void DtmInitSnapshot(Snapshot snapshot);
44
-
45
- // Starts a new global transaction of nParticipants size. Returns the
46
- // transaction id, fills the 'snapshot' and 'gxmin' on success. 'gxmin' is the
47
- // smallest xmin among all snapshots known to DTM. Returns INVALID_XID
48
- // otherwise.
49
- TransactionId DtmGlobalStartTransaction(int nParticipants, Snapshot snapshot, TransactionId *gxmin);
50
-
51
- // Asks the DTM for a fresh snapshot. Fills the 'snapshot' and 'gxmin' on
52
- // success. 'gxmin' is the smallest xmin among all snapshots known to DTM.
53
- void DtmGlobalGetSnapshot(TransactionId xid, Snapshot snapshot, TransactionId *gxmin);
54
-
55
- // Commits transaction only once all participants have called this function,
56
- // does not change CLOG otherwise. Set 'wait' to 'true' if you want this call
57
- // to return only after the transaction is considered finished by the DTM.
58
- // Returns the status on success, or -1 otherwise.
59
- XidStatus DtmGlobalSetTransStatus(TransactionId xid, XidStatus status, bool wait);
60
-
61
- // Gets the status of the transaction identified by 'xid'. Returns the status
62
- // on success, or -1 otherwise. If 'wait' is true, then it does not return
63
- // until the transaction is finished.
64
- XidStatus DtmGlobalGetTransStatus(TransactionId xid, bool wait);
65
-
66
- // Reserves at least 'nXids' successive xids for local transactions. The xids
67
- // reserved are not less than 'xid' in value. Returns the actual number of xids
68
- // reserved, and sets the 'first' xid accordingly. The number of xids reserved
69
- // is guaranteed to be at least nXids.
70
- // In other words, *first ≥ xid and result ≥ nXids.
71
- // Also sets the 'active' snapshot, which is used as a container for the list
72
- // of active global transactions.
73
- int DtmGlobalReserve(TransactionId xid, int nXids, TransactionId *first, Snapshot active);
28
+ FIXME: actualize the API
74
29
75
- --------------------
76
- Backend-DTM Protocol
77
- --------------------
30
+ ------------------------
31
+ Backend-Arbiter Protocol
32
+ ------------------------
78
33
79
- The queries from backend to DTM should be formatted according to this syntax.
34
+ The underlying protocol (libsockhub) also transmits the message length, so
35
+ there is no need in 'argc'. Every command or reply is a series of int64
36
+ numbers.
80
37
81
- <char cmd><hex16 argc><hex16 argv[0]><hex16 argv[1]>...
38
+ The format of all commands:
39
+ [cmd, argv[0], argv[1], ...]
82
40
83
- <cmd> is a character representing a command.
84
- <argc> is the number of arguments.
85
- <argv[i]> are the arguments.
41
+ 'cmd' is a command.
42
+ 'argv[i]' are the arguments.
86
43
87
44
The commands:
88
45
89
46
'r': reserve(minxid, minsize)
90
47
Claims a sequence ≥ minsize of xids ≥ minxid for local usage. This will
91
- prevent DTM from using those values for global transactions. The
92
- 'snapshot' represent the list of currently active global transactions.
48
+ prevent the arbiter from using those values for global transactions.
93
49
94
- The DTM replies with:
95
- '+'<hex16 min><hex16 max><snapshot> if reserved a range [min, max]
96
- '-' on failure
50
+ The arbiter replies with:
51
+ [RES_OK, min, max] if reserved a range [min, max]
52
+ [RES_FAILED] on failure
97
53
98
54
'b': begin(size)
99
55
Starts a global transaction and assign a 'xid' to it. 'size' is used
100
- for vote results calculation. The DTM also creates and returns the
56
+ for vote results calculation. The arbiter also creates and returns the
101
57
snapshot.
102
58
103
- The DTM replies with:
104
- '+'<hex16 xid>< snapshot> if transaction started successfully
105
- '-' on failure
59
+ The arbiter replies with:
60
+ [RES_OK, xid, * snapshot] if transaction started successfully
61
+ [RES_FAILED] on failure
106
62
107
63
See the 'snapshot' command description for the snapshot format.
108
64
109
65
's': status(xid, wait)
110
- Asks the DTM about the status of the global transaction identified
66
+ Asks the arbiter about the status of the global transaction identified
111
67
by the given 'xid'.
112
68
113
- If 'wait' is true, DTM will not reply until it considers the
69
+ If 'wait' is 1, the arbiter will not reply until it considers the
114
70
transaction finished (all nodes voted, or one dead).
115
71
116
- The DTM replies with:
117
- "+0" if not started
118
- "+c" if committed
119
- "+a" if aborted
120
- "+?" if in progress
121
- '-' if failed
72
+ The arbiter replies with:
73
+ [RES_TRANSACTION_UNKNOWN] if not started
74
+ [RES_TRANSACTION_COMMITTED] if committed
75
+ [RES_TRANSACTION_ABORTED] if aborted
76
+ [RES_TRANSACTION_INPROGRESS] if in progress
77
+ [RES_FAILED] if failed
122
78
123
79
'y': for(xid, wait)
124
- Tells the DTM to vote for commit of the global transaction identified
125
- by the given 'xid'.
80
+ Tells the arbiter that this node votes for commit of the global
81
+ transaction identified by the given 'xid'.
126
82
127
83
The reply and 'wait' logic is the same as for the 'status' command.
128
84
129
85
'n': against(xid, wait)
130
- Tells the DTM to vote againts commit of the global transaction
131
- identified by the given 'xid'.
86
+ Tells the arbiter that this node votes againts commit of the global
87
+ transaction identified by the given 'xid'.
132
88
133
89
The reply and 'wait' logic is the same as for the 'status' command.
134
90
135
91
'h': snapshot(xid)
136
- Tells the DTM to generate a snapshot for the global transaction
137
- identified by the given 'xid'. The DTM will create a snapshot for every
138
- participant, so when each of them asks for the snapshot it will reply
139
- with the same snapshot. The DTM generates a fresh version if the same
140
- client asks for a snapshot again for the same transaction.
92
+ Tells the arbiter to generate a snapshot for the global transaction
93
+ identified by the given 'xid'. The arbiter will create a snapshot for
94
+ every participant, so when each of them asks for the snapshot it will
95
+ reply with the same snapshot. The arbiter generates a fresh version if
96
+ the same client asks for a snapshot again for the same transaction.
141
97
142
98
Joins the global transaction identified by the given 'xid', if not
143
99
joined already.
144
100
145
- The DTM replies with '+' followed by a snapshot in the form:
146
-
147
- <hex16 gxmin><hex16 xmin><hex16 xmax><hex16 xcnt><hex16 xip[0]>...
148
-
149
- Where 'gxmin' is the smallest xmin among all available snapshots.
101
+ The arbiter replies with [gxmin, xmin, xmax, xcnt, xip[0], xip[1]...],
102
+ where 'gxmin' is the smallest xmin among all available snapshots.
150
103
151
- In case of a failure, the DTM replies with '-' .
104
+ In case of a failure, the arbiter replies with [RES_FAILED] .
0 commit comments