Skip to content

postgrespro/postgres_cluster

Repository files navigation

===
xtm
===

Distributed transaction management tools for PostgreSQL.

--------------------
Communication scheme
--------------------
              ┏━━━━━━━━━┓
     ┌────────┨ Backend ┠──────────┐
     │        ┗━━━━━━━━━┛          │
┏━━━━┷━━━━┓   ┏━━━━━━━━━┓   ┏━━━━━━┷━━━━━━┓
┃ Arbiter ┠───┨ Backend ┠───┨ Coordinator ┃
┗━━━━┯━━━━┛   ┗━━━━━━━━━┛   ┗━━━━━━┯━━━━━━┛
     │        ┏━━━━━━━━━┓          │
     └──┬─────┨ Backend ┠───────┬──┘
        ┆     ┗━━━━━━━━━┛       ┆
 libdtm + libsockhub      libpq + xtm procs

-----------------------
Coordinator-Backend API
-----------------------

This API includes a set of postgres procedures that
the coordinator can call with "select" statement.

FIXME: actualize the API

------------------------
Backend-Arbiter Protocol
------------------------

The underlying protocol (libsockhub) also transmits the message length, so
there is no need in 'argc'. Every command or reply is a series of int64
numbers.

The format of all commands:
	[cmd, argv[0], argv[1], ...]

'cmd' is a command.
'argv[i]' are the arguments.

The commands:

'r': reserve(minxid, minsize)
	Claims a sequence ≥ minsize of xids ≥ minxid for local usage. This will
	prevent the arbiter from using those values for global transactions.

	The arbiter replies with:
		[RES_OK, min, max] if reserved a range [min, max]
		[RES_FAILED] on failure

'b': begin(size)
	Starts a global transaction and assign a 'xid' to it. 'size' is used
	for vote results calculation. The arbiter also creates and returns the
	snapshot.

	The arbiter replies with:
		[RES_OK, xid, *snapshot] if transaction started successfully
		[RES_FAILED] on failure

	See the 'snapshot' command description for the snapshot format.

's': status(xid, wait)
	Asks the arbiter about the status of the global transaction identified
	by the given 'xid'.

	If 'wait' is 1, the arbiter will not reply until it considers the
	transaction finished (all nodes voted, or one dead).

	The arbiter replies with:
		[RES_TRANSACTION_UNKNOWN] if not started
		[RES_TRANSACTION_COMMITTED] if committed
		[RES_TRANSACTION_ABORTED] if aborted
		[RES_TRANSACTION_INPROGRESS] if in progress
		[RES_FAILED] if failed

'y': for(xid, wait)
	Tells the arbiter that this node votes for commit of the global
	transaction identified by the given 'xid'.

	The reply and 'wait' logic is the same as for the 'status' command.

'n': against(xid, wait)
	Tells the arbiter that this node votes againts commit of the global
	transaction identified by the given 'xid'.

	The reply and 'wait' logic is the same as for the 'status' command.

'h': snapshot(xid)
	Tells the arbiter to generate a snapshot for the global transaction
	identified by the given 'xid'. The arbiter will create a snapshot for
	every participant, so when each of them asks for the snapshot it will
	reply with the same snapshot. The arbiter generates a fresh version if
	the same client asks for a snapshot again for the same transaction.

	Joins the global transaction identified by the given 'xid', if not
	joined already.

	The arbiter replies with [RES_OK, gxmin, xmin, xmax, xcnt, xip[0], xip[1]...],
	where 'gxmin' is the smallest xmin among all available snapshots.

	In case of a failure, the arbiter replies with [RES_FAILED].

About

Various experiments with PostgreSQL clustering

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 38