Clients can use transactions (TX or DTX) with the current HA module but:
Atomic: Transaction results must be atomic on primary and backups.
Consistent: Backups and primary must agree on whether the transaction comitted.
Isolated: Concurrent transactions must not interfere with each other.
Durable: Transactional state is written to the store.
Recovery: If a cluster crashes while a DTX is in progress, it can be re-started and participate in DTX recovery with a transaction co-ordinater (currently via JMS)
Both TX and DTX transactions require a DTX-like prepare
phase to synchronize the commit or rollback across multiple brokers. This design note applies to both.
For DTX the transaction is identified by the xid
. For TX the HA module generates a unique identifier.
Introduce 2 new interfaces on the Primary to intercept transactional operations:
TransactionObserverFactory { TransactionObserver create(...); } TransactionObserver { publish(queue, msg); accept(accumulatedAck, unacked) # NOTE: translate queue positions to replication ids prepare() commit() rollback() }
The primary will register a TransactionObserverFactory
with the broker to hook into transactions. On the primary, transactional events are processed as normal and additionally are passed to the TransactionObserver
which replicates the events to the backups.
The primary creates a specially-named queue for each transaction (the tx-queue)
TransactionObserver operations:
The tx-queue is replicated like a normal queue with some extensions for transactions.
TransactionReplicator
(extends QueueReplicator
)QueueReplicator
builds up a TxBuffer
or DtxBuffer
of transaction events.TxReplicatingSubscription
(extends ReplicatingSubscription
Using a tx-queue has the following benefits:
Primary receives dtx.prepare:
TODO: this means we need asynchronous completion of the prepare control so we complete only when all backups have responded (or time out.)
Backups receiving prepare event do a local prepare. Outcome is communicated to the TxReplicatingSubscription on the primary as follows:
Primary receives dtx.commit/rollback
Primary receives tx.commit/rollback
When tx commits, each broker (backup & primary) pushes tx messages to the local queue.
On the primary, ReplicatingSubscriptions will see the tx messages on the local queue. Need to avoid re-sending to backups that already have the messages from the transaction.
ReplicatingSubscriptions has a “skip set” of messages already on the backup, add tx messages to the skip set before Primary commits to local queue.
Backups abort all open tx if the primary fails.
Keep tx atomic when backups catch up while a tx is in progress. A ready
backup should never contain a proper subset of messages in a transaction.
Scenario:
TODO: Not happy with the following solution.
Solution: Primary delays ready
status till full tx is replicated.
NOTE: needs thougthful locking to correctly handle
TODO: A blocking TX eliminates the benefit of the queue guard for expected backups.
TODO
New tests:
test_failover_send_receive
)Existing tests:
qpid/tests/src/py/qpid_tests/broker_0_10/dtx.py,tx.py
run against HA broker.