If the controller needs to be fault-tolerant, it needs to be deployed in three or more replicas (following the Raft majority protocol).
Controller can also complete Broker Failover with only one deployment, but if the single point Controller fails, it will affect the switching ability, but will not affect the normal reception and transmission of the existing cluster.
There are two ways to deploy Controller. One is to embed it in NameServer for deployment, which can be opened through the configuration enableControllerInNamesrv (it can be opened selectively and is not required to be opened on every NameServer). In this mode, the NameServer itself is still stateless, that is, if the NameServer crashes in the embedded mode, it will only affect the switching ability and not affect the original routing acquisition and other functions. The other is independent deployment, which requires separate deployment of the controller.
When embedded in NameServer deployment, you only need to set enableControllerInNamesrv=true
in the NameServer configuration file and fill in the controller configuration.
enableControllerInNamesrv = true controllerDLegerGroup = group1 controllerDLegerPeers = n0-127.0.0.1:9877;n1-127.0.0.1:9878;n2-127.0.0.1:9879 controllerDLegerSelfId = n0 controllerStorePath = /home/admin/DledgerController enableElectUncleanMaster = false notifyBrokerRoleChanged = true
Parameter explain:
Some other parameters can be referred to in the ControllerConfig code.
After setting the parameters, start the Nameserver by specifying the configuration file.
To deploy independently, execute the following script:
sh bin/mqcontroller -c controller.conf
The mqcontroller script is located at distribution/bin/mqcontroller, and the configuration parameters are the same as in embedded mode.
The Broker start method is the same as before, with the following parameters added:
controllerAddr = 127.0.0.1:9877;127.0.0.1:9878;127.0.0.1:9879
In Controller mode, the Broker configuration must set enableControllerMode=true
and fill in controllerAddr.
Among the parameters such as inSyncReplicas and minInSyncReplicas, there are overlapping and different meanings in normal Master-Slave deployment, SlaveActingMaster mode, and automatic master-slave switching architecture. The specific differences are as follows:
inSyncReplicas | minInSyncReplicas | enableAutoInSyncReplicas | allAckInSyncStateSet | haMaxGapNotInSync | haMaxTimeSlaveNotCatchup | |
---|---|---|---|---|---|---|
Normal Master-Slave deployment | The number of replicas that need to be ACKed in synchronous replication, invalid in asynchronous replication | invalid | invalid | invalid | invalid | invalid |
Enable SlaveActingMaster (slaveActingMaster=true) | The number of replicas that need to be ACKed in synchronous replication in the absence of auto-degradation | The minimum number of replicas that need to be ACKed after auto-degradation | Whether to enable auto-degradation, and the minimum number of replicas that need to be ACKed after auto-degradation is reduced to minInSyncReplicas | invalid | Basis for degradation determination: the difference in Commitlog heights between Slave and Master, in bytes | invalid |
Automatic master-slave switching architecture(enableControllerMode=true) | The number of replicas that need to be ACKed in synchronous replication when allAckInSyncStateSet is not enabled, and this value is invalid when allAckInSyncStateSet is enabled | SyncStateSet can be reduced to the minimum number of replicas, and if the number of replicas in SyncStateSet is less than minInSyncReplicas, it will return directly with insufficient number of replicas | invalid | If this value is true, a message needs to be replicated to every replica in SyncStateSet before it is returned to the client as successful, and this parameter can ensure that the message is not lost | invalid | The minimum time difference between Slave and Master when SyncStateSet is contracted, see RIP-44 for details. |
To summarize:
This mode does not make any changes or modifications to any client-level APIs, and there are no compatibility issues with clients.
The Nameserver itself has not been modified and there are no compatibility issues with the Nameserver. If enableControllerInNamesrv is enabled and the controller parameters are configured correctly, the controller function is enabled.
If Broker is set to enableControllerMode=false
, it will still operate as before. If enableControllerMode=true
, the Controller must be deployed and the parameters must be configured correctly in order to operate properly.
The specific behavior is shown in the following table:
Old nameserver | Old nameserver + Deploy controllers independently | New nameserver enables controller | New nameserver disable controller | |
---|---|---|---|---|
Old broker | Normal running, cannot failover | Normal running, cannot failover | Normal running, cannot failover | Normal running, cannot failover |
New broker enable controller mode | Unable to go online normally | Normal running, can failover | Normal running, can failover | Unable to go online normally |
New broker disable controller mode | Normal running, cannot failover | Normal running, cannot failover | Normal running, cannot failover | Normal running, cannot failover |
From the compatibility statements above, it can be seen that NameServer can be upgraded normally without compatibility issues. In the case where the Nameserver is not to be upgraded, the controller component can be deployed independently to obtain switching capabilities. For broker upgrades, there are two cases:
Master-Slave deployment is upgraded to controller switching architecture
In-place upgrade with data is possible. For each group of Brokers, stop the primary and secondary Brokers and ensure that the CommitLogs of the primary and secondary are aligned (you can either disable writing to this group of Brokers for a certain period of time before the upgrade or ensure consistency by copying). After upgrading the package, restart it.
If the primary and secondary CommitLogs are not aligned, it is necessary to ensure that the primary is online before the secondary is online, otherwise messages may be lost due to data truncation.
Upgrade from DLedger mode to Controller switching architecture
Due to the differences in the format of message data in DLedger mode and Master-Slave mode, there is no in-place upgrade with data. In the case of deploying multiple groups of Brokers, it is possible to disable writing to a group of Brokers for a certain period of time (as long as it is confirmed that all existing messages have been consumed), and then upgrade and deploy the Controller and new Brokers. In this way, the new Brokers will consume messages from the existing Brokers and the existing Brokers will consume messages from the new Brokers until the consumption is balanced, and then the existing Brokers can be decommissioned.