| <?xml version="1.0"?> |
| <!-- |
| |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| |
| --> |
| |
| <chapter xmlns="http://docbook.org/ns/docbook" version="5.0" xml:id="Java-Broker-High-Availability"> |
| <title>High Availability</title> |
| |
| <section xml:id="Java-Broker-High-Availability-GeneralIntroduction"> |
| <title>General Introduction</title> |
| <para>The term High Availability (HA) usually refers to having a number of instances of a |
| service such as a Message Broker available so that should a service unexpectedly fail, or |
| requires to be shutdown for maintenance, users may quickly connect to another instance and |
| continue their work with minimal interruption. HA is one way to make a overall system more |
| resilient by eliminating a single point of failure from a system.</para> |
| <para>HA offerings are usually categorised as <emphasis role="bold">Active/Active</emphasis> or |
| <emphasis role="bold">Active/Passive</emphasis>. An Active/Active system is one where all |
| nodes within the group are usually available for use by clients all of the time. In an |
| Active/Passive system, one only node within the group is available for use by clients at any |
| one time, whilst the others are in some kind of standby state, awaiting to quickly step-in in |
| the event the active node becomes unavailable. </para> |
| </section> |
| |
| <section xml:id="Java-Broker-High-Availability-OverviewOfHA"> |
| <title>High Availability Overview</title> |
| <para>The Broker provides a HA implementation offering an <emphasis role="bold">Active/Passive</emphasis> mode of operation. |
| When using HA, many instances of the Broker work together to form an high availability group of two or more nodes.</para> |
| <para>The remainder of this section now talks about the specifics of how HA is achieved in terms |
| of the <link linkend="Java-Broker-Concepts">concepts</link> introduced earlier in this |
| book.</para> |
| <para>The <link linkend="Java-Broker-Concepts-Virtualhosts">Virtualhost</link> is the unit of |
| replication. This means that any <emphasis>durable</emphasis> queues, exchanges, and bindings |
| belonging to that virtualhost, any <emphasis>persistent</emphasis> messages contained within |
| the queues and any attribute settings applied to the virtualhost itself are automatically |
| replicated to all nodes within the group.<footnote> |
| <para>Transient messages and messages on non-durable queues are not replicated.</para> |
| </footnote></para> |
| <para>It is the <link linkend="Java-Broker-Concepts-Virtualhost-Nodes">Virtualhost Nodes</link> |
| (from different Broker instances) that join together to form a group. The virtualhost nodes |
| collectively to coordinate the group: they organise replication between the master and |
| replicas and conduct elections to determine who becomes the new master in the event of the old |
| failing.</para> |
| <para>When a virtualhost node is in the <emphasis>master</emphasis> role, the virtualhost |
| beneath it is available for messaging work. Any write operations sent to the virtualhost are |
| automatically replicated to all other nodes in group.</para> |
| <para>When a virtualhost node is in the <emphasis>replica</emphasis> role, the virtualhost |
| beneath it is always unavailable for message work. Any attempted connections to a virtualhost |
| in this state are automatically turned away, allowing a messaging client to discover where the |
| master currently resides. When in replica role, the node sole responsibility is to consume a |
| replication stream in order that it remains up to date with the master.</para> |
| <para>Messaging clients discover the active virtualhost.This can be achieved using a static |
| technique (for instance, a failover url (a feature of a Apache Qpid JMS Client for AMQP 0-9-1/0-10)), or a dynamic one |
| utilising some kind of proxy or virtual IP (VIP).</para> |
| <para>The figure that follows illustrates a group formed of three virtualhost nodes from three |
| separate Broker instances. A client is connected to the virtualhost node that is in the master |
| role. The two virtualhost nodes <literal>weather1</literal> and <literal>weather3</literal> |
| are replicas and are receiving a stream of updates.</para> |
| <figure xml:id="Java-Broker-High-Availability-OverviewOfHA-Figure"> |
| <title>3-node group deployed across three Brokers.</title> |
| <mediaobject> |
| <imageobject> |
| <imagedata fileref="images/HA-Overview.png" format="PNG" scalefit="1"/> |
| </imageobject> |
| <textobject> |
| <phrase>Diagram showing a 3 node group deployed across three Brokers</phrase> |
| </textobject> |
| </mediaobject> |
| </figure> |
| <para>Currently, the only virtualhost/virtualhost node type offering HA is BDB HA. Internally, |
| this leverages the HA capabilities of the Berkeley DB JE edition. BDB JE is an <link linkend="Java-Broker-Miscellaneous-Installing-Oracle-BDB-JE">optional dependency</link> of |
| the Broker.</para> |
| <note> |
| <para>The HA solution from the Apache Qpid Broker for Java is incompatible with the HA solution offered by the CPP |
| Broker. It is not possible to co-locate Broker for Java and CPP Brokers within the same group.</para> |
| </note> |
| </section> |
| <section xml:id="Java-Broker-High-Availability-CreatingGroup"> |
| <title>Creating a group</title> |
| <para>This section describes how to create a group. At a high level, creating a group involves |
| first creating the first node standalone, then creating subsequent nodes referencing the first |
| node so the nodes can introduce themselves and gradually the group is built up.</para> |
| <para>A group is created through either <link linkend="Java-Broker-Management-Channel-Web-Console">Web Management</link> or the <link linkend="Java-Broker-Management-Channel-REST-API">REST API</link>. These instructions |
| presume you are using Web Management. To illustrate the example it builds the group |
| illustrated in figure <xref linkend="Java-Broker-High-Availability-OverviewOfHA-Figure"/></para> |
| <para><orderedlist> |
| <listitem> |
| <para>Install a Broker on each machine that will be used to host the group. As messaging |
| clients will need to be able to connect to and authentication to all Brokers, it usually |
| makes sense to choose a common authentication mechanism e.g. Simple LDAP Authentication, |
| External with SSL client authentication or Kerberos.</para> |
| </listitem> |
| <listitem> |
| <para>Select one Broker instance to host the first node instance. This choice is an |
| arbitrary one. The node is special only whilst creating group. Once creation is |
| complete, all nodes will be considered equal.</para> |
| </listitem> |
| <listitem> |
| <para>Click the <literal>Add</literal> button on the Virtualhost Panel on the Broker |
| tab.</para> |
| <para> |
| <orderedlist> |
| <listitem> |
| <para>Give the Virtualhost node a unique name e.g. <literal>weather1</literal>. The |
| name must be unique within the group and unique to that Broker. It is best if the |
| node names are chosen from a different nomenclature than the machine names |
| themselves.</para> |
| </listitem> |
| <listitem> |
| <para>Choose <literal>BDB_HA</literal> and select <literal>New group</literal> |
| </para> |
| </listitem> |
| <listitem> |
| <para>Give the group a name e.g. <literal>weather</literal>. The group name must be |
| unique and will be the name also given to the virtualhost, so this is the name the |
| messaging clients will use in their connection url.</para> |
| </listitem> |
| <listitem> |
| <para>Give the address of this node. This is an address on this node's host that |
| will be used for replication purposes. The hostname <emphasis>must</emphasis> be |
| resolvable by all the other nodes in the group. This is separate from the address |
| used by messaging clients to connect to the Broker. It is usually best to choose a |
| symbolic name, rather than an IP address.</para> |
| </listitem> |
| <listitem> |
| <para>Now add the node addresses of all the other nodes that will form the group. In |
| our example we are building a three node group so we give the node addresses of |
| <literal>chaac:5000</literal> and <literal>indra:5000</literal>.</para> |
| </listitem> |
| <listitem> |
| <para>Click Add to create the node. The virtualhost node will be created with the |
| virtualhost. As there is only one node at this stage, the role will be |
| master.</para> |
| </listitem> |
| </orderedlist> |
| <figure> |
| <title>Creating 1st node in a group</title> |
| <mediaobject> |
| <imageobject> |
| <imagedata fileref="images/HA-Create-1.png" format="PNG" scalefit="1"/> |
| </imageobject> |
| <textobject> |
| <phrase>Creating 1st node in a group</phrase> |
| </textobject> |
| </mediaobject> |
| </figure> |
| </para> |
| </listitem> |
| <listitem> |
| <para>Now move to the second Broker to be the group. Click the <literal>Add</literal> |
| button on the Virtualhost Panel on the Broker tab of the second Broker.</para> |
| <para> |
| <orderedlist> |
| <listitem> |
| <para>Give the Virtualhost node a unique name e.g. |
| <literal>weather2</literal>.</para> |
| </listitem> |
| <listitem> |
| <para>Choose <literal>BDB_HA</literal> and choose <literal>Existing group</literal> |
| </para> |
| </listitem> |
| <listitem> |
| <para>Give the details of the <emphasis>existing node</emphasis>. Following our |
| example, specify <literal>weather</literal>, <literal>weather1</literal> and |
| <literal>thor:5000</literal></para> |
| </listitem> |
| <listitem> |
| <para>Give the address of this node.</para> |
| </listitem> |
| <listitem> |
| <para>Click Add to create the node. The node will use the existing details to |
| contact it and introduce itself into the group. At this stage, the group will have |
| two nodes, with the second node in the replica role.</para> |
| </listitem> |
| <listitem> |
| <para>Repeat these steps until you have added all the nodes to the group.</para> |
| </listitem> |
| </orderedlist> |
| <figure> |
| <title>Adding subsequent nodes to the group</title> |
| <mediaobject> |
| <imageobject> |
| <imagedata fileref="images/HA-Create-2.png" format="PNG" scalefit="1"/> |
| </imageobject> |
| <textobject> |
| <phrase>Adding subsequent nodes to the group</phrase> |
| </textobject> |
| </mediaobject> |
| </figure> |
| </para> |
| </listitem> |
| |
| </orderedlist></para> |
| <para>The group is now formed and is ready for us. Looking at the virtualhost node of any of the |
| nodes shows a complete view of the whole group. <figure> |
| <title>View of group from one node</title> |
| <mediaobject> |
| <imageobject> |
| <imagedata fileref="images/HA-Create-3.png" format="PNG" scalefit="1"/> |
| </imageobject> |
| <textobject> |
| <phrase>View of group from one node</phrase> |
| </textobject> |
| </mediaobject> |
| </figure></para> |
| </section> |
| |
| <section xml:id="Java-Broker-High-Availability-Behaviour"> |
| <title>Behaviour of the Group</title> |
| <para>This section first describes the behaviour of the group in its default configuration. It |
| then goes on to talk about the various controls that are available to override it. It |
| describes the controls available that affect the <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://en.wikipedia.org/wiki/ACID#Durability">durability</link> of transactions and |
| the data consistency between the master and replicas and thus make trade offs between |
| performance and reliability.</para> |
| |
| <section xml:id="Java-Broker-High-Availability-Behaviour-Default-Behaviour"> |
| <title>Default Behaviour</title> |
| <para>Let's first look at the behaviour of a group in default configuration.</para> |
| <para>In the default configuration, for any messaging work to be done, there must be at least |
| <emphasis>quorum</emphasis> nodes present. This means for example, in a three node group, |
| this means there must be at least two nodes available.</para> |
| <para>When a messaging client sends a transaction, it can be assured that, before the control |
| returns back to his application after the commit call that the following is true:</para> |
| <para><itemizedlist> |
| <listitem> |
| <para>At the master, the transaction is <emphasis>written to disk and OS level caches |
| are flushed</emphasis> meaning the data is on the storage device.</para> |
| </listitem> |
| <listitem> |
| <para>At least quorum minus 1 replicas, <emphasis>acknowledge the receipt of |
| transaction</emphasis>. The replicas will write the data to the storage device |
| sometime later.</para> |
| </listitem> |
| </itemizedlist></para> |
| <para>If there were to be a master failure immediately after the transaction was committed, |
| the transaction would be held by at least quorum minus one replicas. For example, if we had |
| a group of three, then we would be assured that at least one replica held the |
| transaction.</para> |
| |
| <para>In the event of a master failure, if quorum nodes remain, those nodes hold an election. |
| The nodes will elect master the node with the most recent transaction. If two or more nodes |
| have the most recent transaction the group makes an arbitrary choice. If quorum number of |
| nodes does not remain, the nodes cannot elect a new master and will wait until nodes rejoin. |
| You will see later that manual controls are available allow service to be restored from |
| fewer than quorum nodes and to influence which node gets elected in the event of a |
| tie.</para> |
| |
| <para>Whenever a group has fewer than quorum nodes present, the virtualhost will be |
| unavailable and messaging connections will be refused. If quorum disappears at the very |
| moment a messaging client sends a transaction that transaction will fail.</para> |
| |
| <para>You will have noticed the difference in the synchronization policies applied the master |
| and the replicas. The replicas send the acknowledgement back before the data is written to |
| disk. The master synchronously writes the transaction to storage. This is an example of a |
| trade off between durability and performance. We will see more about how to control this |
| trade off later.</para> |
| </section> |
| <section xml:id="Java-Broker-High-Availability-Behaviour-SynchronizationPolicy"> |
| <title>Synchronization Policy</title> |
| <para>The <emphasis>synchronization policy</emphasis> dictates what a node must do when it |
| receives a transaction before it acknowledges that transaction to the rest of the |
| group.</para> |
| <para>The following options are available: <itemizedlist> |
| <listitem> |
| <para><emphasis>SYNC</emphasis>. The node must write the transaction to disk and flush |
| any OS level buffers before sending the acknowledgement. SYNC is offers the highest |
| durability but offers the least performance.</para> |
| </listitem> |
| <listitem> |
| <para><emphasis>WRITE_NO_SYNC</emphasis>. The node must write the transaction to disk |
| before sending the acknowledgement. OS level buffers will be flush as some point |
| later. This typically provides an assurance against failure of the application but not |
| the operating system or hardware.</para> |
| </listitem> |
| <listitem> |
| <para><emphasis>NO_SYNC</emphasis>. The node immediately sends the acknowledgement. The |
| transaction will be written and OS level buffers flushed as some point later. NO_SYNC |
| offers the highest performance but the lowest durability level. This synchronization |
| policy is sometimes known as <emphasis>commit to the network</emphasis>.</para> |
| </listitem> |
| </itemizedlist></para> |
| <para>It is possible to assign a one policy to the master and a different policy to the |
| replicas. These are configured as <link linkend="Java-Broker-Management-Managing-Virtualhost-Attributes">attributes on the |
| virtualhost</link>. By default the master uses <emphasis>SYNC</emphasis> and replicas use |
| <emphasis>NO_SYNC</emphasis>.</para> |
| </section> |
| <section xml:id="Java-Broker-High-Availability-Behaviour-NodePriority"> |
| <title>Node Priority</title> |
| <para>Node priority can be used to influence the behaviour of the election algorithm. It is |
| useful in the case were you want to favour some nodes over others. For instance, if you wish |
| to favour nodes located in a particular data centre over those in a remote site. </para> |
| <para>The following options are available: <itemizedlist> |
| <listitem> |
| <para><emphasis>Highest</emphasis>. Nodes with this priority will be more favoured. In |
| the event of two or more nodes having the most recent transaction, the node with this |
| priority will be elected master. If two or more nodes have this priority the algorithm |
| will make an arbitrary choice.</para> |
| </listitem> |
| <listitem> |
| <para><emphasis>High</emphasis>. Nodes with this priority will be favoured but not as |
| much so as those with Highest.</para> |
| </listitem> |
| <listitem> |
| <para><emphasis>Normal</emphasis>. This is default election priority.</para> |
| </listitem> |
| <listitem> |
| <para><emphasis>Never</emphasis>. The node will never be elected <emphasis>even if the |
| node has the most recent transaction</emphasis>. The node will still keep up to date |
| with the replication stream and will still vote itself, but can just never be |
| elected.</para> |
| </listitem> |
| </itemizedlist> |
| </para> |
| <para>Node priority is configured as an <link linkend="Java-Broker-Management-Managing-Virtualhost-Nodes-Attributes">attribute on the |
| virtualhost node</link> and can be changed at runtime and is effective immediately.</para> |
| <important> |
| <para>Use of the Never priority can lead to transaction loss. For example, consider a group |
| of three where replica-2 is marked as Never. If a transaction were to arrive and it be |
| acknowledged only by Master and Replica-2, the transaction would succeed. Replica 1 is |
| running behind for some reason (perhaps a full-GC). If a Master failure were to occur at |
| that moment, the replicas would elect Replica-1 even though Replica-2 had the most recent |
| transaction.</para> |
| <para>Transaction loss is reported by message <link linkend="Java-Broker-Appendix-Operation-Logging-Message-HA-1014">HA-1014</link>.</para> |
| </important> |
| </section> |
| <section xml:id="Java-Broker-High-Availability-Behaviour-MinimumNumberOfNodes"> |
| <title>Required Minimum Number Of Nodes</title> |
| <para>This controls the required minimum number of nodes to complete a transaction and to |
| elect a new master. By default, the required number of nodes is set to |
| <emphasis>Default</emphasis> (which signifies quorum).</para> |
| <para>It is possible to reduce the required minimum number of nodes. The rationale for doing |
| this is normally to temporarily restore service from fewer than quorum nodes following an |
| extraordinary failure.</para> |
| <para>For example, consider a group of three. If one node were to fail, as quorum still |
| remained, the system would continue work without any intervention. If the failing node were |
| the master, a new master would be elected.</para> |
| <para>What if a further node were to fail? Quorum no longer remains, and the remaining node |
| would just wait. It cannot elect itself master. What if we wanted to restore service from |
| just this one node?</para> |
| <para>In this case, Required Number of Nodes can be reduced to 1 on the remain node, allowing |
| the node to elect itself and service to be restored from the singleton. Required minimum |
| number of nodes is configured as an <link linkend="Java-Broker-Management-Managing-Virtualhost-Nodes-Attributes">attribute on the |
| virtualhost node</link> and can be changed at runtime and is effective immediately.</para> |
| <important> |
| <para>The attribute must be used cautiously. Careless use will lead to lost transactions and |
| can lead to a <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://en.wikipedia.org/wiki/Split-brain_(computing)">split-brain</link> in the event of a network partition. If used to temporarily restore |
| service from fewer than quorum nodes, it is <emphasis>imperative</emphasis> to revert it |
| to the Default value as the failed nodes are restored.</para> |
| <para>Transaction loss is reported by message <link linkend="Java-Broker-Appendix-Operation-Logging-Message-HA-1014">HA-1014</link>.</para> |
| </important> |
| </section> |
| <section xml:id="Java-Broker-High-Availability-Behaviour-DesignatedPrimary"> |
| <title>Designated Primary</title> |
| <para>This attribute applies to the groups of two only.</para> |
| <para> In a group of two, if a node were to fail then in default configuration work will cease |
| as quorum no longer exists. A single node cannot elect itself master. </para> |
| <para>The designated primary flag allows a node in a two node group to elect itself master and |
| to operate sole. Designated Primary is configured as an <link linkend="Java-Broker-Management-Managing-Virtualhost-Nodes-Attributes">attribute on the |
| virtualhost node</link> and can be changed at runtime and is effective immediately.</para> |
| <para>For example, consider a group of two where the master fails. Service will be interrupted |
| as the remaining node cannot elect itself master. To allow it to become master, apply the |
| designated primary flag to it. It will elect itself master and work can continue, albeit |
| from one node.</para> |
| <important> |
| <para>It is imperative not to allow designated primary to be set on both nodes at once. To |
| do so will mean, in the event of a network partition, a <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://en.wikipedia.org/wiki/Split-brain_(computing)">split-brain</link> will |
| occur.</para> |
| <para>Transaction loss is reported by message <link linkend="Java-Broker-Appendix-Operation-Logging-Message-HA-1014">HA-1014</link>.</para> |
| </important> |
| </section> |
| </section> |
| <section xml:id="Java-Broker-High-Availability-NodeOperations"> |
| <title>Node Operations</title> |
| <section xml:id="Java-Broker-High-Availability-NodeOperations-Lifecycle"> |
| <title>Lifecycle</title> |
| <para>Virtualhost nodes can be stopped, started and deleted.</para> |
| <itemizedlist> |
| <listitem> |
| <para><emphasis>Stop</emphasis></para> |
| <para>Stopping a master node will cause the node to temporarily leave the group. Any |
| messaging clients will be disconnected and any in-flight transaction rollbacked. The |
| remaining nodes will elect a new master if quorum number of nodes still remains.</para> |
| <para>Stopping a replica node will cause the node to temporarily leave the group too. |
| Providing quorum still exists, the current master will continue without interruption. If |
| by leaving the group, quorum no longer exists, all the nodes will begin waiting, |
| disconnecting any messaging clients, and the virtualhost will become unavailable.</para> |
| <para>A stopped virtualhost node is still considered to be a member of the group.</para> |
| </listitem> |
| <listitem> |
| <para><emphasis>Start</emphasis></para> |
| <para>Starting a virtualhost node allows it to rejoin the group.</para> |
| <para>If the group already has a master, the node will catch up from the master and then |
| become a replica once it has done so.</para> |
| <para>If the group did not have quorum and so had no master, but the rejoining of this |
| node means quorum now exists, an election will take place. The node with the most up to |
| date transaction will become master unless influenced by the priority rules described |
| above.</para> |
| <note> |
| <para>The length of time taken to catch up will depend on how long the node has been |
| stopped. The worst case is where the node has been stopped for more than one hour. In |
| this case, the master will perform an automated <literal>network restore</literal>. |
| This involves streaming all the data held by the master over to the replica. This |
| could take considerable time.</para> |
| </note> |
| </listitem> |
| <listitem> |
| <para><emphasis>Delete</emphasis></para> |
| <para>A virtualhost node can be deleted. Deleting a node permanently removes the node from |
| the group. The data stored locally is removed but this does not affect the data held by |
| the remainder of the group.</para> |
| <note> |
| <para>The names of deleted virtualhost node cannot be reused within a group.</para> |
| </note> |
| </listitem> |
| </itemizedlist> |
| <para>It is also possible to add nodes to an existing group using the procedure described |
| above.</para> |
| </section> |
| <section xml:id="Java-Broker-High-Availability-NodeOperations-TransferMaster"> |
| <title>Transfer Master</title> |
| <para>This operation allows the mastership to be moved from node to node. This is useful for |
| restoring a business as usual state after a failure.</para> |
| <para>When using this function, the following occurs. <orderedlist> |
| <listitem> |
| <para>The system first gives time for the chosen new master to become reasonable up to |
| date. </para> |
| </listitem> |
| <listitem> |
| <para>It then suspends transactions on the old master and allows the chosen node to |
| become up to date.</para> |
| </listitem> |
| <listitem> |
| <para>The suspended transactions are aborted and any messaging clients connected to the |
| old master are disconnected.</para> |
| </listitem> |
| <listitem> |
| <para>The chosen master becomes the new master. The old master becomes a replica.</para> |
| </listitem> |
| <listitem> |
| <para>Messaging clients reconnect the new master.</para> |
| </listitem> |
| </orderedlist></para> |
| </section> |
| </section> |
| |
| <section xml:id="Java-Broker-High-Availability-ClientFailover"> |
| <title>Client failover</title> |
| <para>As mentioned above, the clients need to be able to find the location of the active |
| virtualhost within the group.</para> |
| <para>Clients can do this using a static technique, for example , utilising the <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="${qpidJmsClient08Book}JMS-Client-0-8-Connection-URL.html">failover feature of the Qpid connection url</link> |
| where the client has a list of all the nodes, and tries each node in sequence until it |
| discovers the node with the active virtualhost.</para> |
| <para>Another possibility is a dynamic technique utilising a proxy or Virtual IP (VIP). These |
| require other software and/or hardware and are outside the scope of this document.</para> |
| </section> |
| |
| <section xml:id="Java-Broker-High-Availability-DiskSpace"> |
| <title>Disk space requirements</title> |
| <para>In the case where node in a group are down, the master must keep the data they are missing |
| for them to allow them to return to the replica role quickly.</para> |
| <para>By default, the master will retain up to 1hour of missed transactions. In a busy |
| production system, the disk space occupied could be considerable.</para> |
| <para>This setting is controlled by virtualhost context variable |
| <literal>je.rep.repStreamTimeout</literal>.</para> |
| </section> |
| |
| <section xml:id="Java-Broker-High-Availability-Network-Requirements"> |
| <title>Network Requirements</title> |
| <para>The HA Cluster performance depends on the network bandwidth, its use by existing traffic, |
| and quality of service.</para> |
| <para>In order to achieve the best performance it is recommended to use a separate network |
| infrastructure for the Qpid HA Nodes which might include installation of dedicated network |
| hardware on Broker hosts, assigning a higher priority to replication ports, installing a group |
| in a separate network not impacted by any other traffic.</para> |
| </section> |
| |
| <section xml:id="Java-Broker-High-Availability-Security"> |
| <title>Security</title> |
| <para>The replication stream between the master and the replicas is insecure and can be |
| intercepted by anyone having access to the replication network.</para> |
| <para>In order to reduce the security risks the entire HA group is recommended to run in a |
| separate network protected from general access and/or utilise SSH-tunnels/IPsec.</para> |
| </section> |
| |
| <section xml:id="Java-Broker-High-Availability-Backup"> |
| <title>Backups</title> |
| <para>It is recommend to use the hot backup script to periodically backup every node in the |
| group. <xref linkend="Java-Broker-Backup-And-Recovery-Virtualhost-Node-BDB-HA"/>.</para> |
| </section> |
| |
| |
| <section xml:id="Java-Broker-High-Availability-Reset-Group-Infomational"> |
| <title>Reset Group Information</title> |
| <para>BDB JE internally stores details of the group within its database. There are some |
| circumstances when resetting this information is useful.<itemizedlist> |
| <listitem> |
| <para>Copying data between environments (e.g. production to UAT)</para> |
| </listitem> |
| <listitem> |
| <para>Some disaster recovery situations where a group must be recreated on new |
| hardware</para> |
| </listitem> |
| </itemizedlist></para> |
| <para>This is not an normal operation and is not usually required</para> |
| <para>The following command replaces the group table contained within the JE logs files with the |
| provided information. </para> |
| <example> |
| <title>Resetting of replication group with <classname>DbResetRepGroup</classname></title> |
| <cmdsynopsis> |
| <command>java</command> |
| <arg choice="plain">-cp je-${bdb-version}.jar</arg> |
| <arg choice="plain">com.sleepycat.je.rep.util.DbResetRepGroup</arg> |
| <arg choice="plain">-h <replaceable>path/to/jelogfiles</replaceable></arg> |
| <sbr/> |
| <arg choice="plain">-groupName <replaceable>newgroupname</replaceable></arg> |
| <arg choice="plain">-nodeName <replaceable>newnodename</replaceable></arg> |
| <arg choice="plain">-nodeHostPort <replaceable>newhostname:5000</replaceable></arg> |
| </cmdsynopsis> |
| </example> |
| <para>The modified log files can then by copied into |
| <literal>\${QPID_WORK}/<nodename>/config</literal> directory of a target Broker. Then |
| start the Broker, and add a BDB HA Virtualhost node specify the same group name, node name and |
| node address. You will then have a group with a single node, ready to start re-adding |
| additional nodes as described above. </para> |
| </section> |
| |
| </chapter> |