| <?xml version="1.0" encoding="utf-8"?> |
| <!-- |
| |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| h"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| |
| --> |
| |
| <section id="chap-Messaging_User_Guide-Active_Passive_Cluster"> |
| |
| <title>Active-passive Messaging Clusters</title> |
| |
| <section> |
| <title>Overview</title> |
| <para> |
| |
| The High Availability (HA) module provides |
| <firstterm>active-passive</firstterm>, <firstterm>hot-standby</firstterm> |
| messaging clusters to provide fault tolerant message delivery. |
| </para> |
| <para> |
| In an active-passive cluster only one broker, known as the |
| <firstterm>primary</firstterm>, is active and serving clients at a time. The other |
| brokers are standing by as <firstterm>backups</firstterm>. Changes on the primary |
| are replicated to all the backups so they are always up-to-date or "hot". Backup |
| brokers reject client connection attempts, to enforce the requirement that clients |
| only connect to the primary. |
| </para> |
| <para> |
| If the primary fails, one of the backups is promoted to take over as the new |
| primary. Clients fail-over to the new primary automatically. If there are multiple |
| backups, the other backups also fail-over to become backups of the new primary. |
| </para> |
| <para> |
| This approach relies on an external <firstterm>cluster resource manager</firstterm> |
| to detect failures, choose the new primary and handle network partitions. <ulink |
| url="https://fedorahosted.org/cluster/wiki/RGManager">Rgmanager</ulink> is supported |
| initially, but others may be supported in the future. |
| </para> |
| <section> |
| <title>Avoiding message loss</title> |
| <para> |
| In order to avoid message loss, the primary broker <emphasis>delays |
| acknowledgment</emphasis> of messages received from clients until the message has |
| been replicated to and acknowledged by all of the back-up brokers. This means that |
| all <emphasis>acknowledged</emphasis> messages are safely stored on all the backup |
| brokers. |
| </para> |
| <para> |
| Clients keep <emphasis>unacknowledged</emphasis> messages in a buffer |
| <footnote> |
| <para> |
| You can control the maximum number of messages in the buffer by setting the |
| client's <literal>capacity</literal>. For details of how to set the capacity |
| in client code see "Using the Qpid Messaging API" in |
| <citetitle>Programming in Apache Qpid</citetitle>. |
| </para> |
| </footnote> |
| until they are acknowledged by the primary. If the primary fails, clients will |
| fail-over to the new primary and <emphasis>re-send</emphasis> all their |
| unacknowledged messages. |
| <footnote> |
| <para> |
| Clients must use "at-least-once" reliability to enable re-send of unacknowledged |
| messages. This is the default behavior, no options need be set to enable it. For |
| details of client addressing options see "Using the Qpid Messaging API" |
| in <citetitle>Programming in Apache Qpid</citetitle>. |
| </para> |
| </footnote> |
| </para> |
| <para> |
| So if the primary crashes, all the <emphasis>acknowledged</emphasis> |
| messages will be available on the backup that takes over as the new |
| primary. The <emphasis>unacknowledged</emphasis> messages will be |
| re-sent by the clients. Thus no messages are lost. |
| </para> |
| <para> |
| Note that this means it is possible for messages to be |
| <emphasis>duplicated</emphasis>. In the event of a failure it is possible for a |
| message to received by the backup that becomes the new primary |
| <emphasis>and</emphasis> re-sent by the client. The application must take steps |
| to identify and eliminate duplicates. |
| </para> |
| <para> |
| When a new primary is promoted after a fail-over it is initially in |
| "recovering" mode. In this mode, it delays acknowledgment of messages |
| on behalf of all the backups that were connected to the previous |
| primary. This protects those messages against a failure of the new |
| primary until the backups have a chance to connect and catch up. |
| </para> |
| <para> |
| Not all messages need to be replicated to the back-up brokers. If a |
| message is consumed and acknowledged by a regular client before it has |
| been replicated to a backup, then it doesn't need to be replicated. |
| </para> |
| <variablelist> |
| <title>Status of a HA broker</title> |
| <varlistentry> |
| <term>Joining</term> |
| <listitem> |
| <para> |
| Initial status of a new broker that has not yet connected to the primary. |
| </para> |
| </listitem> |
| </varlistentry> |
| <varlistentry> |
| <term>Catch-up</term> |
| <listitem> |
| <para> |
| A backup broker that is connected to the primary and catching up |
| on queues and messages. |
| </para> |
| </listitem> |
| </varlistentry> |
| <varlistentry> |
| <term>Ready</term> |
| <listitem> |
| <para> |
| A backup broker that is fully caught-up and ready to take over as |
| primary. |
| </para> |
| </listitem> |
| </varlistentry> |
| <varlistentry> |
| <term>Recovering</term> |
| <listitem> |
| <para> |
| The newly-promoted primary, waiting for backups to connect and catch up. |
| </para> |
| </listitem> |
| </varlistentry> |
| <varlistentry> |
| <term>Active</term> |
| <listitem> |
| <para> |
| The active primary broker with all backups connected and caught-up. |
| </para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| </section> |
| <section> |
| <title>Limitations</title> |
| <para> |
| There are a some known limitations in the current implementation. These |
| will be fixed in furture versions. |
| </para> |
| <itemizedlist> |
| <listitem> |
| <para> |
| Transactional changes to queue state are not replicated atomically. If |
| the primary crashes during a transaction, it is possible that the |
| backup could contain only part of the changes introduced by a |
| transaction. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| Configuration changes (creating or deleting queues, exchanges and |
| bindings) are replicated asynchronously. Management tools used to |
| make changes will consider the change complete when it is complete |
| on the primary, it may not yet be replicated to all the backups. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| Federated links <emphasis>from</emphasis> the primary will be lost |
| in fail over, they will not be re-connected to the new |
| primary. Federation links <emphasis>to</emphasis> the primary will |
| fail over. |
| </para> |
| </listitem> |
| </itemizedlist> |
| </section> |
| </section> |
| |
| <section> |
| <title>Virtual IP Addresses</title> |
| <para> |
| Some resource managers (including <command>rgmanager</command>) support |
| <firstterm>virtual IP addresses</firstterm>. A virtual IP address is an IP |
| address that can be relocated to any of the nodes in a cluster. The |
| resource manager associates this address with the primary node in the |
| cluster, and relocates it to the new primary when there is a failure. This |
| simplifies configuration as you can publish a single IP address rather |
| than a list. |
| </para> |
| <para> |
| A virtual IP address can be used by clients and backup brokers to connect |
| to the primary. The following sections will explain how to configure |
| virtual IP addresses for clients or brokers. |
| </para> |
| </section> |
| |
| <section> |
| <title>Configuring the Brokers</title> |
| <para> |
| The broker must load the <filename>ha</filename> module, it is loaded by |
| default. The following broker options are available for the HA module. |
| </para> |
| <table frame="all" id="ha-broker-options"> |
| <title>Broker Options for High Availability Messaging Cluster</title> |
| <tgroup align="left" cols="2" colsep="1" rowsep="1"> |
| <colspec colname="c1"/> |
| <colspec colname="c2"/> |
| <thead> |
| <row> |
| <entry align="center" nameend="c2" namest="c1"> |
| Options for High Availability Messaging Cluster |
| </entry> |
| </row> |
| </thead> |
| <tbody> |
| <row> |
| <entry> |
| <literal>ha-cluster <replaceable>yes|no</replaceable></literal> |
| </entry> |
| <entry> |
| Set to "yes" to have the broker join a cluster. |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <literal>ha-queue-replication <replaceable>yes|no</replaceable></literal> |
| </entry> |
| <entry> |
| Enable replication of specific queues without joining a cluster, see <xref linkend="ha-queue-replication"/>. |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <literal>ha-brokers-url <replaceable>URL</replaceable></literal> |
| </entry> |
| <entry> |
| <para> |
| The URL |
| <footnote id="ha-url-grammar"> |
| <para> |
| The full format of the URL is given by this grammar: |
| <programlisting> |
| url = ["amqp:"][ user ["/" password] "@" ] addr ("," addr)* |
| addr = tcp_addr / rmda_addr / ssl_addr / ... |
| tcp_addr = ["tcp:"] host [":" port] |
| rdma_addr = "rdma:" host [":" port] |
| ssl_addr = "ssl:" host [":" port]' |
| </programlisting> |
| </para> |
| </footnote> |
| used by cluster brokers to connect to each other. The URL should |
| contain a comma separated list of the broker addresses, rather than a |
| virtual IP address. |
| </para> |
| </entry> |
| </row> |
| <row> |
| <entry><literal>ha-public-url <replaceable>URL</replaceable></literal> </entry> |
| <entry> |
| <para> |
| The URL <footnoteref linkend="ha-url-grammar"/> is advertised to |
| clients as the "known-hosts" for fail-over. It can be a list or |
| a single virtual IP address. A virtual IP address is recommended. |
| </para> |
| <para> |
| Using this option you can put client and broker traffic on |
| separate networks, which is recommended. |
| </para> |
| <para> |
| Note: When HA clustering is enabled the broker option |
| <literal>known-hosts-url</literal> is ignored and over-ridden by |
| the <literal>ha-public-url</literal> setting. |
| </para> |
| </entry> |
| </row> |
| <row> |
| <entry><literal>ha-replicate </literal><replaceable>VALUE</replaceable></entry> |
| <entry> |
| <para> |
| Specifies whether queues and exchanges are replicated by default. |
| <replaceable>VALUE</replaceable> is one of: <literal>none</literal>, |
| <literal>configuration</literal>, <literal>all</literal>. |
| For details see <xref linkend="ha-creating-replicated"/>. |
| </para> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <para><literal>ha-username <replaceable>USER</replaceable></literal></para> |
| <para><literal>ha-password <replaceable>PASS</replaceable></literal></para> |
| <para><literal>ha-mechanism <replaceable>MECH</replaceable></literal></para> |
| </entry> |
| <entry> |
| Authentication settings used by HA brokers to connect to each other. |
| If you are using authorization |
| (<xref linkend="sect-Messaging_User_Guide-Security-Authorization"/>) |
| then this user must have all permissions. |
| </entry> |
| </row> |
| <row> |
| <entry><literal>ha-backup-timeout <replaceable>SECONDS</replaceable></literal> </entry> |
| <entry> |
| <para> |
| Maximum time that a recovering primary will wait for an expected |
| backup to connect and become ready. |
| </para> |
| </entry> |
| </row> |
| <row> |
| <entry><literal>link-maintenance-interval <replaceable>SECONDS</replaceable></literal></entry> |
| <entry> |
| <para> |
| Interval for the broker to check link health and re-connect links if need |
| be. If you want brokers to fail over quickly you can set this to a |
| fraction of a second, for example: 0.1. |
| </para> |
| </entry> |
| </row> |
| <row> |
| <entry><literal>link-heartbeat-interval <replaceable>SECONDS</replaceable></literal></entry> |
| <entry> |
| <para> |
| Heartbeat interval for replication links. The link will be assumed broken |
| if there is no heartbeat for twice the interval. |
| </para> |
| </entry> |
| </row> |
| </tbody> |
| </tgroup> |
| </table> |
| <para> |
| To configure a HA cluster you must set at least <literal>ha-cluster</literal> and |
| <literal>ha-brokers-url</literal>. |
| </para> |
| </section> |
| |
| <section> |
| <title>The Cluster Resource Manager</title> |
| <para> |
| Broker fail-over is managed by a <firstterm>cluster resource |
| manager</firstterm>. An integration with <ulink |
| url="https://fedorahosted.org/cluster/wiki/RGManager">rgmanager</ulink> is |
| provided, but it is possible to integrate with other resource managers. |
| </para> |
| <para> |
| The resource manager is responsible for starting the <command>qpidd</command> broker |
| on each node in the cluster. The resource manager then <firstterm>promotes</firstterm> |
| one of the brokers to be the primary. The other brokers connect to the primary as |
| backups, using the URL provided in the <literal>ha-brokers-url</literal> configuration |
| option. |
| </para> |
| <para> |
| Once connected, the backup brokers synchronize their state with the |
| primary. When a backup is synchronized, or "hot", it is ready to take |
| over if the primary fails. Backup brokers continually receive updates |
| from the primary in order to stay synchronized. |
| </para> |
| <para> |
| If the primary fails, backup brokers go into fail-over mode. The resource |
| manager must detect the failure and promote one of the backups to be the |
| new primary. The other backups connect to the new primary and synchronize |
| their state with it. |
| </para> |
| <para> |
| The resource manager is also responsible for protecting the cluster from |
| <firstterm>split-brain</firstterm> conditions resulting from a network partition. A |
| network partition divide a cluster into two sub-groups which cannot see each other. |
| Usually a <firstterm>quorum</firstterm> voting algorithm is used that disables nodes |
| in the inquorate sub-group. |
| </para> |
| </section> |
| |
| <section> |
| <title>Configuring <command>rgmanager</command> as resource manager</title> |
| <para> |
| This section assumes that you are already familiar with setting up and configuring |
| clustered services using <command>cman</command> and |
| <command>rgmanager</command>. It will show you how to configure an active-passive, |
| hot-standby <command>qpidd</command> HA cluster with <command>rgmanager</command>. |
| </para> |
| <para> |
| You must provide a <literal>cluster.conf</literal> file to configure |
| <command>cman</command> and <command>rgmanager</command>. Here is |
| an example <literal>cluster.conf</literal> file for a cluster of 3 nodes named |
| node1, node2 and node3. We will go through the configuration step-by-step. |
| </para> |
| <programlisting> |
| <![CDATA[ |
| <?xml version="1.0"?> |
| <!-- |
| This is an example of a cluster.conf file to run qpidd HA under rgmanager. |
| This example assumes a 3 node cluster, with nodes named node1, node2 and node3. |
| |
| NOTE: fencing is not shown, you must configure fencing appropriately for your cluster. |
| --> |
| |
| <cluster name="qpid-test" config_version="18"> |
| <!-- The cluster has 3 nodes. Each has a unique nodid and one vote |
| for quorum. --> |
| <clusternodes> |
| <clusternode name="node1.example.com" nodeid="1"/> |
| <clusternode name="node2.example.com" nodeid="2"/> |
| <clusternode name="node3.example.com" nodeid="3"/> |
| </clusternodes> |
| <!-- Resouce Manager configuration. --> |
| <rm> |
| <!-- |
| There is a failoverdomain for each node containing just that node. |
| This lets us stipulate that the qpidd service should always run on each node. |
| --> |
| <failoverdomains> |
| <failoverdomain name="node1-domain" restricted="1"> |
| <failoverdomainnode name="node1.example.com"/> |
| </failoverdomain> |
| <failoverdomain name="node2-domain" restricted="1"> |
| <failoverdomainnode name="node2.example.com"/> |
| </failoverdomain> |
| <failoverdomain name="node3-domain" restricted="1"> |
| <failoverdomainnode name="node3.example.com"/> |
| </failoverdomain> |
| </failoverdomains> |
| |
| <resources> |
| <!-- This script starts a qpidd broker acting as a backup. --> |
| <script file="/etc/init.d/qpidd" name="qpidd"/> |
| |
| <!-- This script promotes the qpidd broker on this node to primary. --> |
| <script file="/etc/init.d/qpidd-primary" name="qpidd-primary"/> |
| |
| <!-- This is a virtual IP address for broker replication traffic. --> |
| <ip address="20.0.10.200" monitor_link="1"/> |
| |
| <!-- This is a virtual IP address on a seprate network for client traffic. --> |
| <ip address="20.0.20.200" monitor_link="1"/> |
| </resources> |
| |
| <!-- There is a qpidd service on each node, it should be restarted if it fails. --> |
| <service name="node1-qpidd-service" domain="node1-domain" recovery="restart"> |
| <script ref="qpidd"/> |
| </service> |
| <service name="node2-qpidd-service" domain="node2-domain" recovery="restart"> |
| <script ref="qpidd"/> |
| </service> |
| <service name="node3-qpidd-service" domain="node3-domain" recovery="restart"> |
| <script ref="qpidd"/> |
| </service> |
| |
| <!-- There should always be a single qpidd-primary service, it can run on any node. --> |
| <service name="qpidd-primary-service" autostart="1" exclusive="0" recovery="relocate"> |
| <script ref="qpidd-primary"/> |
| <!-- The primary has the IP addresses for brokers and clients to connect. --> |
| <ip ref="20.0.10.200"/> |
| <ip ref="20.0.20.200"/> |
| </service> |
| </rm> |
| </cluster> |
| ]]> |
| </programlisting> |
| |
| <para> |
| There is a <literal>failoverdomain</literal> for each node containing just that |
| one node. This lets us stipulate that the qpidd service should always run on all |
| nodes. |
| </para> |
| <para> |
| The <literal>resources</literal> section defines the <command>qpidd</command> |
| script used to start the <command>qpidd</command> service. It also defines the |
| <command>qpid-primary</command> script which does not |
| actually start a new service, rather it promotes the existing |
| <command>qpidd</command> broker to primary status. |
| </para> |
| <para> |
| The <literal>resources</literal> section also defines a pair of virtual IP |
| addresses on different sub-nets. One will be used for broker-to-broker |
| communication, the other for client-to-broker. |
| </para> |
| <para> |
| To take advantage of the virtual IP addresses, <filename>qpidd.conf</filename> |
| should contain these lines: |
| </para> |
| <programlisting> |
| ha-cluster=yes |
| ha-brokers-url=20.0.20.200 |
| ha-public-url=20.0.10.200 |
| </programlisting> |
| <para> |
| This configuration specifies that backup brokers will use 20.0.20.200 |
| to connect to the primary and will advertise 20.0.10.200 to clients. |
| Clients should connect to 20.0.10.200. |
| </para> |
| <para> |
| The <literal>service</literal> section defines 3 <literal>qpidd</literal> |
| services, one for each node. Each service is in a restricted fail-over |
| domain containing just that node, and has the <literal>restart</literal> |
| recovery policy. The effect of this is that rgmanager will run |
| <command>qpidd</command> on each node, restarting if it fails. |
| </para> |
| <para> |
| There is a single <literal>qpidd-primary-service</literal> using the |
| <command>qpidd-primary</command> script which is not restricted to a |
| domain and has the <literal>relocate</literal> recovery policy. This means |
| rgmanager will start <command>qpidd-primary</command> on one of the nodes |
| when the cluster starts and will relocate it to another node if the |
| original node fails. Running the <literal>qpidd-primary</literal> script |
| does not start a new broker process, it promotes the existing broker to |
| become the primary. |
| </para> |
| </section> |
| |
| <section> |
| <title>Broker Administration Tools</title> |
| <para> |
| Normally, clients are not allowed to connect to a backup broker. However |
| management tools are allowed to connect to a backup brokers. If you use |
| these tools you <emphasis>must not</emphasis> add or remove messages from |
| replicated queues, nor create or delete replicated queues or exchanges as |
| this will disrupt the replication process and may cause message loss. |
| </para> |
| <para> |
| <command>qpid-ha</command> allows you to view and change HA configuration settings. |
| </para> |
| <para> |
| The tools <command>qpid-config</command>, <command>qpid-route</command> and |
| <command>qpid-stat</command> will connect to a backup if you pass the flag <command>ha-admin</command> on the |
| command line. |
| </para> |
| </section> |
| |
| <section id="ha-creating-replicated"> |
| <title>Controlling replication of queues and exchanges</title> |
| <para> |
| By default, queues and exchanges are not replicated automatically. You can change |
| the default behavior by setting the <literal>ha-replicate</literal> configuration |
| option. It has one of the following values: |
| <itemizedlist> |
| <listitem> |
| <para> |
| <firstterm>all</firstterm>: Replicate everything automatically: queues, |
| exchanges, bindings and messages. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <firstterm>configuration</firstterm>: Replicate the existence of queues, |
| exchange and bindings but don't replicate messages. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <firstterm>none</firstterm>: Don't replicate anything, this is the default. |
| </para> |
| </listitem> |
| </itemizedlist> |
| </para> |
| <para> |
| You can over-ride the default for a particular queue or exchange by passing the |
| argument <literal>qpid.replicate</literal> when creating the queue or exchange. It |
| takes the same values as <literal>ha-replicate</literal> |
| </para> |
| <para> |
| Bindings are automatically replicated if the queue and exchange being bound both |
| have replication <literal>all</literal> or <literal>configuration</literal>, they |
| are not replicated otherwise. |
| </para> |
| <para> |
| You can create replicated queues and exchanges with the |
| <command>qpid-config</command> management tool like this: |
| </para> |
| <programlisting> |
| qpid-config add queue myqueue --replicate all |
| </programlisting> |
| <para> |
| To create replicated queues and exchanges via the client API, add a |
| <literal>node</literal> entry to the address like this: |
| </para> |
| <programlisting> |
| "myqueue;{create:always,node:{x-declare:{arguments:{'qpid.replicate':all}}}}" |
| </programlisting> |
| <para> |
| There are some built-in exchanges created automatically by the broker, these |
| exchangs are never replicated. The built-in exchanges are the default (nameless) |
| exchange, the AMQP standard exchanges (<literal>amq.direct, amq.topic, amq.fanout</literal> and |
| <literal>amq.match</literal>) and the management exchanges (<literal>qpid.management, qmf.default.direct</literal> and |
| <literal>qmf.default.topic</literal>) |
| </para> |
| <para> |
| Note that if you bind a replicated queue to one of these exchanges, the |
| binding wil <emphasis>not</emphasis> be replicated, so the queue will not |
| have the binding after a fail-over. |
| </para> |
| </section> |
| |
| <section> |
| <title>Client Connection and Fail-over</title> |
| <para> |
| Clients can only connect to the primary broker. Backup brokers |
| automatically reject any connection attempt by a client. |
| </para> |
| <para> |
| Clients are configured with the URL for the cluster (details below for |
| each type of client). There are two possibilities |
| <itemizedlist> |
| <listitem> |
| <para> |
| The URL contains multiple addresses, one for each broker in the cluster. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| The URL contains a single <firstterm>virtual IP address</firstterm> |
| that is assigned to the primary broker by the resource manager. |
| <footnote><para>Only if the resource manager supports virtual IP |
| addresses</para></footnote> |
| </para> |
| </listitem> |
| </itemizedlist> |
| In the first case, clients will repeatedly re-try each address in the URL |
| until they successfully connect to the primary. In the second case the |
| resource manager will assign the virtual IP address to the primary broker, |
| so clients only need to re-try on a single address. |
| </para> |
| <para> |
| When the primary broker fails, clients re-try all known cluster addresses |
| until they connect to the new primary. The client re-sends any messages |
| that were previously sent but not acknowledged by the broker at the time |
| of the failure. Similarly messages that have been sent by the broker, but |
| not acknowledged by the client, are re-queued. |
| </para> |
| <para> |
| TCP can be slow to detect connection failures. A client can configure a |
| connection to use a <firstterm>heartbeat</firstterm> to detect connection |
| failure, and can specify a time interval for the heartbeat. If heartbeats |
| are in use, failures will be detected no later than twice the heartbeat |
| interval. The following sections explain how to enable heartbeat in each |
| client. |
| </para> |
| <para> |
| See "Cluster Failover" in <citetitle>Programming in Apache |
| Qpid</citetitle> for details on how to keep the client aware of cluster |
| membership. |
| </para> |
| <para> |
| Suppose your cluster has 3 nodes: <literal>node1</literal>, |
| <literal>node2</literal> and <literal>node3</literal> all using the |
| default AMQP port, and you are not using a virtual IP address. To connect |
| a client you need to specify the address(es) and set the |
| <literal>reconnect</literal> property to <literal>true</literal>. The |
| following sub-sections show how to connect each type of client. |
| </para> |
| <section> |
| <title>C++ clients</title> |
| <para> |
| With the C++ client, you specify multiple cluster addresses in a single URL |
| <footnote> |
| <para> |
| The full grammar for the URL is: |
| </para> |
| <programlisting> |
| url = ["amqp:"][ user ["/" password] "@" ] addr ("," addr)* |
| addr = tcp_addr / rmda_addr / ssl_addr / ... |
| tcp_addr = ["tcp:"] host [":" port] |
| rdma_addr = "rdma:" host [":" port] |
| ssl_addr = "ssl:" host [":" port]' |
| </programlisting> |
| </footnote> |
| You also need to specify the connection option |
| <literal>reconnect</literal> to be true. For example: |
| </para> |
| <programlisting> |
| qpid::messaging::Connection c("node1,node2,node3","{reconnect:true}"); |
| </programlisting> |
| <para> |
| Heartbeats are disabled by default. You can enable them by specifying a |
| heartbeat interval (in seconds) for the connection via the |
| <literal>heartbeat</literal> option. For example: |
| <programlisting> |
| qpid::messaging::Connection c("node1,node2,node3","{reconnect:true,heartbeat:10}"); |
| </programlisting> |
| </para> |
| </section> |
| <section> |
| <title>Python clients</title> |
| <para> |
| With the python client, you specify <literal>reconnect=True</literal> |
| and a list of <replaceable>host:port</replaceable> addresses as |
| <literal>reconnect_urls</literal> when calling |
| <literal>Connection.establish</literal> or |
| <literal>Connection.open</literal> |
| </para> |
| <programlisting> |
| connection = qpid.messaging.Connection.establish("node1", reconnect=True, reconnect_urls=["node1", "node2", "node3"]) |
| </programlisting> |
| <para> |
| Heartbeats are disabled by default. You can |
| enable them by specifying a heartbeat interval (in seconds) for the |
| connection via the 'heartbeat' option. For example: |
| </para> |
| <programlisting> |
| connection = qpid.messaging.Connection.establish("node1", reconnect=True, reconnect_urls=["node1", "node2", "node3"], heartbeat=10) |
| </programlisting> |
| </section> |
| <section> |
| <title>Java JMS Clients</title> |
| <para> |
| In Java JMS clients, client fail-over is handled automatically if it is |
| enabled in the connection. You can configure a connection to use |
| fail-over using the <command>failover</command> property: |
| </para> |
| |
| <screen> |
| connectionfactory.qpidConnectionfactory = amqp://guest:guest@clientid/test?brokerlist='tcp://localhost:5672'&failover='failover_exchange' |
| </screen> |
| <para> |
| This property can take three values: |
| </para> |
| <variablelist> |
| <title>Fail-over Modes</title> |
| <varlistentry> |
| <term>failover_exchange</term> |
| <listitem> |
| <para> |
| If the connection fails, fail over to any other broker in the cluster. |
| </para> |
| |
| </listitem> |
| |
| </varlistentry> |
| <varlistentry> |
| <term>roundrobin</term> |
| <listitem> |
| <para> |
| If the connection fails, fail over to one of the brokers specified in the <command>brokerlist</command>. |
| </para> |
| |
| </listitem> |
| |
| </varlistentry> |
| <varlistentry> |
| <term>singlebroker</term> |
| <listitem> |
| <para> |
| Fail-over is not supported; the connection is to a single broker only. |
| </para> |
| |
| </listitem> |
| |
| </varlistentry> |
| |
| </variablelist> |
| <para> |
| In a Connection URL, heartbeat is set using the <command>idle_timeout</command> property, which is an integer corresponding to the heartbeat period in seconds. For instance, the following line from a JNDI properties file sets the heartbeat time out to 3 seconds: |
| </para> |
| |
| <screen> |
| connectionfactory.qpidConnectionfactory = amqp://guest:guest@clientid/test?brokerlist='tcp://localhost:5672',idle_timeout=3 |
| </screen> |
| </section> |
| </section> |
| |
| <section> |
| <title>Security.</title> |
| <para> |
| You can secure your cluster using the authentication and authorization features |
| described in <xref linkend="chap-Messaging_User_Guide-Security"/>. |
| </para> |
| <para> |
| Backup brokers connect to the primary broker and subscribe for management |
| events and queue contents. You can specify the identity used to connect |
| to the primary with the following options: |
| </para> |
| <table frame="all" id="ha-broker-security-options"> |
| <title>Security options for High Availability Messaging Cluster</title> |
| <tgroup align="left" cols="2" colsep="1" rowsep="1"> |
| <colspec colname="c1" colwidth="1*"/> |
| <colspec colname="c2" colwidth="3*"/> |
| <thead> |
| <row> |
| <entry align="center" nameend="c2" namest="c1"> |
| Security options for High Availability Messaging Cluster |
| </entry> |
| </row> |
| </thead> |
| <tbody> |
| <row> |
| <entry> |
| <para><literal>ha-username <replaceable>USER</replaceable></literal></para> |
| <para><literal>ha-password <replaceable>PASS</replaceable></literal></para> |
| <para><literal>ha-mechanism <replaceable>MECH</replaceable></literal></para> |
| </entry> |
| <entry> |
| Authentication settings used by HA brokers to connect to each other. |
| If you are using authorization |
| (<xref linkend="sect-Messaging_User_Guide-Security-Authorization"/>) |
| then this user must have all permissions. |
| </entry> |
| </row> |
| </tbody> |
| </tgroup> |
| </table> |
| <para> |
| This identity is also used to authorize actions taken on the backup broker to replicate |
| from the primary, for example to create queues or exchanges. |
| </para> |
| </section> |
| |
| <section> |
| <title>Integrating with other Cluster Resource Managers</title> |
| <para> |
| To integrate with a different resource manager you must configure it to: |
| <itemizedlist> |
| <listitem><para>Start a qpidd process on each node of the cluster.</para></listitem> |
| <listitem><para>Restart qpidd if it crashes.</para></listitem> |
| <listitem><para>Promote exactly one of the brokers to primary.</para></listitem> |
| <listitem><para>Detect a failure and promote a new primary.</para></listitem> |
| </itemizedlist> |
| </para> |
| <para> |
| The <command>qpid-ha</command> command allows you to check if a broker is primary, |
| and to promote a backup to primary. |
| </para> |
| <para> |
| To test if a broker is the primary: |
| <programlisting> |
| qpid-ha -b <replaceable>broker-address</replaceable> status --expect=primary |
| </programlisting> |
| This command will return 0 if the broker at <replaceable>broker-address</replaceable> |
| is the primary, non-0 otherwise. |
| </para> |
| <para> |
| To promote a broker to primary: |
| <programlisting> |
| qpid-ha -b <replaceable>broker-address</replaceable> promote |
| </programlisting> |
| </para> |
| <para> |
| <command>qpid-ha --help</command> gives information on other commands and options available. |
| You can also use <command>qpid-ha</command> to manually examine and promote brokers. This |
| can be useful for testing failover scenarios without having to set up a full resource manager, |
| or to simulate a cluster on a single node. For deployment, a resource manager is required. |
| </para> |
| </section> |
| <section id="ha-queue-replication"> |
| <title>Replicating specific queues</title> |
| <para> |
| In addition to the automatic replication performed in a cluster, you can |
| set up replication for specific queues between arbitrary brokers, even if |
| the brokers are not members of a cluster. The command: |
| </para> |
| <programlisting> |
| qpid-ha replicate <replaceable>QUEUE</replaceable> <replaceable>REMOTE-BROKER</replaceable> |
| </programlisting> |
| <para> |
| sets up replication of <replaceable>QUEUE</replaceable> on <replaceable>REMOTE-BROKER</replaceable> to <replaceable>QUEUE</replaceable> on the current broker. |
| </para> |
| <para> |
| Set the configuration option |
| <literal>ha-queue-replication=yes</literal> on both brokers to enable this |
| feature on non-cluster brokers. It is automatically enabled for brokers |
| that are part of a cluster. |
| </para> |
| <para> |
| Note that this feature does not provide automatic fail-over, for that you |
| need to run a cluster. |
| </para> |
| </section> |
| </section> |
| |
| <!-- LocalWords: scalability rgmanager multicast RGManager mailto LVQ qpidd IP dequeued Transactional username |
| --> |