blob: 72fab6450456d5026731b94928b1115d5d5133f0 [file] [log] [blame]
<?xml version='1.0' encoding='utf-8' ?>
<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
<!ENTITY % BOOK_ENTITIES SYSTEM "cloudstack.ent">
%BOOK_ENTITIES;
]>
<!-- Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<section id="db-ha">
<title>Database High Availability</title>
<para>To help ensure high availability of the databases that store the internal data for
&PRODUCT;, you can set up database high availability. This covers both the main &PRODUCT;
database and the Usage database. High availability is achieved using the MySQL connector
parameters and two-way high availability. Tested with MySQL 5.1. </para>
<section id="db-ha-howto">
<title>How to Set Up Database High Availability</title>
<para>Database high availability in &PRODUCT; is provided using the MySQL high availability
capabilities. The steps to set up high availability can be found in the MySQL documentation
(links are provided below). It is suggested that you set up two-way high availability, which
involves two database nodes. In this case, for example, you might have node1 and node2. In
Asynchronous high availability configuration, not more than two database nodes are supported. </para>
<mediaobject>
<imageobject>
<imagedata fileref="./images/event-replica.png"/>
</imageobject>
<textobject>
<phrase>event-replica.png: high availability</phrase>
</textobject>
</mediaobject>
<para>References:</para>
<itemizedlist>
<listitem>
<para><ulink url="http://dev.mysql.com/doc/refman/5.0/en/replication-howto.html"
>http://dev.mysql.com/doc/refman/5.0/en/replication-howto.html</ulink></para>
</listitem>
<listitem>
<para><ulink
url="https://wikis.oracle.com/display/CommSuite/MySQL+High+Availability+and+Replication+Information+For+Calendar+Server"
>https://wikis.oracle.com/display/CommSuite/MySQL+High+Availability+and+Replication+Information+For+Calendar+Server</ulink></para>
</listitem>
</itemizedlist>
</section>
<section id="dbha-consider">
<title>Database High Availability Considerations</title>
<itemizedlist>
<listitem>
<para>To clean up bin log files automatically, perform the following configuration in the
<filename>my.cnf</filename> file of each MySQL server:</para>
<para><property>expire_logs_days=10</property> : Number of days to keep the log
files.</para>
<para>
<property>max_binlog_size=100M</property>: The maximum size of each log file.</para>
</listitem>
<listitem>
<para>To change the bin log files location, use the
<property>log-bin=/var/lib/mysql/binlog/bin-log</property> property in the
<filename>my.cnf</filename> file of each MySQL server.</para>
</listitem>
<listitem>
<para>If two Management Servers happen to connect to two different database HA (split
brain), modify the following properties in db.properties of the slave: </para>
<itemizedlist>
<listitem>
<para><property>auto_increment_increment = 10</property>: Instruct the MySQL node to
auto increment values by 10 instead of default value 1.</para>
</listitem>
<listitem>
<para><property>auto_increment_offset = 2</property>: Instruct the MySQL node what is
the starting point of the auto increment column value to be start with. The second
property is relevant only when split brain occurs on a fresh setup.</para>
</listitem>
</itemizedlist>
</listitem>
</itemizedlist>
</section>
<section id="db-ha-configure">
<title>Configuring Database High Availability</title>
<para>To control the database high availability behavior, use the following configuration
settings in the file /etc/cloudstack/management/db.properties.</para>
<para><emphasis role="bold">Required Settings</emphasis></para>
<para>Be sure you have set the following in db.properties:</para>
<itemizedlist>
<listitem>
<para>db.ha.enabled: set to true if you want to use the high availability feature.</para>
<para>Example: db.ha.enabled=true</para>
</listitem>
<listitem>
<para>db.cloud.slaves: set to a comma-delimited set of slave hosts for the cloud database.
This is the list of nodes set up with high availability. The master node is not in the
list, since it is already mentioned elsewhere in the properties file.</para>
<para>Example: db.cloud.slaves=node2,node3,node4</para>
</listitem>
<listitem>
<para>db.usage.slaves: set to a comma-delimited set of slave hosts for the usage database.
This is the list of nodes set up with high availability. The master node is not in the
list, since it is already mentioned elsewhere in the properties file.</para>
<para>Example: db.usage.slaves=node2,node3,node4</para>
</listitem>
</itemizedlist>
<para><emphasis role="bold">Optional Settings</emphasis></para>
<para>The following settings must be present in db.properties, but you are not required to
change the default values unless you wish to do so for tuning purposes:</para>
<itemizedlist>
<listitem>
<para>db.cloud.secondsBeforeRetryMaster: The number of seconds the MySQL connector should
wait before trying again to connect to the master after the master went down. Default is 1
hour. The retry might happen sooner if db.cloud.queriesBeforeRetryMaster is reached
first.</para>
<para>Example: db.cloud.secondsBeforeRetryMaster=3600</para>
</listitem>
<listitem>
<para>db.cloud.queriesBeforeRetryMaster: The minimum number of queries to be sent to the
database before trying again to connect to the master after the master went down. Default
is 5000. The retry might happen sooner if db.cloud.secondsBeforeRetryMaster is reached
first.</para>
<para>Example: db.cloud.queriesBeforeRetryMaster=5000</para>
</listitem>
<listitem>
<para>db.cloud.initialTimeout: Initial time the MySQL connector should wait before trying
again to connect to the master. Default is 3600.</para>
<para>Example: db.cloud.initialTimeout=3600</para>
</listitem>
</itemizedlist>
</section>
<section id="aync-conf">
<title>Asynchronous Configuration for Database High Availability</title>
<para>The MySQL configuration to support DB HA in MySQL server is goes into the
<filename>/etc/my.cnf</filename> file and configuration slightly varies between master and
slave.</para>
<section id="master-conf">
<title>Master Configuration</title>
<programlisting>[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
# Settings user and group are ignored when systemd is used.
# If you need to run mysqld under a different user or group,
# customize your systemd unit file for mysqld according to the
# instructions in http://fedoraproject.org/wiki/Systemd
server-id=1
default-storage-engine=InnoDB
character-set-server=utf8
transaction-isolation=READ-COMMITTED
log-bin=mysql-bin
innodb_flush_log_at_trx_commit=1
sync_binlog=1
binlog-format=ROW
#Bin logs cleanup configuration
expiry_logs_days=10
max_binlog_size=100M
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid</programlisting>
</section>
<section id="slave-conf">
<title>Slave Configuration</title>
<programlisting>[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
# Settings user and group are ignored when systemd is used.
# If you need to run mysqld under a different user or group,
# customize your systemd unit file for mysqld according to the
# instructions in http://fedoraproject.org/wiki/Systemd
server-id=2
default-storage-engine = InnoDB
character-set-server = utf8
transaction-isolation = READ-COMMITTED
log-bin=mysql-bin
innodb_flush_log_at_trx_commit=1
sync_binlog=1
binlog-format=ROW
#Parameters to solve split brain problem
<emphasis role="bold">auto_increment_increment=10
auto_increment_offset=2</emphasis>
#Bin logs cleanup configuration
expiry_logs_days=10
max_binlog_size=100M
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid</programlisting>
</section>
</section>
<section id="db-ha-limitations">
<title>Limitations on Database High Availability</title>
<para>The following limitations exist in the current implementation of this feature.</para>
<itemizedlist>
<listitem>
<para>Slave hosts can not be monitored through &PRODUCT;. You will need to have a separate
means of monitoring.</para>
</listitem>
<listitem>
<para>Events from the database side are not integrated with the &PRODUCT; Management Server
events system.</para>
</listitem>
<listitem>
<para>MySQL 5.1 supports only Asynchronous high availability; therefore, there is a chance
of data inconsistency while server is down. </para>
</listitem>
</itemizedlist>
</section>
</section>