blob: b692d66439494c89db8f9e7de8ccdadbbbafc55b [file] [log] [blame]
= CDCR Configuration
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
The Source and Target configurations differ in the case of the data centers being in separate clusters. "Cluster" here means separate ZooKeeper ensembles controlling disjoint Solr instances. Whether these data centers are physically separated or not is immaterial for this discussion.
[WARNING]
.CDCR is deprecated
====
This feature (in its current form) is deprecated and will be removed in 9.0.
See <<cross-data-center-replication-cdcr.adoc#,Cross Data Center Replication>> for more details.
====
As described in the section <<cdcr-architecture.adoc#,CDCR Architecture>>, two approaches are supported: uni-directional updates and bi-directional updates.
All CDCR configuration is done in the `solrconfig.xml` file. Because this is a per-collection configuration file, all CDCR configuration is done for each collection.
== Uni-Directional Updates
=== Source Configuration
Here is a sample of a Source configuration file, a section in `solrconfig.xml`. The presence of the `<replica>` section causes CDCR to use this cluster as the Source and it should not be present in the Target collections. Details about each setting are after the two examples. The source example has buffering disabled, the default is enabled:
[source,xml]
----
<requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
<lst name="replica">
<str name="zkHost">10.240.18.211:2181,10.240.18.212:2181</str>
<!--
If you have chrooted your Solr information at the target you must include the chroot, for example:
<str name="zkHost">10.240.18.211:2181,10.240.18.212:2181/solr</str>
-->
<str name="source">collection1</str>
<str name="target">collection1</str>
</lst>
<lst name="replicator">
<str name="threadPoolSize">8</str>
<str name="schedule">1000</str>
<str name="batchSize">128</str>
</lst>
<lst name="updateLogSynchronizer">
<str name="schedule">1000</str>
</lst>
</requestHandler>
<!-- Modify the <updateLog> section of your existing <updateHandler>
in your config as below -->
<updateHandler class="solr.DirectUpdateHandler2">
<updateLog class="solr.CdcrUpdateLog">
<str name="dir">${solr.ulog.dir:}</str>
<!--Any parameters from the original <updateLog> section -->
</updateLog>
<!-- Other configuration options such as autoCommit should still be present -->
</updateHandler>
----
=== Target Configuration
Here is a typical Target configuration.
Target instance must configure an update processor chain that is specific to CDCR. The update processor chain must include the `CdcrUpdateProcessorFactory`. The task of this processor is to ensure that the version numbers attached to update requests coming from a CDCR Source SolrCloud are reused and not overwritten by the Target. A properly configured Target configuration looks similar to this:
[source,xml]
----
<requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
<!-- recommended for Target clusters -->
<lst name="buffer">
<str name="defaultState">disabled</str>
</lst>
</requestHandler>
<requestHandler name="/update" class="solr.UpdateRequestHandler">
<lst name="defaults">
<str name="update.chain">cdcr-processor-chain</str>
</lst>
</requestHandler>
<updateRequestProcessorChain name="cdcr-processor-chain">
<processor class="solr.CdcrUpdateProcessorFactory"/>
<processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>
<!-- Modify the <updateLog> section of your existing <updateHandler> in your
config as below -->
<updateHandler class="solr.DirectUpdateHandler2">
<updateLog class="solr.CdcrUpdateLog">
<str name="dir">${solr.ulog.dir:}</str>
<!--Any parameters from the original <updateLog> section -->
</updateLog>
<!-- Other configuration options such as autoCommit should still be present -->
</updateHandler>
----
== Bi-Directional Updates
The configurations in both Cluster 1 and 2 are identical with respective `zkHost` string specified in each cluster's `solrconfig.xml`.
TIP: Both Cluster 1 and Cluster 2 can act as Source and Target at any given point of time but a cluster cannot be both Source and Target at the same time.
=== Cluster 1 Configuration
Here is a sample of a Cluster 1 configuration file, a section in `solrconfig.xml`. Cluster 2 `zkhost` string is specified in a `CdcrRequestHandler` declaration:
[source,xml]
----
<requestHandler name="/update" class="solr.UpdateRequestHandler">
<lst name="defaults">
<str name="update.chain">cdcr-processor-chain</str>
</lst>
</requestHandler>
<updateRequestProcessorChain name="cdcr-processor-chain">
<processor class="solr.CdcrUpdateProcessorFactory"/>
<processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>
<requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
<lst name="replica">
<str name="zkHost">10.240.19.241:2181,10.240.19.242:2181</str>
<!--
If you have chrooted your Solr information at the target you must include the chroot, for example:
<str name="zkHost">10.240.19.241:2181,10.240.19.242:2181/solr</str>
-->
<str name="source">collection1</str>
<str name="target">collection1</str>
</lst>
<lst name="replicator">
<str name="threadPoolSize">8</str>
<str name="schedule">1000</str>
<str name="batchSize">128</str>
</lst>
<lst name="updateLogSynchronizer">
<str name="schedule">1000</str>
</lst>
</requestHandler>
<!-- Modify the <updateLog> section of your existing <updateHandler>
in your config as below -->
<updateHandler class="solr.DirectUpdateHandler2">
<updateLog class="solr.CdcrUpdateLog">
<str name="dir">${solr.ulog.dir:}</str>
<!--Any parameters from the original <updateLog> section -->
</updateLog>
</updateHandler>
----
=== Cluster 2 Configuration
The configuration of the 2nd cluster is identical to the configuration of Cluster 1, with the Cluster 1 `zkHost` string specified in `CdcrRequestHandler` definition:
[source,xml]
----
<requestHandler name="/update" class="solr.UpdateRequestHandler">
<lst name="defaults">
<str name="update.chain">cdcr-processor-chain</str>
</lst>
</requestHandler>
<updateRequestProcessorChain name="cdcr-processor-chain">
<processor class="solr.CdcrUpdateProcessorFactory"/>
<processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>
<requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
<lst name="replica">
<str name="zkHost">10.250.18.211:2181,10.250.18.212:2181</str>
<!--
If you have chrooted your Solr information at the target you must include the chroot, for example:
<str name="zkHost">10.250.18.211:2181,10.250.18.212:2181/solr</str>
-->
<str name="source">collection1</str>
<str name="target">collection1</str>
</lst>
<lst name="replicator">
<str name="threadPoolSize">8</str>
<str name="schedule">1000</str>
<str name="batchSize">128</str>
</lst>
<lst name="updateLogSynchronizer">
<str name="schedule">1000</str>
</lst>
</requestHandler>
<!-- Modify the <updateLog> section of your existing <updateHandler>
in your config as below -->
<updateHandler class="solr.DirectUpdateHandler2">
<updateLog class="solr.CdcrUpdateLog">
<str name="dir">${solr.ulog.dir:}</str>
<!--Any parameters from the original <updateLog> section -->
</updateLog>
</updateHandler>
----
== CDCR Configuration Parameters
The configuration details, defaults and options are as follows:
=== The Replica Element
CDCR can be configured to forward update requests to one or more Target collections. A Target collection is defined with a “replica” list as follows:
`zkHost`::
The host address for ZooKeeper of the Target SolrCloud. Usually this is a comma-separated list of addresses to each node in the Target ZooKeeper ensemble. This parameter is required.
`Source`::
The name of the collection on the Source SolrCloud to be replicated. This parameter is required.
`Target`::
The name of the collection on the Target SolrCloud to which updates will be forwarded. This parameter is required.
=== The Replicator Element
The CDC Replicator is the component in charge of forwarding updates to the replicas. The replicator will monitor the update logs of the Source collection and will forward any new updates to the Target collection.
The replicator uses a fixed thread pool to forward updates to multiple replicas in parallel. If more than one replica is configured, one thread will forward a batch of updates from one replica at a time in a round-robin fashion. The replicator can be configured with a “replicator” list as follows:
`threadPoolSize`::
The number of threads to use for forwarding updates. One thread per replica is recommended. The default is `2`.
`schedule`::
The delay in milliseconds for the monitoring the update log(s). The default is `10`.
`batchSize`::
The number of updates to send in one batch. The optimal size depends on the size of the documents. Large batches of large documents can increase your memory usage significantly. The default is `128`.
=== The updateLogSynchronizer Element
Expert: Non-leader nodes need to synchronize their update logs with their leader node from time to time in order to clean deprecated transaction log files. By default, such a synchronization process is performed every minute. The schedule of the synchronization can be modified with a “updateLogSynchronizer” list as follows:
TIP: If the updateLogSynchronizer element is omitted from the Source cluster, transaction logs may accumulate on non-leaders.
`schedule`::
The delay in milliseconds for synchronizing the update logs. The default is `60000`.
=== The Buffer Element
When buffering updates, the update logs will store all the updates indefinitely. It is best to disable buffering on both the Source and Target clusters during normal operation as when buffering is enabled the Update Logs will grow without limit. Enbling buffering is intended for special maintenance periods. Buffering can be disabled at startup with a “buffer” list and the parameter “defaultState” as follows:
`defaultState`::
The state of the buffer at startup. The default is `enabled`.
[TIP]
.Buffering should be enabled only for maintenance windows
====
Buffering is designed to augment maintenance windows. The following points should be kept in mind:
* When buffering is enabled, the Update Logs will grow without limit; they will never be purged.
* During normal operation, the Update Logs will automatically accrue on the Source data center if the Target data center is unavailable; It is not necessary to enable buffering for CDCR to handle routine network disruptions.
** For this reason, monitoring disk usage on the Source data center is recommended as an additional check that the Target data center is receiving updates.
* For uni-directional updates, buffering should _not_ be enabled on the Target data center as Update Logs would accrue without limit.
* If buffering is enabled and then disabled, the Update Logs will be removed when their contents have been sent to the Target data center. This process may take some time and is triggered by additional updates the Source cluster.
** Update Log cleanup is not triggered until a new update is sent to the Source data center.
====
== Initial Startup
=== Uni-Directional Approach
This is a general approach for initializing CDCR in a production environment. It's based upon an approach taken by the initial working installation of CDCR and generously contributed to illustrate a "real world" scenario.
* CDCR is used to keep a remote disaster-recovery instance available for production backup.
* This example as 26 clouds with 200 million assets per cloud (15GB indexes). Total document count is over 4.8 billion.
** Source and Target clouds were synched in 2-3 hour maintenance windows to establish the base index for the Targets.
As usual, it is good to start small. Sync a single cloud and monitor for a period of time before doing the others. You may need to adjust your settings several times before finding the right balance.
* Before starting, stop or pause the indexers. This is best done during a small maintenance window.
* Stop the SolrCloud instances at the Source.
* Upload the modified `solrconfig.xml` to ZooKeeper on both Source and Target as appropriate, see the examples above.
* Sync the index directories from the Source collection to Target collection across to the corresponding shard nodes. `rsync` works well for this.
+
For example, if there are two shards on collection1 with 2 replicas for each shard, copy the corresponding index directories from:
+
[width="75%",cols="45,10,45"]
|===
|shard1replica1Source |to |shard1replica1Target
|shard1replica2Source |to |shard1replica2Target
|shard2replica1Source |to |shard2replica1Target
|shard2replica2Source |to |shard2replica2Target
|===
* Start ZooKeeper on the Target (DR).
* Start SolrCloud on the Target (DR).
* Start ZooKeeper on the Source.
* Start SolrCloud on the Source. As a general rule, the Target (DR) should be started before the Source.
* Activate CDCR on Source instance using the CDCR API:
+
[source,text]
http://host:port/solr/<collection_name>/cdcr?action=START
+
There is no need to run the `/cdcr?action=START` command on the Target.
* Disable the buffer on the Target and Source:
+
[source,text]
http://host:port/solr/collection_name/cdcr?action=DISABLEBUFFER
+
* Re-enable indexing.
=== Bi-Directional Approach
[TIP]
====
When using the bi-directional approach, it is highly recommended to enable CDCR on both cluster-collections before any indexing has taken place.
====
Based on the same example from uni-directional solution, let's walk through the necessary steps:
* Before you begin, stop or pause any indexing processes. This is best done during a small maintenance window.
* Stop the SolrCloud instances in both Cluster 1 and Cluster 2.
* Upload the modified `solrconfig.xml` to ZooKeeper on both Cluster 1 and Cluster 2 as appropriate, see the examples above in the section <<Bi-Directional Updates>>.
* If documents were indexed prior to this exercise, sync the index directories from the Cluster 1 collection to the Cluster 2 collection to the corresponding shard nodes or vice versa. The `rsync` utility works well for this if it's available on your server. Check to be sure the updated index is copied across.
+
For example, if there are 2 shards on collection 'cluster1' (the updated collection) with 2 replicas for each shard, copy the corresponding index directories from:
+
[width="75%",cols="45,10,45"]
|===
|shard1replica1cluster1 |to |shard1replica1cluster2
|shard1replica2cluster1 |to |shard1replica2cluster2
|shard2replica1cluster1 |to |shard2replica1cluster2
|shard2replica2cluster1 |to |shard2replica2cluster2
|===
* Start ZooKeeper on Cluster 1.
* Start ZooKeeper on Cluster 2.
* Start SolrCloud on Cluster 1.
* Start SolrCloud on Cluster 2.
* If not present, create respective collections in both Cluster 1 and Cluster 2.
* Activate the CDCR on Cluster 1 and Cluster 2 instance using the CDCR API:
+
[source,text]
http://host:port/solr/<collection_name>/cdcr?action=START
+
* Disable the buffer on Cluster 1 and Cluster 2:
+
[source,text]
http://host:port/solr/collection_name/cdcr?action=DISABLEBUFFER
+
* Re-enable indexing.
== ZooKeeper Settings
With CDCR, the Target ZooKeepers will have connections from the Target clouds and the Source clouds. You may need to increase the `maxClientCnxns` setting in `zoo.cfg`.
[source,text]
----
## set numbers of connection to 800 from client
## is maxClientCnxns=0 that means no limit
maxClientCnxns=800
----