blob: eb532a3826739ca4544a1c950276d92afd0d7b1e [file] [log] [blame]
= Cross Data Center Replication (CDCR)
:page-children: cdcr-architecture, cdcr-config, cdcr-operations, cdcr-api
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
Cross Data Center Replication (CDCR) allows you to create multiple SolrCloud data centers and keep them in sync.
[WARNING]
.CDCR is deprecated
====
This feature (in its current form) is deprecated and will be removed in 9.0.
Anyone currently using CDCR should consider migrating away from it.
There are several open issues which make CDCR complex to maintain and generally unstable.
The simplest alternative to CDCR is to manually maintain two Solr implementations in separate data centers and manage them as completely separate installations.
While this may require more initial setup, it will be more stable in the long run.
There are other alternatives, and the Solr community is working to identify the best recommended replacement in time for 9.0.
====
== What is CDCR?
The <<solrcloud.adoc#,SolrCloud>> architecture is designed to support <<near-real-time-searching.adoc#,Near Real Time (NRT)>> searches on a Solr collection that usual consists of multiple nodes in a single data center. CDCR augments this model by forwarding updates from a Solr collection in one data center to a parallel Solr collection in another data center where the network latencies are greater than the SolrCloud model was designed to accommodate.
For more information about CDCR, see the following sections:
* *<<cdcr-architecture.adoc#,CDCR Architecture>>*: A detailed overview of how CDCR works.
* *<<cdcr-config.adoc#,CDCR Configuration>>*: How to set up and initialize CDCR for your cluster.
* *<<cdcr-operations.adoc#,CDCR Operations>>*: Information on monitoring CDCR and upgrading your cluster when using CDCR.
* *<<cdcr-api.adoc#,CDCR API>>*: Reference for the CDCR API.
// Are there any terms here that are new? If not, I think we should remove this.
== CDCR Glossary
For the purposes of discussing CDCR, the following terminology is used. If you are already familiar with SolrCloud, many of these terms will already be familiar to you.
[glossary]
Node:: A JVM instance running Solr; a server.
Cluster:: A set of Solr nodes managed as a single unit by a ZooKeeper ensemble hosting one or more Collections.
Data Center:: A group of networked servers hosting a Solr cluster. For CDCR, the terms _Cluster_ and _Data Center_ are interchangeable as we assume that each Solr cluster is hosted in a different group of networked servers.
Shard:: A sub-index of a single logical collection. This may be spread across multiple nodes of the cluster. Each shard can have 1-N replicas.
Leader:: Each shard has replica identified as its leader. All the writes for documents belonging to a shard are routed through the leader.
Replica:: A copy of a shard for use in failover or load balancing. Replicas comprising a shard can either be leaders or non-leaders.
Follower:: A convenience term for a replica that is _not_ the leader of a shard.
Collection:: A logical index, consisting of one or more shards. A cluster can have multiple collections.
Update:: An operation that changes the collection's index in any way. This could be adding a new document, deleting documents or changing a document.
Update Log(s):: An append-only log of write operations maintained by each node.