blob: 114d508da019247f43c76820c2a9963a61c07359 [file] [log] [blame]
= Cross Data Center Replication (CDCR)
:page-children: cdcr-architecture, cdcr-config, cdcr-operations, cdcr-api
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
Cross Data Center Replication (CDCR) allows you to create multiple SolrCloud data centers and keep them in sync.
== What is CDCR?
The <<solrcloud.adoc#solrcloud,SolrCloud>> architecture is designed to support <<near-real-time-searching.adoc#near-real-time-searching,Near Real Time (NRT)>> searches on a Solr collection that usual consists of multiple nodes in a single data center. CDCR augments this model by forwarding updates from a Solr collection in one data center to a parallel Solr collection in another data center where the network latencies are greater than the SolrCloud model was designed to accommodate.
For more information about CDCR, see the following sections:
* *<<cdcr-architecture.adoc#cdcr-architecture,CDCR Architecture>>*: A detailed overview of how CDCR works.
* *<<cdcr-config.adoc#cdcr-config,CDCR Configuration>>*: How to set up and initialize CDCR for your cluster.
* *<<cdcr-operations.adoc#cdcr-operations,CDCR Operations>>*: Information on monitoring CDCR and upgrading your cluster when using CDCR.
* *<<cdcr-api.adoc#cdcr-api,CDCR API>>*: Reference for the CDCR API.
// Are there any terms here that are new? If not, I think we should remove this.
== CDCR Glossary
For the purposes of discussing CDCR, the following terminology is used. If you are already familiar with SolrCloud, many of these terms will already be familiar to you.
[glossary]
Node:: A JVM instance running Solr; a server.
Cluster:: A set of Solr nodes managed as a single unit by a ZooKeeper ensemble hosting one or more Collections.
Data Center:: A group of networked servers hosting a Solr cluster. For CDCR, the terms _Cluster_ and _Data Center_ are interchangeable as we assume that each Solr cluster is hosted in a different group of networked servers.
Shard:: A sub-index of a single logical collection. This may be spread across multiple nodes of the cluster. Each shard can have 1-N replicas.
Leader:: Each shard has replica identified as its leader. All the writes for documents belonging to a shard are routed through the leader.
Replica:: A copy of a shard for use in failover or load balancing. Replicas comprising a shard can either be leaders or non-leaders.
Follower:: A convenience term for a replica that is _not_ the leader of a shard.
Collection:: A logical index, consisting of one or more shards. A cluster can have multiple collections.
Update:: An operation that changes the collection's index in any way. This could be adding a new document, deleting documents or changing a document.
Update Log(s):: An append-only log of write operations maintained by each node.