solr/solr-ref-guide/src/solrcloud-query-routing-and-read-tolerance.adoc - lucene-solr - Git at Google

 = SolrCloud Query Routing And Read Tolerance
 // Licensed to the Apache Software Foundation (ASF) under one
 // or more contributor license agreements.  See the NOTICE file
 // distributed with this work for additional information
 // regarding copyright ownership.  The ASF licenses this file
 // to you under the Apache License, Version 2.0 (the
 // "License"); you may not use this file except in compliance
 // with the License.  You may obtain a copy of the License at
 //
 //   http://www.apache.org/licenses/LICENSE-2.0
 //
 // Unless required by applicable law or agreed to in writing,
 // software distributed under the License is distributed on an
 // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 // KIND, either express or implied.  See the License for the
 // specific language governing permissions and limitations
 // under the License.

 SolrCloud is highly available and fault tolerant in reads and writes.


 == Read Side Fault Tolerance

 In a SolrCloud cluster each individual node load balances read requests across all the replicas in a collection. You still need a load balancer on the 'outside' that talks to the cluster, or you need a smart client which understands how to read and interact with Solr's metadata in ZooKeeper and only requests the ZooKeeper ensemble's address to start discovering to which nodes it should send requests. (Solr provides a smart Java SolrJ client called {solr-javadocs}/solr-solrj/org/apache/solr/client/solrj/impl/CloudSolrClient.html[CloudSolrClient].)

 Even if some nodes in the cluster are offline or unreachable, a Solr node will be able to correctly respond to a search request as long as it can communicate with at least one replica of every shard, or one replica of every _relevant_ shard if the user limited the search via the `shards` or `\_route_` parameters. The more replicas there are of every shard, the more likely that the Solr cluster will be able to handle search results in the event of node failures.

 == Query Parameters for Query Routing

 === zkConnected Parameter

 A Solr node will return the results of a search request as long as it can communicate with at least one replica of every shard that it knows about, even if it can _not_ communicate with ZooKeeper at the time it receives the request. This is normally the preferred behavior from a fault tolerance standpoint, but may result in stale or incorrect results if there have been major changes to the collection structure that the node has not been informed of via ZooKeeper (i.e., shards may have been added or removed, or split into sub-shards).

 A `zkConnected` header is included in every search response indicating if the node that processed the request was connected with ZooKeeper at the time:

 .Solr Response with zkConnected
 [source,json]
 ----
 {
   "responseHeader": {
     "status": 0,
     "zkConnected": true,
     "QTime": 20,
     "params": {
       "q": "*:*"
     }
   },
   "response": {
     "numFound": 107,
     "start": 0,
     "docs": [ "..." ]
   }
 }
 ----

 To prevent stale or incorrect results in the event that the request-serving node can't communicate with ZooKeeper, set the <<shards-tolerant-parameter,`shards.tolerant`>> parameter to `requireZkConnected`.  This will cause requests to fail rather than setting a `zkConnected` header to `false`.

 === shards Parameter

 By default, SolrCloud will run searches on all shards and combine the results if the shards parameter is not specified. You can specify one or more shard names as the value of the `shards` parameter to limit the shards that you want to search against.

 [source,plain]
 ----
 http://localhost:8983/solr/collection1/select?q=*:*&shards=shard1
 http://localhost:8983/solr/collection1/select?q=*:*&shards=shard2,shard3
 ----

 === shards.tolerant Parameter

 In the event that one or more shards queried are unavailable, then Solr's default behavior is to fail the request. However, there are many use-cases where partial results are acceptable and so Solr provides a boolean `shards.tolerant` parameter (default `false`).  In addition to `true` and `false`, `shards.tolerant` may also be set to `requireZkConnected` - see below.

 If `shards.tolerant=true` then partial results may be returned. If the returned response does not contain results from all the appropriate shards then the response header contains a special flag called `partialResults`.

 If `shards.tolerant=requireZkConnected` and the node serving the search request cannot communicate with ZooKeeper, the request will fail, rather than returning potentially stale or incorrect results.  This will also cause requests to fail when one or more queried shards are completely unavailable, just like when `shards.tolerant=false`.

 The client can specify '<<distributed-search-with-index-sharding.adoc#,`shards.info`>>' along with the `shards.tolerant` parameter to retrieve more fine-grained details.

 Example response with `partialResults` flag set to `true`:

 .Solr Response with partialResults
 [source,json]
 ----
 {
   "responseHeader": {
     "status": 0,
     "zkConnected": true,
     "partialResults": true,
     "QTime": 20,
     "params": {
       "q": "*:*"
     }
   },
   "response": {
     "numFound": 77,
     "start": 0,
     "docs": [ "..." ]
   }
 }
 ----

 === collection Parameter

 The `collection` parameter allows you to specify a collection or a number of collections on which the query should be executed. This allows you to query multiple collections at once and all the feature of Solr which work in a distributed manner can work across collections.

 [source,plain]
 ----
 http://localhost:8983/solr/collection1/select?collection=collection1,collection2,collection3
 ----

 === \_route_ Parameter

 The `\_route_` parameter can be used to specify a route key which is used to figure out the corresponding shards. For example, if you have a document with a unique key "user1!123", then specifying the route key as "_route_=user1!" (notice the trailing '!' character) will route the request to the shard which hosts that user. You can specify multiple route keys separated by comma.
 This parameter can be leveraged when we have shard data by users. See '<<shards-and-indexing-data-in-solrcloud.adoc#document-routing,`Document Routing`>>' for more information

 [source,plain]
 ----
 http://localhost:8983/solr/collection1/select?q=*:*&_route_=user1!
 http://localhost:8983/solr/collection1/select?q=*:*&_route_=user1!,user2!
 ----

 == Distributed Tracing and Debugging

 The `debug` parameter with a value of `track` can be used to trace the request as well as find timing information for each phase of a distributed request.


 == Optimization

 === distrib.singlePass Parameter

 If set to `true`, the `distrib.singlePass` parameter changes the distributed search algorithm to fetch all requested stored fields from each shard in the first phase itself. This eliminates the need for making a second request to fetch the stored fields.

 This can be faster when requesting a very small number of fields containing small values. However, if large fields are requested or if a lot of fields are requested then the overhead of fetching them over the network from all shards can make the request slower as compared to the normal distributed search path.

 Note that this optimization only applies to distributed search. Certain features such as faceting may make additional network requests for refinements, etc.
	= SolrCloud Query Routing And Read Tolerance
	// Licensed to the Apache Software Foundation (ASF) under one
	// or more contributor license agreements. See the NOTICE file
	// distributed with this work for additional information
	// regarding copyright ownership. The ASF licenses this file
	// to you under the Apache License, Version 2.0 (the
	// "License"); you may not use this file except in compliance
	// with the License. You may obtain a copy of the License at
	//
	// http://www.apache.org/licenses/LICENSE-2.0
	//
	// Unless required by applicable law or agreed to in writing,
	// software distributed under the License is distributed on an
	// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	// KIND, either express or implied. See the License for the
	// specific language governing permissions and limitations
	// under the License.

	SolrCloud is highly available and fault tolerant in reads and writes.


	== Read Side Fault Tolerance

	In a SolrCloud cluster each individual node load balances read requests across all the replicas in a collection. You still need a load balancer on the 'outside' that talks to the cluster, or you need a smart client which understands how to read and interact with Solr's metadata in ZooKeeper and only requests the ZooKeeper ensemble's address to start discovering to which nodes it should send requests. (Solr provides a smart Java SolrJ client called {solr-javadocs}/solr-solrj/org/apache/solr/client/solrj/impl/CloudSolrClient.html[CloudSolrClient].)

	Even if some nodes in the cluster are offline or unreachable, a Solr node will be able to correctly respond to a search request as long as it can communicate with at least one replica of every shard, or one replica of every _relevant_ shard if the user limited the search via the `shards` or `\_route_` parameters. The more replicas there are of every shard, the more likely that the Solr cluster will be able to handle search results in the event of node failures.

	== Query Parameters for Query Routing

	=== zkConnected Parameter

	A Solr node will return the results of a search request as long as it can communicate with at least one replica of every shard that it knows about, even if it can _not_ communicate with ZooKeeper at the time it receives the request. This is normally the preferred behavior from a fault tolerance standpoint, but may result in stale or incorrect results if there have been major changes to the collection structure that the node has not been informed of via ZooKeeper (i.e., shards may have been added or removed, or split into sub-shards).

	A `zkConnected` header is included in every search response indicating if the node that processed the request was connected with ZooKeeper at the time:

	.Solr Response with zkConnected
	[source,json]
	----
	{
	"responseHeader": {
	"status": 0,
	"zkConnected": true,
	"QTime": 20,
	"params": {
	"q": ":"
	}
	},
	"response": {
	"numFound": 107,
	"start": 0,
	"docs": [ "..." ]
	}
	}
	----

	To prevent stale or incorrect results in the event that the request-serving node can't communicate with ZooKeeper, set the <<shards-tolerant-parameter,`shards.tolerant`>> parameter to `requireZkConnected`. This will cause requests to fail rather than setting a `zkConnected` header to `false`.

	=== shards Parameter

	By default, SolrCloud will run searches on all shards and combine the results if the shards parameter is not specified. You can specify one or more shard names as the value of the `shards` parameter to limit the shards that you want to search against.

	[source,plain]
	----
	http://localhost:8983/solr/collection1/select?q=:&shards=shard1
	http://localhost:8983/solr/collection1/select?q=:&shards=shard2,shard3
	----

	=== shards.tolerant Parameter

	In the event that one or more shards queried are unavailable, then Solr's default behavior is to fail the request. However, there are many use-cases where partial results are acceptable and so Solr provides a boolean `shards.tolerant` parameter (default `false`). In addition to `true` and `false`, `shards.tolerant` may also be set to `requireZkConnected` - see below.

	If `shards.tolerant=true` then partial results may be returned. If the returned response does not contain results from all the appropriate shards then the response header contains a special flag called `partialResults`.

	If `shards.tolerant=requireZkConnected` and the node serving the search request cannot communicate with ZooKeeper, the request will fail, rather than returning potentially stale or incorrect results. This will also cause requests to fail when one or more queried shards are completely unavailable, just like when `shards.tolerant=false`.

	The client can specify '<<distributed-search-with-index-sharding.adoc#,`shards.info`>>' along with the `shards.tolerant` parameter to retrieve more fine-grained details.

	Example response with `partialResults` flag set to `true`:

	.Solr Response with partialResults
	[source,json]
	----
	{
	"responseHeader": {
	"status": 0,
	"zkConnected": true,
	"partialResults": true,
	"QTime": 20,
	"params": {
	"q": ":"
	}
	},
	"response": {
	"numFound": 77,
	"start": 0,
	"docs": [ "..." ]
	}
	}
	----

	=== collection Parameter

	The `collection` parameter allows you to specify a collection or a number of collections on which the query should be executed. This allows you to query multiple collections at once and all the feature of Solr which work in a distributed manner can work across collections.

	[source,plain]
	----
	http://localhost:8983/solr/collection1/select?collection=collection1,collection2,collection3
	----

	=== \_route_ Parameter

	The `\_route_` parameter can be used to specify a route key which is used to figure out the corresponding shards. For example, if you have a document with a unique key "user1!123", then specifying the route key as "_route_=user1!" (notice the trailing '!' character) will route the request to the shard which hosts that user. You can specify multiple route keys separated by comma.
	This parameter can be leveraged when we have shard data by users. See '<<shards-and-indexing-data-in-solrcloud.adoc#document-routing,`Document Routing`>>' for more information

	[source,plain]
	----
	http://localhost:8983/solr/collection1/select?q=:&_route_=user1!
	http://localhost:8983/solr/collection1/select?q=:&_route_=user1!,user2!
	----

	== Distributed Tracing and Debugging

	The `debug` parameter with a value of `track` can be used to trace the request as well as find timing information for each phase of a distributed request.


	== Optimization

	=== distrib.singlePass Parameter

	If set to `true`, the `distrib.singlePass` parameter changes the distributed search algorithm to fetch all requested stored fields from each shard in the first phase itself. This eliminates the need for making a second request to fetch the stored fields.

	This can be faster when requesting a very small number of fields containing small values. However, if large fields are requested or if a lot of fields are requested then the overhead of fetching them over the network from all shards can make the request slower as compared to the normal distributed search path.

	Note that this optimization only applies to distributed search. Certain features such as faceting may make additional network requests for refinements, etc.