solr/solr-ref-guide/src/replica-placement-plugins.adoc - lucene-solr - Git at Google

 = Replica Placement Plugins
 :toc: macro
 :toclevels: 5
 // Licensed to the Apache Software Foundation (ASF) under one
 // or more contributor license agreements.  See the NOTICE file
 // distributed with this work for additional information
 // regarding copyright ownership.  The ASF licenses this file
 // to you under the Apache License, Version 2.0 (the
 // "License"); you may not use this file except in compliance
 // with the License.  You may obtain a copy of the License at
 //
 //   http://www.apache.org/licenses/LICENSE-2.0
 //
 // Unless required by applicable law or agreed to in writing,
 // software distributed under the License is distributed on an
 // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 // KIND, either express or implied.  See the License for the
 // specific language governing permissions and limitations
 // under the License.

 == Introduction
 When user wants to create a new collection or add a replica to one of the existing
 collections Solr needs to first determine on what node to put the replica so that the
 cluster resources are allocated in a well-balanced manner (according to some criteria).

 Replica placement plugin is the configurable component that determines these placements.
 It can also enforce additional constraints on operations such as collection or replica removal
 - for example if the plugin wants some collections to be always co-located on the same nodes,
 or always present when some other collection is present.

 (In Solr 8.x this functionality was using either per-collection rules, or it could be configured
 in the autoscaling framework).

 === Plugin Configuration
 Replica placement plugin configurations are maintained using the `/cluster/plugin` API.
 There can be only one cluster-wide plugin configuration at a time, and it uses a pre-defined
 plugin name: `.placement-plugin`.

 There are several placement plugins included in the Solr distribution. Additional placement
 plugins can be added - they have to implement a `PlacementPluginFactory` interface. The
 configuration entry may also contain a `config` element if the plugin implements the
 `ConfigurablePlugin` interface.

 === Legacy replica placement
 By default Solr 9.0 uses a legacy method for replica placements, which doesn't use a placement
 plugin at all. This method is used whenever the placement plugin configuration is missing or
 invalid.

 Legacy placement simply assigns new replicas to live nodes in a round-robin fashion: first it
 prepares a sorted list of nodes with the smallest number of existing replicas of the collection.
 Then for each shard in the request it adds the replicas to consecutive nodes in this order,
 wrapping around to the first node if the number of replicas is larger than the number of nodes.

 This placement strategy doesn't ensure that no more than 1 replica of a shard is placed on the
 same node. Also, the round-robin assignment only roughly approximates an even spread of replicas
 across the nodes.

 === Placement plugins included in Solr
 The following placement plugins are available out-of-the-box in Solr 9.0 (however, as
 described above the default configuration doesn't use any of them, instead it uses the legacy
 replica placement method).

 In order to use a plugin its configuration must be added using the `/cluster/plugin` API.
 For example, in order to use the `AffinityPlacementFactory` plugin the following command
 should be executed:

 [source,bash]
 ----
 curl -X POST -H 'Content-type: application/json' -d '{
     "add":{
         "name": ".placement-plugin",
         "class": "org.apache.solr.cluster.placement.plugins.AffinityPlacementFactory",
         "config": {
           "minimalFreeDiskGB": 20,
           "prioritizedFreeDiskGB": 100,
           "withCollections": {
             "A_primary": "A_secondary",
             "B_primary": "B_secondary"
           },
           "nodeType": {
             "collection_A": "searchNode,indexNode",
             "collection_B": "analyticsNode"
           }
         }
     }}'
   http://localhost:8983/api/cluster/plugin
 ----

 Similarly, the configuration can be updated or removed.

 NOTE: Placement plugin configuration MUST use the predefined name `.placement-plugin`.
 There can be only one (or none) placement configuration defined.


 ==== `RandomPlacementFactory`
 This plugin creates random placements, while preventing two replicas of the same shard from being
 placed on the same node. If there are too few nodes to satisfy these constraints an exception is
 thrown, and the request is rejected.

 Due to its simplistic algorithm this plugin should only be used in simple deployments.

 This plugin doesn't require any configuration.

 ==== `MinimizeCoresPlacementFactory`
 This plugin creates placements that minimize the number of cores per node across all live nodes,
 while not placing two replicas of the same shard on the same node. If there are too few nodes
 to satisfy these constraints an exception is thrown, and the request is rejected.

 Due to its simplistic algorithm this plugin should only be used in simple deployments.

 This plugin doesn't require any configuration.

 ==== `AffinityPlacementFactory`
 This plugin implements replica placement algorithm that roughly replicates Solr 8.x autoscaling
 configuration defined https://github.com/lucidworks/fusion-cloud-native/blob/master/policy.json#L16[here].

 The autoscaling specification in the configuration linked above aimed to do the following:

 * spread replicas per shard as evenly as possible across multiple availability zones (given by a system property),
 * assign replicas based on replica type to specific kinds of nodes (another system property), and
 * avoid having more than one replica per shard on the same node.
 * only after the above constraints are satisfied:
 ** minimize cores per node, or
 ** minimize disk usage.

 It also supports additional per-collection constraints:

 * `withCollection` constraint enforces the placement of co-located collections' replicas on the
 same nodes, and prevents deletions of collections and replicas that would break this constraint.
 * `nodeType` constraint limits the nodes eligible for placement to only those that match one or
 more of the specified node types.

 See below for more details on these constraints.

 Overall strategy of this plugin:

 * The set of nodes in the cluster is obtained, and if the `withCollection` mapping is present
   and applicable to the current collection then this candidate set is filtered so that only
   eligible nodes remain according to this constraint.
 * The resulting node set is transformed into 3 independent sets (that can overlap) of nodes accepting each of the three replica types (NRT, TLOG and PULL).
 * For each shard on which placing replicas is required and then for each replica type to place (starting with NRT, then TLOG then PULL):
 ** The set of candidates nodes corresponding to the replica type is used and from that set are removed nodes that already have a replica (of any type) for that shard
 ** If there are not enough nodes, an error is thrown (this is checked further down during processing).
 ** The number of (already existing) replicas of the current type on each Availability Zone is collected.
 ** Separate the set of available nodes to as many subsets (possibly some are empty) as there are Availability Zones defined for the candidate nodes
 ** In each AZ nodes subset, sort the nodes by increasing total number of cores count.
 ** Iterate over the number of replicas to place (for the current replica type for the current shard):
 *** Based on the number of replicas per AZ collected previously, pick the non-empty set of nodes having the lowest number of replicas. Then pick the first node in that set. That's the node the replica is placed one. Remove the node from the set of available nodes for the given AZ and increase the number of replicas placed on that AZ.
 ** During this process, the number of cores on the nodes in general is tracked to take into account placement decisions so that not all shards decide to put their replicas on the same nodes (they might though if these are the less loaded nodes).

 NOTE: At the moment the names of availability zone property and the name of the replica type
 property are not configurable, and set respectively to `availability_zone` and `replica_type`.

 ===== `withCollection` constraint
 This plugin supports enforcing additional constraint named `withCollection`, which causes
 replicas of two paired collections to be placed on the same nodes.

 Users can define the collection pairs in the plugin configuration, in the `config/withCollection`
 element, which is a JSON map where keys are the primary collection names, and the values are the
 secondary collection names (currently only 1:1 mapping is supported - however, multiple primary
 collections may use the same secondary collection, which effectively relaxes this to N:1 mapping).

 Unlike previous versions of Solr, this plugin does NOT automatically place replicas of the
 secondary collection - those replicas are assumed to be already in place, and it's the
 responsibility of the user to already place them on the right nodes (most likely simply by
 using this plugin to create the secondary collection first, with large enough replication
 factor to ensure that the target node set is populated with secondary replicas).

 When a request to compute placements is processed for the primary collection that has a
 key in the `withCollection` map, the set of candidate nodes is first filtered to eliminate nodes
 that don't contain the replicas of the secondary collection. Please note that this may
 result in an empty set, and an exception - in this case the sufficient number of secondary
 replicas needs to be created first.

 The plugin preserves this co-location by rejecting delete operation of secondary collections
 (or their replicas) if they are still in use on the nodes where primary replicas are located
 - requests to do so will be rejected with errors. In order to delete a secondary collection
 (or its replicas) from these nodes first the replicas of the primary collection must be
 removed from the co-located nodes, or the configuration must be changed to remove the
 co-location mapping for the primary collection.

 ===== `nodeType` constraint


 ===== Configuration
 This plugin supports the following configuration parameters:

 `minimalFreeDiskGB`::
 (optional, integer) if a node has strictly less GB of free disk than this value, the node is
 excluded from assignment decisions. Set to 0 or less to disable. Default value is 10.

 `prioritizedFreeDiskGB`::
 (optional, integer) replica allocation will assign replicas to nodes with at least this number
 of GB of free disk space regardless of the number of cores on these nodes rather than assigning
 replicas to nodes with less than this amount of free disk space if that's an option (if that's
 not an option, replicas can still be assigned to nodes with less than this amount of free space).
 Default value is 100.

 `withCollection`::
 (optional, map) this property defines an additional constraint that primary collections (keys)
 must be located on the same nodes as the secondary collections (values). The plugin will
 assume that the secondary collection replicas are already in place and ignore candidate
 nodes where they are not already present. Default value is none.

 `nodeType`::
 (optional, map) this property defines an additional constraint that collections (keys)
 must be located only on the nodes that are labeled with one or more of the matching
 "node type" labels (values in the map are comma-separated labels). Nodes are labeled using the
 `node_type` system property with the value being an arbitrary comma-separated list of labels.
 Correspondingly, the plugin configuration can specify that a particular collection must be placed
 only on the nodes that match at least one of the (comma-separated) labels defined here.

 === Example configurations
 This is a simple configuration that uses default values:

 [source,bash]
 ----
 curl -X POST -H 'Content-type: application/json' -d '{
     "add":{
         "name": ".placement-plugin",
         "class": "org.apache.solr.cluster.placement.plugins.AffinityPlacementFactory"
     }}'
   http://localhost:8983/api/cluster/plugin
 ----

 This configuration specifies the base parameters:
 [source,bash]
 ----
 curl -X POST -H 'Content-type: application/json' -d '{
     "add":{
         "name": ".placement-plugin",
         "class": "org.apache.solr.cluster.placement.plugins.AffinityPlacementFactory",
         "config": {
           "minimalFreeDiskGB": 20,
           "prioritizedFreeDiskGB": 100
         }
     }}'
   http://localhost:8983/api/cluster/plugin
 ----

 This configuration defines that collection `A_primary` must be co-located with
 collection `Common_secondary`, and collection `B_primary` must be co-located also with the
 collection `Common_secondary`:

 [source,bash]
 ----
 curl -X POST -H 'Content-type: application/json' -d '{
     "add":{
         "name": ".placement-plugin",
         "class": "org.apache.solr.cluster.placement.plugins.AffinityPlacementFactory",
         "config": {
           "withCollection": {
             "A_primary": "Common_secondary",
             "B_primary": "Common_secondary"
           }
         }
     }}'
   http://localhost:8983/api/cluster/plugin
 ----

 This configuration defines that collection `collection_A` must be placed only on the nodes with
 the `node_type` system property containing either `searchNode` or `indexNode` (for example, a node
 may be labeled as `-Dnode_type=searchNode,indexNode,uiNode,zkNode`). Similarly, the
 collection `collection_B` must be placed only on the nodes that contain the `analyticsNode` label:

 [source,bash]
 ----
 curl -X POST -H 'Content-type: application/json' -d '{
     "add":{
         "name": ".placement-plugin",
         "class": "org.apache.solr.cluster.placement.plugins.AffinityPlacementFactory",
         "config": {
           "nodeType": {
             "collection_A": "searchNode,indexNode",
             "collection_B": "analyticsNode"
           }
         }
     }}'
   http://localhost:8983/api/cluster/plugin
 ----
	= Replica Placement Plugins
	:toc: macro
	:toclevels: 5
	// Licensed to the Apache Software Foundation (ASF) under one
	// or more contributor license agreements. See the NOTICE file
	// distributed with this work for additional information
	// regarding copyright ownership. The ASF licenses this file
	// to you under the Apache License, Version 2.0 (the
	// "License"); you may not use this file except in compliance
	// with the License. You may obtain a copy of the License at
	//
	// http://www.apache.org/licenses/LICENSE-2.0
	//
	// Unless required by applicable law or agreed to in writing,
	// software distributed under the License is distributed on an
	// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	// KIND, either express or implied. See the License for the
	// specific language governing permissions and limitations
	// under the License.

	== Introduction
	When user wants to create a new collection or add a replica to one of the existing
	collections Solr needs to first determine on what node to put the replica so that the
	cluster resources are allocated in a well-balanced manner (according to some criteria).

	Replica placement plugin is the configurable component that determines these placements.
	It can also enforce additional constraints on operations such as collection or replica removal
	- for example if the plugin wants some collections to be always co-located on the same nodes,
	or always present when some other collection is present.

	(In Solr 8.x this functionality was using either per-collection rules, or it could be configured
	in the autoscaling framework).

	=== Plugin Configuration
	Replica placement plugin configurations are maintained using the `/cluster/plugin` API.
	There can be only one cluster-wide plugin configuration at a time, and it uses a pre-defined
	plugin name: `.placement-plugin`.

	There are several placement plugins included in the Solr distribution. Additional placement
	plugins can be added - they have to implement a `PlacementPluginFactory` interface. The
	configuration entry may also contain a `config` element if the plugin implements the
	`ConfigurablePlugin` interface.

	=== Legacy replica placement
	By default Solr 9.0 uses a legacy method for replica placements, which doesn't use a placement
	plugin at all. This method is used whenever the placement plugin configuration is missing or
	invalid.

	Legacy placement simply assigns new replicas to live nodes in a round-robin fashion: first it
	prepares a sorted list of nodes with the smallest number of existing replicas of the collection.
	Then for each shard in the request it adds the replicas to consecutive nodes in this order,
	wrapping around to the first node if the number of replicas is larger than the number of nodes.

	This placement strategy doesn't ensure that no more than 1 replica of a shard is placed on the
	same node. Also, the round-robin assignment only roughly approximates an even spread of replicas
	across the nodes.

	=== Placement plugins included in Solr
	The following placement plugins are available out-of-the-box in Solr 9.0 (however, as
	described above the default configuration doesn't use any of them, instead it uses the legacy
	replica placement method).

	In order to use a plugin its configuration must be added using the `/cluster/plugin` API.
	For example, in order to use the `AffinityPlacementFactory` plugin the following command
	should be executed:

	[source,bash]
	----
	curl -X POST -H 'Content-type: application/json' -d '{
	"add":{
	"name": ".placement-plugin",
	"class": "org.apache.solr.cluster.placement.plugins.AffinityPlacementFactory",
	"config": {
	"minimalFreeDiskGB": 20,
	"prioritizedFreeDiskGB": 100,
	"withCollections": {
	"A_primary": "A_secondary",
	"B_primary": "B_secondary"
	},
	"nodeType": {
	"collection_A": "searchNode,indexNode",
	"collection_B": "analyticsNode"
	}
	}
	}}'
	http://localhost:8983/api/cluster/plugin
	----

	Similarly, the configuration can be updated or removed.

	NOTE: Placement plugin configuration MUST use the predefined name `.placement-plugin`.
	There can be only one (or none) placement configuration defined.


	==== `RandomPlacementFactory`
	This plugin creates random placements, while preventing two replicas of the same shard from being
	placed on the same node. If there are too few nodes to satisfy these constraints an exception is
	thrown, and the request is rejected.

	Due to its simplistic algorithm this plugin should only be used in simple deployments.

	This plugin doesn't require any configuration.

	==== `MinimizeCoresPlacementFactory`
	This plugin creates placements that minimize the number of cores per node across all live nodes,
	while not placing two replicas of the same shard on the same node. If there are too few nodes
	to satisfy these constraints an exception is thrown, and the request is rejected.

	Due to its simplistic algorithm this plugin should only be used in simple deployments.

	This plugin doesn't require any configuration.

	==== `AffinityPlacementFactory`
	This plugin implements replica placement algorithm that roughly replicates Solr 8.x autoscaling
	configuration defined https://github.com/lucidworks/fusion-cloud-native/blob/master/policy.json#L16[here].

	The autoscaling specification in the configuration linked above aimed to do the following:

	* spread replicas per shard as evenly as possible across multiple availability zones (given by a system property),
	* assign replicas based on replica type to specific kinds of nodes (another system property), and
	* avoid having more than one replica per shard on the same node.
	* only after the above constraints are satisfied:
	** minimize cores per node, or
	** minimize disk usage.

	It also supports additional per-collection constraints:

	* `withCollection` constraint enforces the placement of co-located collections' replicas on the
	same nodes, and prevents deletions of collections and replicas that would break this constraint.
	* `nodeType` constraint limits the nodes eligible for placement to only those that match one or
	more of the specified node types.

	See below for more details on these constraints.

	Overall strategy of this plugin:

	* The set of nodes in the cluster is obtained, and if the `withCollection` mapping is present
	and applicable to the current collection then this candidate set is filtered so that only
	eligible nodes remain according to this constraint.
	* The resulting node set is transformed into 3 independent sets (that can overlap) of nodes accepting each of the three replica types (NRT, TLOG and PULL).
	* For each shard on which placing replicas is required and then for each replica type to place (starting with NRT, then TLOG then PULL):
	** The set of candidates nodes corresponding to the replica type is used and from that set are removed nodes that already have a replica (of any type) for that shard
	** If there are not enough nodes, an error is thrown (this is checked further down during processing).
	** The number of (already existing) replicas of the current type on each Availability Zone is collected.
	** Separate the set of available nodes to as many subsets (possibly some are empty) as there are Availability Zones defined for the candidate nodes
	** In each AZ nodes subset, sort the nodes by increasing total number of cores count.
	** Iterate over the number of replicas to place (for the current replica type for the current shard):
	*** Based on the number of replicas per AZ collected previously, pick the non-empty set of nodes having the lowest number of replicas. Then pick the first node in that set. That's the node the replica is placed one. Remove the node from the set of available nodes for the given AZ and increase the number of replicas placed on that AZ.
	** During this process, the number of cores on the nodes in general is tracked to take into account placement decisions so that not all shards decide to put their replicas on the same nodes (they might though if these are the less loaded nodes).

	NOTE: At the moment the names of availability zone property and the name of the replica type
	property are not configurable, and set respectively to `availability_zone` and `replica_type`.

	===== `withCollection` constraint
	This plugin supports enforcing additional constraint named `withCollection`, which causes
	replicas of two paired collections to be placed on the same nodes.

	Users can define the collection pairs in the plugin configuration, in the `config/withCollection`
	element, which is a JSON map where keys are the primary collection names, and the values are the
	secondary collection names (currently only 1:1 mapping is supported - however, multiple primary
	collections may use the same secondary collection, which effectively relaxes this to N:1 mapping).

	Unlike previous versions of Solr, this plugin does NOT automatically place replicas of the
	secondary collection - those replicas are assumed to be already in place, and it's the
	responsibility of the user to already place them on the right nodes (most likely simply by
	using this plugin to create the secondary collection first, with large enough replication
	factor to ensure that the target node set is populated with secondary replicas).

	When a request to compute placements is processed for the primary collection that has a
	key in the `withCollection` map, the set of candidate nodes is first filtered to eliminate nodes
	that don't contain the replicas of the secondary collection. Please note that this may
	result in an empty set, and an exception - in this case the sufficient number of secondary
	replicas needs to be created first.

	The plugin preserves this co-location by rejecting delete operation of secondary collections
	(or their replicas) if they are still in use on the nodes where primary replicas are located
	- requests to do so will be rejected with errors. In order to delete a secondary collection
	(or its replicas) from these nodes first the replicas of the primary collection must be
	removed from the co-located nodes, or the configuration must be changed to remove the
	co-location mapping for the primary collection.

	===== `nodeType` constraint


	===== Configuration
	This plugin supports the following configuration parameters:

	`minimalFreeDiskGB`::
	(optional, integer) if a node has strictly less GB of free disk than this value, the node is
	excluded from assignment decisions. Set to 0 or less to disable. Default value is 10.

	`prioritizedFreeDiskGB`::
	(optional, integer) replica allocation will assign replicas to nodes with at least this number
	of GB of free disk space regardless of the number of cores on these nodes rather than assigning
	replicas to nodes with less than this amount of free disk space if that's an option (if that's
	not an option, replicas can still be assigned to nodes with less than this amount of free space).
	Default value is 100.

	`withCollection`::
	(optional, map) this property defines an additional constraint that primary collections (keys)
	must be located on the same nodes as the secondary collections (values). The plugin will
	assume that the secondary collection replicas are already in place and ignore candidate
	nodes where they are not already present. Default value is none.

	`nodeType`::
	(optional, map) this property defines an additional constraint that collections (keys)
	must be located only on the nodes that are labeled with one or more of the matching
	"node type" labels (values in the map are comma-separated labels). Nodes are labeled using the
	`node_type` system property with the value being an arbitrary comma-separated list of labels.
	Correspondingly, the plugin configuration can specify that a particular collection must be placed
	only on the nodes that match at least one of the (comma-separated) labels defined here.

	=== Example configurations
	This is a simple configuration that uses default values:

	[source,bash]
	----
	curl -X POST -H 'Content-type: application/json' -d '{
	"add":{
	"name": ".placement-plugin",
	"class": "org.apache.solr.cluster.placement.plugins.AffinityPlacementFactory"
	}}'
	http://localhost:8983/api/cluster/plugin
	----

	This configuration specifies the base parameters:
	[source,bash]
	----
	curl -X POST -H 'Content-type: application/json' -d '{
	"add":{
	"name": ".placement-plugin",
	"class": "org.apache.solr.cluster.placement.plugins.AffinityPlacementFactory",
	"config": {
	"minimalFreeDiskGB": 20,
	"prioritizedFreeDiskGB": 100
	}
	}}'
	http://localhost:8983/api/cluster/plugin
	----

	This configuration defines that collection `A_primary` must be co-located with
	collection `Common_secondary`, and collection `B_primary` must be co-located also with the
	collection `Common_secondary`:

	[source,bash]
	----
	curl -X POST -H 'Content-type: application/json' -d '{
	"add":{
	"name": ".placement-plugin",
	"class": "org.apache.solr.cluster.placement.plugins.AffinityPlacementFactory",
	"config": {
	"withCollection": {
	"A_primary": "Common_secondary",
	"B_primary": "Common_secondary"
	}
	}
	}}'
	http://localhost:8983/api/cluster/plugin
	----

	This configuration defines that collection `collection_A` must be placed only on the nodes with
	the `node_type` system property containing either `searchNode` or `indexNode` (for example, a node
	may be labeled as `-Dnode_type=searchNode,indexNode,uiNode,zkNode`). Similarly, the
	collection `collection_B` must be placed only on the nodes that contain the `analyticsNode` label:

	[source,bash]
	----
	curl -X POST -H 'Content-type: application/json' -d '{
	"add":{
	"name": ".placement-plugin",
	"class": "org.apache.solr.cluster.placement.plugins.AffinityPlacementFactory",
	"config": {
	"nodeType": {
	"collection_A": "searchNode,indexNode",
	"collection_B": "analyticsNode"
	}
	}
	}}'
	http://localhost:8983/api/cluster/plugin
	----