solr/solr-ref-guide/src/migrate-to-policy-rule.adoc - lucene-solr - Git at Google

 = Migrating Rule-Based Replica Rules to Autoscaling Policies
 // Licensed to the Apache Software Foundation (ASF) under one
 // or more contributor license agreements.  See the NOTICE file
 // distributed with this work for additional information
 // regarding copyright ownership.  The ASF licenses this file
 // to you under the Apache License, Version 2.0 (the
 // "License"); you may not use this file except in compliance
 // with the License.  You may obtain a copy of the License at
 //
 //   http://www.apache.org/licenses/LICENSE-2.0
 //
 // Unless required by applicable law or agreed to in writing,
 // software distributed under the License is distributed on an
 // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 // KIND, either express or implied.  See the License for the
 // specific language governing permissions and limitations
 // under the License.

 Creating rules for replica placement in a Solr cluster is now done with the <<solrcloud-autoscaling.adoc#,autoscaling framework>>.

 This document outlines how to migrate from the legacy <<rule-based-replica-placement.adoc#,rule-based replica placement>> to an <<solrcloud-autoscaling-policy-preferences.adoc#,autoscaling policy>>.

 The autoscaling framework is designed to fully automate your cluster management.
 However, if you do not want actions taken on your cluster in an automatic way, you can still use the framework to set rules and preferences.
 With a set of rules and preferences in place, instead of taking action directly the system will suggest actions you can take manually.

 The section <<solrcloud-autoscaling-policy-preferences.adoc#cluster-preferences-specification,Cluster Preferences Specification>> describes the capabilities of an autoscaling policy in detail.
 Below we'll walk through a few examples to show how you would express the your legacy rules in the autoscaling syntax.
 Every rule in the legacy rule-based replica framework can be expressed in the new syntax.

 == How Rules are Defined

 One key difference between the frameworks is the way rules are defined.

 With the rule-based replica placement framework, rules are defined with the Collections API at the time of collection creation.

 The autoscaling framework, however, has its own <<solrcloud-autoscaling-api.adoc#,API>>.
 Policies can be configured for the entire cluster or for individual collections depending on your needs.

 The following is the legacy syntax for a rule that limits the cluster to one replica for each shard in any Solr node:

 [source,text]
 ----
 replica:<2,node:*,shard:**
 ----

 The equivalent rule in the autoscaling policy is:

 [source,json]
 ----
 {"replica":"<2", "node":"#ANY", "shard":"#EACH"}
 ----

 == Differences in Rule Syntaxes

 Many elements of defining rules are similar in both frameworks, but some elements are different.

 [[rule-operators1]]
 === Rule Operators

 All of the following operators can be directly used in the new policy syntax and they mean the same in both frameworks.

 * *equals (no operator required)*: `tag:x` means the value for a tag must be equal to `'x'`.
 * *greater than (>)*: `tag:>x` means the tag value must be greater than `'x'`. In this case, `'x'` must be a number.
 * *less than (<)*: `tag:<x` means the tag value must be less than `‘x’`. In this case also, `'x'` must be a number.
 * *not equal (!)*: `tag:!x` means tag value MUST NOT be equal to `‘x’`. The equals check is performed on a String value.

 [[fuzzy-operator1]]
 ==== Fuzzy Operator (~)

 There is no `~` operator in the autoscaling policy syntax.
 Instead, it uses the `strict` parameter, which can be `true` or `false`.
 To replace the `~` operator, use the attribute `"strict":false` instead.

 For example:

 .Rule-based replica placement framework:
 [source,text]
 ----
 replica:<2~,node:*,shard:**
 ----

 .Autoscaling framework:
 [source,json]
 ----
 {"replica":"<2", "node":"#ANY", "shard":"#EACH", "strict": false}
 ----

 [[tag-names1]]
 === Attributes

 Attributes were known as "tags" in the rule-based replica placement framework.
 In the autoscaling framework, they are attributes that are used for node selection or to set global cluster-wide rules.

 The available attributes in the autoscaling framework are similar to the tags that were available with rule-based replica placement. Attributes with the same name mean the same in both frameworks.

 * *cores*: Number of cores in the node
 * *freedisk*: Disk space available in the node
 * *host*: host name of the node
 * *port*: port of the node
 * *node*: node name
 * *role*: The role of the node. The only supported role is 'overseer'
 * *ip_1, ip_2, ip_3, ip_4*: These are ip fragments for each node. For example, in a host with ip `192.168.1.2`, `ip_1 = 2`, `ip_2 =1`, `ip_3 = 168` and` ip_4 = 192`
 * *sysprop.\{PROPERTY_NAME}*: These are values available from system properties. `sysprop.key` means a value that is passed to the node as `-Dkey=keyValue` during the node startup. It is possible to use rules like `sysprop.key:expectedVal,shard:*`

 [[snitches1]]
 === Snitches

 There is no equivalent for a snitch in the autoscaling policy framework.

 == Porting Existing Replica Placement Rules

 There is no automatic way to move from using rule-based replica placement rules to an autoscaling policy.
 Instead you will need to remove your replica rules from each collection and institute a policy using the <<solrcloud-autoscaling-api.adoc#,autoscaling API>>.

 The following examples are intended to help you translate your existing rules into new rules that fit the autoscaling framework.

 *Keep less than 2 replicas (at most 1 replica) of this collection on any node*

 For this rule, we define the `replica` condition with operators for "less than 2", and use a pre-defined tag named `node` to define nodes with any name.

 .Rule-based replica placement framework:
 [source,text]
 ----
 replica:<2,node:*
 ----

 .Autoscaling framework:
 [source,json]
 ----
 {"replica":"<2","node":"#ANY"}
 ----

 *For a given shard, keep less than 2 replicas on any node*

 For this rule, we use the `shard` condition to define any shard, the `replica` condition with operators for "less than 2", and finally a pre-defined tag named `node` to define nodes with any name.

 .Rule-based replica placement framework:
 [source,text]
 ----
 shard:*,replica:<2,node:*
 ----

 .Autoscaling framework:
 [source,json]
 ----
 {"replica":"<2","shard":"#EACH", "node":"#ANY"}
 ----

 *Assign all replicas in shard1 to rack 730*

 This rule limits the `shard` condition to 'shard1', but any number of replicas. We're also referencing a custom tag named `rack`.

 .Rule-based replica placement framework:
 [source,text]
 ----
 shard:shard1,replica:*,rack:730
 ----

 .Autoscaling framework:
 [source,json]
 ----
 {"replica":"#ALL", "shard":"shard1", "sysprop.rack":"730"}
 ----

 In the rule-based replica placement framework, we needed to configure a custom Snitch which provides values for the tag `rack`.

 With the autoscaling framework, however, we need to start all nodes with a system property to define the rack values. For example, `bin/solr start -c -Drack=<rack-number>`.

 *Create replicas in nodes with less than 5 cores only*

 This rule uses the `replica` condition to define any number of replicas, but adds a pre-defined tag named `core` and uses operators for "less than 5".

 .Rule-based replica placement framework:
 [source,text]
 ----
 cores:<5
 ----

 .Autoscaling framework:
 [source,json]
 ----
 {"cores":"<5", "node":"#ANY"}
 ----

 *Do not create any replicas in host 192.45.67.3*

 .legacy syntax:
 [source,text]
 ----
 host:!192.45.67.3
 ----

 .autoscaling framework:
 [source,json]
 ----
 {"replica": 0, "host":"192.45.67.3"}
 ----
	= Migrating Rule-Based Replica Rules to Autoscaling Policies
	// Licensed to the Apache Software Foundation (ASF) under one
	// or more contributor license agreements. See the NOTICE file
	// distributed with this work for additional information
	// regarding copyright ownership. The ASF licenses this file
	// to you under the Apache License, Version 2.0 (the
	// "License"); you may not use this file except in compliance
	// with the License. You may obtain a copy of the License at
	//
	// http://www.apache.org/licenses/LICENSE-2.0
	//
	// Unless required by applicable law or agreed to in writing,
	// software distributed under the License is distributed on an
	// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	// KIND, either express or implied. See the License for the
	// specific language governing permissions and limitations
	// under the License.

	Creating rules for replica placement in a Solr cluster is now done with the <<solrcloud-autoscaling.adoc#,autoscaling framework>>.

	This document outlines how to migrate from the legacy <<rule-based-replica-placement.adoc#,rule-based replica placement>> to an <<solrcloud-autoscaling-policy-preferences.adoc#,autoscaling policy>>.

	The autoscaling framework is designed to fully automate your cluster management.
	However, if you do not want actions taken on your cluster in an automatic way, you can still use the framework to set rules and preferences.
	With a set of rules and preferences in place, instead of taking action directly the system will suggest actions you can take manually.

	The section <<solrcloud-autoscaling-policy-preferences.adoc#cluster-preferences-specification,Cluster Preferences Specification>> describes the capabilities of an autoscaling policy in detail.
	Below we'll walk through a few examples to show how you would express the your legacy rules in the autoscaling syntax.
	Every rule in the legacy rule-based replica framework can be expressed in the new syntax.

	== How Rules are Defined

	One key difference between the frameworks is the way rules are defined.

	With the rule-based replica placement framework, rules are defined with the Collections API at the time of collection creation.

	The autoscaling framework, however, has its own <<solrcloud-autoscaling-api.adoc#,API>>.
	Policies can be configured for the entire cluster or for individual collections depending on your needs.

	The following is the legacy syntax for a rule that limits the cluster to one replica for each shard in any Solr node:

	[source,text]
	----
	replica:<2,node:,shard:*
	----

	The equivalent rule in the autoscaling policy is:

	[source,json]
	----
	{"replica":"<2", "node":"#ANY", "shard":"#EACH"}
	----

	== Differences in Rule Syntaxes

	Many elements of defining rules are similar in both frameworks, but some elements are different.

	[[rule-operators1]]
	=== Rule Operators

	All of the following operators can be directly used in the new policy syntax and they mean the same in both frameworks.

	* equals (no operator required): `tag:x` means the value for a tag must be equal to `'x'`.
	* greater than (>): `tag:>x` means the tag value must be greater than `'x'`. In this case, `'x'` must be a number.
	* less than (<): `tag:<x` means the tag value must be less than `‘x’`. In this case also, `'x'` must be a number.
	* not equal (!): `tag:!x` means tag value MUST NOT be equal to `‘x’`. The equals check is performed on a String value.

	[[fuzzy-operator1]]
	==== Fuzzy Operator (~)

	There is no `~` operator in the autoscaling policy syntax.
	Instead, it uses the `strict` parameter, which can be `true` or `false`.
	To replace the `~` operator, use the attribute `"strict":false` instead.

	For example:

	.Rule-based replica placement framework:
	[source,text]
	----
	replica:<2~,node:,shard:*
	----

	.Autoscaling framework:
	[source,json]
	----
	{"replica":"<2", "node":"#ANY", "shard":"#EACH", "strict": false}
	----

	[[tag-names1]]
	=== Attributes

	Attributes were known as "tags" in the rule-based replica placement framework.
	In the autoscaling framework, they are attributes that are used for node selection or to set global cluster-wide rules.

	The available attributes in the autoscaling framework are similar to the tags that were available with rule-based replica placement. Attributes with the same name mean the same in both frameworks.

	* cores: Number of cores in the node
	* freedisk: Disk space available in the node
	* host: host name of the node
	* port: port of the node
	* node: node name
	* role: The role of the node. The only supported role is 'overseer'
	* ip_1, ip_2, ip_3, ip_4: These are ip fragments for each node. For example, in a host with ip `192.168.1.2`, `ip_1 = 2`, `ip_2 =1`, `ip_3 = 168` and` ip_4 = 192`
	* sysprop.\{PROPERTY_NAME}: These are values available from system properties. `sysprop.key` means a value that is passed to the node as `-Dkey=keyValue` during the node startup. It is possible to use rules like `sysprop.key:expectedVal,shard:*`

	[[snitches1]]
	=== Snitches

	There is no equivalent for a snitch in the autoscaling policy framework.

	== Porting Existing Replica Placement Rules

	There is no automatic way to move from using rule-based replica placement rules to an autoscaling policy.
	Instead you will need to remove your replica rules from each collection and institute a policy using the <<solrcloud-autoscaling-api.adoc#,autoscaling API>>.

	The following examples are intended to help you translate your existing rules into new rules that fit the autoscaling framework.

	Keep less than 2 replicas (at most 1 replica) of this collection on any node

	For this rule, we define the `replica` condition with operators for "less than 2", and use a pre-defined tag named `node` to define nodes with any name.

	.Rule-based replica placement framework:
	[source,text]
	----
	replica:<2,node:*
	----

	.Autoscaling framework:
	[source,json]
	----
	{"replica":"<2","node":"#ANY"}
	----

	For a given shard, keep less than 2 replicas on any node

	For this rule, we use the `shard` condition to define any shard, the `replica` condition with operators for "less than 2", and finally a pre-defined tag named `node` to define nodes with any name.

	.Rule-based replica placement framework:
	[source,text]
	----
	shard:,replica:<2,node:
	----

	.Autoscaling framework:
	[source,json]
	----
	{"replica":"<2","shard":"#EACH", "node":"#ANY"}
	----

	Assign all replicas in shard1 to rack 730

	This rule limits the `shard` condition to 'shard1', but any number of replicas. We're also referencing a custom tag named `rack`.

	.Rule-based replica placement framework:
	[source,text]
	----
	shard:shard1,replica:*,rack:730
	----

	.Autoscaling framework:
	[source,json]
	----
	{"replica":"#ALL", "shard":"shard1", "sysprop.rack":"730"}
	----

	In the rule-based replica placement framework, we needed to configure a custom Snitch which provides values for the tag `rack`.

	With the autoscaling framework, however, we need to start all nodes with a system property to define the rack values. For example, `bin/solr start -c -Drack=<rack-number>`.

	Create replicas in nodes with less than 5 cores only

	This rule uses the `replica` condition to define any number of replicas, but adds a pre-defined tag named `core` and uses operators for "less than 5".

	.Rule-based replica placement framework:
	[source,text]
	----
	cores:<5
	----

	.Autoscaling framework:
	[source,json]
	----
	{"cores":"<5", "node":"#ANY"}
	----

	Do not create any replicas in host 192.45.67.3

	.legacy syntax:
	[source,text]
	----
	host:!192.45.67.3
	----

	.autoscaling framework:
	[source,json]
	----
	{"replica": 0, "host":"192.45.67.3"}
	----