| = SolrCloud Autoscaling Fault Tolerance |
| // Licensed to the Apache Software Foundation (ASF) under one |
| // or more contributor license agreements. See the NOTICE file |
| // distributed with this work for additional information |
| // regarding copyright ownership. The ASF licenses this file |
| // to you under the Apache License, Version 2.0 (the |
| // "License"); you may not use this file except in compliance |
| // with the License. You may obtain a copy of the License at |
| // |
| // http://www.apache.org/licenses/LICENSE-2.0 |
| // |
| // Unless required by applicable law or agreed to in writing, |
| // software distributed under the License is distributed on an |
| // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| // KIND, either express or implied. See the License for the |
| // specific language governing permissions and limitations |
| // under the License. |
| |
| [WARNING] |
| .Autoscaling is deprecated |
| ==== |
| The autoscaling framework in its current form is deprecated and will be removed in Solr 9.0. |
| |
| Some features relating to replica placement will be replaced in 9.0, but other features are not likely to be replaced. |
| |
| There is no planned replacement for the functionality described on this page. |
| ==== |
| |
| The autoscaling framework uses a few strategies to ensure it's able to still trigger actions in the event of unexpected changes to the system. |
| |
| == Node Added or Lost Markers |
| Since triggers execute on the node that runs the Overseer, should the Overseer node go down the `nodeLost` |
| event would be lost because there would be no mechanism to generate it. Similarly, if a node has |
| been added before the Overseer leader change was completed, the `nodeAdded` event would not be |
| generated. |
| |
| For this reason Solr implements additional mechanisms to ensure that these events are generated |
| reliably. |
| |
| With standard SolrCloud behavior, when a node joins a cluster its presence is marked as an ephemeral ZooKeeper path in the `/live_nodes/<nodeName>` ZooKeeper directory. Now an ephemeral path is also created under `/autoscaling/nodeAdded/<nodeName>`. |
| When a new instance of Overseer leader is started it will run the `nodeAdded` trigger (if it's configured) |
| and discover the presence of this ZooKeeper path, at which point it will remove it and generate a `nodeAdded` event. |
| |
| When a node leaves the cluster, up to three remaining nodes will try to create a persistent ZooKeeper path |
| `/autoscaling/nodeLost/<nodeName>` and eventually one of them succeeds. When a new instance of Overseer leader |
| is started it will run the `nodeLost` trigger (if it's configured) and discover the presence of this ZooKeeper |
| path, at which point it will remove it and generate a `nodeLost` event. |
| |
| == Trigger State Checkpointing |
| Triggers generate events based on their internal state. If the Overseer leader goes down while the trigger is |
| about to generate a new event, it's likely that the event would be lost because a new trigger instance |
| running on the new Overseer leader would start from a clean slate. |
| |
| For this reason, after each time a trigger is executed its internal state is persisted to ZooKeeper, and |
| on Overseer start its internal state is restored. |
| |
| == Trigger Event Queues |
| Autoscaling framework limits the rate at which events are processed using several different mechanisms. |
| One is the locking mechanism that prevents concurrent |
| processing of events, and another is a single-threaded executor that runs trigger actions. |
| |
| This means that the processing of an event may take significant time, and during this time it's possible that the |
| Overseer may go down. In order to avoid losing events that were already generated but not yet fully |
| processed, events are queued before processing is started. |
| |
| Separate ZooKeeper queues are created for each trigger, and events produced by triggers are put on these |
| per-trigger queues. When a new Overseer leader is started it will first check |
| these queues and process events accumulated there, and only then it will continue to run triggers |
| normally. Queued events that fail processing during this "replay" stage are discarded. |