In this chapter, we'll learn about the state models provided by Helix, and how to create your own custom state model.
Helix comes with 3 default state models that are commonly used. It is possible to have multiple state models in a cluster. Every resource that is added should be configured to use a state model that govern its ideal state.
In addition to the state machine configuration, one can specify the constraints of states and transitions.
For example, one can say:
MASTER:1
Maximum number of replicas in MASTER state at any time is 1
OFFLINE-SLAVE:5
Maximum number of OFFLINE-SLAVE transitions that can happen concurrently in the system is 5 in this example.
We also support two dynamic upper bounds for the number of replicas in each state:
Helix uses a greedy approach to satisfy the state constraints. For example, if the state machine configuration says it needs 1 MASTER and 2 SLAVES, but only 1 node is active, Helix must promote it to MASTER. This behavior is achieved by providing the state priority list as [MASTER, SLAVE].
Helix tries to fire as many transitions as possible in parallel to reach the stable state without violating constraints. By default, Helix simply sorts the transitions alphabetically and fires as many as it can without violating the constraints. You can control this by overriding the priority order.
There are a few Helix-defined states that are important to be aware of.
The DROPPED state is used to signify a replica that was served by a given participant, but is no longer served. This allows Helix and its participants to effectively clean up. There are two requirements that every new state model should follow with respect to the DROPPED state:
The ERROR state is used whenever the participant serving a partition encountered an error and cannot continue to serve the partition. HelixAdmin has "reset" functionality to allow for participants to recover from the ERROR state.
Below is a complete definition of a Master-Slave state model. Notice the fields marked REQUIRED; these are essential for any state model definition.
StateModelDefinition stateModel = new StateModelDefinition.Builder("MasterSlave") // OFFLINE is the state that the system starts in (initial state is REQUIRED) .initialState("OFFLINE") // Lowest number here indicates highest priority, no value indicates lowest priority .addState("MASTER", 1) .addState("SLAVE", 2) .addState("OFFLINE") // Note the special inclusion of the DROPPED state (REQUIRED) .addState(HelixDefinedState.DROPPED.toString()) // No more than one master allowed .upperBound("MASTER", 1) // R indicates an upper bound of number of replicas for each partition .dynamicUpperBound("SLAVE", "R") // Add some high-priority transitions .addTransition("SLAVE", "MASTER", 1) .addTransition("OFFLINE", "SLAVE", 2) // Using the same priority value indicates that these transitions can fire in any order .addTransition("MASTER", "SLAVE", 3) .addTransition("SLAVE", "OFFLINE", 3) // Not specifying a value defaults to lowest priority // Notice the inclusion of the OFFLINE to DROPPED transition // Since every state has a path to OFFLINE, they each now have a path to DROPPED (REQUIRED) .addTransition("OFFLINE", HelixDefinedState.DROPPED.toString()) // Create the StateModelDefinition instance .build(); // Use the isValid() function to make sure the StateModelDefinition will work without issues Assert.assertTrue(stateModel.isValid());