Fix unexpected result when resuming a cluster from paused/maintenance mode. (#1698)

This PR aims to fix a race condition that the resume event could be pushed to the event queue before the cluster status has been updated. In this case, when the event is processed, the controller might still think the cluster is in paused/maintenance mode. As a result, the resume event is discarded and the rebalance won't be done until the next event happens.
1 file changed
tree: fdc708f2c57353fd9a778216b560cfef4ea1944e
  1. .github/
  2. helix-admin-webapp/
  3. helix-agent/
  4. helix-common/
  5. helix-core/
  6. helix-front/
  7. helix-lock/
  8. helix-rest/
  9. metadata-store-directory-common/
  10. metrics-common/
  11. recipes/
  12. scripts/
  13. website/
  14. zookeeper-api/
  15. .gitignore
  16. build
  17. bump-up.command
  18. deploySite.sh
  19. helix-style-intellij.xml
  20. helix-style.xml
  21. hpost-review.sh
  22. LICENSE
  23. NOTICE
  24. pom.xml
  25. README.md
README.md

Apache Helix

Github Build Maven Central License

Helix Logo

Helix is part of the Apache Software Foundation.

Project page: http://helix.apache.org/

Mailing list: http://helix.apache.org/mail-lists.html

Build

mvn clean install -Dmaven.test.skip.exec=true

WHAT IS HELIX

Helix is a generic cluster management framework used for automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes. Helix provides the following features:

  1. Automatic assignment of resource/partition to nodes
  2. Node failure detection and recovery
  3. Dynamic addition of Resources
  4. Dynamic addition of nodes to the cluster
  5. Pluggable distributed state machine to manage the state of a resource via state transitions
  6. Automatic load balancing and throttling of transitions