Merge Waged rebalancer branch code to master. (#724)

* Define the WAGED rebalancer interfaces.

This is the intial check in for the future development of the WAGED rebalancer.
All the components are placeholders. They will be implemented gradually.

* Adding the configuration items of the WAGED rebalancer. (#348)

* Adding the configuration items of the WAGED rebalancer.

Including: Instance Capacity Keys, Rebalance Preferences, Instance Capacity Details, Partition Capacity (the weight) Details.
Also adding test to cover the new configuration items.

* Implement the WAGED rebalancer cluster model (#362)

* Introduce the cluster model classes to support the WAGED rebalancer.

Implement the cluster model classes with the minimum necessary information to support rebalance.
Additional field/logics might be added later once the detailed rebalance logic is implemented.

Also add related tests.

* Change the rebalancer assignment record to be ResourceAssignment instead of IdealState. (#398)

ResourceAssignment fit the usage better. And there will be no unnecessary information to be recorded or read during the rebalance calculation.

* Convert all the internal assignment state objects to be ResourceAssignment. (#399)

This is to avoid unnecessary information being recorded or read.

* Implement Cluster Model Provider. (#392)

* Implement Cluster Model Provider.

The model provider is called in the WAGED rebalancer to generate CLuster Model based on the current cluster status.
The major responsibility of the provider is to parse all the assignable replicas and identify which replicas need to be reassigned. Note that if the current best possible assignment is still valid, the rebalancer won't need to calculate for the partition assignment.

Also, add unit tests to verify the main logic.

* Add ChangeDetector interface and ResourceChangeDetector implementation (#388)

Add ChangeDetector interface and ResourceChangeDetector implementation

In order to efficiently react to changes happening to the cluster in the new WAGED rebalancer, a new component called ChangeDetector was added.

Changelist:
1. Add ChangeDetector interface
2. Implement ResourceChangeDetector
3. Add ResourceChangeCache, a wrapper for critical cluster metadata
4. Add an integration test, TestResourceChangeDetector

* Add cluster level default instance config. (#413)

This config will be applied to the instance when there is no (or empty) capacity configuration in the Instance Config.
Also add unit tests.

* Redefine the hard/soft constraints (#422)

* Refactor the interfaces of hard/soft constraints and a central place to keep the softConstraint weights

* Refine the WAGED rebalancer related interfaces for integration (#431)

* Refine the WAGED rebalancer related interfaces and initial integrate with the BestPossibleStateCalStage.

- Modify the BestPossibleStateCalStage logic to plugin the WAGED rebalancer.
- Refine ClusterModel to integrate with the ClusterDataDetector implementation.
- Enabling getting the changed details for Cluster Config in the change detector. Which is required by the WAGED rebalancer.

* Resubmit the change: Refine the WAGED rebalancer related interfaces for integration (#431)

* Refine the WAGED rebalancer related interfaces and initial integrate with the BestPossibleStateCalStage.

- Modify the BestPossibleStateCalStage logic to plugin the WAGED rebalancer.
- Refine ClusterModel to integrate with the ClusterDataDetector implementation.
- Enabling getting the changed details for Cluster Config in the change detector. Which is required by the WAGED rebalancer.

* Bring back the interface class and algorithm placeholder class that was removed prematurely.

* Revert "Refine the WAGED rebalancer related interfaces for integration (#431)" (#437)

This reverts commit 08a2015c617ddd3c93525afc572081a7836f9476.

* Modify the expected change type from CONFIG to CLUSTER_CONFIG in the WAGED rebalancer. (#438)

CONFIG is for generic configuration items. That will be too generic for the rebalancer.
Modify to check for CLUSTER_CONFIG to avoid confusion.

* Add special treatment for ClusterConfig

This diff allows callers of getChangeType to iterate over the result of getChangeType() by changing determinePropertyMapByType so that it just returns an empty map for ClusterConfig.

* Record the replica objects in the AssignableNode in addition to the partition name (#440)

The replica instances are required while the rebalance algorithm generating ResourceAssignment based on the AssignableNode instances.
Refine the methods of the AssignableNode for better code style and readability.
Also, modify the related test cases to verify state information and new methods.

* Add BucketDataAccessor for large writes

For the new WAGED rebalancer, it's necessary to have a data accessor that will allow writes of data exceeding 1MB. ZooKeeper's ZNode size is capped at 1MB, so BucketDataAccessor interface and ZkBucketDataAccessor help us achieve this.
Changelist:
1. Add BucketDataAccessor and ZkBucketDataAccessor
2. Add necessary serializers
3. Add an integration test against ZK

* Implement the basic constraint based algorithm (#381)

Implement basic constraint algorithm: Greedy based, each time it picks the best scores given each replica and assigns the replica to the node. It doesn't guarantee to achieve global optimal but local optimal result

The algorithm is based on a given set of constraints

* HardConstraint: Approve or deny the assignment given its condition, any assignment cannot bypass any "hard constraint"
* SoftConstraint: Evaluate the assignment by points/rewards/scores, a higher point means a better assignment
The goal is to avoid all "hard constraints" while accumulating the most points(rewards) from "soft constraints"

* Validate the instance capacity/partition weight configuration while constructing the assignable instances (#451)

Compare the configure items with the required capacity keys that are defined in the cluster config when build the assignable instances.
- According to the design, all the required capacity keys must appear in the instance capacity config.
- As for the partition weights, the corresponding weight item will be filled with value 0 if the required capacity key is not specified in the resource config.

* Implement the WAGED rebalancer with the limited functionality. (#443)

The implemented rebalancer supports basic rebalance logic. It does not contain the logic to support delayed rebalance and user-defined preference list.

Added unit test to cover the main workflow of the WAGED rebalancer.

* HardConstraints Implementation and unit tests (#433)

* Implement all of basic Hard Constraints
1. Partitions count cannot exceed instance's upper limit
2. Fault zone aware (no same partitions on the same zone)
3. Partitions weight cannot exceed instance's capacity
4. Cannot assign inactived partitions
5. Same partition of different states cannot co-exist in one instance
6. Instance doesn't have the tag of the replica

* Implement AssignmentMetadataStore (#453)

Implement AssignmentMetadataStore

AssignmentMetadataStore is a component for the new WAGED Rebalaner. It provides APIs that allows the rebalancer to read and write the baseline and best possible assignments using BucketDataAccessor.

Changelist:
1. Add AssignmentMetadataStore
2. Add an integration test: TestAssignmentMetadataStore

* Fix TestWagedRebalancer and add constructor in AssignmentMetadataStore

TestWagedRebalancer was failing because it was not using a proper HelixManager to instantiate a mock version of AssignmentMetadataStore. This diff refactors the constructors in AssignmentMetadataStore and fixes the failing test.

* Implement one of the soft constraints (#450)

Implement Instance Partitions Count soft constraint.
Evaluate by instance's current partition count versus estimated max partition count.
Intuitively, Encourage the assignment if the instance's occupancy rate is below average;
Discourage the assignment if the instance's occupancy rate is above average.

The final normalized score will be within [0, 1].
The implementation of the class will depend on the cluster current total partitions count as the max score.

* Add soft constraint: ResourcetopStateAntiAffinityConstraint (#465)

Add ResourcetopStateAntiAffinityConstraint

The more total top state partitions assigned to the instance, the lower the score, vice versa.

* Implement MaxCapacityUsageInstanceConstraint soft constraint (#463)

The constraint evaluates the score by checking the max used capacity key out of all the capacity
keys.
The higher the maximum usage value for the capacity key, the lower the score will be, implying
that it is that much less desirable to assign anything on the given node.
It is a greedy approach since it evaluates only the most used capacity key.

* Add soft constraint: ResourcePartitionAntiAffinityConstraint (#464)

If the resource of the partition overall has a light load on the instance, the score is higher compared to case when the resource is heavily loaded on the instance

* Improve ResourceTopStateAntiAffinityConstraint (#475)

- fix the min max range to be [0,1]
- add unit test for normalized score

* Adjust the expected replica count according to fault zone count. (#476)

The rebalancer should determine the expected replica count according to the fault zone instead of the node count only.

* PartitionMovementSoftConstraint Implementation (#474)

Add soft constraint: partition movement constraint

Evaluate the proposed assignment according to the potential partition movements cost.
The cost is evaluated based on the difference between the old assignment and the new assignment.

* Add the remaining implementation of ConstraintBasedAlgorithmFactory (#478)

Implementation of ConstraintBasedAlgorithmFactory and the soft constraint weight model.
Remove SoftConstraintWeightModel class.
Get the rebalance preference and adjust the corresponding weight.
Pass the preference keys instead of cluster config.

* Integrate the WAGED rebalancer with all the related components. (#466)

1. Integrate with the algorithm, assignment metadata store, etc. Fix several conflicting interfaces and logics so as to all the rebalancer run correctly.
2. Complete OptimalAssignment.
3. Add integration tests to ensure the correctness of rebalancing logic.

* Separate AssignableNode properties by Immutable and Mutable (#485)

Make AssignableNode properties different by Immutable and Mutable
- It helps detect any wrong usage of these properties early

* Enable maintenance mode for the WAGED rebalancer.

The maintenance mode rebalance logic keeps the same as the previous feature.
Add more tests about partition migration and node swap that requires maintenance mode.

* Add delayed rebalance and user-defined preference list features to the WAGED rebalancer. (#456)

- Add delayed rebalance and user-defined preference list features to the WAGED rebalancer.
- Refine the delayed rebalance usage in the waged rebalancer.
- Add the delayed rebalance scheduling logic.
- Add the necessary tests. And fix TestMixedModeAutoRebalance and all delayed rebalance tests.

* Adjust the topology processing logic for instance to ensure backward compatibility.

* Load soft constraint weight from resources/properties file (#492)

Load the soft constraint's weight from a properties file.
It makes easier for us to adjust weights in the future.

* Add latency metric components for WAGED rebalancer (#490)

Add WAGED rebalancer metric framework and latency metric implementation

Changelist:
1. Add WAGED rebalancer metric interface
2. Implement latency-related metrics
3. Integrate latency metrics into WAGED rebalancer
4. Add tests

* Fixing rebalance cache issue and stablize the tests. (#510)

1. Fix the DelayedAutoRebalancer Cache issue that ClusterConfig change won't trigger rebalance. The current workaround in our code blocks the WAGED rebalancer logic. So we need to fix it while merging the WAGED rebalancer code.
2. Refine the ResourceChangeDetector's usage in the WAGED rebalancer so as to avoid unnecessary global rebalance.
3. Extend the StrictMatchExternalViewVerifier so it can be used to test the WAGED rebalance feature.

* More strict partition weight validation while creating the cluster model. (#511)

1. If any capacity key is not configured in the Resource Config (or default weight) as the partition weight, the config is invalid.
2. If any partition weight is configured with a negative number, the config is invalid.
Note that the rebalancer will not compute a new assignment if any capacity/weight config is invalid.

* Increase parallelism for ZkBucketDataAccessor (#506)

* Increase parallelism for ZkBucketDataAccessor

This diff improves parallelism and throughput for ZkBucketDataAccessor. It implements the following ideas:
1. Optimistic Concurrency Control
2. Monotonically Increasing Version Number
3. Garbage Collection of Stale Metadata
4. Retrying Reads Upon Failure

* The WAGED rebalancer returns the previously calculated assignment on calculation failure (#514)

* The WAGED rebalancer returns the previously calculated assignment on calculation failure.

This is to protect the cluster assignment on a rebalancing algorithm failure. For example, the cluster is out of capacity. In this case, the rebalancer will keep using the previously calculated mapping.
Also, refine the new metric interface, and add the RebalanceFailureCount metric for recording the failures.

Modify the test cases so that DBs from different test cases have a different name. This is to avoid previous test records to be returned by the rebalancer on calculation error.

* Make log clearer after finishing calculateAssignment. (#531)

Make log clearer after finishing calculateAssignment.

* Implement monitoring mbeans for the WAGED rebalancer. (#525)

Change list:
1. GlobalBaselineCalcCounter: Counter of the global rebalance.
2. PartialRebalanceCounter: Counter of the partial rebalance done.
3. BaselineDivergenceGauge: Gauge of the difference at replica level between the Baseline and the Best Possible assignments.

* Refine the rebalance scope calculating logic in the WAGED rebalancer. (#519)

* Refine the rebalane scope calculating logic in the WAGED rebalancer.

1. Ignore the IdealState mapping/listing fields if the resource is in FULL_AUTO mode.
2. On IdealState change, the resource shall be fully rebalanced since some filter conditions might be changed. Such as instance tag.
3. Live instance change (node newly connected) shall trigger full rebalance so partitions will be re-assigned to the new node.
4. Modify the related test cases.
5. Adding an option to the change detector so if it is used elsewhere, the caller has an option to listen to any change.

* Make WagedRebalancer static by creating a ThreadLocal (#540)

ZKBucketDataAccessor has a GC logic, but this is only valid if the ZkClient inside it is active and not closed. Currently, WAGED rebalancer generates an instance of AssignmentMetadataStore every time it rebalances, which does not allow the internal ZkBucketDataAccessor to garbage collect the assignment metadata it wrote previously.

This diff makes the entire WagedRebalancer object a ThreadLocal, which has the effect of making it essentially static across different runs of the pipeline.

* Change change detector to a regular field in the WAGED rebalancer instead of static threadlocal. (#543)

* Change change detector to regular field instead of static thread-local.

The rebalance has been modified to be a thread-local object. So there is no need to keep the change detector as thread-local.
This may cause potential problems.
In addition, in order to avoid resource leakage, implement the finalize method of the WagedRebalancer to close all connections.

* Refactor soft constraints to simply the algorithm and fix potential issues. (#520)

* Refactor soft constraints to simply the algorithm and fix potential issues.

1. Check for zero weight so as to avoid unnecessary calculations.
2. Simply the soft constraint interfaces and implementations. Avoid duplicate code.
3. Adjust partition movements constraint logic to reduce the chance of moving partition when the baseline and best possible assignment diverge.
4. Estimate utilization in addition to the other usage estimation. The estimation will be used as a base when calculating the capacity usage score. This is to ensure the algorithm treats different clusters with different overall usage in the same way.
5. Fix the issue that high utilization calculation does not consider the current proposed replica usage.
6. Use Sigmoid to calculate usage-based soft constraints score. This enhances the assignment result of the algorithm.
7. Adjust the related test cases.

* Minor fix for the constraints related tests. (#545)

Minor fix for the constraints related tests.

* Adjust the replica rebalance calculating ordering to avoid static order. (#535)

* Adjust the replica rebalance calculating ordering to avoid static order.

The problem of a static order is that the same set of replicas will always be the ones that are moved or state transited during the rebalance.
This randomize won't change the algorithm's performance. But it will help the Helix to eliminate very unstable partitions.

* Implement increment() method in CountMetric class. (#537)

Abstract method increaseCount() in CountMetric is a generic method used in inherited classes. We should implement this method in CountMetric to reduce duplicate code in inherited classes.
Change list:
1. Move increaseCount() to CountMetric.
2. Change the name to increment() and implement the method.

* Modify the ivy file to add the new math3 lib dependency. (#546)

Modify the ivy file to add the new math3 lib dependency.

* Fix a missing parameter when the WAGED rebalancer init the change detector. (#547)

This parameter was missed during the previous change.

* Add the new Rebalancer monitor domain to the active domain list. (#550)

Add the new Rebalancer monitor domain to the active domain list.

* Refine ivy file config. The org were not configured correctly. (#551)

* Use a deep copy of the new best possible assignment for measuring baseline divergence. (#542)

The new assignment is really critical in waged rebalancer. If there is any potential changes in measure baseline divergence, waged rebalancer may not work correctly.
To avoid changes of the new assignment and make it safe when being used to measure baseline divergence, use a deep copy of the new assignment.

* Add max capacity usage metric for instance monitor. (#548)

We need to monitor instance's max utilization in purpose of understanding what the max capacity usage is and knowing the status of the instance.

Change list:
1. Change instance monitor to extend dynamic metric, and change code logic in ClusterStatusMonitor to adapt the InstanceMonitor changes.
2. Add APIs for get/update MaxCapacityUsage.
3. Add an API in cluster status monitor to update max capacity usage.
4. Add unit tests for instance monitor and updateing max capacity usage.

* Fix formula incorrection in the comment for measuring baseline divergence. (#559)

Fix incorrect formula in the comment for measuring baseline divergence.

* Avoid redundant writes in AssignmentMetadataStore (#564)

For the WAGED rebalancer, we persist the cluster's mapping via AssignmentMetadataStore every pipeline. However, if there are no changes made to the new assignment from the old assignment, this write is not necessary. This diff checks whether they are equal and skips the write if old and new assignments are the same.

* Filter resource map with ideal states for instance capacity metrics. (#574)

ResourceToReblance map also has resources from current states. And this causes null pointer exceptions at parsing all replicas stage when the resource is not in ideal states. This diff fixes the issue by only using the resources in ideal states to parse all replicas.

* Introduce Dry-run Waged Rebalancer for the verifiers and tests. (#573)

Use a dry-run rebalancer to avoid updating the persisted rebalancer status in the verifiers or tests.
Also, refine several rebalancer related interfaces so as to simplify the dry-run rebalancer implementation.
Convert the test cases back to use the BestPossibleExternalViewVerifier.

Additional fixing:
- Updating the rebalancer preference for every rebalancer.compute calls. Since the preference might be updated at runtime.
- Fix one minor metric domain name bug in the WagedRebalancerMetricCollector.
- Minor test case fix to make them more stable after the change.

* Change ClusterConfig.setDefaultCapacityMap to be private. (#590)

Change ClusterConfig.setDefaultCapacityMap to be private.

* Add Java API for adding and validating resources for WAGED rebalancer (#570)

Add Java API methods for adding and validating resources for WAGED rebalancer. This is a set of convenience APIs provided through HelixAdmin the user could use to more easily add resources and validate them for WAGED rebalance usage.
Changelist:
1. Add API methods in HelixAdmin
2. Implement the said methods
3. Add tests

* Change calculation for baseline divergence. (#598)

Change the calculation for baseline divergence: 0.0 means no difference, 1.0 means all are different.

* Improve the WAGED rebalancer performance. (#586)

This change improves the rebalance's speed by 2x to 5x depends on the host capacity.

Parallelism the loop processing whenever possible and help to improve the performance. This does not change the logic.
Avoid some duplicate logic in the loop. Put the calculation outside the loop and only do it once.

* Fix the unstable test TestZeroReplicaAvoidance. (#603)

Fix the unstable test TestZeroReplicaAvoidance by waiting.
This is a temporary resolution before we fix issue #526. Marked it in the TODO comment so easier for us to remove the wait in batch later.

* Add REST API endpoints for WAGED Rebalancer (#611)

We want to make WAGED rebalancer (weight-aware) easier to use. One way to do this is to allow the user to easily add resources with weight configuration set by providing REST endpoints. This change adds the relevant REST endpoints based on the HelixAdmin APIs added in (#570).

Basically, this commit uses existing REST endpoints whose hierarchy is defined by REST resource. What this commit does to the existing endpoints is 1) Add extra commands 2) Add a WAGED command as a QueryParam so that WAGED logic could be included.

This change is backward-compatible because it keeps the original behavior when no commands are provided by using @DefaultValue annotation.

* Fix a potential issue in the ResourceChangeSnapshot. (#635)

The trim logic in the ResourceChangeSnapshot for cleaning up the IdealState should not clear the whole map. This will cause the WAGED rebalancer ignores changes such as new partitions into the partition list.
Modify the test case accordingly.

* Simply and enhance the RebalanceLatencyGauge so it can be used in multi-threads. (#636)

The previous design of RebalanceLatencyGauge won't support asynchronous metric data emitting. This PR adds support by using a ThreadLocal object.
The metric logic is not changed.

* Add new WAGED rebalancer config item "GLOBAL_REBALANCE_ASYNC_MODE". (#637)

This option will be used by the WAGED rebalancer to determine if the global rebalance should be performed asynchronously.

* Decouple the event type and the scheduled rebalance cache refresh option. (#638)

The previous design is that both on-demand and periodic rebalance scheduling task will request for a cache refresh. This won't be always true moving forward.
For example, the WAGED rebalancer async baseline calculating requests for a scheduled rebalance. But cache refresh won't be necessary.
This PR does not change any business logic. It prepares for future feature change.
This PR ensures strict backward compatibility.

* Improve the algorithm so it prioritizes the assignment to the idle nodes when the constraint evaluation results are the same (#651)

This is to get rid of the randomness when the algorithm result is a tie. Usually, when the algorithm picks up the nodes with the same score, more partition movements will be triggered on a cluster change.

* Refine the WAGED rebalancer to minimize the partial rebalance workload. (#639)

* Refine the WAGED rebalancer to minimize the partial rebalance workload.

Split the cluster module calculation method so that different rebalance logic can have different rebalance scope calculation logic.
Also, refine the WAGED rebalancer logic to reduce duplicate code.

* Refine methods name and comments. (#664)

* Refine methods name and comments.

* Asynchronously calculating the Baseline (#632)

* Enable the Baseline calculation to be asynchronously done.

This will greatly fasten the rebalance speed. Basically, the WAGED rebalancer will firstly partial rebalance to recover the invalid replica allocations (for example, the ones that are on a disabled instance). Then it calculates the new baseline by global rebalancing.

* Reorgnize the test case so the new WAGED expand cluster tests are not skipped. (#670)

TestNG cannot handle test classes inheritance well. Some of the tests are skipped with the current design. Move the logic to the new test class so it is no longer a child of another test class. This ensures all the test cases are running.

* Fix the Helix rest tests by cleaning up the environment before testing. (#679)

The validateWeight test methods in TestInstanceAccessor and TestPreInstanceAccessor are testing against the same instance config fields. There was a conflict if both of the test cases are executed in a certain order. This change adds cleanup logic so the shared fields will be empty before each test method starts.

* Add instance capacity gauge (#557)

We need to monitor instance utilization in purpose of understanding what the instance capacity is.

Change list:
- Change instance monitor to update capacity
- Change getAttribute to throw AttributeNotFoundException in DynamicMBeanProvider
- Combine max usage and instance capacity update into one method in cluster status monitor
- Add unit test

* Add resource partition weight gauge (#686)

We would like to monitor the usage of each capacity for the resource partitions: gauge of the average partition weight for each CAPACITY key.

Change list:
- Add partition weight gauge metric to resource monitor.
- Add two unit tests to cover new code.

* Add WAGED rebalancer reset method to clean up cached status. (#696)

The reset method is for cleaning up any in-memory records within the WAGED rebalancer so we don't need to recreate one.

Detailed change list:
1. Add reset methods to all the stateful objects that are used in the WAGED rebalancer.
2. Refine some of the potential race condition in the WAGED rebalancer components.
3. Adjust the tests accordingly. Also adding new tests to cover the components reset / the WAGED rebalancer reset logic.

* Reset the WAGED rebalancer once the controller newly acquires leadership. (#690)

This is to prevent any cached assignment information which is recorded during the previous session from impacting the rebalance result.
Detailed change list:

Move the stateful WAGED rebalancer to the GenericHelixController object instead of the rebalance stage. This is for resolving the possible race condition between the event processing thread and leader switch handling thread.
Adding a new test regarding leadership switch to verify that the WAGED rebalancer has been reset after the processing.

Co-authored-by: Hunter Lee <narendly@gmail.com>
Co-authored-by: Yi Wang <ywang4@linkedin.com>
Co-authored-by: Huizhi Lu <hulu@linkedin.com>
diff --git a/helix-core/helix-core-0.9.2-SNAPSHOT.ivy b/helix-core/helix-core-0.9.2-SNAPSHOT.ivy
index 2d6e298..07dd266 100644
--- a/helix-core/helix-core-0.9.2-SNAPSHOT.ivy
+++ b/helix-core/helix-core-0.9.2-SNAPSHOT.ivy
@@ -57,7 +57,8 @@
     <dependency org="org.codehaus.jackson" name="jackson-mapper-asl" rev="1.8.5" conf="compile->compile(default);runtime->runtime(default);default->default"/>
     <dependency org="commons-io" name="commons-io" rev="1.4" conf="compile->compile(default);runtime->runtime(default);default->default"/>
     <dependency org="commons-cli" name="commons-cli" rev="1.2" conf="compile->compile(default);runtime->runtime(default);default->default"/>
-    <dependency org="commons-math" name="commons-math" rev="2.1" conf="compile->compile(default);runtime->runtime(default);default->default"/>
+    <dependency org="org.apache.commons" name="commons-math" rev="2.1" conf="compile->compile(default);runtime->runtime(default);default->default"/>
+    <dependency org="org.apache.commons" name="commons-math3" rev="3.6.1" conf="compile->compile(default);runtime->runtime(default);default->default"/>
     <dependency org="com.101tec" name="zkclient" rev="0.5" conf="compile->compile(default);runtime->runtime(default);default->default"/>
     <dependency org="com.google.guava" name="guava" rev="15.0" conf="compile->compile(default);runtime->runtime(default);default->default"/>
     <dependency org="org.yaml" name="snakeyaml" rev="1.12" conf="compile->compile(default);runtime->runtime(default);default->default"/>
diff --git a/helix-core/pom.xml b/helix-core/pom.xml
index 45b6552..1077cc0 100644
--- a/helix-core/pom.xml
+++ b/helix-core/pom.xml
@@ -37,7 +37,7 @@
       org.I0Itec.zkclient*,
       org.apache.commons.cli*;version="[1.2,2)",
       org.apache.commons.io*;version="[1.4,2)",
-      org.apache.commons.math*;version="[2.1,3)",
+      org.apache.commons.math*;version="[2.1,4)",
       org.apache.jute*;resolution:=optional,
       org.apache.zookeeper.server.persistence*;resolution:=optional,
       org.apache.zookeeper.server.util*;resolution:=optional,
@@ -140,6 +140,11 @@
       <version>2.1</version>
     </dependency>
     <dependency>
+      <groupId>org.apache.commons</groupId>
+      <artifactId>commons-math3</artifactId>
+      <version>3.6.1</version>
+    </dependency>
+    <dependency>
       <groupId>commons-codec</groupId>
       <artifactId>commons-codec</artifactId>
       <version>1.6</version>
diff --git a/helix-core/src/main/java/org/apache/helix/BucketDataAccessor.java b/helix-core/src/main/java/org/apache/helix/BucketDataAccessor.java
new file mode 100644
index 0000000..2008c23
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/BucketDataAccessor.java
@@ -0,0 +1,53 @@
+package org.apache.helix;
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.IOException;
+
+public interface BucketDataAccessor {
+
+  /**
+   * Write a HelixProperty in buckets, compressed.
+   * @param path path to which the metadata will be written to
+   * @param value HelixProperty to write
+   * @param <T>
+   * @throws IOException
+   */
+  <T extends HelixProperty> boolean compressedBucketWrite(String path, T value) throws IOException;
+
+  /**
+   * Read a HelixProperty that was written in buckets, compressed.
+   * @param path
+   * @param helixPropertySubType the subtype of HelixProperty the data was written in
+   * @param <T>
+   */
+  <T extends HelixProperty> HelixProperty compressedBucketRead(String path,
+      Class<T> helixPropertySubType);
+
+  /**
+   * Delete the HelixProperty in the given path.
+   * @param path
+   */
+  void compressedBucketDelete(String path);
+
+  /**
+   * Close the connection to the metadata store.
+   */
+  void disconnect();
+}
diff --git a/helix-core/src/main/java/org/apache/helix/HelixAdmin.java b/helix-core/src/main/java/org/apache/helix/HelixAdmin.java
index a11b235..423f879 100644
--- a/helix-core/src/main/java/org/apache/helix/HelixAdmin.java
+++ b/helix-core/src/main/java/org/apache/helix/HelixAdmin.java
@@ -31,6 +31,7 @@
 import org.apache.helix.model.IdealState;
 import org.apache.helix.model.InstanceConfig;
 import org.apache.helix.model.MaintenanceSignal;
+import org.apache.helix.model.ResourceConfig;
 import org.apache.helix.model.StateModelDefinition;
 
 /*
@@ -579,4 +580,50 @@
   default void close() {
     System.out.println("Default close() was invoked! No operation was executed.");
   }
+
+  /**
+   * Adds a resource with IdealState and ResourceConfig to be rebalanced by WAGED rebalancer with validation.
+   * Validation includes the following:
+   * 1. Check ResourceConfig has the WEIGHT field
+   * 2. Check that all capacity keys from ClusterConfig are set up in the WEIGHT field
+   * 3. Check that all ResourceConfig's weightMap fields have all of the capacity keys
+   * @param clusterName
+   * @param idealState
+   * @param resourceConfig
+   * @return true if the resource has been added successfully. False otherwise
+   */
+  boolean addResourceWithWeight(String clusterName, IdealState idealState,
+      ResourceConfig resourceConfig);
+
+  /**
+   * Batch-enables Waged rebalance for the names of resources given.
+   * @param clusterName
+   * @param resourceNames
+   * @return
+   */
+  boolean enableWagedRebalance(String clusterName, List<String> resourceNames);
+
+  /**
+   * Validates the resources to see if their weight configs have been set properly.
+   * Validation includes the following:
+   * 1. Check ResourceConfig has the WEIGHT field
+   * 2. Check that all capacity keys from ClusterConfig are set up in the WEIGHT field
+   * 3. Check that all ResourceConfig's weightMap fields have all of the capacity keys
+   * @param resourceNames
+   * @return for each resource, true if the weight configs have been set properly, false otherwise
+   */
+  Map<String, Boolean> validateResourcesForWagedRebalance(String clusterName,
+      List<String> resourceNames);
+
+  /**
+   * Validates the instances to ensure their weights in InstanceConfigs have been set up properly.
+   * Validation includes the following:
+   * 1. If default instance capacity is not set, check that the InstanceConfigs have the CAPACITY field
+   * 2. Check that all capacity keys defined in ClusterConfig are present in the CAPACITY field
+   * @param clusterName
+   * @param instancesNames
+   * @return
+   */
+  Map<String, Boolean> validateInstancesForWagedRebalance(String clusterName,
+      List<String> instancesNames);
 }
diff --git a/helix-core/src/main/java/org/apache/helix/HelixRebalanceException.java b/helix-core/src/main/java/org/apache/helix/HelixRebalanceException.java
new file mode 100644
index 0000000..d54853f
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/HelixRebalanceException.java
@@ -0,0 +1,51 @@
+package org.apache.helix;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/**
+ * Exception thrown by Helix due to rebalance failures.
+ */
+public class HelixRebalanceException extends Exception {
+  // TODO: Adding static description or other necessary fields into the enum instances for
+  // TODO: supporting the rebalance monitor to understand the exception.
+  public enum Type {
+    INVALID_CLUSTER_STATUS,
+    INVALID_REBALANCER_STATUS,
+    FAILED_TO_CALCULATE,
+    INVALID_INPUT,
+    UNKNOWN_FAILURE
+  }
+
+  private final Type _type;
+
+  public HelixRebalanceException(String message, Type type, Throwable cause) {
+    super(String.format("%s Failure Type: %s", message, type.name()), cause);
+    _type = type;
+  }
+
+  public HelixRebalanceException(String message, Type type) {
+    super(String.format("%s Failure Type: %s", message, type.name()));
+    _type = type;
+  }
+
+  public Type getFailureType() {
+    return _type;
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/InstanceType.java b/helix-core/src/main/java/org/apache/helix/InstanceType.java
index 84e9d87..92b0e80 100644
--- a/helix-core/src/main/java/org/apache/helix/InstanceType.java
+++ b/helix-core/src/main/java/org/apache/helix/InstanceType.java
@@ -36,7 +36,8 @@
   CONTROLLER(new String[] {
       MonitorDomainNames.ClusterStatus.name(),
       MonitorDomainNames.HelixZkClient.name(),
-      MonitorDomainNames.HelixCallback.name()
+      MonitorDomainNames.HelixCallback.name(),
+      MonitorDomainNames.Rebalancer.name()
   }),
 
   PARTICIPANT(new String[] {
@@ -51,7 +52,8 @@
       MonitorDomainNames.HelixZkClient.name(),
       MonitorDomainNames.HelixCallback.name(),
       MonitorDomainNames.HelixThreadPoolExecutor.name(),
-      MonitorDomainNames.CLMParticipantReport.name()
+      MonitorDomainNames.CLMParticipantReport.name(),
+      MonitorDomainNames.Rebalancer.name()
   }),
 
   SPECTATOR(new String[] {
diff --git a/helix-core/src/main/java/org/apache/helix/SystemPropertyKeys.java b/helix-core/src/main/java/org/apache/helix/SystemPropertyKeys.java
index 1a6a797..d316986 100644
--- a/helix-core/src/main/java/org/apache/helix/SystemPropertyKeys.java
+++ b/helix-core/src/main/java/org/apache/helix/SystemPropertyKeys.java
@@ -6,6 +6,8 @@
 
   // ZKHelixManager
   public static final String CLUSTER_MANAGER_VERSION = "cluster-manager-version.properties";
+  // soft constraints weight definitions
+  public static final String SOFT_CONSTRAINT_WEIGHTS = "soft-constraint-weight.properties";
 
   public static final String FLAPPING_TIME_WINDOW = "helixmanager.flappingTimeWindow";
 
diff --git a/helix-core/src/main/java/org/apache/helix/controller/GenericHelixController.java b/helix-core/src/main/java/org/apache/helix/controller/GenericHelixController.java
index 39a5ad7..e47c420 100644
--- a/helix-core/src/main/java/org/apache/helix/controller/GenericHelixController.java
+++ b/helix-core/src/main/java/org/apache/helix/controller/GenericHelixController.java
@@ -25,6 +25,7 @@
 import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
+import java.util.Optional;
 import java.util.Set;
 import java.util.Timer;
 import java.util.TimerTask;
@@ -61,6 +62,8 @@
 import org.apache.helix.controller.pipeline.AsyncWorkerType;
 import org.apache.helix.controller.pipeline.Pipeline;
 import org.apache.helix.controller.pipeline.PipelineRegistry;
+import org.apache.helix.controller.rebalancer.StatefulRebalancer;
+import org.apache.helix.controller.rebalancer.waged.WagedRebalancer;
 import org.apache.helix.controller.stages.AttributeName;
 import org.apache.helix.controller.stages.BestPossibleStateCalcStage;
 import org.apache.helix.controller.stages.ClusterEvent;
@@ -168,7 +171,6 @@
   Timer _onDemandRebalanceTimer = null;
   AtomicReference<RebalanceTask> _nextRebalanceTask = new AtomicReference<>();
 
-
   /**
    * A cache maintained across pipelines
    */
@@ -186,6 +188,17 @@
 
   private HelixManager _helixManager;
 
+  // Since the stateful rebalancer needs to be lazily constructed when the HelixManager instance is
+  // ready, the GenericHelixController is not constructed with a stateful rebalancer. This wrapper
+  // is to avoid the complexity of handling a nullable value in the event handling process.
+  // TODO Create the required stateful rebalancer only when it is used by any resource.
+  private final StatefulRebalancerRef _rebalancerRef = new StatefulRebalancerRef() {
+    @Override
+    protected StatefulRebalancer createRebalancer(HelixManager helixManager) {
+      return new WagedRebalancer(helixManager);
+    }
+  };
+
   /**
    * TODO: We should get rid of this once we move to:
    *  1) ZK callback should go to ClusterDataCache and trigger data cache refresh only
@@ -221,17 +234,29 @@
   class RebalanceTask extends TimerTask {
     final HelixManager _manager;
     final ClusterEventType _clusterEventType;
+    private final Optional<Boolean> _shouldRefreshCacheOption;
     private long _nextRebalanceTime;
 
     public RebalanceTask(HelixManager manager, ClusterEventType clusterEventType) {
       this(manager, clusterEventType, -1);
-
     }
 
-    public RebalanceTask(HelixManager manager, ClusterEventType clusterEventType, long nextRebalanceTime) {
+    public RebalanceTask(HelixManager manager, ClusterEventType clusterEventType,
+        long nextRebalanceTime) {
+      this(manager, clusterEventType, nextRebalanceTime, Optional.empty());
+    }
+
+    public RebalanceTask(HelixManager manager, ClusterEventType clusterEventType,
+        long nextRebalanceTime, boolean shouldRefreshCache) {
+      this(manager, clusterEventType, nextRebalanceTime, Optional.of(shouldRefreshCache));
+    }
+
+    private RebalanceTask(HelixManager manager, ClusterEventType clusterEventType,
+        long nextRebalanceTime, Optional<Boolean> shouldRefreshCacheOption) {
       _manager = manager;
       _clusterEventType = clusterEventType;
       _nextRebalanceTime = nextRebalanceTime;
+      _shouldRefreshCacheOption = shouldRefreshCacheOption;
     }
 
     public long getNextRebalanceTime() {
@@ -241,8 +266,9 @@
     @Override
     public void run() {
       try {
-        if (_clusterEventType.equals(ClusterEventType.PeriodicalRebalance) || _clusterEventType
-            .equals(ClusterEventType.OnDemandRebalance)) {
+        if (_shouldRefreshCacheOption.orElse(
+            _clusterEventType.equals(ClusterEventType.PeriodicalRebalance) || _clusterEventType
+                .equals(ClusterEventType.OnDemandRebalance))) {
           requestDataProvidersFullRefresh();
 
           HelixDataAccessor accessor = _manager.getHelixDataAccessor();
@@ -360,7 +386,17 @@
    * Schedule an on demand rebalance pipeline.
    * @param delay
    */
+  @Deprecated
   public void scheduleOnDemandRebalance(long delay) {
+    scheduleOnDemandRebalance(delay, true);
+  }
+
+  /**
+   * Schedule an on demand rebalance pipeline.
+   * @param delay
+   * @param shouldRefreshCache true if refresh the cache before scheduling a rebalance.
+   */
+  public void scheduleOnDemandRebalance(long delay, boolean shouldRefreshCache) {
     if (_helixManager == null) {
       logger.error("Failed to schedule a future pipeline run for cluster {}. Helix manager is null!",
           _clusterName);
@@ -378,7 +414,8 @@
     }
 
     RebalanceTask newTask =
-        new RebalanceTask(_helixManager, ClusterEventType.OnDemandRebalance, rebalanceTime);
+        new RebalanceTask(_helixManager, ClusterEventType.OnDemandRebalance, rebalanceTime,
+            shouldRefreshCache);
 
     _onDemandRebalanceTimer.schedule(newTask, delay);
     logger.info("Scheduled instant pipeline run for cluster {}." , _helixManager.getClusterName());
@@ -601,6 +638,22 @@
       return;
     }
 
+    // Event handling happens in a different thread from the onControllerChange processing thread.
+    // Thus, there are several possible conditions.
+    // 1. Event handled after leadership acquired. So we will have a valid rebalancer for the
+    // event processing.
+    // 2. Event handled shortly after leadership relinquished. And the rebalancer has not been
+    // marked as invalid yet. So the event will be processed the same as case one.
+    // 3. Event is leftover from the previous session, and it is handled when the controller
+    // regains the leadership. The rebalancer will be reset before being used. That is the
+    // expected behavior so as to avoid inconsistent rebalance result.
+    // 4. Event handled shortly after leadership relinquished. And the rebalancer has been marked
+    // as invalid. So we reset the rebalancer. But the later isLeader() check will return false and
+    // the pipeline will be triggered. So the reset rebalancer won't be used before the controller
+    // regains leadership.
+    event.addAttribute(AttributeName.STATEFUL_REBALANCER.name(),
+        _rebalancerRef.getRebalancer(manager));
+
     if (!manager.isLeader()) {
       logger.error("Cluster manager: " + manager.getInstanceName() + " is not leader for " + manager
           .getClusterName() + ". Pipeline will not be invoked");
@@ -997,6 +1050,12 @@
       _clusterStatusMonitor.setMaintenance(_inMaintenanceMode);
     } else {
       enableClusterStatusMonitor(false);
+      // Note that onControllerChange is executed in parallel with the event processing thread. It
+      // is possible that the current WAGED rebalancer object is in use for handling callback. So
+      // mark the rebalancer invalid only, instead of closing it here.
+      // This to-be-closed WAGED rebalancer will be reset later on a later event processing if
+      // the controller becomes leader again.
+      _rebalancerRef.invalidateRebalancer();
     }
 
     logger.info("END: GenericClusterController.onControllerChange() for cluster " + _clusterName);
@@ -1100,6 +1159,8 @@
 
     enableClusterStatusMonitor(false);
 
+    _rebalancerRef.closeRebalancer();
+
     // TODO controller shouldn't be used in anyway after shutdown.
     // Need to record shutdown and throw Exception if the controller is used again.
   }
@@ -1177,7 +1238,6 @@
     return statusFlag;
   }
 
-
   // TODO: refactor this to use common/ClusterEventProcessor.
   @Deprecated
   private class ClusterEventProcessor extends Thread {
@@ -1233,4 +1293,59 @@
     eventThread.setDaemon(true);
     eventThread.start();
   }
-}
\ No newline at end of file
+
+  /**
+   * A wrapper class for the stateful rebalancer instance that will be tracked in the
+   * GenericHelixController.
+   */
+  private abstract class StatefulRebalancerRef<T extends StatefulRebalancer> {
+    private T _rebalancer = null;
+    private boolean _isRebalancerValid = true;
+
+    /**
+     * @param helixManager
+     * @return A new stateful rebalancer instance with initial state.
+     */
+    protected abstract T createRebalancer(HelixManager helixManager);
+
+    /**
+     * Mark the current rebalancer object to be invalid, which indicates it needs to be reset before
+     * the next usage.
+     */
+    synchronized void invalidateRebalancer() {
+      _isRebalancerValid = false;
+    }
+
+    /**
+     * @return A valid rebalancer object.
+     *         If the rebalancer is no longer valid, it will be reset before returning.
+     * TODO: Make rebalancer volatile or make it singleton, if this method is called in multiple
+     * TODO: threads outside the controller object.
+     */
+    synchronized T getRebalancer(HelixManager helixManager) {
+      // Lazily initialize the stateful rebalancer instance since the GenericHelixController
+      // instance is instantiated without the HelixManager information that is required.
+      if (_rebalancer == null) {
+        _rebalancer = createRebalancer(helixManager);
+        _isRebalancerValid = true;
+      }
+      // If the rebalance exists but has been marked as invalid (due to leadership switch), it needs
+      // to be reset before return.
+      if (!_isRebalancerValid) {
+        _rebalancer.reset();
+        _isRebalancerValid = true;
+      }
+      return _rebalancer;
+    }
+
+    /**
+     * Proactively close the rebalance object to release the resources.
+     */
+    synchronized void closeRebalancer() {
+      if (_rebalancer != null) {
+        _rebalancer.close();
+        _rebalancer = null;
+      }
+    }
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/changedetector/ChangeDetector.java b/helix-core/src/main/java/org/apache/helix/controller/changedetector/ChangeDetector.java
new file mode 100644
index 0000000..fbe4afc
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/changedetector/ChangeDetector.java
@@ -0,0 +1,57 @@
+package org.apache.helix.controller.changedetector;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Collection;
+import org.apache.helix.HelixConstants;
+
+/**
+ * ChangeDetector interface that will be used to track deltas in the cluster from one pipeline run
+ * to another. The interface methods are designed to be flexible for both the resource pipeline and
+ * the task pipeline.
+ * TODO: Consider splitting this up into two different ChangeDetector interfaces:
+ * TODO: PropertyBasedChangeDetector and PathBasedChangeDetector.
+ */
+public interface ChangeDetector {
+
+  /**
+   * Returns all types of changes detected.
+   * @return a collection of ChangeTypes
+   */
+  Collection<HelixConstants.ChangeType> getChangeTypes();
+
+  /**
+   * Returns the names of items that changed based on the change type given.
+   * @return a collection of names of items that changed
+   */
+  Collection<String> getChangesByType(HelixConstants.ChangeType changeType);
+
+  /**
+   * Returns the names of items that were added based on the change type given.
+   * @return a collection of names of items that were added
+   */
+  Collection<String> getAdditionsByType(HelixConstants.ChangeType changeType);
+
+  /**
+   * Returns the names of items that were removed based on the change type given.
+   * @return a collection of names of items that were removed
+   */
+  Collection<String> getRemovalsByType(HelixConstants.ChangeType changeType);
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/changedetector/ResourceChangeDetector.java b/helix-core/src/main/java/org/apache/helix/controller/changedetector/ResourceChangeDetector.java
new file mode 100644
index 0000000..27f4c50
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/changedetector/ResourceChangeDetector.java
@@ -0,0 +1,199 @@
+package org.apache.helix.controller.changedetector;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+import com.google.common.collect.Sets;
+import org.apache.helix.HelixConstants;
+import org.apache.helix.HelixProperty;
+import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
+import org.apache.helix.model.ClusterConfig;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * ResourceChangeDetector implements ChangeDetector. It caches resource-related metadata from
+ * Helix's main resource pipeline cache (DataProvider) and the computation results of change
+ * detection.
+ * WARNING: the methods of this class are not thread-safe.
+ */
+public class ResourceChangeDetector implements ChangeDetector {
+  private static final Logger LOG = LoggerFactory.getLogger(ResourceChangeDetector.class.getName());
+
+  private final boolean _ignoreControllerGeneratedFields;
+  private ResourceChangeSnapshot _oldSnapshot; // snapshot for previous pipeline run
+  private ResourceChangeSnapshot _newSnapshot; // snapshot for this pipeline run
+
+  // The following caches the computation results
+  private Map<HelixConstants.ChangeType, Collection<String>> _changedItems = new HashMap<>();
+  private Map<HelixConstants.ChangeType, Collection<String>> _addedItems = new HashMap<>();
+  private Map<HelixConstants.ChangeType, Collection<String>> _removedItems = new HashMap<>();
+
+  public ResourceChangeDetector(boolean ignoreControllerGeneratedFields) {
+    _newSnapshot = new ResourceChangeSnapshot();
+    _ignoreControllerGeneratedFields = ignoreControllerGeneratedFields;
+  }
+
+  public ResourceChangeDetector() {
+    this(false);
+  }
+
+  /**
+   * Compare the underlying HelixProperty objects and produce a collection of names of changed
+   * properties.
+   * @return
+   */
+  private Collection<String> getChangedItems(Map<String, ? extends HelixProperty> oldPropertyMap,
+      Map<String, ? extends HelixProperty> newPropertyMap) {
+    Collection<String> changedItems = new HashSet<>();
+    oldPropertyMap.forEach((name, property) -> {
+      if (newPropertyMap.containsKey(name)
+          && !property.getRecord().equals(newPropertyMap.get(name).getRecord())) {
+        changedItems.add(name);
+      }
+    });
+    return changedItems;
+  }
+
+  /**
+   * Return a collection of names that are newly added.
+   * @return
+   */
+  private Collection<String> getAddedItems(Map<String, ? extends HelixProperty> oldPropertyMap,
+      Map<String, ? extends HelixProperty> newPropertyMap) {
+    return Sets.difference(newPropertyMap.keySet(), oldPropertyMap.keySet());
+  }
+
+  /**
+   * Return a collection of names that were removed.
+   * @return
+   */
+  private Collection<String> getRemovedItems(Map<String, ? extends HelixProperty> oldPropertyMap,
+      Map<String, ? extends HelixProperty> newPropertyMap) {
+    return Sets.difference(oldPropertyMap.keySet(), newPropertyMap.keySet());
+  }
+
+  private void clearCachedComputation() {
+    _changedItems.clear();
+    _addedItems.clear();
+    _removedItems.clear();
+  }
+
+  /**
+   * Based on the change type given and propertyMap type, call the right getters for propertyMap.
+   * @param changeType
+   * @param snapshot
+   * @return
+   */
+  private Map<String, ? extends HelixProperty> determinePropertyMapByType(
+      HelixConstants.ChangeType changeType, ResourceChangeSnapshot snapshot) {
+    switch (changeType) {
+    case INSTANCE_CONFIG:
+      return snapshot.getInstanceConfigMap();
+    case IDEAL_STATE:
+      return snapshot.getIdealStateMap();
+    case RESOURCE_CONFIG:
+      return snapshot.getResourceConfigMap();
+    case LIVE_INSTANCE:
+      return snapshot.getLiveInstances();
+    case CLUSTER_CONFIG:
+      ClusterConfig config = snapshot.getClusterConfig();
+      if (config == null) {
+        return Collections.emptyMap();
+      } else {
+        return Collections.singletonMap(config.getClusterName(), config);
+      }
+    default:
+      LOG.warn(
+          "ResourceChangeDetector cannot determine propertyMap for the given ChangeType: {}. Returning an empty map.",
+          changeType);
+      return Collections.emptyMap();
+    }
+  }
+
+  /**
+   * Makes the current newSnapshot the oldSnapshot and reads in the up-to-date snapshot for change
+   * computation. To be called in the controller pipeline.
+   * @param dataProvider newly refreshed DataProvider (cache)
+   */
+  public synchronized void updateSnapshots(ResourceControllerDataProvider dataProvider) {
+    // If there are changes, update internal states
+    _oldSnapshot = new ResourceChangeSnapshot(_newSnapshot);
+    _newSnapshot = new ResourceChangeSnapshot(dataProvider, _ignoreControllerGeneratedFields);
+    dataProvider.clearRefreshedChangeTypes();
+
+    // Invalidate cached computation
+    clearCachedComputation();
+  }
+
+  public synchronized void resetSnapshots() {
+    _newSnapshot = new ResourceChangeSnapshot();
+    clearCachedComputation();
+  }
+
+  @Override
+  public synchronized Collection<HelixConstants.ChangeType> getChangeTypes() {
+    return Collections.unmodifiableSet(_newSnapshot.getChangedTypes());
+  }
+
+  @Override
+  public synchronized Collection<String> getChangesByType(HelixConstants.ChangeType changeType) {
+    return _changedItems.computeIfAbsent(changeType,
+        changedItems -> getChangedItems(determinePropertyMapByType(changeType, _oldSnapshot),
+            determinePropertyMapByType(changeType, _newSnapshot)));
+  }
+
+  @Override
+  public synchronized Collection<String> getAdditionsByType(HelixConstants.ChangeType changeType) {
+    return _addedItems.computeIfAbsent(changeType,
+        changedItems -> getAddedItems(determinePropertyMapByType(changeType, _oldSnapshot),
+            determinePropertyMapByType(changeType, _newSnapshot)));
+  }
+
+  @Override
+  public synchronized Collection<String> getRemovalsByType(HelixConstants.ChangeType changeType) {
+    return _removedItems.computeIfAbsent(changeType,
+        changedItems -> getRemovedItems(determinePropertyMapByType(changeType, _oldSnapshot),
+            determinePropertyMapByType(changeType, _newSnapshot)));
+  }
+
+  /**
+   * @return A map contains all the changed items that are categorized by the change types.
+   */
+  public Map<HelixConstants.ChangeType, Set<String>> getAllChanges() {
+    return getChangeTypes().stream()
+        .collect(Collectors.toMap(changeType -> changeType, changeType -> {
+          Set<String> itemKeys = new HashSet<>();
+          itemKeys.addAll(getAdditionsByType(changeType));
+          itemKeys.addAll(getChangesByType(changeType));
+          itemKeys.addAll(getRemovalsByType(changeType));
+          return itemKeys;
+        })).entrySet().stream().filter(changeEntry -> !changeEntry.getValue().isEmpty()).collect(
+            Collectors
+                .toMap(changeEntry -> changeEntry.getKey(), changeEntry -> changeEntry.getValue()));
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/changedetector/ResourceChangeSnapshot.java b/helix-core/src/main/java/org/apache/helix/controller/changedetector/ResourceChangeSnapshot.java
new file mode 100644
index 0000000..fc8c5c4
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/changedetector/ResourceChangeSnapshot.java
@@ -0,0 +1,157 @@
+package org.apache.helix.controller.changedetector;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+import org.apache.helix.HelixConstants;
+import org.apache.helix.ZNRecord;
+import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.IdealState;
+import org.apache.helix.model.InstanceConfig;
+import org.apache.helix.model.LiveInstance;
+import org.apache.helix.model.ResourceConfig;
+
+/**
+ * ResourceChangeSnapshot is a POJO that contains the following Helix metadata:
+ * 1. InstanceConfig
+ * 2. IdealState
+ * 3. ResourceConfig
+ * 4. LiveInstance
+ * 5. Changed property types
+ * It serves as a snapshot of the main controller cache to enable the difference (change)
+ * calculation between two rounds of the pipeline run.
+ */
+class ResourceChangeSnapshot {
+
+  private Set<HelixConstants.ChangeType> _changedTypes;
+  private Map<String, InstanceConfig> _instanceConfigMap;
+  private Map<String, IdealState> _idealStateMap;
+  private Map<String, ResourceConfig> _resourceConfigMap;
+  private Map<String, LiveInstance> _liveInstances;
+  private ClusterConfig _clusterConfig;
+
+  /**
+   * Default constructor that constructs an empty snapshot.
+   */
+  ResourceChangeSnapshot() {
+    _changedTypes = new HashSet<>();
+    _instanceConfigMap = new HashMap<>();
+    _idealStateMap = new HashMap<>();
+    _resourceConfigMap = new HashMap<>();
+    _liveInstances = new HashMap<>();
+    _clusterConfig = null;
+  }
+
+  /**
+   * Constructor using controller cache (ResourceControllerDataProvider).
+   *
+   * @param dataProvider
+   * @param ignoreControllerGeneratedFields if true, the snapshot won't record any changes that is
+   *                                        being modified by the controller.
+   */
+  ResourceChangeSnapshot(ResourceControllerDataProvider dataProvider,
+      boolean ignoreControllerGeneratedFields) {
+    _changedTypes = new HashSet<>(dataProvider.getRefreshedChangeTypes());
+    _instanceConfigMap = new HashMap<>(dataProvider.getInstanceConfigMap());
+    _idealStateMap = new HashMap<>(dataProvider.getIdealStates());
+    if (ignoreControllerGeneratedFields && (
+        dataProvider.getClusterConfig().isPersistBestPossibleAssignment() || dataProvider
+            .getClusterConfig().isPersistIntermediateAssignment())) {
+      for (String resourceName : _idealStateMap.keySet()) {
+        _idealStateMap.put(resourceName, trimIdealState(_idealStateMap.get(resourceName)));
+      }
+    }
+    _resourceConfigMap = new HashMap<>(dataProvider.getResourceConfigMap());
+    _liveInstances = new HashMap<>(dataProvider.getLiveInstances());
+    _clusterConfig = dataProvider.getClusterConfig();
+  }
+
+  /**
+   * Copy constructor for ResourceChangeCache.
+   * @param snapshot
+   */
+  ResourceChangeSnapshot(ResourceChangeSnapshot snapshot) {
+    _changedTypes = new HashSet<>(snapshot._changedTypes);
+    _instanceConfigMap = new HashMap<>(snapshot._instanceConfigMap);
+    _idealStateMap = new HashMap<>(snapshot._idealStateMap);
+    _resourceConfigMap = new HashMap<>(snapshot._resourceConfigMap);
+    _liveInstances = new HashMap<>(snapshot._liveInstances);
+    _clusterConfig = snapshot._clusterConfig;
+  }
+
+  Set<HelixConstants.ChangeType> getChangedTypes() {
+    return _changedTypes;
+  }
+
+  Map<String, InstanceConfig> getInstanceConfigMap() {
+    return _instanceConfigMap;
+  }
+
+  Map<String, IdealState> getIdealStateMap() {
+    return _idealStateMap;
+  }
+
+  Map<String, ResourceConfig> getResourceConfigMap() {
+    return _resourceConfigMap;
+  }
+
+  Map<String, LiveInstance> getLiveInstances() {
+    return _liveInstances;
+  }
+
+  ClusterConfig getClusterConfig() {
+    return _clusterConfig;
+  }
+
+  // Trim the IdealState to exclude any controller modified information.
+  private IdealState trimIdealState(IdealState originalIdealState) {
+    // Clone the IdealState to avoid modifying the objects in the Cluster Data Cache, which might
+    // be used by the other stages in the pipeline.
+    IdealState trimmedIdealState = new IdealState(originalIdealState.getRecord());
+    ZNRecord trimmedIdealStateRecord = trimmedIdealState.getRecord();
+    switch (originalIdealState.getRebalanceMode()) {
+      // WARNING: the IdealState copy constructor is not really deep copy. So we should not modify
+      // the values directly or the cached values will be changed.
+      case FULL_AUTO:
+        // For FULL_AUTO resources, both map fields and list fields are not considered as data input
+        // for the controller. The controller will write to these two types of fields for persisting
+        // the assignment mapping.
+        trimmedIdealStateRecord.setListFields(trimmedIdealStateRecord.getListFields().keySet().stream().collect(
+            Collectors.toMap(partition -> partition, partition -> Collections.emptyList())));
+        // Continue to clean up map fields in the SEMI_AUTO case.
+      case SEMI_AUTO:
+        // For SEMI_AUTO resources, map fields are not considered as data input for the controller.
+        // The controller will write to the map fields for persisting the assignment mapping.
+        trimmedIdealStateRecord.setMapFields(trimmedIdealStateRecord.getMapFields().keySet().stream().collect(
+            Collectors.toMap(partition -> partition, partition -> Collections.emptyMap())));
+        break;
+      default:
+        break;
+    }
+    return trimmedIdealState;
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/dataproviders/ResourceControllerDataProvider.java b/helix-core/src/main/java/org/apache/helix/controller/dataproviders/ResourceControllerDataProvider.java
index b1dc215..1631d50 100644
--- a/helix-core/src/main/java/org/apache/helix/controller/dataproviders/ResourceControllerDataProvider.java
+++ b/helix-core/src/main/java/org/apache/helix/controller/dataproviders/ResourceControllerDataProvider.java
@@ -25,6 +25,7 @@
 import java.util.Map;
 import java.util.Set;
 
+import java.util.concurrent.ConcurrentHashMap;
 import org.apache.helix.HelixConstants;
 import org.apache.helix.HelixDataAccessor;
 import org.apache.helix.PropertyKey;
@@ -64,6 +65,9 @@
   private Map<String, Map<String, MissingTopStateRecord>> _missingTopStateMap;
   private Map<String, Map<String, String>> _lastTopStateLocationMap;
 
+  // Maintain a set of all ChangeTypes for change detection
+  private Set<HelixConstants.ChangeType> _refreshedChangeTypes;
+
   public ResourceControllerDataProvider() {
     this(AbstractDataCache.UNKNOWN_CLUSTER);
   }
@@ -106,19 +110,22 @@
     _idealMappingCache = new HashMap<>();
     _missingTopStateMap = new HashMap<>();
     _lastTopStateLocationMap = new HashMap<>();
+    _refreshedChangeTypes = ConcurrentHashMap.newKeySet();
   }
 
   public synchronized void refresh(HelixDataAccessor accessor) {
     long startTime = System.currentTimeMillis();
 
     // Refresh base
-    Set<HelixConstants.ChangeType> propertyRefreshed = super.doRefresh(accessor);
+    Set<HelixConstants.ChangeType> changedTypes = super.doRefresh(accessor);
+    _refreshedChangeTypes.addAll(changedTypes);
 
     // Invalidate cached information if any of the important data has been refreshed
-    if (propertyRefreshed.contains(HelixConstants.ChangeType.IDEAL_STATE)
-        || propertyRefreshed.contains(HelixConstants.ChangeType.LIVE_INSTANCE)
-        || propertyRefreshed.contains(HelixConstants.ChangeType.INSTANCE_CONFIG)
-        || propertyRefreshed.contains(HelixConstants.ChangeType.RESOURCE_CONFIG)) {
+    if (changedTypes.contains(HelixConstants.ChangeType.IDEAL_STATE)
+        || changedTypes.contains(HelixConstants.ChangeType.LIVE_INSTANCE)
+        || changedTypes.contains(HelixConstants.ChangeType.INSTANCE_CONFIG)
+        || changedTypes.contains(HelixConstants.ChangeType.RESOURCE_CONFIG)
+        || changedTypes.contains((HelixConstants.ChangeType.CLUSTER_CONFIG))) {
       clearCachedResourceAssignments();
     }
 
@@ -261,6 +268,23 @@
     _idealMappingCache.put(resource, mapping);
   }
 
+  /**
+   * Return the set of all PropertyTypes that changed prior to this round of rebalance. The caller
+   * should clear this set by calling {@link #clearRefreshedChangeTypes()}.
+   * @return
+   */
+  public Set<HelixConstants.ChangeType> getRefreshedChangeTypes() {
+    return _refreshedChangeTypes;
+  }
+
+  /**
+   * Clears the set of all PropertyTypes that changed. The caller will have consumed all change
+   * types by calling {@link #getRefreshedChangeTypes()}.
+   */
+  public void clearRefreshedChangeTypes() {
+    _refreshedChangeTypes.clear();
+  }
+
   public void clearCachedResourceAssignments() {
     _resourceAssignmentCache.clear();
     _idealMappingCache.clear();
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/DelayedAutoRebalancer.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/DelayedAutoRebalancer.java
index 6ae7076..63870ec 100644
--- a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/DelayedAutoRebalancer.java
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/DelayedAutoRebalancer.java
@@ -33,11 +33,10 @@
 import org.apache.helix.ZNRecord;
 import org.apache.helix.api.config.StateTransitionThrottleConfig;
 import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
-import org.apache.helix.controller.rebalancer.util.RebalanceScheduler;
+import org.apache.helix.controller.rebalancer.util.DelayedRebalanceUtil;
 import org.apache.helix.controller.stages.CurrentStateOutput;
 import org.apache.helix.model.ClusterConfig;
 import org.apache.helix.model.IdealState;
-import org.apache.helix.model.InstanceConfig;
 import org.apache.helix.model.Partition;
 import org.apache.helix.model.Resource;
 import org.apache.helix.model.ResourceAssignment;
@@ -51,7 +50,6 @@
  */
 public class DelayedAutoRebalancer extends AbstractRebalancer<ResourceControllerDataProvider> {
   private static final Logger LOG = LoggerFactory.getLogger(DelayedAutoRebalancer.class);
-  private static RebalanceScheduler _rebalanceScheduler = new RebalanceScheduler();
 
   @Override
   public IdealState computeNewIdealState(String resourceName,
@@ -80,7 +78,8 @@
 
     ClusterConfig clusterConfig = clusterData.getClusterConfig();
     ResourceConfig resourceConfig = clusterData.getResourceConfig(resourceName);
-    boolean delayRebalanceEnabled = isDelayRebalanceEnabled(currentIdealState, clusterConfig);
+    boolean delayRebalanceEnabled =
+        DelayedRebalanceUtil.isDelayRebalanceEnabled(currentIdealState, clusterConfig);
 
     if (resourceConfig != null) {
       userDefinedPreferenceList = resourceConfig.getPreferenceLists();
@@ -111,16 +110,18 @@
 
     Set<String> activeNodes = liveEnabledNodes;
     if (delayRebalanceEnabled) {
-      long delay = getRebalanceDelay(currentIdealState, clusterConfig);
-      activeNodes = getActiveInstances(allNodes, currentIdealState, liveEnabledNodes,
-          clusterData.getInstanceOfflineTimeMap(), clusterData.getLiveInstances().keySet(),
-          clusterData.getInstanceConfigMap(), delay, clusterConfig);
+      long delay = DelayedRebalanceUtil.getRebalanceDelay(currentIdealState, clusterConfig);
+      activeNodes = DelayedRebalanceUtil
+          .getActiveNodes(allNodes, currentIdealState, liveEnabledNodes,
+              clusterData.getInstanceOfflineTimeMap(), clusterData.getLiveInstances().keySet(),
+              clusterData.getInstanceConfigMap(), delay, clusterConfig);
 
       Set<String> offlineOrDisabledInstances = new HashSet<>(activeNodes);
       offlineOrDisabledInstances.removeAll(liveEnabledNodes);
-      setRebalanceScheduler(currentIdealState, offlineOrDisabledInstances,
-          clusterData.getInstanceOfflineTimeMap(), clusterData.getLiveInstances().keySet(),
-          clusterData.getInstanceConfigMap(), delay, clusterConfig);
+      DelayedRebalanceUtil.setRebalanceScheduler(currentIdealState.getResourceName(), true,
+          offlineOrDisabledInstances, clusterData.getInstanceOfflineTimeMap(),
+          clusterData.getLiveInstances().keySet(), clusterData.getInstanceConfigMap(), delay,
+          clusterConfig, _manager);
     }
 
     if (allNodes.isEmpty() || activeNodes.isEmpty()) {
@@ -163,16 +164,16 @@
         .computePartitionAssignment(allNodeList, liveEnabledNodeList, currentMapping, clusterData);
     ZNRecord finalMapping = newIdealMapping;
 
-    if (isDelayRebalanceEnabled(currentIdealState, clusterConfig)) {
+    if (DelayedRebalanceUtil.isDelayRebalanceEnabled(currentIdealState, clusterConfig)) {
       List<String> activeNodeList = new ArrayList<>(activeNodes);
       Collections.sort(activeNodeList);
-      int minActiveReplicas = getMinActiveReplica(currentIdealState, replicaCount);
+      int minActiveReplicas =
+          DelayedRebalanceUtil.getMinActiveReplica(currentIdealState, replicaCount);
 
       ZNRecord newActiveMapping = _rebalanceStrategy
           .computePartitionAssignment(allNodeList, activeNodeList, currentMapping, clusterData);
-      finalMapping =
-          getFinalDelayedMapping(currentIdealState, newIdealMapping, newActiveMapping, liveEnabledNodes,
-              replicaCount, minActiveReplicas);
+      finalMapping = getFinalDelayedMapping(currentIdealState, newIdealMapping, newActiveMapping,
+          liveEnabledNodes, replicaCount, minActiveReplicas);
     }
 
     finalMapping.getListFields().putAll(userDefinedPreferenceList);
@@ -203,162 +204,15 @@
     return newIdealState;
   }
 
-  /* get all active instances (live instances plus offline-yet-active instances */
-  private Set<String> getActiveInstances(Set<String> allNodes, IdealState idealState,
-      Set<String> liveEnabledNodes, Map<String, Long> instanceOfflineTimeMap, Set<String> liveNodes,
-      Map<String, InstanceConfig> instanceConfigMap, long delay, ClusterConfig clusterConfig) {
-    Set<String> activeInstances = new HashSet<>(liveEnabledNodes);
-
-    if (!isDelayRebalanceEnabled(idealState, clusterConfig)) {
-      return activeInstances;
-    }
-
-    Set<String> offlineOrDisabledInstances = new HashSet<>(allNodes);
-    offlineOrDisabledInstances.removeAll(liveEnabledNodes);
-
-    long currentTime = System.currentTimeMillis();
-    for (String ins : offlineOrDisabledInstances) {
-      long inactiveTime = getInactiveTime(ins, liveNodes, instanceOfflineTimeMap.get(ins), delay,
-          instanceConfigMap.get(ins), clusterConfig);
-      InstanceConfig instanceConfig = instanceConfigMap.get(ins);
-      if (inactiveTime > currentTime && instanceConfig != null && instanceConfig
-          .isDelayRebalanceEnabled()) {
-        activeInstances.add(ins);
-      }
-    }
-
-    return activeInstances;
-  }
-
-  /* Set a rebalance scheduler for the closest future rebalance time. */
-  private void setRebalanceScheduler(IdealState idealState, Set<String> offlineOrDisabledInstances,
-      Map<String, Long> instanceOfflineTimeMap, Set<String> liveNodes,
-      Map<String, InstanceConfig> instanceConfigMap,  long delay,
-      ClusterConfig clusterConfig) {
-    String resourceName = idealState.getResourceName();
-    if (!isDelayRebalanceEnabled(idealState, clusterConfig)) {
-      _rebalanceScheduler.removeScheduledRebalance(resourceName);
-      return;
-    }
-
-    long currentTime = System.currentTimeMillis();
-    long nextRebalanceTime = Long.MAX_VALUE;
-    // calculate the closest future rebalance time
-    for (String ins : offlineOrDisabledInstances) {
-      long inactiveTime = getInactiveTime(ins, liveNodes, instanceOfflineTimeMap.get(ins), delay,
-          instanceConfigMap.get(ins), clusterConfig);
-      if (inactiveTime != -1 && inactiveTime > currentTime && inactiveTime < nextRebalanceTime) {
-        nextRebalanceTime = inactiveTime;
-      }
-    }
-
-    if (nextRebalanceTime == Long.MAX_VALUE) {
-      long startTime = _rebalanceScheduler.removeScheduledRebalance(resourceName);
-      if (LOG.isDebugEnabled()) {
-        LOG.debug(String
-            .format("Remove exist rebalance timer for resource %s at %d\n", resourceName, startTime));
-      }
-    } else {
-      long currentScheduledTime = _rebalanceScheduler.getRebalanceTime(resourceName);
-      if (currentScheduledTime < 0 || currentScheduledTime > nextRebalanceTime) {
-        _rebalanceScheduler.scheduleRebalance(_manager, resourceName, nextRebalanceTime);
-        if (LOG.isDebugEnabled()) {
-          LOG.debug(String
-              .format("Set next rebalance time for resource %s at time %d\n", resourceName,
-                  nextRebalanceTime));
-        }
-      }
-    }
-  }
-
-  /**
-   * The time when an offline or disabled instance should be treated as inactive. return -1 if it is
-   * inactive now.
-   *
-   * @return
-   */
-  private long getInactiveTime(String instance, Set<String> liveInstances, Long offlineTime,
-      long delay, InstanceConfig instanceConfig, ClusterConfig clusterConfig) {
-    long inactiveTime = Long.MAX_VALUE;
-
-    // check the time instance went offline.
-    if (!liveInstances.contains(instance)) {
-      if (offlineTime != null && offlineTime > 0 && offlineTime + delay < inactiveTime) {
-        inactiveTime = offlineTime + delay;
-      }
-    }
-
-    // check the time instance got disabled.
-    if (!instanceConfig.getInstanceEnabled() || (clusterConfig.getDisabledInstances() != null
-        && clusterConfig.getDisabledInstances().containsKey(instance))) {
-      long disabledTime = instanceConfig.getInstanceEnabledTime();
-      if (clusterConfig.getDisabledInstances() != null && clusterConfig.getDisabledInstances()
-          .containsKey(instance)) {
-        // Update batch disable time
-        long batchDisableTime = Long.parseLong(clusterConfig.getDisabledInstances().get(instance));
-        if (disabledTime == -1 || disabledTime > batchDisableTime) {
-          disabledTime = batchDisableTime;
-        }
-      }
-      if (disabledTime > 0 && disabledTime + delay < inactiveTime) {
-        inactiveTime = disabledTime + delay;
-      }
-    }
-
-    if (inactiveTime == Long.MAX_VALUE) {
-      return -1;
-    }
-
-    return inactiveTime;
-  }
-
-  private long getRebalanceDelay(IdealState idealState, ClusterConfig clusterConfig) {
-    long delayTime = idealState.getRebalanceDelay();
-    if (delayTime < 0) {
-      delayTime = clusterConfig.getRebalanceDelayTime();
-    }
-    return delayTime;
-  }
-
-  private boolean isDelayRebalanceEnabled(IdealState idealState, ClusterConfig clusterConfig) {
-    long delay = getRebalanceDelay(idealState, clusterConfig);
-    return (delay > 0 && idealState.isDelayRebalanceEnabled() && clusterConfig
-        . isDelayRebalaceEnabled());
-  }
-
   private ZNRecord getFinalDelayedMapping(IdealState idealState, ZNRecord newIdealMapping,
       ZNRecord newActiveMapping, Set<String> liveInstances, int numReplica, int minActiveReplica) {
     if (minActiveReplica >= numReplica) {
       return newIdealMapping;
     }
     ZNRecord finalMapping = new ZNRecord(idealState.getResourceName());
-    for (String partition : newIdealMapping.getListFields().keySet()) {
-      List<String> idealList = newIdealMapping.getListField(partition);
-      List<String> activeList = newActiveMapping.getListField(partition);
-
-      List<String> liveList = new ArrayList<>();
-      int activeReplica = 0;
-      for (String ins : activeList) {
-        if (liveInstances.contains(ins)) {
-          activeReplica++;
-          liveList.add(ins);
-        }
-      }
-
-      if (activeReplica >= minActiveReplica) {
-        finalMapping.setListField(partition, activeList);
-      } else {
-        List<String> candidates = new ArrayList<String>(idealList);
-        candidates.removeAll(activeList);
-        for (String liveIns : candidates) {
-          liveList.add(liveIns);
-          if (liveList.size() >= minActiveReplica) {
-            break;
-          }
-        }
-        finalMapping.setListField(partition, liveList);
-      }
-    }
+    finalMapping.setListFields(DelayedRebalanceUtil
+        .getFinalDelayedMapping(newIdealMapping.getListFields(), newActiveMapping.getListFields(),
+            liveInstances, minActiveReplica));
     return finalMapping;
   }
 
@@ -392,10 +246,11 @@
     Set<String> liveNodes = cache.getLiveInstances().keySet();
 
     ClusterConfig clusterConfig = cache.getClusterConfig();
-    long delayTime = getRebalanceDelay(idealState, clusterConfig);
-    Set<String> activeNodes = getActiveInstances(allNodes, idealState, liveNodes,
-        cache.getInstanceOfflineTimeMap(), cache.getLiveInstances().keySet(),
-        cache.getInstanceConfigMap(), delayTime, clusterConfig);
+    long delayTime = DelayedRebalanceUtil.getRebalanceDelay(idealState, clusterConfig);
+    Set<String> activeNodes = DelayedRebalanceUtil
+        .getActiveNodes(allNodes, idealState, liveNodes, cache.getInstanceOfflineTimeMap(),
+            cache.getLiveInstances().keySet(), cache.getInstanceConfigMap(), delayTime,
+            clusterConfig);
 
     String stateModelDefName = idealState.getStateModelDefRef();
     StateModelDefinition stateModelDef = cache.getStateModelDef(stateModelDefName);
@@ -420,14 +275,6 @@
     return partitionMapping;
   }
 
-  private int getMinActiveReplica(IdealState idealState, int replicaCount) {
-    int minActiveReplicas = idealState.getMinActiveReplicas();
-    if (minActiveReplicas < 0) {
-      minActiveReplicas = replicaCount;
-    }
-    return minActiveReplicas;
-  }
-
   /**
    * compute best state for resource in AUTO ideal state mode
    * @param liveInstances
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/StatefulRebalancer.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/StatefulRebalancer.java
new file mode 100644
index 0000000..94567bb
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/StatefulRebalancer.java
@@ -0,0 +1,37 @@
+package org.apache.helix.controller.rebalancer;
+
+import java.util.Map;
+
+import org.apache.helix.HelixRebalanceException;
+import org.apache.helix.controller.dataproviders.BaseControllerDataProvider;
+import org.apache.helix.controller.stages.CurrentStateOutput;
+import org.apache.helix.model.IdealState;
+import org.apache.helix.model.Resource;
+
+
+/**
+ * Allows one to come up with custom implementation of a stateful rebalancer.<br/>
+ */
+public interface StatefulRebalancer<T extends BaseControllerDataProvider> {
+
+  /**
+   * Reset the rebalancer to the initial state.
+   */
+  void reset();
+
+  /**
+   * Release all the resources and clean up all the rebalancer state.
+   */
+  void close();
+
+  /**
+   * Compute the new IdealStates for all the input resources. The IdealStates include both new
+   * partition assignment (in the listFiles) and the new replica state mapping (in the mapFields).
+   * @param clusterData The Cluster status data provider.
+   * @param resourceMap A map containing all the rebalancing resources.
+   * @param currentStateOutput The present Current States of the resources.
+   * @return A map of the new IdealStates with the resource name as key.
+   */
+  Map<String, IdealState> computeNewIdealStates(T clusterData, Map<String, Resource> resourceMap,
+      final CurrentStateOutput currentStateOutput) throws HelixRebalanceException;
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/util/DelayedRebalanceUtil.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/util/DelayedRebalanceUtil.java
new file mode 100644
index 0000000..1342860
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/util/DelayedRebalanceUtil.java
@@ -0,0 +1,267 @@
+package org.apache.helix.controller.rebalancer.util;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import org.apache.helix.HelixManager;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.IdealState;
+import org.apache.helix.model.InstanceConfig;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+
+/**
+ * The util for supporting delayed rebalance logic.
+ */
+public class DelayedRebalanceUtil {
+  private static final Logger LOG = LoggerFactory.getLogger(DelayedRebalanceUtil.class);
+
+  private static RebalanceScheduler REBALANCE_SCHEDULER = new RebalanceScheduler();
+
+  /**
+   * @return true if delay rebalance is configured and enabled in the ClusterConfig configurations.
+   */
+  public static boolean isDelayRebalanceEnabled(ClusterConfig clusterConfig) {
+    long delay = clusterConfig.getRebalanceDelayTime();
+    return (delay > 0 && clusterConfig.isDelayRebalaceEnabled());
+  }
+
+  /**
+   * @return true if delay rebalance is configured and enabled in Resource IdealState and the
+   * ClusterConfig configurations.
+   */
+  public static boolean isDelayRebalanceEnabled(IdealState idealState,
+      ClusterConfig clusterConfig) {
+    long delay = getRebalanceDelay(idealState, clusterConfig);
+    return (delay > 0 && idealState.isDelayRebalanceEnabled() && clusterConfig
+        .isDelayRebalaceEnabled());
+  }
+
+  /**
+   * @return the rebalance delay based on Resource IdealState and the ClusterConfig configurations.
+   */
+  public static long getRebalanceDelay(IdealState idealState, ClusterConfig clusterConfig) {
+    long delayTime = idealState.getRebalanceDelay();
+    if (delayTime < 0) {
+      delayTime = clusterConfig.getRebalanceDelayTime();
+    }
+    return delayTime;
+  }
+
+  /**
+   * @return all active nodes (live nodes plus offline-yet-active nodes) while considering cluster
+   * delay rebalance configurations.
+   */
+  public static Set<String> getActiveNodes(Set<String> allNodes, Set<String> liveEnabledNodes,
+      Map<String, Long> instanceOfflineTimeMap, Set<String> liveNodes,
+      Map<String, InstanceConfig> instanceConfigMap, ClusterConfig clusterConfig) {
+    if (!isDelayRebalanceEnabled(clusterConfig)) {
+      return new HashSet<>(liveEnabledNodes);
+    }
+    return getActiveNodes(allNodes, liveEnabledNodes, instanceOfflineTimeMap, liveNodes,
+        instanceConfigMap, clusterConfig.getRebalanceDelayTime(), clusterConfig);
+  }
+
+  /**
+   * @return all active nodes (live nodes plus offline-yet-active nodes) while considering cluster
+   * and the resource delay rebalance configurations.
+   */
+  public static Set<String> getActiveNodes(Set<String> allNodes, IdealState idealState,
+      Set<String> liveEnabledNodes, Map<String, Long> instanceOfflineTimeMap, Set<String> liveNodes,
+      Map<String, InstanceConfig> instanceConfigMap, long delay, ClusterConfig clusterConfig) {
+    if (!isDelayRebalanceEnabled(idealState, clusterConfig)) {
+      return new HashSet<>(liveEnabledNodes);
+    }
+    return getActiveNodes(allNodes, liveEnabledNodes, instanceOfflineTimeMap, liveNodes,
+        instanceConfigMap, delay, clusterConfig);
+  }
+
+  private static Set<String> getActiveNodes(Set<String> allNodes, Set<String> liveEnabledNodes,
+      Map<String, Long> instanceOfflineTimeMap, Set<String> liveNodes,
+      Map<String, InstanceConfig> instanceConfigMap, long delay, ClusterConfig clusterConfig) {
+    Set<String> activeNodes = new HashSet<>(liveEnabledNodes);
+    Set<String> offlineOrDisabledInstances = new HashSet<>(allNodes);
+    offlineOrDisabledInstances.removeAll(liveEnabledNodes);
+    long currentTime = System.currentTimeMillis();
+    for (String ins : offlineOrDisabledInstances) {
+      long inactiveTime = getInactiveTime(ins, liveNodes, instanceOfflineTimeMap.get(ins), delay,
+          instanceConfigMap.get(ins), clusterConfig);
+      InstanceConfig instanceConfig = instanceConfigMap.get(ins);
+      if (inactiveTime > currentTime && instanceConfig != null && instanceConfig
+          .isDelayRebalanceEnabled()) {
+        activeNodes.add(ins);
+      }
+    }
+    return activeNodes;
+  }
+
+  /**
+   * @return The time when an offline or disabled instance should be treated as inactive.
+   * Return -1 if it is inactive now.
+   */
+  private static long getInactiveTime(String instance, Set<String> liveInstances, Long offlineTime,
+      long delay, InstanceConfig instanceConfig, ClusterConfig clusterConfig) {
+    long inactiveTime = Long.MAX_VALUE;
+
+    // check the time instance went offline.
+    if (!liveInstances.contains(instance)) {
+      if (offlineTime != null && offlineTime > 0 && offlineTime + delay < inactiveTime) {
+        inactiveTime = offlineTime + delay;
+      }
+    }
+
+    // check the time instance got disabled.
+    if (!instanceConfig.getInstanceEnabled() || (clusterConfig.getDisabledInstances() != null
+        && clusterConfig.getDisabledInstances().containsKey(instance))) {
+      long disabledTime = instanceConfig.getInstanceEnabledTime();
+      if (clusterConfig.getDisabledInstances() != null && clusterConfig.getDisabledInstances()
+          .containsKey(instance)) {
+        // Update batch disable time
+        long batchDisableTime = Long.parseLong(clusterConfig.getDisabledInstances().get(instance));
+        if (disabledTime == -1 || disabledTime > batchDisableTime) {
+          disabledTime = batchDisableTime;
+        }
+      }
+      if (disabledTime > 0 && disabledTime + delay < inactiveTime) {
+        inactiveTime = disabledTime + delay;
+      }
+    }
+
+    if (inactiveTime == Long.MAX_VALUE) {
+      return -1;
+    }
+
+    return inactiveTime;
+  }
+
+  /**
+   * Merge the new ideal preference list with the delayed mapping that is calculated based on the
+   * delayed rebalance configurations.
+   * The method will prioritize the "active" preference list so as to avoid unnecessary transient
+   * state transitions.
+   *
+   * @param newIdealPreferenceList  the ideal mapping that was calculated based on the current
+   *                                instance status
+   * @param newDelayedPreferenceList the delayed mapping that was calculated based on the delayed
+   *                                 instance status
+   * @param liveEnabledInstances    list of all the nodes that are both alive and enabled.
+   * @param minActiveReplica        the minimum replica count to ensure a valid mapping.
+   *                                If the active list does not have enough replica assignment,
+   *                                this method will fill the list with the new ideal mapping until
+   *                                the replica count satisfies the minimum requirement.
+   * @return the merged state mapping.
+   */
+  public static Map<String, List<String>> getFinalDelayedMapping(
+      Map<String, List<String>> newIdealPreferenceList,
+      Map<String, List<String>> newDelayedPreferenceList, Set<String> liveEnabledInstances,
+      int minActiveReplica) {
+    Map<String, List<String>> finalPreferenceList = new HashMap<>();
+    for (String partition : newIdealPreferenceList.keySet()) {
+      List<String> idealList = newIdealPreferenceList.get(partition);
+      List<String> delayedIdealList = newDelayedPreferenceList.get(partition);
+
+      List<String> liveList = new ArrayList<>();
+      for (String ins : delayedIdealList) {
+        if (liveEnabledInstances.contains(ins)) {
+          liveList.add(ins);
+        }
+      }
+
+      if (liveList.size() >= minActiveReplica) {
+        finalPreferenceList.put(partition, delayedIdealList);
+      } else {
+        List<String> candidates = new ArrayList<>(idealList);
+        candidates.removeAll(delayedIdealList);
+        for (String liveIns : candidates) {
+          liveList.add(liveIns);
+          if (liveList.size() >= minActiveReplica) {
+            break;
+          }
+        }
+        finalPreferenceList.put(partition, liveList);
+      }
+    }
+    return finalPreferenceList;
+  }
+
+  /**
+   * Get the minimum active replica count threshold that allows delayed rebalance.
+   *
+   * @param idealState      the resource Ideal State
+   * @param replicaCount the expected active replica count.
+   * @return the expected minimum active replica count that is required
+   */
+  public static int getMinActiveReplica(IdealState idealState, int replicaCount) {
+    int minActiveReplicas = idealState.getMinActiveReplicas();
+    if (minActiveReplicas < 0) {
+      minActiveReplicas = replicaCount;
+    }
+    return minActiveReplicas;
+  }
+
+  /**
+   * Set a rebalance scheduler for the closest future rebalance time.
+   */
+  public static void setRebalanceScheduler(String resourceName, boolean isDelayedRebalanceEnabled,
+      Set<String> offlineOrDisabledInstances, Map<String, Long> instanceOfflineTimeMap,
+      Set<String> liveNodes, Map<String, InstanceConfig> instanceConfigMap, long delay,
+      ClusterConfig clusterConfig, HelixManager manager) {
+    if (!isDelayedRebalanceEnabled) {
+      REBALANCE_SCHEDULER.removeScheduledRebalance(resourceName);
+      return;
+    }
+
+    long currentTime = System.currentTimeMillis();
+    long nextRebalanceTime = Long.MAX_VALUE;
+    // calculate the closest future rebalance time
+    for (String ins : offlineOrDisabledInstances) {
+      long inactiveTime = getInactiveTime(ins, liveNodes, instanceOfflineTimeMap.get(ins), delay,
+          instanceConfigMap.get(ins), clusterConfig);
+      if (inactiveTime != -1 && inactiveTime > currentTime && inactiveTime < nextRebalanceTime) {
+        nextRebalanceTime = inactiveTime;
+      }
+    }
+
+    if (nextRebalanceTime == Long.MAX_VALUE) {
+      long startTime = REBALANCE_SCHEDULER.removeScheduledRebalance(resourceName);
+      if (LOG.isDebugEnabled()) {
+        LOG.debug(String
+            .format("Remove exist rebalance timer for resource %s at %d\n", resourceName,
+                startTime));
+      }
+    } else {
+      long currentScheduledTime = REBALANCE_SCHEDULER.getRebalanceTime(resourceName);
+      if (currentScheduledTime < 0 || currentScheduledTime > nextRebalanceTime) {
+        REBALANCE_SCHEDULER.scheduleRebalance(manager, resourceName, nextRebalanceTime);
+        if (LOG.isDebugEnabled()) {
+          LOG.debug(String
+              .format("Set next rebalance time for resource %s at time %d\n", resourceName,
+                  nextRebalanceTime));
+        }
+      }
+    }
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/util/ResourceUsageCalculator.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/util/ResourceUsageCalculator.java
index c2d472a..e7a1b94 100644
--- a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/util/ResourceUsageCalculator.java
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/util/ResourceUsageCalculator.java
@@ -1,11 +1,31 @@
 package org.apache.helix.controller.rebalancer.util;
 
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 import java.util.HashMap;
 import java.util.Map;
 
 import org.apache.helix.api.rebalancer.constraint.dataprovider.PartitionWeightProvider;
 import org.apache.helix.controller.common.ResourcesStateMap;
 import org.apache.helix.model.Partition;
+import org.apache.helix.model.ResourceAssignment;
 
 public class ResourceUsageCalculator {
   /**
@@ -33,4 +53,176 @@
     }
     return newParticipantUsage;
   }
+
+  /**
+   * Measure baseline divergence between baseline assignment and best possible assignment at
+   * replica level. Example as below:
+   * baseline =
+   * {
+   *    resource1={
+   *       partition1={
+   *          instance1=master,
+   *          instance2=slave
+   *       },
+   *       partition2={
+   *          instance2=slave
+   *       }
+   *    }
+   * }
+   * bestPossible =
+   * {
+   *    resource1={
+   *       partition1={
+   *          instance1=master,  <--- matched
+   *          instance3=slave    <--- doesn't match
+   *       },
+   *       partition2={
+   *          instance3=master   <--- doesn't match
+   *       }
+   *    }
+   * }
+   * baseline divergence = (doesn't match: 2) / (total(matched + doesn't match): 3) = 2/3 ~= 0.66667
+   * If divergence == 1.0, all are different(no match); divergence == 0.0, no difference.
+   *
+   * @param baseline baseline assignment
+   * @param bestPossibleAssignment best possible assignment
+   * @return double value range at [0.0, 1.0]
+   */
+  public static double measureBaselineDivergence(Map<String, ResourceAssignment> baseline,
+      Map<String, ResourceAssignment> bestPossibleAssignment) {
+    int numMatchedReplicas = 0;
+    int numTotalBestPossibleReplicas = 0;
+
+    // 1. Check resource assignment names.
+    for (Map.Entry<String, ResourceAssignment> resourceEntry : bestPossibleAssignment.entrySet()) {
+      String resourceKey = resourceEntry.getKey();
+      if (!baseline.containsKey(resourceKey)) {
+        continue;
+      }
+
+      // Resource assignment names are matched.
+      // 2. check partitions.
+      Map<String, Map<String, String>> bestPossiblePartitions =
+          resourceEntry.getValue().getRecord().getMapFields();
+      Map<String, Map<String, String>> baselinePartitions =
+          baseline.get(resourceKey).getRecord().getMapFields();
+
+      for (Map.Entry<String, Map<String, String>> partitionEntry
+          : bestPossiblePartitions.entrySet()) {
+        String partitionName = partitionEntry.getKey();
+        if (!baselinePartitions.containsKey(partitionName)) {
+          continue;
+        }
+
+        // Partition names are matched.
+        // 3. Check replicas.
+        Map<String, String> bestPossibleReplicas = partitionEntry.getValue();
+        Map<String, String> baselineReplicas = baselinePartitions.get(partitionName);
+
+        for (Map.Entry<String, String> replicaEntry : bestPossibleReplicas.entrySet()) {
+          String replicaName = replicaEntry.getKey();
+          if (!baselineReplicas.containsKey(replicaName)) {
+            continue;
+          }
+
+          // Replica names are matched.
+          // 4. Check replica values.
+          String bestPossibleReplica = replicaEntry.getValue();
+          String baselineReplica = baselineReplicas.get(replicaName);
+          if (bestPossibleReplica.equals(baselineReplica)) {
+            numMatchedReplicas++;
+          }
+        }
+
+        // Count total best possible replicas.
+        numTotalBestPossibleReplicas += bestPossibleReplicas.size();
+      }
+    }
+
+    return numTotalBestPossibleReplicas == 0 ? 1.0d
+        : (1.0d - (double) numMatchedReplicas / (double) numTotalBestPossibleReplicas);
+  }
+
+  /**
+   * Calculates average partition weight per capacity key for a resource config. Example as below:
+   * Input =
+   * {
+   *   "partition1": {
+   *     "capacity1": 20,
+   *     "capacity2": 40
+   *   },
+   *   "partition2": {
+   *     "capacity1": 30,
+   *     "capacity2": 50
+   *   },
+   *   "partition3": {
+   *     "capacity1": 16,
+   *     "capacity2": 30
+   *   }
+   * }
+   *
+   * Total weight for key "capacity1" = 20 + 30 + 16 = 66;
+   * Total weight for key "capacity2" = 40 + 50 + 30 = 120;
+   * Total partitions = 3;
+   * Average partition weight for "capacity1" = 66 / 3 = 22;
+   * Average partition weight for "capacity2" = 120 / 3 = 40;
+   *
+   * Output =
+   * {
+   *   "capacity1": 22,
+   *   "capacity2": 40
+   * }
+   *
+   * @param partitionCapacityMap A map of partition capacity:
+   *        <PartitionName or DEFAULT_PARTITION_KEY, <Capacity Key, Capacity Number>>
+   * @return A map of partition weight: capacity key -> average partition weight
+   */
+  public static Map<String, Integer> calculateAveragePartitionWeight(
+      Map<String, Map<String, Integer>> partitionCapacityMap) {
+    // capacity key -> [number of partitions, total weight per capacity key]
+    Map<String, PartitionWeightCounterEntry> countPartitionWeightMap = new HashMap<>();
+
+    // Aggregates partition weight for each capacity key.
+    partitionCapacityMap.values().forEach(partitionCapacityEntry ->
+        partitionCapacityEntry.forEach((capacityKey, weight) -> countPartitionWeightMap
+            .computeIfAbsent(capacityKey, counterEntry -> new PartitionWeightCounterEntry())
+            .increase(1, weight)));
+
+    // capacity key -> average partition weight
+    Map<String, Integer> averagePartitionWeightMap = new HashMap<>();
+
+    // Calculate average partition weight for each capacity key.
+    // Per capacity key level:
+    // average partition weight = (total partition weight) / (number of partitions)
+    for (Map.Entry<String, PartitionWeightCounterEntry> entry
+        : countPartitionWeightMap.entrySet()) {
+      String capacityKey = entry.getKey();
+      PartitionWeightCounterEntry weightEntry = entry.getValue();
+      int averageWeight = (int) (weightEntry.getWeight() / weightEntry.getPartitions());
+      averagePartitionWeightMap.put(capacityKey, averageWeight);
+    }
+
+    return averagePartitionWeightMap;
+  }
+
+  /*
+   * Represents total number of partitions and total partition weight for a capacity key.
+   */
+  private static class PartitionWeightCounterEntry {
+    private int partitions;
+    private long weight;
+
+    private int getPartitions() {
+      return partitions;
+    }
+
+    private long getWeight() {
+      return weight;
+    }
+
+    private void increase(int partitions, int weight) {
+      this.partitions += partitions;
+      this.weight += weight;
+    }
+  }
 }
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/util/WagedValidationUtil.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/util/WagedValidationUtil.java
new file mode 100644
index 0000000..e9f86e7
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/util/WagedValidationUtil.java
@@ -0,0 +1,91 @@
+package org.apache.helix.controller.rebalancer.util;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.helix.HelixException;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.InstanceConfig;
+import org.apache.helix.model.ResourceConfig;
+
+
+/**
+ * A util class that contains validation-related static methods for WAGED rebalancer.
+ */
+public class WagedValidationUtil {
+  /**
+   * Validates and returns instance capacities. The validation logic ensures that all required capacity keys (in ClusterConfig) are present in InstanceConfig.
+   * @param clusterConfig
+   * @param instanceConfig
+   * @return
+   */
+  public static Map<String, Integer> validateAndGetInstanceCapacity(ClusterConfig clusterConfig,
+      InstanceConfig instanceConfig) {
+    // Fetch the capacity of instance from 2 possible sources according to the following priority.
+    // 1. The instance capacity that is configured in the instance config.
+    // 2. If the default instance capacity that is configured in the cluster config contains more capacity keys, fill the capacity map with those additional values.
+    Map<String, Integer> instanceCapacity =
+        new HashMap<>(clusterConfig.getDefaultInstanceCapacityMap());
+    instanceCapacity.putAll(instanceConfig.getInstanceCapacityMap());
+
+    List<String> requiredCapacityKeys = clusterConfig.getInstanceCapacityKeys();
+    // All the required keys must exist in the instance config.
+    if (!instanceCapacity.keySet().containsAll(requiredCapacityKeys)) {
+      throw new HelixException(String.format(
+          "The required capacity keys: %s are not fully configured in the instance: %s, capacity map: %s.",
+          requiredCapacityKeys.toString(), instanceConfig.getInstanceName(),
+          instanceCapacity.toString()));
+    }
+    return instanceCapacity;
+  }
+
+  /**
+   * Validates and returns partition capacities. The validation logic ensures that all required capacity keys (from ClusterConfig) are present in the ResourceConfig for the partition.
+   * @param partitionName
+   * @param resourceConfig
+   * @param clusterConfig
+   * @return
+   */
+  public static Map<String, Integer> validateAndGetPartitionCapacity(String partitionName,
+      ResourceConfig resourceConfig, Map<String, Map<String, Integer>> capacityMap,
+      ClusterConfig clusterConfig) {
+    // Fetch the capacity of partition from 3 possible sources according to the following priority.
+    // 1. The partition capacity that is explicitly configured in the resource config.
+    // 2. Or, the default partition capacity that is configured under partition name DEFAULT_PARTITION_KEY in the resource config.
+    // 3. If the default partition capacity that is configured in the cluster config contains more capacity keys, fill the capacity map with those additional values.
+    Map<String, Integer> partitionCapacity =
+        new HashMap<>(clusterConfig.getDefaultPartitionWeightMap());
+    partitionCapacity.putAll(capacityMap.getOrDefault(partitionName,
+        capacityMap.getOrDefault(ResourceConfig.DEFAULT_PARTITION_KEY, new HashMap<>())));
+
+    List<String> requiredCapacityKeys = clusterConfig.getInstanceCapacityKeys();
+    // If any required capacity key is not configured in the resource config, fail the model creating.
+    if (!partitionCapacity.keySet().containsAll(requiredCapacityKeys)) {
+      throw new HelixException(String.format(
+          "The required capacity keys: %s are not fully configured in the resource: %s, partition: %s, weight map: %s.",
+          requiredCapacityKeys.toString(), resourceConfig.getResourceName(), partitionName,
+          partitionCapacity.toString()));
+    }
+    return partitionCapacity;
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/AssignmentMetadataStore.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/AssignmentMetadataStore.java
new file mode 100644
index 0000000..afd0187
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/AssignmentMetadataStore.java
@@ -0,0 +1,213 @@
+package org.apache.helix.controller.rebalancer.waged;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.IOException;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.I0Itec.zkclient.exception.ZkNoNodeException;
+import org.I0Itec.zkclient.serialize.ZkSerializer;
+import org.apache.helix.BucketDataAccessor;
+import org.apache.helix.HelixException;
+import org.apache.helix.HelixProperty;
+import org.apache.helix.ZNRecord;
+import org.apache.helix.manager.zk.ZNRecordJacksonSerializer;
+import org.apache.helix.manager.zk.ZkBucketDataAccessor;
+import org.apache.helix.model.ResourceAssignment;
+
+
+/**
+ * A placeholder before we have the real assignment metadata store.
+ */
+public class AssignmentMetadataStore {
+  private static final String ASSIGNMENT_METADATA_KEY = "ASSIGNMENT_METADATA";
+  private static final String BASELINE_TEMPLATE = "/%s/%s/BASELINE";
+  private static final String BEST_POSSIBLE_TEMPLATE = "/%s/%s/BEST_POSSIBLE";
+  private static final String BASELINE_KEY = "BASELINE";
+  private static final String BEST_POSSIBLE_KEY = "BEST_POSSIBLE";
+  private static final ZkSerializer SERIALIZER = new ZNRecordJacksonSerializer();
+
+  private BucketDataAccessor _dataAccessor;
+  private String _baselinePath;
+  private String _bestPossiblePath;
+  protected Map<String, ResourceAssignment> _globalBaseline;
+  protected Map<String, ResourceAssignment> _bestPossibleAssignment;
+
+  AssignmentMetadataStore(String metadataStoreAddrs, String clusterName) {
+    this(new ZkBucketDataAccessor(metadataStoreAddrs), clusterName);
+  }
+
+  protected AssignmentMetadataStore(BucketDataAccessor bucketDataAccessor, String clusterName) {
+    _dataAccessor = bucketDataAccessor;
+    _baselinePath = String.format(BASELINE_TEMPLATE, clusterName, ASSIGNMENT_METADATA_KEY);
+    _bestPossiblePath = String.format(BEST_POSSIBLE_TEMPLATE, clusterName, ASSIGNMENT_METADATA_KEY);
+  }
+
+  public synchronized Map<String, ResourceAssignment> getBaseline() {
+    // Return the in-memory baseline. If null, read from ZK. This is to minimize reads from ZK
+    if (_globalBaseline == null) {
+      try {
+        HelixProperty baseline =
+            _dataAccessor.compressedBucketRead(_baselinePath, HelixProperty.class);
+        _globalBaseline = splitAssignments(baseline);
+      } catch (ZkNoNodeException ex) {
+        // Metadata does not exist, so return an empty map
+        _globalBaseline = Collections.emptyMap();
+      }
+    }
+    return _globalBaseline;
+  }
+
+  public synchronized Map<String, ResourceAssignment> getBestPossibleAssignment() {
+    // Return the in-memory baseline. If null, read from ZK. This is to minimize reads from ZK
+    if (_bestPossibleAssignment == null) {
+      try {
+        HelixProperty baseline =
+            _dataAccessor.compressedBucketRead(_bestPossiblePath, HelixProperty.class);
+        _bestPossibleAssignment = splitAssignments(baseline);
+      } catch (ZkNoNodeException ex) {
+        // Metadata does not exist, so return an empty map
+        _bestPossibleAssignment = Collections.emptyMap();
+      }
+    }
+    return _bestPossibleAssignment;
+  }
+
+  /**
+   * @return true if a new baseline was persisted.
+   * @throws HelixException if the method failed to persist the baseline.
+   */
+  // TODO: Enhance the return value so it is more intuitive to understand when the persist fails and
+  // TODO: when it is skipped.
+  public synchronized boolean persistBaseline(Map<String, ResourceAssignment> globalBaseline) {
+    // TODO: Make the write async?
+    // If baseline hasn't changed, skip writing to metadata store
+    if (compareAssignments(_globalBaseline, globalBaseline)) {
+      return false;
+    }
+    // Persist to ZK
+    HelixProperty combinedAssignments = combineAssignments(BASELINE_KEY, globalBaseline);
+    try {
+      _dataAccessor.compressedBucketWrite(_baselinePath, combinedAssignments);
+    } catch (IOException e) {
+      // TODO: Improve failure handling
+      throw new HelixException("Failed to persist baseline!", e);
+    }
+
+    // Update the in-memory reference
+    _globalBaseline = globalBaseline;
+    return true;
+  }
+
+  /**
+   * @return true if a new best possible assignment was persisted.
+   * @throws HelixException if the method failed to persist the baseline.
+   */
+  // TODO: Enhance the return value so it is more intuitive to understand when the persist fails and
+  // TODO: when it is skipped.
+  public synchronized boolean persistBestPossibleAssignment(
+      Map<String, ResourceAssignment> bestPossibleAssignment) {
+    // TODO: Make the write async?
+    // If bestPossibleAssignment hasn't changed, skip writing to metadata store
+    if (compareAssignments(_bestPossibleAssignment, bestPossibleAssignment)) {
+      return false;
+    }
+    // Persist to ZK
+    HelixProperty combinedAssignments =
+        combineAssignments(BEST_POSSIBLE_KEY, bestPossibleAssignment);
+    try {
+      _dataAccessor.compressedBucketWrite(_bestPossiblePath, combinedAssignments);
+    } catch (IOException e) {
+      // TODO: Improve failure handling
+      throw new HelixException("Failed to persist BestPossibleAssignment!", e);
+    }
+
+    // Update the in-memory reference
+    _bestPossibleAssignment = bestPossibleAssignment;
+    return true;
+  }
+
+  protected synchronized void reset() {
+    if (_bestPossibleAssignment != null) {
+      _bestPossibleAssignment.clear();
+      _bestPossibleAssignment = null;
+    }
+    if (_globalBaseline != null) {
+      _globalBaseline.clear();
+      _globalBaseline = null;
+    }
+  }
+
+  protected void finalize() {
+    // To ensure all resources are released.
+    close();
+  }
+
+  // Close to release all the resources.
+  public void close() {
+    _dataAccessor.disconnect();
+  }
+
+  /**
+   * Produces one HelixProperty that contains all assignment data.
+   * @param name
+   * @param assignmentMap
+   * @return
+   */
+  private HelixProperty combineAssignments(String name,
+      Map<String, ResourceAssignment> assignmentMap) {
+    HelixProperty property = new HelixProperty(name);
+    // Add each resource's assignment as a simple field in one ZNRecord
+    // Node that don't use Arrays.toString() for the record converting. The deserialize will fail.
+    assignmentMap.forEach((resource, assignment) -> property.getRecord()
+        .setSimpleField(resource, new String(SERIALIZER.serialize(assignment.getRecord()))));
+    return property;
+  }
+
+  /**
+   * Returns a Map of (ResourceName, ResourceAssignment) pairs.
+   * @param property
+   * @return
+   */
+  private Map<String, ResourceAssignment> splitAssignments(HelixProperty property) {
+    Map<String, ResourceAssignment> assignmentMap = new HashMap<>();
+    // Convert each resource's assignment String into a ResourceAssignment object and put it in a
+    // map
+    property.getRecord().getSimpleFields().forEach((resource, assignmentStr) -> assignmentMap
+        .put(resource,
+            new ResourceAssignment((ZNRecord) SERIALIZER.deserialize(assignmentStr.getBytes()))));
+    return assignmentMap;
+  }
+
+  /**
+   * Returns whether two assignments are same.
+   * @param oldAssignment
+   * @param newAssignment
+   * @return true if they are the same. False otherwise or oldAssignment is null
+   */
+  protected boolean compareAssignments(Map<String, ResourceAssignment> oldAssignment,
+      Map<String, ResourceAssignment> newAssignment) {
+    // If oldAssignment is null, that means that we haven't read from/written to
+    // the metadata store yet. In that case, we return false so that we write to metadata store.
+    return oldAssignment != null && oldAssignment.equals(newAssignment);
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/RebalanceAlgorithm.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/RebalanceAlgorithm.java
new file mode 100644
index 0000000..1374162
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/RebalanceAlgorithm.java
@@ -0,0 +1,43 @@
+package org.apache.helix.controller.rebalancer.waged;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.HelixRebalanceException;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterModel;
+import org.apache.helix.controller.rebalancer.waged.model.OptimalAssignment;
+
+/**
+ * A generic interface to generate the optimal assignment given the runtime cluster environment.
+ *
+ * <pre>
+ * @see <a href="https://github.com/apache/helix/wiki/
+ * Design-Proposal---Weight-Aware-Globally-Even-Distribute-Rebalancer
+ * #rebalance-algorithm-adapter">Rebalance Algorithm</a>
+ * </pre>
+ */
+public interface RebalanceAlgorithm {
+
+  /**
+   * Rebalance the Helix resource partitions based on the input cluster model.
+   * @param clusterModel The run time cluster model that contains all necessary information
+   * @return An instance of {@link OptimalAssignment}
+   */
+  OptimalAssignment calculate(ClusterModel clusterModel) throws HelixRebalanceException;
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedRebalancer.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedRebalancer.java
new file mode 100644
index 0000000..8a21bbb
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedRebalancer.java
@@ -0,0 +1,787 @@
+package org.apache.helix.controller.rebalancer.waged;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.stream.Collectors;
+
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.ImmutableSet;
+import org.apache.helix.HelixConstants;
+import org.apache.helix.HelixManager;
+import org.apache.helix.HelixRebalanceException;
+import org.apache.helix.controller.changedetector.ResourceChangeDetector;
+import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
+import org.apache.helix.controller.rebalancer.DelayedAutoRebalancer;
+import org.apache.helix.controller.rebalancer.StatefulRebalancer;
+import org.apache.helix.controller.rebalancer.internal.MappingCalculator;
+import org.apache.helix.controller.rebalancer.util.DelayedRebalanceUtil;
+import org.apache.helix.controller.rebalancer.waged.constraints.ConstraintBasedAlgorithmFactory;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterModel;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterModelProvider;
+import org.apache.helix.controller.rebalancer.waged.model.OptimalAssignment;
+import org.apache.helix.controller.stages.CurrentStateOutput;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.IdealState;
+import org.apache.helix.model.Partition;
+import org.apache.helix.model.Resource;
+import org.apache.helix.model.ResourceAssignment;
+import org.apache.helix.model.ResourceConfig;
+import org.apache.helix.monitoring.metrics.MetricCollector;
+import org.apache.helix.monitoring.metrics.WagedRebalancerMetricCollector;
+import org.apache.helix.monitoring.metrics.implementation.BaselineDivergenceGauge;
+import org.apache.helix.monitoring.metrics.model.CountMetric;
+import org.apache.helix.monitoring.metrics.model.LatencyMetric;
+import org.apache.helix.util.RebalanceUtil;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Weight-Aware Globally-Even Distribute Rebalancer.
+ * @see <a
+ *      href="https://github.com/apache/helix/wiki/Design-Proposal---Weight-Aware-Globally-Even-Distribute-Rebalancer">
+ *      Design Document
+ *      </a>
+ */
+public class WagedRebalancer implements StatefulRebalancer<ResourceControllerDataProvider> {
+  private static final Logger LOG = LoggerFactory.getLogger(WagedRebalancer.class);
+
+  // When any of the following change happens, the rebalancer needs to do a global rebalance which
+  // contains 1. baseline recalculate, 2. partial rebalance that is based on the new baseline.
+  private static final Set<HelixConstants.ChangeType> GLOBAL_REBALANCE_REQUIRED_CHANGE_TYPES =
+      ImmutableSet
+          .of(HelixConstants.ChangeType.RESOURCE_CONFIG, HelixConstants.ChangeType.IDEAL_STATE,
+              HelixConstants.ChangeType.CLUSTER_CONFIG, HelixConstants.ChangeType.INSTANCE_CONFIG);
+  // To identify if the preference has been configured or not.
+  private static final Map<ClusterConfig.GlobalRebalancePreferenceKey, Integer>
+      NOT_CONFIGURED_PREFERENCE = ImmutableMap
+      .of(ClusterConfig.GlobalRebalancePreferenceKey.EVENNESS, -1,
+          ClusterConfig.GlobalRebalancePreferenceKey.LESS_MOVEMENT, -1);
+  // The default algorithm to use when there is no preference configured.
+  private static final RebalanceAlgorithm DEFAULT_REBALANCE_ALGORITHM =
+      ConstraintBasedAlgorithmFactory
+          .getInstance(ClusterConfig.DEFAULT_GLOBAL_REBALANCE_PREFERENCE);
+
+  // To calculate the baseline asynchronously
+  private final ExecutorService _baselineCalculateExecutor;
+  private final ResourceChangeDetector _changeDetector;
+  private final HelixManager _manager;
+  private final MappingCalculator<ResourceControllerDataProvider> _mappingCalculator;
+  private final AssignmentMetadataStore _assignmentMetadataStore;
+
+  private final MetricCollector _metricCollector;
+  private final CountMetric _rebalanceFailureCount;
+  private final CountMetric _baselineCalcCounter;
+  private final LatencyMetric _baselineCalcLatency;
+  private final LatencyMetric _writeLatency;
+  private final CountMetric _partialRebalanceCounter;
+  private final LatencyMetric _partialRebalanceLatency;
+  private final LatencyMetric _stateReadLatency;
+  private final BaselineDivergenceGauge _baselineDivergenceGauge;
+
+  private boolean _asyncGlobalRebalanceEnabled;
+
+  // Note, the rebalance algorithm field is mutable so it should not be directly referred except for
+  // the public method computeNewIdealStates.
+  private RebalanceAlgorithm _rebalanceAlgorithm;
+  private Map<ClusterConfig.GlobalRebalancePreferenceKey, Integer> _preference =
+      NOT_CONFIGURED_PREFERENCE;
+
+  private static AssignmentMetadataStore constructAssignmentStore(String metadataStoreAddrs,
+      String clusterName) {
+    if (metadataStoreAddrs != null && clusterName != null) {
+      return new AssignmentMetadataStore(metadataStoreAddrs, clusterName);
+    }
+    return null;
+  }
+
+  public WagedRebalancer(HelixManager helixManager) {
+    this(helixManager == null ? null
+            : constructAssignmentStore(helixManager.getMetadataStoreConnectionString(),
+                helixManager.getClusterName()),
+        DEFAULT_REBALANCE_ALGORITHM,
+        // Use DelayedAutoRebalancer as the mapping calculator for the final assignment output.
+        // Mapping calculator will translate the best possible assignment into the applicable state
+        // mapping based on the current states.
+        // TODO abstract and separate the main mapping calculator logic from DelayedAutoRebalancer
+        new DelayedAutoRebalancer(),
+        // Helix Manager is required for the rebalancer scheduler
+        helixManager,
+        // If HelixManager is null, we just pass in a non-functioning WagedRebalancerMetricCollector
+        // that will not be registered to MBean.
+        // This is to handle two cases: 1. HelixManager is null for non-testing cases. In this case,
+        // WagedRebalancer will not read/write to metadata store and just use CurrentState-based
+        // rebalancing. 2. Tests that require instrumenting the rebalancer for verifying whether the
+        // cluster has converged.
+        helixManager == null ? new WagedRebalancerMetricCollector()
+            : new WagedRebalancerMetricCollector(helixManager.getClusterName()),
+        ClusterConfig.DEFAULT_GLOBAL_REBALANCE_ASYNC_MODE_ENABLED);
+    _preference = ImmutableMap.copyOf(ClusterConfig.DEFAULT_GLOBAL_REBALANCE_PREFERENCE);
+  }
+
+  /**
+   * This constructor will use null for HelixManager. With null HelixManager, the rebalancer will
+   * not schedule for a future delayed rebalance.
+   * @param assignmentMetadataStore
+   * @param algorithm
+   * @param metricCollectorOptional
+   */
+  protected WagedRebalancer(AssignmentMetadataStore assignmentMetadataStore,
+      RebalanceAlgorithm algorithm, Optional<MetricCollector> metricCollectorOptional) {
+    this(assignmentMetadataStore, algorithm, new DelayedAutoRebalancer(), null,
+        // If metricCollector is not provided, instantiate a version that does not register metrics
+        // in order to allow rebalancer to proceed
+        metricCollectorOptional.orElse(new WagedRebalancerMetricCollector()),
+        false);
+  }
+
+  private WagedRebalancer(AssignmentMetadataStore assignmentMetadataStore,
+      RebalanceAlgorithm algorithm, MappingCalculator mappingCalculator, HelixManager manager,
+      MetricCollector metricCollector, boolean isAsyncGlobalRebalanceEnabled) {
+    if (assignmentMetadataStore == null) {
+      LOG.warn("Assignment Metadata Store is not configured properly."
+          + " The rebalancer will not access the assignment store during the rebalance.");
+    }
+    _assignmentMetadataStore = assignmentMetadataStore;
+    _rebalanceAlgorithm = algorithm;
+    _mappingCalculator = mappingCalculator;
+    if (manager == null) {
+      LOG.warn("HelixManager is not provided. The rebalancer is not going to schedule for a future "
+          + "rebalance even when delayed rebalance is enabled.");
+    }
+    _manager = manager;
+
+    _metricCollector = metricCollector;
+    _rebalanceFailureCount = _metricCollector.getMetric(
+        WagedRebalancerMetricCollector.WagedRebalancerMetricNames.RebalanceFailureCounter.name(),
+        CountMetric.class);
+    _baselineCalcCounter = _metricCollector.getMetric(
+        WagedRebalancerMetricCollector.WagedRebalancerMetricNames.GlobalBaselineCalcCounter.name(),
+        CountMetric.class);
+    _baselineCalcLatency = _metricCollector.getMetric(
+        WagedRebalancerMetricCollector.WagedRebalancerMetricNames.GlobalBaselineCalcLatencyGauge
+            .name(),
+        LatencyMetric.class);
+    _partialRebalanceCounter = _metricCollector.getMetric(
+        WagedRebalancerMetricCollector.WagedRebalancerMetricNames.PartialRebalanceCounter.name(),
+        CountMetric.class);
+    _partialRebalanceLatency = _metricCollector.getMetric(
+        WagedRebalancerMetricCollector.WagedRebalancerMetricNames.PartialRebalanceLatencyGauge
+            .name(),
+        LatencyMetric.class);
+    _writeLatency = _metricCollector.getMetric(
+        WagedRebalancerMetricCollector.WagedRebalancerMetricNames.StateWriteLatencyGauge.name(),
+        LatencyMetric.class);
+    _stateReadLatency = _metricCollector.getMetric(
+        WagedRebalancerMetricCollector.WagedRebalancerMetricNames.StateReadLatencyGauge.name(),
+        LatencyMetric.class);
+    _baselineDivergenceGauge = _metricCollector.getMetric(
+        WagedRebalancerMetricCollector.WagedRebalancerMetricNames.BaselineDivergenceGauge.name(),
+        BaselineDivergenceGauge.class);
+
+    _changeDetector = new ResourceChangeDetector(true);
+
+    _baselineCalculateExecutor = Executors.newSingleThreadExecutor();
+    _asyncGlobalRebalanceEnabled = isAsyncGlobalRebalanceEnabled;
+  }
+
+  // Update the global rebalance mode to be asynchronous or synchronous
+  public void setGlobalRebalanceAsyncMode(boolean isAsyncGlobalRebalanceEnabled) {
+    _asyncGlobalRebalanceEnabled = isAsyncGlobalRebalanceEnabled;
+  }
+
+  // Update the rebalancer preference if the new options are different from the current preference.
+  public synchronized void updateRebalancePreference(
+      Map<ClusterConfig.GlobalRebalancePreferenceKey, Integer> newPreference) {
+    // 1. if the preference was not configured during constructing, no need to update.
+    // 2. if the preference equals to the new preference, no need to update.
+    if (!_preference.equals(NOT_CONFIGURED_PREFERENCE) && !_preference.equals(newPreference)) {
+      _rebalanceAlgorithm = ConstraintBasedAlgorithmFactory.getInstance(newPreference);
+      _preference = ImmutableMap.copyOf(newPreference);
+    }
+  }
+
+  @Override
+  public void reset() {
+    if (_assignmentMetadataStore != null) {
+      _assignmentMetadataStore.reset();
+    }
+    _changeDetector.resetSnapshots();
+  }
+
+  // TODO the rebalancer should reject any other computing request after being closed.
+  @Override
+  public void close() {
+    if (_baselineCalculateExecutor != null) {
+      _baselineCalculateExecutor.shutdownNow();
+    }
+    if (_assignmentMetadataStore != null) {
+      _assignmentMetadataStore.close();
+    }
+    _metricCollector.unregister();
+  }
+
+  @Override
+  public Map<String, IdealState> computeNewIdealStates(ResourceControllerDataProvider clusterData,
+      Map<String, Resource> resourceMap, final CurrentStateOutput currentStateOutput)
+      throws HelixRebalanceException {
+    if (resourceMap.isEmpty()) {
+      LOG.warn("There is no resource to be rebalanced by {}", this.getClass().getSimpleName());
+      return Collections.emptyMap();
+    }
+
+    LOG.info("Start computing new ideal states for resources: {}", resourceMap.keySet().toString());
+    validateInput(clusterData, resourceMap);
+
+    Map<String, IdealState> newIdealStates;
+    try {
+      // Calculate the target assignment based on the current cluster status.
+      newIdealStates = computeBestPossibleStates(clusterData, resourceMap, currentStateOutput,
+          _rebalanceAlgorithm);
+    } catch (HelixRebalanceException ex) {
+      LOG.error("Failed to calculate the new assignments.", ex);
+      // Record the failure in metrics.
+      _rebalanceFailureCount.increment(1L);
+
+      HelixRebalanceException.Type failureType = ex.getFailureType();
+      if (failureType.equals(HelixRebalanceException.Type.INVALID_REBALANCER_STATUS) || failureType
+          .equals(HelixRebalanceException.Type.UNKNOWN_FAILURE)) {
+        // If the failure is unknown or because of assignment store access failure, throw the
+        // rebalance exception.
+        throw ex;
+      } else { // return the previously calculated assignment.
+        LOG.warn(
+            "Returning the last known-good best possible assignment from metadata store due to "
+                + "rebalance failure of type: {}", failureType);
+        // Note that don't return an assignment based on the current state if there is no previously
+        // calculated result in this fallback logic.
+        Map<String, ResourceAssignment> assignmentRecord =
+            getBestPossibleAssignment(_assignmentMetadataStore, new CurrentStateOutput(),
+                resourceMap.keySet());
+        newIdealStates = convertResourceAssignment(clusterData, assignmentRecord);
+      }
+    }
+
+    // Construct the new best possible states according to the current state and target assignment.
+    // Note that the new ideal state might be an intermediate state between the current state and
+    // the target assignment.
+    newIdealStates.values().parallelStream().forEach(idealState -> {
+      String resourceName = idealState.getResourceName();
+      // Adjust the states according to the current state.
+      ResourceAssignment finalAssignment = _mappingCalculator
+          .computeBestPossiblePartitionState(clusterData, idealState, resourceMap.get(resourceName),
+              currentStateOutput);
+
+      // Clean up the state mapping fields. Use the final assignment that is calculated by the
+      // mapping calculator to replace them.
+      idealState.getRecord().getMapFields().clear();
+      for (Partition partition : finalAssignment.getMappedPartitions()) {
+        Map<String, String> newStateMap = finalAssignment.getReplicaMap(partition);
+        // if the final states cannot be generated, override the best possible state with empty map.
+        idealState.setInstanceStateMap(partition.getPartitionName(),
+            newStateMap == null ? Collections.emptyMap() : newStateMap);
+      }
+    });
+    LOG.info("Finish computing new ideal states for resources: {}",
+        resourceMap.keySet().toString());
+    return newIdealStates;
+  }
+
+  // Coordinate global rebalance and partial rebalance according to the cluster changes.
+  private Map<String, IdealState> computeBestPossibleStates(
+      ResourceControllerDataProvider clusterData, Map<String, Resource> resourceMap,
+      final CurrentStateOutput currentStateOutput, RebalanceAlgorithm algorithm)
+      throws HelixRebalanceException {
+    Set<String> activeNodes = DelayedRebalanceUtil
+        .getActiveNodes(clusterData.getAllInstances(), clusterData.getEnabledLiveInstances(),
+            clusterData.getInstanceOfflineTimeMap(), clusterData.getLiveInstances().keySet(),
+            clusterData.getInstanceConfigMap(), clusterData.getClusterConfig());
+
+    // Schedule (or unschedule) delayed rebalance according to the delayed rebalance config.
+    delayedRebalanceSchedule(clusterData, activeNodes, resourceMap.keySet());
+
+    Map<String, IdealState> newIdealStates = convertResourceAssignment(clusterData,
+        computeBestPossibleAssignment(clusterData, resourceMap, activeNodes, currentStateOutput,
+            algorithm));
+
+    // The additional rebalance overwrite is required since the calculated mapping may contain
+    // some delayed rebalanced assignments.
+    if (!activeNodes.equals(clusterData.getEnabledLiveInstances())) {
+      applyRebalanceOverwrite(newIdealStates, clusterData, resourceMap,
+          getBaselineAssignment(_assignmentMetadataStore, currentStateOutput, resourceMap.keySet()),
+          algorithm);
+    }
+    // Replace the assignment if user-defined preference list is configured.
+    // Note the user-defined list is intentionally applied to the final mapping after calculation.
+    // This is to avoid persisting it into the assignment store, which impacts the long term
+    // assignment evenness and partition movements.
+    newIdealStates.entrySet().stream().forEach(idealStateEntry -> applyUserDefinedPreferenceList(
+        clusterData.getResourceConfig(idealStateEntry.getKey()), idealStateEntry.getValue()));
+
+    return newIdealStates;
+  }
+
+  // Coordinate global rebalance and partial rebalance according to the cluster changes.
+  protected Map<String, ResourceAssignment> computeBestPossibleAssignment(
+      ResourceControllerDataProvider clusterData, Map<String, Resource> resourceMap,
+      Set<String> activeNodes, final CurrentStateOutput currentStateOutput,
+      RebalanceAlgorithm algorithm)
+      throws HelixRebalanceException {
+    // Perform global rebalance for a new baseline assignment
+    globalRebalance(clusterData, resourceMap, currentStateOutput, algorithm);
+    // Perform partial rebalance for a new best possible assignment
+    Map<String, ResourceAssignment> newAssignment =
+        partialRebalance(clusterData, resourceMap, activeNodes, currentStateOutput, algorithm);
+    return newAssignment;
+  }
+
+  /**
+   * Convert the resource assignment map into an IdealState map.
+   */
+  private Map<String, IdealState> convertResourceAssignment(
+      ResourceControllerDataProvider clusterData, Map<String, ResourceAssignment> assignments)
+      throws HelixRebalanceException {
+    // Convert the assignments into IdealState for the following state mapping calculation.
+    Map<String, IdealState> finalIdealStateMap = new HashMap<>();
+    for (String resourceName : assignments.keySet()) {
+      try {
+        IdealState currentIdealState = clusterData.getIdealState(resourceName);
+        Map<String, Integer> statePriorityMap =
+            clusterData.getStateModelDef(currentIdealState.getStateModelDefRef())
+                .getStatePriorityMap();
+        // Create a new IdealState instance which contains the new calculated assignment in the
+        // preference list.
+        IdealState newIdealState = new IdealState(resourceName);
+        // Copy the simple fields
+        newIdealState.getRecord().setSimpleFields(currentIdealState.getRecord().getSimpleFields());
+        // Sort the preference list according to state priority.
+        newIdealState.setPreferenceLists(
+            getPreferenceLists(assignments.get(resourceName), statePriorityMap));
+        // Note the state mapping in the new assignment won't directly propagate to the map fields.
+        // The rebalancer will calculate for the final state mapping considering the current states.
+        finalIdealStateMap.put(resourceName, newIdealState);
+      } catch (Exception ex) {
+        throw new HelixRebalanceException(
+            "Failed to calculate the new IdealState for resource: " + resourceName,
+            HelixRebalanceException.Type.INVALID_CLUSTER_STATUS, ex);
+      }
+    }
+    return finalIdealStateMap;
+  }
+
+  /**
+   * Global rebalance calculates for a new baseline assignment.
+   * The new baseline assignment will be persisted and leveraged by the partial rebalance.
+   * @param clusterData
+   * @param resourceMap
+   * @param currentStateOutput
+   * @param algorithm
+   * @throws HelixRebalanceException
+   */
+  private void globalRebalance(ResourceControllerDataProvider clusterData,
+      Map<String, Resource> resourceMap, final CurrentStateOutput currentStateOutput,
+      RebalanceAlgorithm algorithm)
+      throws HelixRebalanceException {
+    _changeDetector.updateSnapshots(clusterData);
+    // Get all the changed items' information. Filter for the items that have content changed.
+    final Map<HelixConstants.ChangeType, Set<String>> clusterChanges =
+        _changeDetector.getAllChanges();
+
+    if (clusterChanges.keySet().stream()
+        .anyMatch(GLOBAL_REBALANCE_REQUIRED_CHANGE_TYPES::contains)) {
+      // Build the cluster model for rebalance calculation.
+      // Note, for a Baseline calculation,
+      // 1. Ignore node status (disable/offline).
+      // 2. Use the previous Baseline as the only parameter about the previous assignment.
+      Map<String, ResourceAssignment> currentBaseline =
+          getBaselineAssignment(_assignmentMetadataStore, currentStateOutput, resourceMap.keySet());
+      ClusterModel clusterModel;
+      try {
+        clusterModel = ClusterModelProvider
+            .generateClusterModelForBaseline(clusterData, resourceMap,
+                clusterData.getAllInstances(), clusterChanges, currentBaseline);
+      } catch (Exception ex) {
+        throw new HelixRebalanceException("Failed to generate cluster model for global rebalance.",
+            HelixRebalanceException.Type.INVALID_CLUSTER_STATUS, ex);
+      }
+
+      final boolean waitForGlobalRebalance = !_asyncGlobalRebalanceEnabled;
+      final String clusterName = clusterData.getClusterName();
+      // Calculate the Baseline assignment for global rebalance.
+      Future<Boolean> result = _baselineCalculateExecutor.submit(() -> {
+        try {
+          // Note that we should schedule a new partial rebalance for a future rebalance pipeline if
+          // the planned partial rebalance in the current rebalance pipeline won't wait for the new
+          // baseline being calculated.
+          // So set shouldSchedulePartialRebalance to be !waitForGlobalRebalance
+          calculateAndUpdateBaseline(clusterModel, algorithm, !waitForGlobalRebalance, clusterName);
+        } catch (HelixRebalanceException e) {
+          LOG.error("Failed to calculate baseline assignment!", e);
+          return false;
+        }
+        return true;
+      });
+      if (waitForGlobalRebalance) {
+        try {
+          if (!result.get()) {
+            throw new HelixRebalanceException("Failed to calculate for the new Baseline.",
+                HelixRebalanceException.Type.FAILED_TO_CALCULATE);
+          }
+        } catch (InterruptedException | ExecutionException e) {
+          throw new HelixRebalanceException("Failed to execute new Baseline calculation.",
+              HelixRebalanceException.Type.FAILED_TO_CALCULATE, e);
+        }
+      }
+    }
+  }
+
+  /**
+   * Calculate and update the Baseline assignment
+   * @param clusterModel
+   * @param algorithm
+   * @param shouldSchedulePartialRebalance True if the call should trigger a following partial rebalance
+   *                                   so the new Baseline could be applied to cluster.
+   * @param clusterName
+   * @throws HelixRebalanceException
+   */
+  private void calculateAndUpdateBaseline(ClusterModel clusterModel, RebalanceAlgorithm algorithm,
+      boolean shouldSchedulePartialRebalance, String clusterName)
+      throws HelixRebalanceException {
+    LOG.info("Start calculating the new baseline.");
+    _baselineCalcCounter.increment(1L);
+    _baselineCalcLatency.startMeasuringLatency();
+
+    boolean isBaselineChanged = false;
+    Map<String, ResourceAssignment> newBaseline = calculateAssignment(clusterModel, algorithm);
+    // Write the new baseline to metadata store
+    if (_assignmentMetadataStore != null) {
+      try {
+        _writeLatency.startMeasuringLatency();
+        isBaselineChanged = _assignmentMetadataStore.persistBaseline(newBaseline);
+        _writeLatency.endMeasuringLatency();
+      } catch (Exception ex) {
+        throw new HelixRebalanceException("Failed to persist the new baseline assignment.",
+            HelixRebalanceException.Type.INVALID_REBALANCER_STATUS, ex);
+      }
+    } else {
+      LOG.debug("Assignment Metadata Store is null. Skip persisting the baseline assignment.");
+    }
+    _baselineCalcLatency.endMeasuringLatency();
+    LOG.info("Global baseline calculation completed and has been persisted into metadata store.");
+
+    if (isBaselineChanged && shouldSchedulePartialRebalance) {
+      LOG.info("Schedule a new rebalance after the new baseline calculation has finished.");
+      RebalanceUtil.scheduleOnDemandPipeline(clusterName, 0L, false);
+    }
+  }
+
+  private Map<String, ResourceAssignment> partialRebalance(
+      ResourceControllerDataProvider clusterData, Map<String, Resource> resourceMap,
+      Set<String> activeNodes, final CurrentStateOutput currentStateOutput,
+      RebalanceAlgorithm algorithm)
+      throws HelixRebalanceException {
+    LOG.info("Start calculating the new best possible assignment.");
+    _partialRebalanceCounter.increment(1L);
+    _partialRebalanceLatency.startMeasuringLatency();
+    // TODO: Consider combining the metrics for both baseline/best possible?
+    // Read the baseline from metadata store
+    Map<String, ResourceAssignment> currentBaseline =
+        getBaselineAssignment(_assignmentMetadataStore, currentStateOutput, resourceMap.keySet());
+
+    // Read the best possible assignment from metadata store
+    Map<String, ResourceAssignment> currentBestPossibleAssignment =
+        getBestPossibleAssignment(_assignmentMetadataStore, currentStateOutput,
+            resourceMap.keySet());
+    ClusterModel clusterModel;
+    try {
+      clusterModel = ClusterModelProvider
+          .generateClusterModelForPartialRebalance(clusterData, resourceMap, activeNodes,
+              currentBaseline, currentBestPossibleAssignment);
+    } catch (Exception ex) {
+      throw new HelixRebalanceException("Failed to generate cluster model for partial rebalance.",
+          HelixRebalanceException.Type.INVALID_CLUSTER_STATUS, ex);
+    }
+    Map<String, ResourceAssignment> newAssignment = calculateAssignment(clusterModel, algorithm);
+
+    // Asynchronously report baseline divergence metric before persisting to metadata store,
+    // just in case if persisting fails, we still have the metric.
+    // To avoid changes of the new assignment and make it safe when being used to measure baseline
+    // divergence, use a deep copy of the new assignment.
+    Map<String, ResourceAssignment> newAssignmentCopy = new HashMap<>();
+    for (Map.Entry<String, ResourceAssignment> entry : newAssignment.entrySet()) {
+      newAssignmentCopy.put(entry.getKey(), new ResourceAssignment(entry.getValue().getRecord()));
+    }
+
+    _baselineDivergenceGauge.asyncMeasureAndUpdateValue(clusterData.getAsyncTasksThreadPool(),
+        currentBaseline, newAssignmentCopy);
+
+    if (_assignmentMetadataStore != null) {
+      try {
+        _writeLatency.startMeasuringLatency();
+        _assignmentMetadataStore.persistBestPossibleAssignment(newAssignment);
+        _writeLatency.endMeasuringLatency();
+      } catch (Exception ex) {
+        throw new HelixRebalanceException("Failed to persist the new best possible assignment.",
+            HelixRebalanceException.Type.INVALID_REBALANCER_STATUS, ex);
+      }
+    } else {
+      LOG.debug("Assignment Metadata Store is null. Skip persisting the baseline assignment.");
+    }
+    _partialRebalanceLatency.endMeasuringLatency();
+    LOG.info("Finish calculating the new best possible assignment.");
+    return newAssignment;
+  }
+
+  /**
+   * @param clusterModel the cluster model that contains all the cluster status for the purpose of
+   *                     rebalancing.
+   * @return the new optimal assignment for the resources.
+   */
+  private Map<String, ResourceAssignment> calculateAssignment(ClusterModel clusterModel,
+      RebalanceAlgorithm algorithm) throws HelixRebalanceException {
+    long startTime = System.currentTimeMillis();
+    LOG.info("Start calculating for an assignment with algorithm {}",
+        algorithm.getClass().getSimpleName());
+    OptimalAssignment optimalAssignment = algorithm.calculate(clusterModel);
+    Map<String, ResourceAssignment> newAssignment =
+        optimalAssignment.getOptimalResourceAssignment();
+    LOG.info("Finish calculating an assignment with algorithm {}. Took: {} ms.",
+        algorithm.getClass().getSimpleName(), System.currentTimeMillis() - startTime);
+    return newAssignment;
+  }
+
+  // Generate the preference lists from the state mapping based on state priority.
+  private Map<String, List<String>> getPreferenceLists(ResourceAssignment newAssignment,
+      Map<String, Integer> statePriorityMap) {
+    Map<String, List<String>> preferenceList = new HashMap<>();
+    for (Partition partition : newAssignment.getMappedPartitions()) {
+      List<String> nodes = new ArrayList<>(newAssignment.getReplicaMap(partition).keySet());
+      // To ensure backward compatibility, sort the preference list according to state priority.
+      nodes.sort((node1, node2) -> {
+        int statePriority1 =
+            statePriorityMap.get(newAssignment.getReplicaMap(partition).get(node1));
+        int statePriority2 =
+            statePriorityMap.get(newAssignment.getReplicaMap(partition).get(node2));
+        if (statePriority1 == statePriority2) {
+          return node1.compareTo(node2);
+        } else {
+          return statePriority1 - statePriority2;
+        }
+      });
+      preferenceList.put(partition.getPartitionName(), nodes);
+    }
+    return preferenceList;
+  }
+
+  private void validateInput(ResourceControllerDataProvider clusterData,
+      Map<String, Resource> resourceMap) throws HelixRebalanceException {
+    Set<String> nonCompatibleResources = resourceMap.entrySet().stream().filter(resourceEntry -> {
+      IdealState is = clusterData.getIdealState(resourceEntry.getKey());
+      return is == null || !is.getRebalanceMode().equals(IdealState.RebalanceMode.FULL_AUTO)
+          || !WagedRebalancer.class.getName().equals(is.getRebalancerClassName());
+    }).map(Map.Entry::getKey).collect(Collectors.toSet());
+    if (!nonCompatibleResources.isEmpty()) {
+      throw new HelixRebalanceException(String.format(
+          "Input contains invalid resource(s) that cannot be rebalanced by the WAGED rebalancer. %s",
+          nonCompatibleResources.toString()), HelixRebalanceException.Type.INVALID_INPUT);
+    }
+  }
+
+  /**
+   * @param assignmentMetadataStore
+   * @param currentStateOutput
+   * @param resources
+   * @return The current baseline assignment. If record does not exist in the
+   *         assignmentMetadataStore, return the current state assignment.
+   * @throws HelixRebalanceException
+   */
+  private Map<String, ResourceAssignment> getBaselineAssignment(
+      AssignmentMetadataStore assignmentMetadataStore, CurrentStateOutput currentStateOutput,
+      Set<String> resources) throws HelixRebalanceException {
+    Map<String, ResourceAssignment> currentBaseline = Collections.emptyMap();
+    if (assignmentMetadataStore != null) {
+      try {
+        _stateReadLatency.startMeasuringLatency();
+        currentBaseline = assignmentMetadataStore.getBaseline();
+        _stateReadLatency.endMeasuringLatency();
+      } catch (Exception ex) {
+        throw new HelixRebalanceException(
+            "Failed to get the current baseline assignment because of unexpected error.",
+            HelixRebalanceException.Type.INVALID_REBALANCER_STATUS, ex);
+      }
+    }
+    if (currentBaseline.isEmpty()) {
+      LOG.warn("The current baseline assignment record is empty. Use the current states instead.");
+      currentBaseline = currentStateOutput.getAssignment(resources);
+    }
+    currentBaseline.keySet().retainAll(resources);
+    return currentBaseline;
+  }
+
+  /**
+   * @param assignmentMetadataStore
+   * @param currentStateOutput
+   * @param resources
+   * @return The current best possible assignment. If record does not exist in the
+   *         assignmentMetadataStore, return the current state assignment.
+   * @throws HelixRebalanceException
+   */
+  protected Map<String, ResourceAssignment> getBestPossibleAssignment(
+      AssignmentMetadataStore assignmentMetadataStore, CurrentStateOutput currentStateOutput,
+      Set<String> resources) throws HelixRebalanceException {
+    Map<String, ResourceAssignment> currentBestAssignment = Collections.emptyMap();
+    if (assignmentMetadataStore != null) {
+      try {
+        _stateReadLatency.startMeasuringLatency();
+        currentBestAssignment = assignmentMetadataStore.getBestPossibleAssignment();
+        _stateReadLatency.endMeasuringLatency();
+      } catch (Exception ex) {
+        throw new HelixRebalanceException(
+            "Failed to get the current best possible assignment because of unexpected error.",
+            HelixRebalanceException.Type.INVALID_REBALANCER_STATUS, ex);
+      }
+    }
+    if (currentBestAssignment.isEmpty()) {
+      LOG.warn(
+          "The current best possible assignment record is empty. Use the current states instead.");
+      currentBestAssignment = currentStateOutput.getAssignment(resources);
+    }
+    currentBestAssignment.keySet().retainAll(resources);
+    return currentBestAssignment;
+  }
+
+  /**
+   * Schedule rebalance according to the delayed rebalance logic.
+   * @param clusterData the current cluster data cache
+   * @param delayedActiveNodes the active nodes set that is calculated with the delay time window
+   * @param resourceSet the rebalanced resourceSet
+   */
+  private void delayedRebalanceSchedule(ResourceControllerDataProvider clusterData,
+      Set<String> delayedActiveNodes, Set<String> resourceSet) {
+    if (_manager != null) {
+      // Schedule for the next delayed rebalance in case no cluster change event happens.
+      ClusterConfig clusterConfig = clusterData.getClusterConfig();
+      boolean delayedRebalanceEnabled = DelayedRebalanceUtil.isDelayRebalanceEnabled(clusterConfig);
+      Set<String> offlineOrDisabledInstances = new HashSet<>(delayedActiveNodes);
+      offlineOrDisabledInstances.removeAll(clusterData.getEnabledLiveInstances());
+      for (String resource : resourceSet) {
+        DelayedRebalanceUtil
+            .setRebalanceScheduler(resource, delayedRebalanceEnabled, offlineOrDisabledInstances,
+                clusterData.getInstanceOfflineTimeMap(), clusterData.getLiveInstances().keySet(),
+                clusterData.getInstanceConfigMap(), clusterConfig.getRebalanceDelayTime(),
+                clusterConfig, _manager);
+      }
+    } else {
+      LOG.warn("Skip scheduling a delayed rebalancer since HelixManager is not specified.");
+    }
+  }
+
+  /**
+   * Update the rebalanced ideal states according to the real active nodes.
+   * Since the rebalancing might be done with the delayed logic, the rebalanced ideal states
+   * might include inactive nodes.
+   * This overwrite will adjust the final mapping, so as to ensure the result is completely valid.
+   * @param idealStateMap the calculated ideal states.
+   * @param clusterData the cluster data cache.
+   * @param resourceMap the rebalanaced resource map.
+   * @param baseline the baseline assignment.
+   * @param algorithm the rebalance algorithm.
+   */
+  private void applyRebalanceOverwrite(Map<String, IdealState> idealStateMap,
+      ResourceControllerDataProvider clusterData, Map<String, Resource> resourceMap,
+      Map<String, ResourceAssignment> baseline, RebalanceAlgorithm algorithm)
+      throws HelixRebalanceException {
+    ClusterModel clusterModel;
+    try {
+      // Note this calculation uses the baseline as the best possible assignment input here.
+      // This is for minimizing unnecessary partition movement.
+      clusterModel = ClusterModelProvider
+          .generateClusterModelFromExistingAssignment(clusterData, resourceMap, baseline);
+    } catch (Exception ex) {
+      throw new HelixRebalanceException(
+          "Failed to generate cluster model for delayed rebalance overwrite.",
+          HelixRebalanceException.Type.INVALID_CLUSTER_STATUS, ex);
+    }
+    Map<String, IdealState> activeIdealStates =
+        convertResourceAssignment(clusterData, calculateAssignment(clusterModel, algorithm));
+    for (String resourceName : idealStateMap.keySet()) {
+      // The new calculated ideal state before overwrite
+      IdealState newIdealState = idealStateMap.get(resourceName);
+      if (!activeIdealStates.containsKey(resourceName)) {
+        throw new HelixRebalanceException(
+            "Failed to calculate the complete partition assignment with all active nodes. Cannot find the resource assignment for "
+                + resourceName, HelixRebalanceException.Type.FAILED_TO_CALCULATE);
+      }
+      // The ideal state that is calculated based on the real alive/enabled instances list
+      IdealState newActiveIdealState = activeIdealStates.get(resourceName);
+      // The current ideal state that exists in the IdealState znode
+      IdealState currentIdealState = clusterData.getIdealState(resourceName);
+      Set<String> enabledLiveInstances = clusterData.getEnabledLiveInstances();
+      int numReplica = currentIdealState.getReplicaCount(enabledLiveInstances.size());
+      int minActiveReplica =
+          DelayedRebalanceUtil.getMinActiveReplica(currentIdealState, numReplica);
+      Map<String, List<String>> finalPreferenceLists = DelayedRebalanceUtil
+          .getFinalDelayedMapping(newActiveIdealState.getPreferenceLists(),
+              newIdealState.getPreferenceLists(), enabledLiveInstances,
+              Math.min(minActiveReplica, numReplica));
+
+      newIdealState.setPreferenceLists(finalPreferenceLists);
+    }
+  }
+
+  private void applyUserDefinedPreferenceList(ResourceConfig resourceConfig,
+      IdealState idealState) {
+    if (resourceConfig != null) {
+      Map<String, List<String>> userDefinedPreferenceList = resourceConfig.getPreferenceLists();
+      if (!userDefinedPreferenceList.isEmpty()) {
+        LOG.info("Using user defined preference list for partitions.");
+        for (String partition : userDefinedPreferenceList.keySet()) {
+          idealState.setPreferenceList(partition, userDefinedPreferenceList.get(partition));
+        }
+      }
+    }
+  }
+
+  protected AssignmentMetadataStore getAssignmentMetadataStore() {
+    return _assignmentMetadataStore;
+  }
+
+  protected MetricCollector getMetricCollector() {
+    return _metricCollector;
+  }
+
+  @Override
+  protected void finalize()
+      throws Throwable {
+    super.finalize();
+    close();
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ConstraintBasedAlgorithm.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ConstraintBasedAlgorithm.java
new file mode 100644
index 0000000..dcadff6
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ConstraintBasedAlgorithm.java
@@ -0,0 +1,228 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.Optional;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.stream.Collectors;
+
+import com.google.common.collect.Maps;
+import org.apache.helix.HelixRebalanceException;
+import org.apache.helix.controller.rebalancer.waged.RebalanceAlgorithm;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterModel;
+import org.apache.helix.controller.rebalancer.waged.model.OptimalAssignment;
+import org.apache.helix.model.ResourceAssignment;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * The algorithm is based on a given set of constraints
+ * - HardConstraint: Approve or deny the assignment given its condition, any assignment cannot
+ * bypass any "hard constraint"
+ * - SoftConstraint: Evaluate the assignment by points/rewards/scores, a higher point means a better
+ * assignment
+ * The goal is to accumulate the most points(rewards) from "soft constraints" while avoiding any
+ * "hard constraints"
+ */
+class ConstraintBasedAlgorithm implements RebalanceAlgorithm {
+  private static final Logger LOG = LoggerFactory.getLogger(ConstraintBasedAlgorithm.class);
+  private final List<HardConstraint> _hardConstraints;
+  private final Map<SoftConstraint, Float> _softConstraints;
+
+  ConstraintBasedAlgorithm(List<HardConstraint> hardConstraints,
+      Map<SoftConstraint, Float> softConstraints) {
+    _hardConstraints = hardConstraints;
+    _softConstraints = softConstraints;
+  }
+
+  @Override
+  public OptimalAssignment calculate(ClusterModel clusterModel)
+      throws HelixRebalanceException {
+    OptimalAssignment optimalAssignment = new OptimalAssignment();
+    List<AssignableNode> nodes = new ArrayList<>(clusterModel.getAssignableNodes().values());
+    Set<String> busyInstances =
+        getBusyInstances(clusterModel.getContext().getBestPossibleAssignment().values());
+    // Sort the replicas so the input is stable for the greedy algorithm.
+    // For the other algorithm implementation, this sorting could be unnecessary.
+    for (AssignableReplica replica : getOrderedAssignableReplica(clusterModel)) {
+      Optional<AssignableNode> maybeBestNode =
+          getNodeWithHighestPoints(replica, nodes, clusterModel.getContext(), busyInstances,
+              optimalAssignment);
+      // stop immediately if any replica cannot find best assignable node
+      if (optimalAssignment.hasAnyFailure()) {
+        String errorMessage = String
+            .format("Unable to find any available candidate node for partition %s; Fail reasons: %s",
+            replica.getPartitionName(), optimalAssignment.getFailures());
+        throw new HelixRebalanceException(errorMessage,
+            HelixRebalanceException.Type.FAILED_TO_CALCULATE);
+      }
+      maybeBestNode.ifPresent(node -> clusterModel
+          .assign(replica.getResourceName(), replica.getPartitionName(), replica.getReplicaState(),
+              node.getInstanceName()));
+    }
+    optimalAssignment.updateAssignments(clusterModel);
+    return optimalAssignment;
+  }
+
+  private Optional<AssignableNode> getNodeWithHighestPoints(AssignableReplica replica,
+      List<AssignableNode> assignableNodes, ClusterContext clusterContext,
+      Set<String> busyInstances, OptimalAssignment optimalAssignment) {
+    Map<AssignableNode, List<HardConstraint>> hardConstraintFailures = new ConcurrentHashMap<>();
+    List<AssignableNode> candidateNodes = assignableNodes.parallelStream().filter(candidateNode -> {
+      boolean isValid = true;
+      // need to record all the failure reasons and it gives us the ability to debug/fix the runtime
+      // cluster environment
+      for (HardConstraint hardConstraint : _hardConstraints) {
+        if (!hardConstraint.isAssignmentValid(candidateNode, replica, clusterContext)) {
+          hardConstraintFailures.computeIfAbsent(candidateNode, node -> new ArrayList<>())
+              .add(hardConstraint);
+          isValid = false;
+        }
+      }
+      return isValid;
+    }).collect(Collectors.toList());
+
+    if (candidateNodes.isEmpty()) {
+      optimalAssignment.recordAssignmentFailure(replica,
+          Maps.transformValues(hardConstraintFailures, this::convertFailureReasons));
+      return Optional.empty();
+    }
+
+    return candidateNodes.parallelStream().map(node -> new HashMap.SimpleEntry<>(node,
+        getAssignmentNormalizedScore(node, replica, clusterContext)))
+        .max((nodeEntry1, nodeEntry2) -> {
+          int scoreCompareResult = nodeEntry1.getValue().compareTo(nodeEntry2.getValue());
+          if (scoreCompareResult == 0) {
+            // If the evaluation scores of 2 nodes are the same, the algorithm assigns the replica
+            // to the idle node first.
+            int idleScore1 = busyInstances.contains(nodeEntry1.getKey().getInstanceName()) ? 0 : 1;
+            int idleScore2 = busyInstances.contains(nodeEntry2.getKey().getInstanceName()) ? 0 : 1;
+            return idleScore1 - idleScore2;
+          } else {
+            return scoreCompareResult;
+          }
+        }).map(Map.Entry::getKey);
+  }
+
+  private double getAssignmentNormalizedScore(AssignableNode node, AssignableReplica replica,
+      ClusterContext clusterContext) {
+    double sum = 0;
+    for (Map.Entry<SoftConstraint, Float> softConstraintEntry : _softConstraints.entrySet()) {
+      SoftConstraint softConstraint = softConstraintEntry.getKey();
+      float weight = softConstraintEntry.getValue();
+      if (weight != 0) {
+        // Skip calculating zero weighted constraints.
+        sum += weight * softConstraint.getAssignmentNormalizedScore(node, replica, clusterContext);
+      }
+    }
+    return sum;
+  }
+
+  private List<String> convertFailureReasons(List<HardConstraint> hardConstraints) {
+    return hardConstraints.stream().map(HardConstraint::getDescription)
+        .collect(Collectors.toList());
+  }
+
+  private List<AssignableReplica> getOrderedAssignableReplica(ClusterModel clusterModel) {
+    Map<String, Set<AssignableReplica>> replicasByResource = clusterModel.getAssignableReplicaMap();
+    List<AssignableReplica> orderedAssignableReplicas =
+        replicasByResource.values().stream().flatMap(replicas -> replicas.stream())
+            .collect(Collectors.toList());
+
+    Map<String, ResourceAssignment> bestPossibleAssignment =
+        clusterModel.getContext().getBestPossibleAssignment();
+    Map<String, ResourceAssignment> baselineAssignment =
+        clusterModel.getContext().getBaselineAssignment();
+
+    Map<String, Integer> replicaHashCodeMap = orderedAssignableReplicas.parallelStream().collect(
+        Collectors.toMap(AssignableReplica::toString,
+            replica -> Objects.hash(replica.toString(), clusterModel.getAssignableNodes().keySet()),
+            (hash1, hash2) -> hash2));
+
+    // 1. Sort according if the assignment exists in the best possible and/or baseline assignment
+    // 2. Sort according to the state priority. Note that prioritizing the top state is required.
+    // Or the greedy algorithm will unnecessarily shuffle the states between replicas.
+    // 3. Sort according to the resource/partition name.
+    orderedAssignableReplicas.sort((replica1, replica2) -> {
+      String resourceName1 = replica1.getResourceName();
+      String resourceName2 = replica2.getResourceName();
+      if (bestPossibleAssignment.containsKey(resourceName1) == bestPossibleAssignment
+          .containsKey(resourceName2)) {
+        if (baselineAssignment.containsKey(resourceName1) == baselineAssignment
+            .containsKey(resourceName2)) {
+          // If both assignment states have/not have the resource assignment the same,
+          // compare for additional dimensions.
+          int statePriority1 = replica1.getStatePriority();
+          int statePriority2 = replica2.getStatePriority();
+          if (statePriority1 == statePriority2) {
+            // If state priorities are the same, try to randomize the replicas order. Otherwise,
+            // the same replicas might always be moved in each rebalancing. This is because their
+            // placement calculating will always happen at the critical moment while the cluster is
+            // almost close to the expected utilization.
+            //
+            // Note that to ensure the algorithm is deterministic with the same inputs, do not use
+            // Random functions here. Use hashcode based on the cluster topology information to get
+            // a controlled randomized order is good enough.
+            Integer replicaHash1 = replicaHashCodeMap.get(replica1.toString());
+            Integer replicaHash2 = replicaHashCodeMap.get(replica2.toString());
+            if (!replicaHash1.equals(replicaHash2)) {
+              return replicaHash1.compareTo(replicaHash2);
+            } else {
+              // In case of hash collision, return order according to the name.
+              return replica1.toString().compareTo(replica2.toString());
+            }
+          } else {
+            // Note we shall prioritize the replica with a higher state priority,
+            // the smaller priority number means higher priority.
+            return statePriority1 - statePriority2;
+          }
+        } else {
+          // If the baseline assignment contains the assignment, prioritize the replica.
+          return baselineAssignment.containsKey(resourceName1) ? -1 : 1;
+        }
+      } else {
+        // If the best possible assignment contains the assignment, prioritize the replica.
+        return bestPossibleAssignment.containsKey(resourceName1) ? -1 : 1;
+      }
+    });
+    return orderedAssignableReplicas;
+  }
+
+  /**
+   * @param assignments A collection of resource replicas assignment.
+   * @return A set of instance names that have at least one replica assigned in the input assignments.
+   */
+  private Set<String> getBusyInstances(Collection<ResourceAssignment> assignments) {
+    return assignments.stream().flatMap(
+        resourceAssignment -> resourceAssignment.getRecord().getMapFields().values().stream()
+            .flatMap(instanceStateMap -> instanceStateMap.keySet().stream())
+            .collect(Collectors.toSet()).stream()).collect(Collectors.toSet());
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ConstraintBasedAlgorithmFactory.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ConstraintBasedAlgorithmFactory.java
new file mode 100644
index 0000000..934bfa7
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ConstraintBasedAlgorithmFactory.java
@@ -0,0 +1,82 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Properties;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.Maps;
+import org.apache.helix.HelixManagerProperties;
+import org.apache.helix.SystemPropertyKeys;
+import org.apache.helix.controller.rebalancer.waged.RebalanceAlgorithm;
+import org.apache.helix.model.ClusterConfig;
+
+/**
+ * The factory class to create an instance of {@link ConstraintBasedAlgorithm}
+ */
+public class ConstraintBasedAlgorithmFactory {
+  private static final Map<String, Float> MODEL = new HashMap<String, Float>() {
+    {
+      // The default setting
+      put(PartitionMovementConstraint.class.getSimpleName(), 2f);
+      put(InstancePartitionsCountConstraint.class.getSimpleName(), 1f);
+      put(ResourcePartitionAntiAffinityConstraint.class.getSimpleName(), 1f);
+      put(ResourceTopStateAntiAffinityConstraint.class.getSimpleName(), 3f);
+      put(MaxCapacityUsageInstanceConstraint.class.getSimpleName(), 5f);
+    }
+  };
+
+  static {
+    Properties properties =
+        new HelixManagerProperties(SystemPropertyKeys.SOFT_CONSTRAINT_WEIGHTS).getProperties();
+    // overwrite the default value with data load from property file
+    properties.forEach((constraintName, weight) -> MODEL.put(String.valueOf(constraintName),
+        Float.valueOf(String.valueOf(weight))));
+  }
+
+  public static RebalanceAlgorithm getInstance(
+      Map<ClusterConfig.GlobalRebalancePreferenceKey, Integer> preferences) {
+    List<HardConstraint> hardConstraints =
+        ImmutableList.of(new FaultZoneAwareConstraint(), new NodeCapacityConstraint(),
+            new ReplicaActivateConstraint(), new NodeMaxPartitionLimitConstraint(),
+            new ValidGroupTagConstraint(), new SamePartitionOnInstanceConstraint());
+
+    int evennessPreference =
+        preferences.getOrDefault(ClusterConfig.GlobalRebalancePreferenceKey.EVENNESS, 1);
+    int movementPreference =
+        preferences.getOrDefault(ClusterConfig.GlobalRebalancePreferenceKey.LESS_MOVEMENT, 1);
+
+    List<SoftConstraint> softConstraints = ImmutableList
+        .of(new PartitionMovementConstraint(), new InstancePartitionsCountConstraint(),
+            new ResourcePartitionAntiAffinityConstraint(),
+            new ResourceTopStateAntiAffinityConstraint(), new MaxCapacityUsageInstanceConstraint());
+    Map<SoftConstraint, Float> softConstraintsWithWeight = Maps.toMap(softConstraints, key -> {
+      String name = key.getClass().getSimpleName();
+      float weight = MODEL.get(name);
+      return name.equals(PartitionMovementConstraint.class.getSimpleName()) ?
+          movementPreference * weight : evennessPreference * weight;
+    });
+
+    return new ConstraintBasedAlgorithm(hardConstraints, softConstraintsWithWeight);
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/FaultZoneAwareConstraint.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/FaultZoneAwareConstraint.java
new file mode 100644
index 0000000..c33419e
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/FaultZoneAwareConstraint.java
@@ -0,0 +1,43 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+
+class FaultZoneAwareConstraint extends HardConstraint {
+
+  @Override
+  boolean isAssignmentValid(AssignableNode node, AssignableReplica replica,
+      ClusterContext clusterContext) {
+    if (!node.hasFaultZone()) {
+      return true;
+    }
+    return !clusterContext
+        .getPartitionsForResourceAndFaultZone(replica.getResourceName(), node.getFaultZone())
+        .contains(replica.getPartitionName());
+  }
+
+  @Override
+  String getDescription() {
+    return "A fault zone cannot contain more than 1 replica of same partition";
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/HardConstraint.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/HardConstraint.java
new file mode 100644
index 0000000..f544d4b
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/HardConstraint.java
@@ -0,0 +1,47 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+
+/**
+ * Evaluate a partition allocation proposal and return YES or NO based on the cluster context.
+ * Any proposal fails one or more hard constraints will be rejected.
+ */
+abstract class HardConstraint {
+
+  /**
+   * Check if the replica could be assigned to the node
+   * @return True if the proposed assignment is valid; False otherwise
+   */
+  abstract boolean isAssignmentValid(AssignableNode node, AssignableReplica replica,
+      ClusterContext clusterContext);
+
+  /**
+   * Return class name by default as description if it's explanatory enough, child class could override
+   * the method and add more detailed descriptions
+   * @return The detailed description of hard constraint
+   */
+  String getDescription() {
+    return getClass().getName();
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/InstancePartitionsCountConstraint.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/InstancePartitionsCountConstraint.java
new file mode 100644
index 0000000..948a7d0
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/InstancePartitionsCountConstraint.java
@@ -0,0 +1,41 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+
+/**
+ * Evaluate by instance's current partition count versus estimated max partition count
+ * Intuitively, Encourage the assignment if the instance's occupancy rate is below average;
+ * Discourage the assignment if the instance's occupancy rate is above average
+ * The normalized score will be within [0, 1]
+ */
+class InstancePartitionsCountConstraint extends UsageSoftConstraint {
+
+  @Override
+  protected double getAssignmentScore(AssignableNode node, AssignableReplica replica,
+      ClusterContext clusterContext) {
+    int estimatedMaxPartitionCount = clusterContext.getEstimatedMaxPartitionCount();
+    int currentPartitionCount = node.getAssignedReplicaCount();
+    return computeUtilizationScore(estimatedMaxPartitionCount, currentPartitionCount);
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/MaxCapacityUsageInstanceConstraint.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/MaxCapacityUsageInstanceConstraint.java
new file mode 100644
index 0000000..8f41f5e
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/MaxCapacityUsageInstanceConstraint.java
@@ -0,0 +1,42 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+
+/**
+ * The constraint evaluates the score by checking the max used capacity key out of all the capacity
+ * keys.
+ * The higher the maximum usage value for the capacity key, the lower the score will be, implying
+ * that it is that much less desirable to assign anything on the given node.
+ * It is a greedy approach since it evaluates only on the most used capacity key.
+ */
+class MaxCapacityUsageInstanceConstraint extends UsageSoftConstraint {
+
+  @Override
+  protected double getAssignmentScore(AssignableNode node, AssignableReplica replica,
+      ClusterContext clusterContext) {
+    float estimatedMaxUtilization = clusterContext.getEstimatedMaxUtilization();
+    float projectedHighestUtilization = node.getProjectedHighestUtilization(replica.getCapacity());
+    return computeUtilizationScore(estimatedMaxUtilization, projectedHighestUtilization);
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/NodeCapacityConstraint.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/NodeCapacityConstraint.java
new file mode 100644
index 0000000..827d6ce
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/NodeCapacityConstraint.java
@@ -0,0 +1,50 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Map;
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+
+class NodeCapacityConstraint extends HardConstraint {
+
+  @Override
+  boolean isAssignmentValid(AssignableNode node, AssignableReplica replica,
+      ClusterContext clusterContext) {
+    Map<String, Integer> nodeCapacity = node.getRemainingCapacity();
+    Map<String, Integer> replicaCapacity = replica.getCapacity();
+
+    for (String key : replicaCapacity.keySet()) {
+      if (nodeCapacity.containsKey(key)) {
+        if (nodeCapacity.get(key) < replicaCapacity.get(key)) {
+          return false;
+        }
+      }
+    }
+    return true;
+  }
+
+  @Override
+  String getDescription() {
+    return "Node has insufficient capacity";
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/NodeMaxPartitionLimitConstraint.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/NodeMaxPartitionLimitConstraint.java
new file mode 100644
index 0000000..cda5329
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/NodeMaxPartitionLimitConstraint.java
@@ -0,0 +1,43 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+
+class NodeMaxPartitionLimitConstraint extends HardConstraint {
+
+  @Override
+  boolean isAssignmentValid(AssignableNode node, AssignableReplica replica,
+      ClusterContext clusterContext) {
+    boolean exceedMaxPartitionLimit =
+        node.getMaxPartition() < 0 || node.getAssignedReplicaCount() < node.getMaxPartition();
+    boolean exceedResourceMaxPartitionLimit = replica.getResourceMaxPartitionsPerInstance() < 0
+        || node.getAssignedPartitionsByResource(replica.getResourceName()).size() < replica
+        .getResourceMaxPartitionsPerInstance();
+    return exceedMaxPartitionLimit && exceedResourceMaxPartitionLimit;
+  }
+
+  @Override
+  String getDescription() {
+    return "Cannot exceed the maximum number of partitions limitation on node";
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/PartitionMovementConstraint.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/PartitionMovementConstraint.java
new file mode 100644
index 0000000..dc19c19
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/PartitionMovementConstraint.java
@@ -0,0 +1,96 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Collections;
+import java.util.Map;
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+import org.apache.helix.model.Partition;
+import org.apache.helix.model.ResourceAssignment;
+
+/**
+ * Evaluate the proposed assignment according to the potential partition movements cost.
+ * The cost is evaluated based on the difference between the old assignment and the new assignment.
+ * In detail, we consider the following two previous assignments as the base.
+ * - Baseline assignment that is calculated regardless of the node state (online/offline).
+ * - Previous Best Possible assignment.
+ * Any change to these two assignments will increase the partition movements cost, so that the
+ * evaluated score will become lower.
+ */
+class PartitionMovementConstraint extends SoftConstraint {
+  private static final double MAX_SCORE = 1f;
+  private static final double MIN_SCORE = 0f;
+  //TODO: these factors will be tuned based on user's preference
+  // This factor indicates the default score that is evaluated if only partition allocation matches
+  // (states are different).
+  private static final double ALLOCATION_MATCH_FACTOR = 0.5;
+
+  PartitionMovementConstraint() {
+    super(MAX_SCORE, MIN_SCORE);
+  }
+
+  @Override
+  protected double getAssignmentScore(AssignableNode node, AssignableReplica replica,
+      ClusterContext clusterContext) {
+    // Prioritize the previous Best Possible assignment
+    Map<String, String> bestPossibleAssignment =
+        getStateMap(replica, clusterContext.getBestPossibleAssignment());
+    if (!bestPossibleAssignment.isEmpty()) {
+      return calculateAssignmentScale(node, replica, bestPossibleAssignment);
+    }
+    // else, compare the baseline only if the best possible assignment does not contain the replica
+    Map<String, String> baselineAssignment =
+        getStateMap(replica, clusterContext.getBaselineAssignment());
+    if (!baselineAssignment.isEmpty()) {
+      return calculateAssignmentScale(node, replica, baselineAssignment);
+    }
+    return 0;
+  }
+
+  private Map<String, String> getStateMap(AssignableReplica replica,
+      Map<String, ResourceAssignment> assignment) {
+    String resourceName = replica.getResourceName();
+    String partitionName = replica.getPartitionName();
+    if (assignment == null || !assignment.containsKey(resourceName)) {
+      return Collections.emptyMap();
+    }
+    return assignment.get(resourceName).getReplicaMap(new Partition(partitionName));
+  }
+
+  private double calculateAssignmentScale(AssignableNode node, AssignableReplica replica,
+      Map<String, String> instanceToStateMap) {
+    String instanceName = node.getInstanceName();
+    if (!instanceToStateMap.containsKey(instanceName)) {
+      return 0;
+    } else {
+      return (instanceToStateMap.get(instanceName).equals(replica.getReplicaState()) ? 1 :
+          ALLOCATION_MATCH_FACTOR);
+    }
+  }
+
+  @Override
+  protected NormalizeFunction getNormalizeFunction() {
+    // PartitionMovementConstraint already scale the score properly.
+    return (score) -> score;
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ReplicaActivateConstraint.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ReplicaActivateConstraint.java
new file mode 100644
index 0000000..9152efe
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ReplicaActivateConstraint.java
@@ -0,0 +1,41 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.List;
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+
+class ReplicaActivateConstraint extends HardConstraint {
+  @Override
+  boolean isAssignmentValid(AssignableNode node, AssignableReplica replica,
+      ClusterContext clusterContext) {
+    List<String> disabledPartitions =
+        node.getDisabledPartitionsMap().get(replica.getResourceName());
+    return disabledPartitions == null || !disabledPartitions.contains(replica.getPartitionName());
+  }
+
+  @Override
+  String getDescription() {
+    return "Cannot assign the inactive replica";
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ResourcePartitionAntiAffinityConstraint.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ResourcePartitionAntiAffinityConstraint.java
new file mode 100644
index 0000000..a3b701f
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ResourcePartitionAntiAffinityConstraint.java
@@ -0,0 +1,43 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+
+/**
+ * This constraint exists to make partitions belonging to the same resource be assigned as far from
+ * each other as possible. This is because it is undesirable to have many partitions belonging to
+ * the same resource be assigned to the same node to minimize the impact of node failure scenarios.
+ * The score is higher the fewer the partitions are on the node belonging to the same resource.
+ */
+class ResourcePartitionAntiAffinityConstraint extends UsageSoftConstraint {
+  @Override
+  protected double getAssignmentScore(AssignableNode node, AssignableReplica replica,
+      ClusterContext clusterContext) {
+    String resource = replica.getResourceName();
+    int curPartitionCountForResource = node.getAssignedPartitionsByResource(resource).size();
+    int estimatedMaxPartitionCountForResource =
+        clusterContext.getEstimatedMaxPartitionByResource(resource);
+    return computeUtilizationScore(estimatedMaxPartitionCountForResource,
+        curPartitionCountForResource);
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ResourceTopStateAntiAffinityConstraint.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ResourceTopStateAntiAffinityConstraint.java
new file mode 100644
index 0000000..f0f9e13
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ResourceTopStateAntiAffinityConstraint.java
@@ -0,0 +1,44 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+
+/**
+ * Evaluate the proposed assignment according to the top state replication count on the instance.
+ * The higher number the number of top state partitions assigned to the instance, the lower the
+ * score, vice versa.
+ */
+class ResourceTopStateAntiAffinityConstraint extends UsageSoftConstraint {
+  @Override
+  protected double getAssignmentScore(AssignableNode node, AssignableReplica replica,
+      ClusterContext clusterContext) {
+    if (!replica.isReplicaTopState()) {
+      // For non top state replica, this constraint is not applicable.
+      // So return zero on any assignable node candidate.
+      return 0;
+    }
+    int curTopPartitionCountForResource = node.getAssignedTopStatePartitionsCount();
+    int estimatedMaxTopStateCount = clusterContext.getEstimatedMaxTopStateCount();
+    return computeUtilizationScore(estimatedMaxTopStateCount, curTopPartitionCountForResource);
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/SamePartitionOnInstanceConstraint.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/SamePartitionOnInstanceConstraint.java
new file mode 100644
index 0000000..202e49a
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/SamePartitionOnInstanceConstraint.java
@@ -0,0 +1,39 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+
+class SamePartitionOnInstanceConstraint extends HardConstraint {
+
+  @Override
+  boolean isAssignmentValid(AssignableNode node, AssignableReplica replica,
+      ClusterContext clusterContext) {
+    return !node.getAssignedPartitionsByResource(replica.getResourceName())
+        .contains(replica.getPartitionName());
+  }
+
+  @Override
+  String getDescription() {
+    return "Same partition of different states cannot co-exist in one instance";
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/SoftConstraint.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/SoftConstraint.java
new file mode 100644
index 0000000..21bed84
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/SoftConstraint.java
@@ -0,0 +1,90 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+
+/**
+ * The "soft" constraint evaluates the optimality of an assignment by giving it a score of a scale
+ * of [minScore, maxScore]
+ * The higher the score, the better the assignment; Intuitively, the assignment is encouraged.
+ * The lower score the score, the worse the assignment; Intuitively, the assignment is penalized.
+ */
+abstract class SoftConstraint {
+  private final double _maxScore;
+  private final double _minScore;
+
+  interface NormalizeFunction {
+    /**
+     * Scale the origin score to a normalized range (0, 1).
+     * The purpose is to compare scores between different soft constraints.
+     * @param originScore The origin score
+     * @return The normalized value between (0, 1)
+     */
+    double scale(double originScore);
+  }
+
+  /**
+   * Child class customize the min/max score on its own
+   * @param maxScore The max score
+   * @param minScore The min score
+   */
+  SoftConstraint(double maxScore, double minScore) {
+    _maxScore = maxScore;
+    _minScore = minScore;
+  }
+
+  protected double getMaxScore() {
+    return _maxScore;
+  }
+
+  protected double getMinScore() {
+    return _minScore;
+  }
+
+  /**
+   * Evaluate and give a score for an potential assignment partition -> instance
+   * Child class only needs to care about how the score is implemented
+   * @return The score of the assignment in double value
+   */
+  protected abstract double getAssignmentScore(AssignableNode node, AssignableReplica replica,
+      ClusterContext clusterContext);
+
+  /**
+   * Evaluate and give a score for an potential assignment partition -> instance
+   * It's the only exposed method to the caller
+   * @return The score is normalized to be within MinScore and MaxScore
+   */
+  double getAssignmentNormalizedScore(AssignableNode node, AssignableReplica replica,
+      ClusterContext clusterContext) {
+    return getNormalizeFunction().scale(getAssignmentScore(node, replica, clusterContext));
+  }
+
+  /**
+   * The default scaler function that squashes any score within (min_score, max_score) to (0, 1);
+   * Child class could override the method and customize the method on its own
+   * @return The MinMaxScaler instance by default
+   */
+  protected NormalizeFunction getNormalizeFunction() {
+    return (score) -> (score - getMinScore()) / (getMaxScore() - getMinScore());
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/UsageSoftConstraint.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/UsageSoftConstraint.java
new file mode 100644
index 0000000..c8bc521
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/UsageSoftConstraint.java
@@ -0,0 +1,85 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.commons.math3.analysis.function.Sigmoid;
+
+/**
+ * The soft constraint that evaluates the assignment proposal based on usage.
+ */
+abstract class UsageSoftConstraint extends SoftConstraint {
+  private static final double MAX_SCORE = 1f;
+  private static final double MIN_SCORE = 0f;
+  /**
+   * Alpha is used to adjust the curve of sigmoid function.
+   * Intuitively, this is for tolerating the inaccuracy of the estimation.
+   * Ideally, if we have the prefect estimation, we can use a segmented function here, which
+   * scores the assignment with 1.0 if projected usage is below the estimation, and scores 0.0
+   * if the projected usage exceeds the estimation. However, in reality, it is hard to get a
+   * prefect estimation. With the curve of sigmoid, the algorithm reacts differently and
+   * reasonally even the usage is a little bit more or less than the estimation for a certain
+   * extend.
+   * As tested, when we have the input number which surrounds 1, the default alpha value will
+   * ensure a curve that has sigmoid(0.95) = 0.90, sigmoid(1.05) = 0.1. Meaning the constraint
+   * can handle the estimation inaccuracy of +-5%.
+   * To adjust the curve:
+   * 1. Smaller alpha will increase the curve's scope. So the function will be handler a wilder
+   * range of inaccuracy. However, the downside is more random movements since the evenness
+   * score would be more changable and nondefinitive.
+   * 2. Larger alpha will decrease the curve's scope. In that case, we might want to change to
+   * use segmented function so as to speed up the algorthm.
+   **/
+  private static final int DEFAULT_ALPHA = 44;
+  private static final Sigmoid SIGMOID = new Sigmoid();
+
+  UsageSoftConstraint() {
+    super(MAX_SCORE, MIN_SCORE);
+  }
+
+  /**
+   * Compute the utilization score based on the estimated and current usage numbers.
+   * The score = currentUsage / estimatedUsage.
+   * In short, a smaller score means better assignment proposal.
+   *
+   * @param estimatedUsage The estimated usage that is between [0.0, 1.0]
+   * @param currentUsage   The current usage that is between [0.0, 1.0]
+   * @return The score between [0.0, 1.0] that evaluates the utilization.
+   */
+  protected double computeUtilizationScore(double estimatedUsage, double currentUsage) {
+    if (estimatedUsage == 0) {
+      return 0;
+    }
+    return currentUsage / estimatedUsage;
+  }
+
+  /**
+   * Compute evaluation score based on the utilization data.
+   * The normalized score is evaluated using a sigmoid function.
+   * When the usage is smaller than 1.0, the constraint returns a value that is very close to the
+   * max score.
+   * When the usage is close or larger than 1.0, the constraint returns a score that is very close
+   * to the min score. Note even in this case, more usage will still be assigned with a
+   * smaller score.
+   */
+  @Override
+  protected NormalizeFunction getNormalizeFunction() {
+    return (score) -> SIGMOID.value(-(score - 1) * DEFAULT_ALPHA) * (MAX_SCORE - MIN_SCORE);
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ValidGroupTagConstraint.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ValidGroupTagConstraint.java
new file mode 100644
index 0000000..e31864f
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ValidGroupTagConstraint.java
@@ -0,0 +1,41 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+
+class ValidGroupTagConstraint extends HardConstraint {
+  @Override
+  boolean isAssignmentValid(AssignableNode node, AssignableReplica replica,
+      ClusterContext clusterContext) {
+    if (!replica.hasResourceInstanceGroupTag()) {
+      return true;
+    }
+
+    return node.getInstanceTags().contains(replica.getResourceInstanceGroupTag());
+  }
+
+  @Override
+  String getDescription() {
+    return "Instance doesn't have the tag of the replica";
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/AssignableNode.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/AssignableNode.java
new file mode 100644
index 0000000..06d4976
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/AssignableNode.java
@@ -0,0 +1,374 @@
+package org.apache.helix.controller.rebalancer.waged.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.ImmutableSet;
+import org.apache.helix.HelixException;
+import org.apache.helix.controller.rebalancer.util.WagedValidationUtil;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.InstanceConfig;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+
+/**
+ * This class represents a possible allocation of the replication.
+ * Note that any usage updates to the AssignableNode are not thread safe.
+ */
+public class AssignableNode implements Comparable<AssignableNode> {
+  private static final Logger LOG = LoggerFactory.getLogger(AssignableNode.class.getName());
+
+  // Immutable Instance Properties
+  private final String _instanceName;
+  private final String _faultZone;
+  // maximum number of the partitions that can be assigned to the instance.
+  private final int _maxPartition;
+  private final ImmutableSet<String> _instanceTags;
+  private final ImmutableMap<String, List<String>> _disabledPartitionsMap;
+  private final ImmutableMap<String, Integer> _maxAllowedCapacity;
+
+  // Mutable (Dynamic) Instance Properties
+  // A map of <resource name, <partition name, replica>> that tracks the replicas assigned to the
+  // node.
+  private Map<String, Map<String, AssignableReplica>> _currentAssignedReplicaMap;
+  // A map of <capacity key, capacity value> that tracks the current available node capacity
+  private Map<String, Integer> _remainingCapacity;
+
+  /**
+   * Update the node with a ClusterDataCache. This resets the current assignment and recalculates
+   * currentCapacity.
+   * NOTE: While this is required to be used in the constructor, this can also be used when the
+   * clusterCache needs to be
+   * refreshed. This is under the assumption that the capacity mappings of InstanceConfig and
+   * ResourceConfig could
+   * subject to change. If the assumption is no longer true, this function should become private.
+   */
+  AssignableNode(ClusterConfig clusterConfig, InstanceConfig instanceConfig, String instanceName) {
+    _instanceName = instanceName;
+    Map<String, Integer> instanceCapacity = fetchInstanceCapacity(clusterConfig, instanceConfig);
+    _faultZone = computeFaultZone(clusterConfig, instanceConfig);
+    _instanceTags = ImmutableSet.copyOf(instanceConfig.getTags());
+    _disabledPartitionsMap = ImmutableMap.copyOf(instanceConfig.getDisabledPartitionsMap());
+    // make a copy of max capacity
+    _maxAllowedCapacity = ImmutableMap.copyOf(instanceCapacity);
+    _remainingCapacity = new HashMap<>(instanceCapacity);
+    _maxPartition = clusterConfig.getMaxPartitionsPerInstance();
+    _currentAssignedReplicaMap = new HashMap<>();
+  }
+
+  /**
+   * This function should only be used to assign a set of new partitions that are not allocated on
+   * this node. It's because the any exception could occur at the middle of batch assignment and the
+   * previous finished assignment cannot be reverted
+   * Using this function avoids the overhead of updating capacity repeatedly.
+   */
+  void assignInitBatch(Collection<AssignableReplica> replicas) {
+    Map<String, Integer> totalPartitionCapacity = new HashMap<>();
+    for (AssignableReplica replica : replicas) {
+      // TODO: the exception could occur in the middle of for loop and the previous added records cannot be reverted
+      addToAssignmentRecord(replica);
+      // increment the capacity requirement according to partition's capacity configuration.
+      for (Map.Entry<String, Integer> capacity : replica.getCapacity().entrySet()) {
+        totalPartitionCapacity.compute(capacity.getKey(),
+            (key, totalValue) -> (totalValue == null) ? capacity.getValue()
+                : totalValue + capacity.getValue());
+      }
+    }
+
+    // Update the global state after all single replications' calculation is done.
+    for (String capacityKey : totalPartitionCapacity.keySet()) {
+      updateRemainingCapacity(capacityKey, totalPartitionCapacity.get(capacityKey));
+    }
+  }
+
+  /**
+   * Assign a replica to the node.
+   * @param assignableReplica - the replica to be assigned
+   */
+  void assign(AssignableReplica assignableReplica) {
+    addToAssignmentRecord(assignableReplica);
+    assignableReplica.getCapacity().entrySet().stream()
+            .forEach(capacity -> updateRemainingCapacity(capacity.getKey(), capacity.getValue()));
+  }
+
+  /**
+   * Release a replica from the node.
+   * If the replication is not on this node, the assignable node is not updated.
+   * @param replica - the replica to be released
+   */
+  void release(AssignableReplica replica)
+      throws IllegalArgumentException {
+    String resourceName = replica.getResourceName();
+    String partitionName = replica.getPartitionName();
+
+    // Check if the release is necessary
+    if (!_currentAssignedReplicaMap.containsKey(resourceName)) {
+      LOG.warn("Resource {} is not on node {}. Ignore the release call.", resourceName,
+          getInstanceName());
+      return;
+    }
+
+    Map<String, AssignableReplica> partitionMap = _currentAssignedReplicaMap.get(resourceName);
+    if (!partitionMap.containsKey(partitionName) || !partitionMap.get(partitionName)
+        .equals(replica)) {
+      LOG.warn("Replica {} is not assigned to node {}. Ignore the release call.",
+          replica.toString(), getInstanceName());
+      return;
+    }
+
+    AssignableReplica removedReplica = partitionMap.remove(partitionName);
+    removedReplica.getCapacity().entrySet().stream()
+        .forEach(entry -> updateRemainingCapacity(entry.getKey(), -1 * entry.getValue()));
+  }
+
+  /**
+   * @return A set of all assigned replicas on the node.
+   */
+  Set<AssignableReplica> getAssignedReplicas() {
+    return _currentAssignedReplicaMap.values().stream()
+        .flatMap(replicaMap -> replicaMap.values().stream()).collect(Collectors.toSet());
+  }
+
+  /**
+   * @return The current assignment in a map of <resource name, set of partition names>
+   */
+  Map<String, Set<String>> getAssignedPartitionsMap() {
+    Map<String, Set<String>> assignmentMap = new HashMap<>();
+    for (String resourceName : _currentAssignedReplicaMap.keySet()) {
+      assignmentMap.put(resourceName, _currentAssignedReplicaMap.get(resourceName).keySet());
+    }
+    return assignmentMap;
+  }
+
+  /**
+   * @param resource Resource name
+   * @return A set of the current assigned replicas' partition names in the specified resource.
+   */
+  public Set<String> getAssignedPartitionsByResource(String resource) {
+    return _currentAssignedReplicaMap.getOrDefault(resource, Collections.emptyMap()).keySet();
+  }
+
+  /**
+   * @param resource Resource name
+   * @return A set of the current assigned replicas' partition names with the top state in the
+   *         specified resource.
+   */
+  Set<String> getAssignedTopStatePartitionsByResource(String resource) {
+    return _currentAssignedReplicaMap.getOrDefault(resource, Collections.emptyMap()).entrySet()
+        .stream().filter(partitionEntry -> partitionEntry.getValue().isReplicaTopState())
+        .map(partitionEntry -> partitionEntry.getKey()).collect(Collectors.toSet());
+  }
+
+  /**
+   * @return The total count of assigned top state partitions.
+   */
+  public int getAssignedTopStatePartitionsCount() {
+    return (int) _currentAssignedReplicaMap.values().stream()
+        .flatMap(replicaMap -> replicaMap.values().stream())
+        .filter(AssignableReplica::isReplicaTopState).count();
+  }
+
+  /**
+   * @return The total count of assigned replicas.
+   */
+  public int getAssignedReplicaCount() {
+    return _currentAssignedReplicaMap.values().stream().mapToInt(Map::size).sum();
+  }
+
+  /**
+   * @return The current available capacity.
+   */
+  public Map<String, Integer> getRemainingCapacity() {
+    return _remainingCapacity;
+  }
+
+  /**
+   * @return A map of <capacity category, capacity number> that describes the max capacity of the
+   *         node.
+   */
+  public Map<String, Integer> getMaxCapacity() {
+    return _maxAllowedCapacity;
+  }
+
+  /**
+   * Return the most concerning capacity utilization number for evenly partition assignment.
+   * The method dynamically calculates the projected highest utilization number among all the
+   * capacity categories assuming the new capacity usage is added to the node.
+   * For example, if the current node usage is {CPU: 0.9, MEM: 0.4, DISK: 0.6}. Then this call shall
+   * return 0.9.
+   * @param newUsage the proposed new additional capacity usage.
+   * @return The highest utilization number of the node among all the capacity category.
+   */
+  public float getProjectedHighestUtilization(Map<String, Integer> newUsage) {
+    float highestCapacityUtilization = 0;
+    for (String capacityKey : _maxAllowedCapacity.keySet()) {
+      float capacityValue = _maxAllowedCapacity.get(capacityKey);
+      float utilization = (capacityValue - _remainingCapacity.get(capacityKey) + newUsage
+          .getOrDefault(capacityKey, 0)) / capacityValue;
+      highestCapacityUtilization = Math.max(highestCapacityUtilization, utilization);
+    }
+    return highestCapacityUtilization;
+  }
+
+  public String getInstanceName() {
+    return _instanceName;
+  }
+
+  public Set<String> getInstanceTags() {
+    return _instanceTags;
+  }
+
+  public String getFaultZone() {
+    return _faultZone;
+  }
+
+  public boolean hasFaultZone() {
+    return _faultZone != null;
+  }
+
+  /**
+   * @return A map of <resource name, set of partition names> contains all the partitions that are
+   *         disabled on the node.
+   */
+  public Map<String, List<String>> getDisabledPartitionsMap() {
+    return _disabledPartitionsMap;
+  }
+
+  /**
+   * @return The max partition count that are allowed to be allocated on the node.
+   */
+  public int getMaxPartition() {
+    return _maxPartition;
+  }
+
+  /**
+   * Computes the fault zone id based on the domain and fault zone type when topology is enabled.
+   * For example, when
+   * the domain is "zone=2, instance=testInstance" and the fault zone type is "zone", this function
+   * returns "2".
+   * If cannot find the fault zone type, this function leaves the fault zone id as the instance name.
+   * Note the WAGED rebalancer does not require full topology tree to be created. So this logic is
+   * simpler than the CRUSH based rebalancer.
+   */
+  private String computeFaultZone(ClusterConfig clusterConfig, InstanceConfig instanceConfig) {
+    if (!clusterConfig.isTopologyAwareEnabled()) {
+      // Instance name is the default fault zone if topology awareness is false.
+      return instanceConfig.getInstanceName();
+    }
+    String topologyStr = clusterConfig.getTopology();
+    String faultZoneType = clusterConfig.getFaultZoneType();
+    if (topologyStr == null || faultZoneType == null) {
+      LOG.debug("Topology configuration is not complete. Topology define: {}, Fault Zone Type: {}",
+          topologyStr, faultZoneType);
+      // Use the instance name, or the deprecated ZoneId field (if exists) as the default fault
+      // zone.
+      String zoneId = instanceConfig.getZoneId();
+      return zoneId == null ? instanceConfig.getInstanceName() : zoneId;
+    } else {
+      // Get the fault zone information from the complete topology definition.
+      String[] topologyKeys = topologyStr.trim().split("/");
+      if (topologyKeys.length == 0 || Arrays.stream(topologyKeys)
+          .noneMatch(type -> type.equals(faultZoneType))) {
+        throw new HelixException(
+            "The configured topology definition is empty or does not contain the fault zone type.");
+      }
+
+      Map<String, String> domainAsMap = instanceConfig.getDomainAsMap();
+      StringBuilder faultZoneStringBuilder = new StringBuilder();
+      for (String key : topologyKeys) {
+        if (!key.isEmpty()) {
+          // if a key does not exist in the instance domain config, apply the default domain value.
+          faultZoneStringBuilder.append(domainAsMap.getOrDefault(key, "Default_" + key));
+          if (key.equals(faultZoneType)) {
+            break;
+          } else {
+            faultZoneStringBuilder.append('/');
+          }
+        }
+      }
+      return faultZoneStringBuilder.toString();
+    }
+  }
+
+  /**
+   * @throws HelixException if the replica has already been assigned to the node.
+   */
+  private void addToAssignmentRecord(AssignableReplica replica) {
+    String resourceName = replica.getResourceName();
+    String partitionName = replica.getPartitionName();
+    if (_currentAssignedReplicaMap.containsKey(resourceName) && _currentAssignedReplicaMap
+        .get(resourceName).containsKey(partitionName)) {
+      throw new HelixException(String
+          .format("Resource %s already has a replica with state %s from partition %s on node %s",
+              replica.getResourceName(), replica.getReplicaState(), replica.getPartitionName(),
+              getInstanceName()));
+    } else {
+      _currentAssignedReplicaMap.computeIfAbsent(resourceName, key -> new HashMap<>())
+          .put(partitionName, replica);
+    }
+  }
+
+  private void updateRemainingCapacity(String capacityKey, int usage) {
+    if (!_remainingCapacity.containsKey(capacityKey)) {
+      //if the capacityKey belongs to replicas does not exist in the instance's capacity,
+      // it will be treated as if it has unlimited capacity of that capacityKey
+      return;
+    }
+    _remainingCapacity.put(capacityKey, _remainingCapacity.get(capacityKey) - usage);
+  }
+
+  /**
+   * Get and validate the instance capacity from instance config.
+   * @throws HelixException if any required capacity key is not configured in the instance config.
+   */
+  private Map<String, Integer> fetchInstanceCapacity(ClusterConfig clusterConfig,
+      InstanceConfig instanceConfig) {
+    Map<String, Integer> instanceCapacity =
+        WagedValidationUtil.validateAndGetInstanceCapacity(clusterConfig, instanceConfig);
+    // Remove all the non-required capacity items from the map.
+    instanceCapacity.keySet().retainAll(clusterConfig.getInstanceCapacityKeys());
+    return instanceCapacity;
+  }
+
+  @Override
+  public int hashCode() {
+    return _instanceName.hashCode();
+  }
+
+  @Override
+  public int compareTo(AssignableNode o) {
+    return _instanceName.compareTo(o.getInstanceName());
+  }
+
+  @Override
+  public String toString() {
+    return _instanceName;
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/AssignableReplica.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/AssignableReplica.java
new file mode 100644
index 0000000..fdcc03a
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/AssignableReplica.java
@@ -0,0 +1,161 @@
+package org.apache.helix.controller.rebalancer.waged.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.helix.HelixException;
+import org.apache.helix.controller.rebalancer.util.WagedValidationUtil;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.ResourceConfig;
+import org.apache.helix.model.StateModelDefinition;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * This class represents a partition replication that needs to be allocated.
+ */
+public class AssignableReplica implements Comparable<AssignableReplica> {
+  private static final Logger LOG = LoggerFactory.getLogger(AssignableReplica.class);
+
+  private final String _replicaKey;
+  private final String _partitionName;
+  private final String _resourceName;
+  private final String _resourceInstanceGroupTag;
+  private final int _resourceMaxPartitionsPerInstance;
+  private final Map<String, Integer> _capacityUsage;
+  // The priority of the replica's state
+  private final int _statePriority;
+  // The state of the replica
+  private final String _replicaState;
+
+  /**
+   * @param clusterConfig  The cluster config.
+   * @param resourceConfig The resource config for the resource which contains the replication.
+   * @param partitionName  The replication's partition name.
+   * @param replicaState   The state of the replication.
+   * @param statePriority  The priority of the replication's state.
+   */
+  AssignableReplica(ClusterConfig clusterConfig, ResourceConfig resourceConfig,
+      String partitionName, String replicaState, int statePriority) {
+    _partitionName = partitionName;
+    _replicaState = replicaState;
+    _statePriority = statePriority;
+    _resourceName = resourceConfig.getResourceName();
+    _capacityUsage = fetchCapacityUsage(partitionName, resourceConfig, clusterConfig);
+    _resourceInstanceGroupTag = resourceConfig.getInstanceGroupTag();
+    _resourceMaxPartitionsPerInstance = resourceConfig.getMaxPartitionsPerInstance();
+    _replicaKey = generateReplicaKey(_resourceName, _partitionName,_replicaState);
+  }
+
+  public Map<String, Integer> getCapacity() {
+    return _capacityUsage;
+  }
+
+  public String getPartitionName() {
+    return _partitionName;
+  }
+
+  public String getReplicaState() {
+    return _replicaState;
+  }
+
+  public boolean isReplicaTopState() {
+    return _statePriority == StateModelDefinition.TOP_STATE_PRIORITY;
+  }
+
+  public int getStatePriority() {
+    return _statePriority;
+  }
+
+  public String getResourceName() {
+    return _resourceName;
+  }
+
+  public String getResourceInstanceGroupTag() {
+    return _resourceInstanceGroupTag;
+  }
+
+  public boolean hasResourceInstanceGroupTag() {
+    return _resourceInstanceGroupTag != null && !_resourceInstanceGroupTag.isEmpty();
+  }
+
+  public int getResourceMaxPartitionsPerInstance() {
+    return _resourceMaxPartitionsPerInstance;
+  }
+
+  @Override
+  public String toString() {
+    return _replicaKey;
+  }
+
+  @Override
+  public int compareTo(AssignableReplica replica) {
+    if (!_resourceName.equals(replica._resourceName)) {
+      return _resourceName.compareTo(replica._resourceName);
+    }
+    if (!_partitionName.equals(replica._partitionName)) {
+      return _partitionName.compareTo(replica._partitionName);
+    }
+    if (!_replicaState.equals(replica._replicaState)) {
+      return _replicaState.compareTo(replica._replicaState);
+    }
+    return 0;
+  }
+
+  @Override
+  public boolean equals(Object obj) {
+    if (obj == null) {
+      return false;
+    }
+    if (obj instanceof AssignableReplica) {
+      return compareTo((AssignableReplica) obj) == 0;
+    } else {
+      return false;
+    }
+  }
+
+  public static String generateReplicaKey(String resourceName, String partitionName, String state) {
+    return String.format("%s-%s-%s", resourceName, partitionName, state);
+  }
+
+  /**
+   * Parse the resource config for the partition weight.
+   */
+  private Map<String, Integer> fetchCapacityUsage(String partitionName,
+      ResourceConfig resourceConfig, ClusterConfig clusterConfig) {
+    Map<String, Map<String, Integer>> capacityMap;
+    try {
+      capacityMap = resourceConfig.getPartitionCapacityMap();
+    } catch (IOException ex) {
+      throw new IllegalArgumentException(
+          "Invalid partition capacity configuration of resource: " + resourceConfig
+              .getResourceName(), ex);
+    }
+    Map<String, Integer> partitionCapacity = WagedValidationUtil
+        .validateAndGetPartitionCapacity(partitionName, resourceConfig, capacityMap, clusterConfig);
+    // Remove the non-required capacity items.
+    partitionCapacity.keySet().retainAll(clusterConfig.getInstanceCapacityKeys());
+    return partitionCapacity;
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/ClusterContext.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/ClusterContext.java
new file mode 100644
index 0000000..4705be5
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/ClusterContext.java
@@ -0,0 +1,172 @@
+package org.apache.helix.controller.rebalancer.waged.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+import org.apache.helix.HelixException;
+import org.apache.helix.model.ResourceAssignment;
+
+
+/**
+ * This class tracks the rebalance-related global cluster status.
+ */
+public class ClusterContext {
+  // This estimation helps to ensure global partition count evenness
+  private final int _estimatedMaxPartitionCount;
+  // This estimation helps to ensure global top state replica count evenness
+  private final int _estimatedMaxTopStateCount;
+  // This estimation helps to ensure per-resource partition count evenness
+  private final Map<String, Integer> _estimatedMaxPartitionByResource = new HashMap<>();
+  // This estimation helps to ensure global resource usage evenness.
+  private final float _estimatedMaxUtilization;
+
+  // map{zoneName : map{resourceName : set(partitionNames)}}
+  private Map<String, Map<String, Set<String>>> _assignmentForFaultZoneMap = new HashMap<>();
+  // Records about the previous assignment
+  // <ResourceName, ResourceAssignment contains the baseline assignment>
+  private final Map<String, ResourceAssignment> _baselineAssignment;
+  // <ResourceName, ResourceAssignment contains the best possible assignment>
+  private final Map<String, ResourceAssignment> _bestPossibleAssignment;
+
+  /**
+   * Construct the cluster context based on the current instance status.
+   * @param replicaSet All the partition replicas that are managed by the rebalancer
+   * @param nodeSet All the active nodes that are managed by the rebalancer
+   */
+  ClusterContext(Set<AssignableReplica> replicaSet, Set<AssignableNode> nodeSet,
+      Map<String, ResourceAssignment> baselineAssignment, Map<String, ResourceAssignment> bestPossibleAssignment) {
+    int instanceCount = nodeSet.size();
+    int totalReplicas = 0;
+    int totalTopStateReplicas = 0;
+    Map<String, Integer> totalUsage = new HashMap<>();
+    Map<String, Integer> totalCapacity = new HashMap<>();
+
+    for (Map.Entry<String, List<AssignableReplica>> entry : replicaSet.stream()
+        .collect(Collectors.groupingBy(AssignableReplica::getResourceName))
+        .entrySet()) {
+      int replicas = entry.getValue().size();
+      totalReplicas += replicas;
+
+      int replicaCnt = Math.max(1, estimateAvgReplicaCount(replicas, instanceCount));
+      _estimatedMaxPartitionByResource.put(entry.getKey(), replicaCnt);
+
+      for (AssignableReplica replica : entry.getValue()) {
+        if (replica.isReplicaTopState()) {
+          totalTopStateReplicas += 1;
+        }
+        replica.getCapacity().entrySet().stream().forEach(capacityEntry -> totalUsage
+            .compute(capacityEntry.getKey(),
+                (k, v) -> (v == null) ? capacityEntry.getValue() : (v + capacityEntry.getValue())));
+      }
+    }
+    nodeSet.stream().forEach(node -> node.getMaxCapacity().entrySet().stream().forEach(
+        capacityEntry -> totalCapacity.compute(capacityEntry.getKey(),
+            (k, v) -> (v == null) ? capacityEntry.getValue() : (v + capacityEntry.getValue()))));
+
+    if (totalCapacity.isEmpty()) {
+      // If no capacity is configured, we treat the cluster as fully utilized.
+      _estimatedMaxUtilization = 1f;
+    } else {
+      float estimatedMaxUsage = 0;
+      for (String capacityKey : totalCapacity.keySet()) {
+        int maxCapacity = totalCapacity.get(capacityKey);
+        int usage = totalUsage.getOrDefault(capacityKey, 0);
+        float utilization = (maxCapacity == 0) ? 1 : (float) usage / maxCapacity;
+        estimatedMaxUsage = Math.max(estimatedMaxUsage, utilization);
+      }
+      _estimatedMaxUtilization = estimatedMaxUsage;
+    }
+    _estimatedMaxPartitionCount = estimateAvgReplicaCount(totalReplicas, instanceCount);
+    _estimatedMaxTopStateCount = estimateAvgReplicaCount(totalTopStateReplicas, instanceCount);
+    _baselineAssignment = baselineAssignment;
+    _bestPossibleAssignment = bestPossibleAssignment;
+  }
+
+  public Map<String, ResourceAssignment> getBaselineAssignment() {
+    return _baselineAssignment == null || _baselineAssignment.isEmpty() ? Collections.emptyMap() : _baselineAssignment;
+  }
+
+  public Map<String, ResourceAssignment> getBestPossibleAssignment() {
+    return _bestPossibleAssignment == null || _bestPossibleAssignment.isEmpty() ? Collections.emptyMap()
+        : _bestPossibleAssignment;
+  }
+
+  public Map<String, Map<String, Set<String>>> getAssignmentForFaultZoneMap() {
+    return _assignmentForFaultZoneMap;
+  }
+
+  public int getEstimatedMaxPartitionCount() {
+    return _estimatedMaxPartitionCount;
+  }
+
+  public int getEstimatedMaxPartitionByResource(String resourceName) {
+    return _estimatedMaxPartitionByResource.get(resourceName);
+  }
+
+  public int getEstimatedMaxTopStateCount() {
+    return _estimatedMaxTopStateCount;
+  }
+
+  public float getEstimatedMaxUtilization() {
+    return _estimatedMaxUtilization;
+  }
+
+  public Set<String> getPartitionsForResourceAndFaultZone(String resourceName, String faultZoneId) {
+    return _assignmentForFaultZoneMap.getOrDefault(faultZoneId, Collections.emptyMap())
+        .getOrDefault(resourceName, Collections.emptySet());
+  }
+
+  void addPartitionToFaultZone(String faultZoneId, String resourceName, String partition) {
+    if (!_assignmentForFaultZoneMap.computeIfAbsent(faultZoneId, k -> new HashMap<>())
+        .computeIfAbsent(resourceName, k -> new HashSet<>())
+        .add(partition)) {
+      throw new HelixException(
+          String.format("Resource %s already has a replica from partition %s in fault zone %s", resourceName, partition,
+              faultZoneId));
+    }
+  }
+
+  boolean removePartitionFromFaultZone(String faultZoneId, String resourceName, String partition) {
+    return _assignmentForFaultZoneMap.getOrDefault(faultZoneId, Collections.emptyMap())
+        .getOrDefault(resourceName, Collections.emptySet())
+        .remove(partition);
+  }
+
+  void setAssignmentForFaultZoneMap(Map<String, Map<String, Set<String>>> assignmentForFaultZoneMap) {
+    _assignmentForFaultZoneMap = assignmentForFaultZoneMap;
+  }
+
+  private int estimateAvgReplicaCount(int replicaCount, int instanceCount) {
+    // Use the floor to ensure evenness.
+    // Note if we calculate estimation based on ceil, we might have some low usage participants.
+    // For example, if the evaluation is between 1 and 2. While we use 2, many participants will be
+    // allocated with 2 partitions. And the other participants only has 0 partitions. Otherwise,
+    // if we use 1, most participant will have 1 partition assigned and several participant has 2
+    // partitions. The later scenario is what we want to achieve.
+    return (int) Math.floor((float) replicaCount / instanceCount);
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/ClusterModel.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/ClusterModel.java
new file mode 100644
index 0000000..57ffa42
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/ClusterModel.java
@@ -0,0 +1,132 @@
+package org.apache.helix.controller.rebalancer.waged.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Collections;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+import org.apache.helix.HelixException;
+
+/**
+ * This class wraps the required input for the rebalance algorithm.
+ */
+public class ClusterModel {
+  private final ClusterContext _clusterContext;
+  // Map to track all the assignable replications. <Resource Name, Set<Replicas>>
+  private final Map<String, Set<AssignableReplica>> _assignableReplicaMap;
+  // The index to find the replication information with a certain state. <Resource, <Key(resource_partition_state), Replica>>
+  // Note that the identical replicas are deduped in the index.
+  private final Map<String, Map<String, AssignableReplica>> _assignableReplicaIndex;
+  private final Map<String, AssignableNode> _assignableNodeMap;
+
+  /**
+   * @param clusterContext         The initialized cluster context.
+   * @param assignableReplicas     The replicas to be assigned.
+   *                               Note that the replicas in this list shall not be included while initializing the context and assignable nodes.
+   * @param assignableNodes        The active instances.
+   */
+  ClusterModel(ClusterContext clusterContext, Set<AssignableReplica> assignableReplicas,
+      Set<AssignableNode> assignableNodes) {
+    _clusterContext = clusterContext;
+
+    // Save all the to be assigned replication
+    _assignableReplicaMap = assignableReplicas.stream()
+        .collect(Collectors.groupingBy(AssignableReplica::getResourceName, Collectors.toSet()));
+
+    // Index all the replicas to be assigned. Dedup the replica if two instances have the same resource/partition/state
+    _assignableReplicaIndex = assignableReplicas.stream().collect(Collectors
+        .groupingBy(AssignableReplica::getResourceName, Collectors
+            .toMap(AssignableReplica::toString, replica -> replica,
+                (oldValue, newValue) -> oldValue)));
+
+    _assignableNodeMap = assignableNodes.parallelStream()
+        .collect(Collectors.toMap(AssignableNode::getInstanceName, node -> node));
+  }
+
+  public ClusterContext getContext() {
+    return _clusterContext;
+  }
+
+  public Map<String, AssignableNode> getAssignableNodes() {
+    return _assignableNodeMap;
+  }
+
+  public Map<String, Set<AssignableReplica>> getAssignableReplicaMap() {
+    return _assignableReplicaMap;
+  }
+
+  /**
+   * Assign the given replica to the specified instance and record the assignment in the cluster model.
+   * The cluster usage information will be updated accordingly.
+   *
+   * @param resourceName
+   * @param partitionName
+   * @param state
+   * @param instanceName
+   */
+  public void assign(String resourceName, String partitionName, String state, String instanceName) {
+    AssignableNode node = locateAssignableNode(instanceName);
+    AssignableReplica replica = locateAssignableReplica(resourceName, partitionName, state);
+
+    node.assign(replica);
+    _clusterContext.addPartitionToFaultZone(node.getFaultZone(), resourceName, partitionName);
+  }
+
+  /**
+   * Revert the proposed assignment from the cluster model.
+   * The cluster usage information will be updated accordingly.
+   *
+   * @param resourceName
+   * @param partitionName
+   * @param state
+   * @param instanceName
+   */
+  public void release(String resourceName, String partitionName, String state,
+      String instanceName) {
+    AssignableNode node = locateAssignableNode(instanceName);
+    AssignableReplica replica = locateAssignableReplica(resourceName, partitionName, state);
+
+    node.release(replica);
+    _clusterContext.removePartitionFromFaultZone(node.getFaultZone(), resourceName, partitionName);
+  }
+
+  private AssignableNode locateAssignableNode(String instanceName) {
+    AssignableNode node = _assignableNodeMap.get(instanceName);
+    if (node == null) {
+      throw new HelixException("Cannot find the instance: " + instanceName);
+    }
+    return node;
+  }
+
+  private AssignableReplica locateAssignableReplica(String resourceName, String partitionName,
+      String state) {
+    AssignableReplica sampleReplica =
+        _assignableReplicaIndex.getOrDefault(resourceName, Collections.emptyMap())
+            .get(AssignableReplica.generateReplicaKey(resourceName, partitionName, state));
+    if (sampleReplica == null) {
+      throw new HelixException(String
+          .format("Cannot find the replication with resource name %s, partition name %s, state %s.",
+              resourceName, partitionName, state));
+    }
+    return sampleReplica;
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/ClusterModelProvider.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/ClusterModelProvider.java
new file mode 100644
index 0000000..41c43d6
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/ClusterModelProvider.java
@@ -0,0 +1,532 @@
+package org.apache.helix.controller.rebalancer.waged.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+import org.apache.helix.HelixConstants;
+import org.apache.helix.HelixException;
+import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.IdealState;
+import org.apache.helix.model.InstanceConfig;
+import org.apache.helix.model.Resource;
+import org.apache.helix.model.ResourceAssignment;
+import org.apache.helix.model.ResourceConfig;
+import org.apache.helix.model.StateModelDefinition;
+
+/**
+ * This util class generates Cluster Model object based on the controller's data cache.
+ */
+public class ClusterModelProvider {
+
+  private enum RebalanceScopeType {
+    // Set the rebalance scope to cover the difference between the current assignment and the
+    // Baseline assignment only.
+    PARTIAL,
+    // Set the rebalance scope to cover all replicas that need relocation based on the cluster
+    // changes.
+    GLOBAL_BASELINE
+  }
+
+  /**
+   * Generate a new Cluster Model object according to the current cluster status for partial
+   * rebalance. The rebalance scope is configured for recovering the missing replicas that are in
+   * the Baseline assignment but not in the current Best possible assignment only.
+   * @param dataProvider           The controller's data cache.
+   * @param resourceMap            The full list of the resources to be rebalanced. Note that any
+   *                               resources that are not in this list will be removed from the
+   *                               final assignment.
+   * @param activeInstances        The active instances that will be used in the calculation.
+   *                               Note this list can be different from the real active node list
+   *                               according to the rebalancer logic.
+   * @param baselineAssignment     The persisted Baseline assignment.
+   * @param bestPossibleAssignment The persisted Best Possible assignment that was generated in the
+   *                               previous rebalance.
+   * @return
+   */
+  public static ClusterModel generateClusterModelForPartialRebalance(
+      ResourceControllerDataProvider dataProvider, Map<String, Resource> resourceMap,
+      Set<String> activeInstances, Map<String, ResourceAssignment> baselineAssignment,
+      Map<String, ResourceAssignment> bestPossibleAssignment) {
+    return generateClusterModel(dataProvider, resourceMap, activeInstances, Collections.emptyMap(),
+        baselineAssignment, bestPossibleAssignment, RebalanceScopeType.PARTIAL);
+  }
+
+  /**
+   * Generate a new Cluster Model object according to the current cluster status for the Baseline
+   * calculation. The rebalance scope is determined according to the cluster changes.
+   * @param dataProvider           The controller's data cache.
+   * @param resourceMap            The full list of the resources to be rebalanced. Note that any
+   *                               resources that are not in this list will be removed from the
+   *                               final assignment.
+   * @param allInstances           All the instances that will be used in the calculation.
+   * @param clusterChanges         All the cluster changes that happened after the previous rebalance.
+   * @param baselineAssignment     The previous Baseline assignment.
+   * @return the new cluster model
+   */
+  public static ClusterModel generateClusterModelForBaseline(
+      ResourceControllerDataProvider dataProvider, Map<String, Resource> resourceMap,
+      Set<String> allInstances, Map<HelixConstants.ChangeType, Set<String>> clusterChanges,
+      Map<String, ResourceAssignment> baselineAssignment) {
+    return generateClusterModel(dataProvider, resourceMap, allInstances, clusterChanges,
+        Collections.emptyMap(), baselineAssignment, RebalanceScopeType.GLOBAL_BASELINE);
+  }
+
+  /**
+   * Generate a cluster model based on the current state output and data cache. The rebalance scope
+   * is configured for recovering the missing replicas only.
+   * @param dataProvider           The controller's data cache.
+   * @param resourceMap            The full list of the resources to be rebalanced. Note that any
+   *                               resources that are not in this list will be removed from the
+   *                               final assignment.
+   * @param currentStateAssignment The resource assignment built from current state output.
+   * @return the new cluster model
+   */
+  public static ClusterModel generateClusterModelFromExistingAssignment(
+      ResourceControllerDataProvider dataProvider, Map<String, Resource> resourceMap,
+      Map<String, ResourceAssignment> currentStateAssignment) {
+    return generateClusterModel(dataProvider, resourceMap, dataProvider.getEnabledLiveInstances(),
+        Collections.emptyMap(), Collections.emptyMap(), currentStateAssignment,
+        RebalanceScopeType.GLOBAL_BASELINE);
+  }
+
+  /**
+   * Generate a new Cluster Model object according to the current cluster status.
+   * @param dataProvider           The controller's data cache.
+   * @param resourceMap            The full list of the resources to be rebalanced. Note that any
+   *                               resources that are not in this list will be removed from the
+   *                               final assignment.
+   * @param activeInstances        The active instances that will be used in the calculation.
+   *                               Note this list can be different from the real active node list
+   *                               according to the rebalancer logic.
+   * @param clusterChanges         All the cluster changes that happened after the previous rebalance.
+   * @param idealAssignment        The ideal assignment.
+   * @param currentAssignment      The current assignment that was generated in the previous rebalance.
+   * @param scopeType              Specify how to determine the rebalance scope.
+   * @return the new cluster model
+   */
+  private static ClusterModel generateClusterModel(ResourceControllerDataProvider dataProvider,
+      Map<String, Resource> resourceMap, Set<String> activeInstances,
+      Map<HelixConstants.ChangeType, Set<String>> clusterChanges,
+      Map<String, ResourceAssignment> idealAssignment,
+      Map<String, ResourceAssignment> currentAssignment, RebalanceScopeType scopeType) {
+    // Construct all the assignable nodes and initialize with the allocated replicas.
+    Set<AssignableNode> assignableNodes =
+        getAllAssignableNodes(dataProvider.getClusterConfig(), dataProvider.getInstanceConfigMap(),
+            activeInstances);
+
+    // Generate replica objects for all the resource partitions.
+    // <resource, replica set>
+    Map<String, Set<AssignableReplica>> replicaMap =
+        getAllAssignableReplicas(dataProvider, resourceMap, assignableNodes);
+
+    // Check if the replicas need to be reassigned.
+    Map<String, Set<AssignableReplica>> allocatedReplicas =
+        new HashMap<>(); // <instanceName, replica set>
+    Set<AssignableReplica> toBeAssignedReplicas;
+    switch (scopeType) {
+      case GLOBAL_BASELINE:
+        toBeAssignedReplicas = findToBeAssignedReplicasByClusterChanges(replicaMap, activeInstances,
+            dataProvider.getLiveInstances().keySet(), clusterChanges, currentAssignment,
+            allocatedReplicas);
+        break;
+      case PARTIAL:
+        // Filter to remove the replicas that do not exist in the ideal assignment given but exist
+        // in the replicaMap. This is because such replicas are new additions that do not need to be
+        // rebalanced right away.
+        retainExistingReplicas(replicaMap, idealAssignment);
+        toBeAssignedReplicas =
+            findToBeAssignedReplicasByComparingWithIdealAssignment(replicaMap, activeInstances,
+                idealAssignment, currentAssignment, allocatedReplicas);
+        break;
+      default:
+        throw new HelixException("Unknown rebalance scope type: " + scopeType);
+    }
+
+    // Update the allocated replicas to the assignable nodes.
+    assignableNodes.parallelStream().forEach(node -> node.assignInitBatch(
+        allocatedReplicas.getOrDefault(node.getInstanceName(), Collections.emptySet())));
+
+    // Construct and initialize cluster context.
+    ClusterContext context = new ClusterContext(
+        replicaMap.values().stream().flatMap(Set::stream).collect(Collectors.toSet()),
+        assignableNodes, idealAssignment, currentAssignment);
+    // Initial the cluster context with the allocated assignments.
+    context.setAssignmentForFaultZoneMap(mapAssignmentToFaultZone(assignableNodes));
+
+    return new ClusterModel(context, toBeAssignedReplicas, assignableNodes);
+  }
+
+  // Filter the replicas map so only the replicas that have been allocated in the existing
+  // assignmentMap remain in the map.
+  private static void retainExistingReplicas(Map<String, Set<AssignableReplica>> replicaMap,
+      Map<String, ResourceAssignment> assignmentMap) {
+    replicaMap.entrySet().parallelStream().forEach(replicaSetEntry -> {
+      // <partition, <state, instances set>>
+      Map<String, Map<String, Set<String>>> stateInstanceMap =
+          getStateInstanceMap(assignmentMap.get(replicaSetEntry.getKey()));
+      // Iterate the replicas of the resource to find the ones that require reallocating.
+      Iterator<AssignableReplica> replicaIter = replicaSetEntry.getValue().iterator();
+      while (replicaIter.hasNext()) {
+        AssignableReplica replica = replicaIter.next();
+        Set<String> validInstances =
+            stateInstanceMap.getOrDefault(replica.getPartitionName(), Collections.emptyMap())
+                .getOrDefault(replica.getReplicaState(), Collections.emptySet());
+        if (validInstances.isEmpty()) {
+          // Removing the replica if it is not known in the assignment map.
+          replicaIter.remove();
+        } else {
+          // Remove the instance from the state map record after processing so it won't be
+          // double-processed as we loop through all replica
+          validInstances.remove(validInstances.iterator().next());
+        }
+      }
+    });
+  }
+
+  /**
+   * Find the minimum set of replicas that need to be reassigned by comparing the current assignment
+   * with the ideal assignment.
+   * A replica needs to be reassigned or newly assigned if either of the following conditions is true:
+   * 1. The partition allocation (the instance the replica is placed on) in the ideal assignment and
+   * the current assignment are different. And the allocation in the ideal assignment is valid.
+   * So it is worthwhile to move it.
+   * 2. The partition allocation is in neither the ideal assignment nor the current assignment. Or
+   * those allocations are not valid due to offline or disabled instances.
+   * Otherwise, the rebalancer just keeps the current assignment allocation.
+   *
+   * @param replicaMap             A map contains all the replicas grouped by resource name.
+   * @param activeInstances        All the instances that are live and enabled according to the delay rebalance configuration.
+   * @param idealAssignment        The ideal assignment.
+   * @param currentAssignment      The current assignment that was generated in the previous rebalance.
+   * @param allocatedReplicas      A map of <Instance -> replicas> to return the allocated replicas grouped by the target instance name.
+   * @return The replicas that need to be reassigned.
+   */
+  private static Set<AssignableReplica> findToBeAssignedReplicasByComparingWithIdealAssignment(
+      Map<String, Set<AssignableReplica>> replicaMap, Set<String> activeInstances,
+      Map<String, ResourceAssignment> idealAssignment,
+      Map<String, ResourceAssignment> currentAssignment,
+      Map<String, Set<AssignableReplica>> allocatedReplicas) {
+    Set<AssignableReplica> toBeAssignedReplicas = new HashSet<>();
+    // check each resource to identify the allocated replicas and to-be-assigned replicas.
+    for (String resourceName : replicaMap.keySet()) {
+      // <partition, <state, instances set>>
+      Map<String, Map<String, Set<String>>> idealPartitionStateMap =
+          getValidStateInstanceMap(idealAssignment.get(resourceName), activeInstances);
+      Map<String, Map<String, Set<String>>> currentPartitionStateMap =
+          getValidStateInstanceMap(currentAssignment.get(resourceName), activeInstances);
+      // Iterate the replicas of the resource to find the ones that require reallocating.
+      for (AssignableReplica replica : replicaMap.get(resourceName)) {
+        String partitionName = replica.getPartitionName();
+        String replicaState = replica.getReplicaState();
+        Set<String> idealAllocations =
+            idealPartitionStateMap.getOrDefault(partitionName, Collections.emptyMap())
+                .getOrDefault(replicaState, Collections.emptySet());
+        Set<String> currentAllocations =
+            currentPartitionStateMap.getOrDefault(partitionName, Collections.emptyMap())
+                .getOrDefault(replicaState, Collections.emptySet());
+
+        // Compare the current assignments with the ideal assignment for the common part.
+        List<String> commonAllocations = new ArrayList<>(currentAllocations);
+        commonAllocations.retainAll(idealAllocations);
+        if (!commonAllocations.isEmpty()) {
+          // 1. If the partition is allocated at the same location in both ideal and current
+          // assignments, there is no need to reassign it.
+          String allocatedInstance = commonAllocations.get(0);
+          allocatedReplicas.computeIfAbsent(allocatedInstance, key -> new HashSet<>()).add(replica);
+          // Remove the instance from the record to prevent this instance from being processed twice.
+          idealAllocations.remove(allocatedInstance);
+          currentAllocations.remove(allocatedInstance);
+        } else if (!idealAllocations.isEmpty()) {
+          // 2. If the partition is allocated at an active instance in the ideal assignment but the
+          // same allocation does not exist in the current assignment, try to rebalance the replica
+          // or assign it if the replica has not been assigned.
+          // There are two possible conditions,
+          // * This replica has been newly added and has not been assigned yet, so it appears in
+          // the ideal assignment and does not appear in the current assignment.
+          // * The allocation of this replica in the ideal assignment has been updated due to a
+          // cluster change. For example, new instance is added. So the old allocation in the
+          // current assignment might be sub-optimal.
+          // In either condition, we add it to toBeAssignedReplicas so that it will get assigned.
+          toBeAssignedReplicas.add(replica);
+          // Remove the pending allocation from the idealAllocations after processing so that the
+          // instance won't be double-processed as we loop through all replicas
+          String pendingAllocation = idealAllocations.iterator().next();
+          idealAllocations.remove(pendingAllocation);
+        } else if (!currentAllocations.isEmpty()) {
+          // 3. This replica exists in the current assignment but does not appear or has a valid
+          // allocation in the ideal assignment.
+          // This means either 1) that the ideal assignment actually has this replica allocated on
+          // this instance, but it does not show up because the instance is temporarily offline or
+          // disabled (note that all such instances have been filtered out in earlier part of the
+          // logic) or that the most recent version of the ideal assignment was not fetched
+          // correctly from the assignment metadata store.
+          // In either case, the solution is to keep the current assignment. So put this replica
+          // with the allocated instance into the allocatedReplicas map.
+          String allocatedInstance = currentAllocations.iterator().next();
+          allocatedReplicas.computeIfAbsent(allocatedInstance, key -> new HashSet<>()).add(replica);
+          // Remove the instance from the record to prevent the same location being processed again.
+          currentAllocations.remove(allocatedInstance);
+        } else {
+          // 4. This replica is not found in either the ideal assignment or the current assignment
+          // with a valid allocation. This implies that the replica was newly added but was never
+          // assigned in reality or was added so recently that it hasn't shown up in the ideal
+          // assignment (because it's calculation takes longer and is asynchronously calculated).
+          // In that case, we add it to toBeAssignedReplicas so that it will get assigned as a
+          // result of partialRebalance.
+          toBeAssignedReplicas.add(replica);
+        }
+      }
+    }
+    return toBeAssignedReplicas;
+  }
+
+  /**
+   * Find the minimum set of replicas that need to be reassigned according to the cluster change.
+   * A replica needs to be reassigned if one of the following condition is true:
+   * 1. Cluster topology (the cluster config / any instance config) has been updated.
+   * 2. The resource config has been updated.
+   * 3. If the current assignment does not contain the partition's valid assignment.
+   *
+   * @param replicaMap             A map contains all the replicas grouped by resource name.
+   * @param activeInstances        All the instances that are live and enabled according to the delay rebalance configuration.
+   * @param liveInstances          All the instances that are live.
+   * @param clusterChanges         A map that contains all the important metadata updates that happened after the previous rebalance.
+   * @param currentAssignment      The current replica assignment.
+   * @param allocatedReplicas      Return the allocated replicas grouped by the target instance name.
+   * @return The replicas that need to be reassigned.
+   */
+  private static Set<AssignableReplica> findToBeAssignedReplicasByClusterChanges(
+      Map<String, Set<AssignableReplica>> replicaMap, Set<String> activeInstances,
+      Set<String> liveInstances, Map<HelixConstants.ChangeType, Set<String>> clusterChanges,
+      Map<String, ResourceAssignment> currentAssignment,
+      Map<String, Set<AssignableReplica>> allocatedReplicas) {
+    Set<AssignableReplica> toBeAssignedReplicas = new HashSet<>();
+
+    // A newly connected node = A new LiveInstance znode (or session Id updated) & the
+    // corresponding instance is live.
+    // TODO: The assumption here is that if the LiveInstance znode is created or it's session Id is
+    // TODO: updated, we need to call algorithm for moving some partitions to this new node.
+    // TODO: However, if the liveInstance znode is changed because of some other reason, it will be
+    // TODO: treated as a newly connected nodes. We need to find a better way to identify which one
+    // TODO: is the real newly connected nodes.
+    Set<String> newlyConnectedNodes = clusterChanges
+        .getOrDefault(HelixConstants.ChangeType.LIVE_INSTANCE, Collections.emptySet());
+    newlyConnectedNodes.retainAll(liveInstances);
+    if (clusterChanges.containsKey(HelixConstants.ChangeType.CLUSTER_CONFIG) || clusterChanges
+        .containsKey(HelixConstants.ChangeType.INSTANCE_CONFIG) || !newlyConnectedNodes.isEmpty()) {
+      // 1. If the cluster topology has been modified, need to reassign all replicas.
+      // 2. If any node was newly connected, need to rebalance all replicas for the evenness of
+      // distribution.
+      toBeAssignedReplicas
+          .addAll(replicaMap.values().stream().flatMap(Set::stream).collect(Collectors.toSet()));
+    } else {
+      // check each resource to identify the allocated replicas and to-be-assigned replicas.
+      for (Map.Entry<String, Set<AssignableReplica>> replicaMapEntry : replicaMap.entrySet()) {
+        String resourceName = replicaMapEntry.getKey();
+        Set<AssignableReplica> replicas = replicaMapEntry.getValue();
+        // 1. if the resource config/idealstate is changed, need to reassign.
+        // 2. if the resource does not appear in the current assignment, need to reassign.
+        if (clusterChanges
+            .getOrDefault(HelixConstants.ChangeType.RESOURCE_CONFIG, Collections.emptySet())
+            .contains(resourceName) || clusterChanges
+            .getOrDefault(HelixConstants.ChangeType.IDEAL_STATE, Collections.emptySet())
+            .contains(resourceName) || !currentAssignment.containsKey(resourceName)) {
+          toBeAssignedReplicas.addAll(replicas);
+          continue; // go to check next resource
+        } else {
+          // check for every replica assignment to identify if the related replicas need to be reassigned.
+          // <partition, <state, instances list>>
+          Map<String, Map<String, Set<String>>> stateMap =
+              getValidStateInstanceMap(currentAssignment.get(resourceName), activeInstances);
+          for (AssignableReplica replica : replicas) {
+            // Find any ACTIVE instance allocation that has the same state with the replica
+            Set<String> validInstances =
+                stateMap.getOrDefault(replica.getPartitionName(), Collections.emptyMap())
+                    .getOrDefault(replica.getReplicaState(), Collections.emptySet());
+            if (validInstances.isEmpty()) {
+              // 3. if no such an instance in the current assignment, need to reassign the replica
+              toBeAssignedReplicas.add(replica);
+              continue; // go to check the next replica
+            } else {
+              Iterator<String> iter = validInstances.iterator();
+              // Remove the instance from the current allocation record after processing so that it
+              // won't be double-processed as we loop through all replicas
+              String instanceName = iter.next();
+              iter.remove();
+              // the current assignment for this replica is valid,
+              // add to the allocated replica list.
+              allocatedReplicas.computeIfAbsent(instanceName, key -> new HashSet<>()).add(replica);
+            }
+          }
+        }
+      }
+    }
+    return toBeAssignedReplicas;
+  }
+
+  /**
+   * Filter to remove all invalid allocations that are not on the active instances.
+   * @param assignment
+   * @param activeInstances
+   * @return A map of <partition, <state, instances set>> contains the valid state to instance map.
+   */
+  private static Map<String, Map<String, Set<String>>> getValidStateInstanceMap(
+      ResourceAssignment assignment, Set<String> activeInstances) {
+    Map<String, Map<String, Set<String>>> stateInstanceMap = getStateInstanceMap(assignment);
+    stateInstanceMap.values().stream().forEach(stateMap -> stateMap.values().stream()
+        .forEach(instanceSet -> instanceSet.retainAll(activeInstances)));
+    return stateInstanceMap;
+  }
+
+  // <partition, <state, instances set>>
+  private static Map<String, Map<String, Set<String>>> getStateInstanceMap(
+      ResourceAssignment assignment) {
+    if (assignment == null) {
+      return Collections.emptyMap();
+    }
+    return assignment.getMappedPartitions().stream()
+        .collect(Collectors.toMap(partition -> partition.getPartitionName(), partition -> {
+          Map<String, Set<String>> stateInstanceMap = new HashMap<>();
+          assignment.getReplicaMap(partition).entrySet().stream().forEach(
+              stateMapEntry -> stateInstanceMap
+                  .computeIfAbsent(stateMapEntry.getValue(), key -> new HashSet<>())
+                  .add(stateMapEntry.getKey()));
+          return stateInstanceMap;
+        }));
+  }
+
+  /**
+   * Get all the nodes that can be assigned replicas based on the configurations.
+   *
+   * @param clusterConfig     The cluster configuration.
+   * @param instanceConfigMap A map of all the instance configuration.
+   *                          If any active instance has no configuration, it will be ignored.
+   * @param activeInstances   All the instances that are online and enabled.
+   * @return A map of assignable node set, <InstanceName, node set>.
+   */
+  private static Set<AssignableNode> getAllAssignableNodes(ClusterConfig clusterConfig,
+      Map<String, InstanceConfig> instanceConfigMap, Set<String> activeInstances) {
+    return activeInstances.parallelStream()
+        .filter(instance -> instanceConfigMap.containsKey(instance)).map(
+            instanceName -> new AssignableNode(clusterConfig, instanceConfigMap.get(instanceName),
+                instanceName)).collect(Collectors.toSet());
+  }
+
+  /**
+   * Get all the replicas that need to be reallocated from the cluster data cache.
+   *
+   * @param dataProvider The cluster status cache that contains the current cluster status.
+   * @param resourceMap  All the valid resources that are managed by the rebalancer.
+   * @param assignableNodes All the active assignable nodes.
+   * @return A map of assignable replica set, <ResourceName, replica set>.
+   */
+  private static Map<String, Set<AssignableReplica>> getAllAssignableReplicas(
+      ResourceControllerDataProvider dataProvider, Map<String, Resource> resourceMap,
+      Set<AssignableNode> assignableNodes) {
+    ClusterConfig clusterConfig = dataProvider.getClusterConfig();
+    int activeFaultZoneCount = assignableNodes.stream().map(node -> node.getFaultZone())
+        .collect(Collectors.toSet()).size();
+    return resourceMap.keySet().parallelStream().map(resourceName -> {
+      ResourceConfig resourceConfig = dataProvider.getResourceConfig(resourceName);
+      if (resourceConfig == null) {
+        resourceConfig = new ResourceConfig(resourceName);
+      }
+      IdealState is = dataProvider.getIdealState(resourceName);
+      if (is == null) {
+        throw new HelixException(
+            "Cannot find the resource ideal state for resource: " + resourceName);
+      }
+      String defName = is.getStateModelDefRef();
+      StateModelDefinition def = dataProvider.getStateModelDef(defName);
+      if (def == null) {
+        throw new IllegalArgumentException(String
+            .format("Cannot find state model definition %s for resource %s.",
+                is.getStateModelDefRef(), resourceName));
+      }
+      Map<String, Integer> stateCountMap =
+          def.getStateCountMap(activeFaultZoneCount, is.getReplicaCount(assignableNodes.size()));
+      mergeIdealStateWithResourceConfig(resourceConfig, is);
+      Set<AssignableReplica> replicas = new HashSet<>();
+      for (String partition : is.getPartitionSet()) {
+        for (Map.Entry<String, Integer> entry : stateCountMap.entrySet()) {
+          String state = entry.getKey();
+          for (int i = 0; i < entry.getValue(); i++) {
+            replicas.add(new AssignableReplica(clusterConfig, resourceConfig, partition, state,
+                def.getStatePriorityMap().get(state)));
+          }
+        }
+      }
+      return new HashMap.SimpleEntry<>(resourceName, replicas);
+    }).collect(Collectors.toMap(entry -> entry.getKey(), entry -> entry.getValue()));
+  }
+
+  /**
+   * For backward compatibility, propagate the critical simple fields from the IdealState to
+   * the Resource Config.
+   * Eventually, Resource Config should be the only metadata node that contains the required information.
+   */
+  private static void mergeIdealStateWithResourceConfig(ResourceConfig resourceConfig,
+      final IdealState idealState) {
+    // Note that the config fields get updated in this method shall be fully compatible with ones in the IdealState.
+    // 1. The fields shall have exactly the same meaning.
+    // 2. The value shall be exactly compatible, no additional calculation involved.
+    // 3. Resource Config items have a high priority.
+    // This is to ensure the resource config is not polluted after the merge.
+    if (null == resourceConfig.getRecord()
+        .getSimpleField(ResourceConfig.ResourceConfigProperty.INSTANCE_GROUP_TAG.name())) {
+      resourceConfig.getRecord()
+          .setSimpleField(ResourceConfig.ResourceConfigProperty.INSTANCE_GROUP_TAG.name(),
+              idealState.getInstanceGroupTag());
+    }
+    if (null == resourceConfig.getRecord()
+        .getSimpleField(ResourceConfig.ResourceConfigProperty.MAX_PARTITIONS_PER_INSTANCE.name())) {
+      resourceConfig.getRecord()
+          .setIntField(ResourceConfig.ResourceConfigProperty.MAX_PARTITIONS_PER_INSTANCE.name(),
+              idealState.getMaxPartitionsPerInstance());
+    }
+  }
+
+  /**
+   * @return A map containing the assignments for each fault zone. <fault zone, <resource, set of partitions>>
+   */
+  private static Map<String, Map<String, Set<String>>> mapAssignmentToFaultZone(
+      Set<AssignableNode> assignableNodes) {
+    Map<String, Map<String, Set<String>>> faultZoneAssignmentMap = new HashMap<>();
+    assignableNodes.stream().forEach(node -> {
+      for (Map.Entry<String, Set<String>> resourceMap : node.getAssignedPartitionsMap()
+          .entrySet()) {
+        faultZoneAssignmentMap.computeIfAbsent(node.getFaultZone(), k -> new HashMap<>())
+            .computeIfAbsent(resourceMap.getKey(), k -> new HashSet<>())
+            .addAll(resourceMap.getValue());
+      }
+    });
+    return faultZoneAssignmentMap;
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/OptimalAssignment.java b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/OptimalAssignment.java
new file mode 100644
index 0000000..1ff00c9
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/OptimalAssignment.java
@@ -0,0 +1,93 @@
+package org.apache.helix.controller.rebalancer.waged.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.helix.HelixException;
+import org.apache.helix.model.Partition;
+import org.apache.helix.model.ResourceAssignment;
+
+/**
+ * The data model represents the optimal assignment of N replicas assigned to M instances;
+ * It's mostly used as the return parameter of an assignment calculation algorithm; If the algorithm
+ * failed to find optimal assignment given the endeavor, the user could check the failure reasons.
+ * Note that this class is not thread safe.
+ */
+public class OptimalAssignment {
+  private Map<String, ResourceAssignment> _optimalAssignment = Collections.emptyMap();
+  private Map<AssignableReplica, Map<AssignableNode, List<String>>> _failedAssignments =
+      new HashMap<>();
+
+  /**
+   * Update the OptimalAssignment instance with the existing assignment recorded in the input cluster model.
+   *
+   * @param clusterModel
+   */
+  public void updateAssignments(ClusterModel clusterModel) {
+    Map<String, ResourceAssignment> assignmentMap = new HashMap<>();
+    for (AssignableNode node : clusterModel.getAssignableNodes().values()) {
+      for (AssignableReplica replica : node.getAssignedReplicas()) {
+        String resourceName = replica.getResourceName();
+        Partition partition = new Partition(replica.getPartitionName());
+        ResourceAssignment resourceAssignment = assignmentMap
+            .computeIfAbsent(resourceName, key -> new ResourceAssignment(resourceName));
+        Map<String, String> partitionStateMap = resourceAssignment.getReplicaMap(partition);
+        if (partitionStateMap.isEmpty()) {
+          // ResourceAssignment returns immutable empty map while no such assignment recorded yet.
+          // So if the returned map is empty, create a new map.
+          partitionStateMap = new HashMap<>();
+        }
+        partitionStateMap.put(node.getInstanceName(), replica.getReplicaState());
+        resourceAssignment.addReplicaMap(partition, partitionStateMap);
+      }
+    }
+    _optimalAssignment = assignmentMap;
+  }
+
+  /**
+   * @return The optimal assignment in the form of a <Resource Name, ResourceAssignment> map.
+   */
+  public Map<String, ResourceAssignment> getOptimalResourceAssignment() {
+    if (hasAnyFailure()) {
+      throw new HelixException(
+          "Cannot get the optimal resource assignment since a calculation failure is recorded. "
+              + getFailures());
+    }
+    return _optimalAssignment;
+  }
+
+  public void recordAssignmentFailure(AssignableReplica replica,
+      Map<AssignableNode, List<String>> failedReasons) {
+    _failedAssignments.put(replica, failedReasons);
+  }
+
+  public boolean hasAnyFailure() {
+    return !_failedAssignments.isEmpty();
+  }
+
+  public String getFailures() {
+    // TODO: format the error string
+    return _failedAssignments.toString();
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/controller/stages/AttributeName.java b/helix-core/src/main/java/org/apache/helix/controller/stages/AttributeName.java
index a2b63f8..b570568 100644
--- a/helix-core/src/main/java/org/apache/helix/controller/stages/AttributeName.java
+++ b/helix-core/src/main/java/org/apache/helix/controller/stages/AttributeName.java
@@ -38,5 +38,6 @@
   AsyncFIFOWorkerPool,
   PipelineType,
   LastRebalanceFinishTimeStamp,
-  ControllerDataProvider
+  ControllerDataProvider,
+  STATEFUL_REBALANCER
 }
diff --git a/helix-core/src/main/java/org/apache/helix/controller/stages/BestPossibleStateCalcStage.java b/helix-core/src/main/java/org/apache/helix/controller/stages/BestPossibleStateCalcStage.java
index 49a72e0..ffaac8f 100644
--- a/helix-core/src/main/java/org/apache/helix/controller/stages/BestPossibleStateCalcStage.java
+++ b/helix-core/src/main/java/org/apache/helix/controller/stages/BestPossibleStateCalcStage.java
@@ -20,13 +20,17 @@
  */
 
 import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
 import java.util.Iterator;
 import java.util.List;
 import java.util.Map;
 import java.util.concurrent.Callable;
+import java.util.stream.Collectors;
 
 import org.apache.helix.HelixException;
 import org.apache.helix.HelixManager;
+import org.apache.helix.HelixRebalanceException;
 import org.apache.helix.controller.LogUtil;
 import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
 import org.apache.helix.controller.pipeline.AbstractBaseStage;
@@ -37,6 +41,8 @@
 import org.apache.helix.controller.rebalancer.Rebalancer;
 import org.apache.helix.controller.rebalancer.SemiAutoRebalancer;
 import org.apache.helix.controller.rebalancer.internal.MappingCalculator;
+import org.apache.helix.controller.rebalancer.waged.WagedRebalancer;
+import org.apache.helix.model.ClusterConfig;
 import org.apache.helix.model.IdealState;
 import org.apache.helix.model.InstanceConfig;
 import org.apache.helix.model.MaintenanceSignal;
@@ -56,18 +62,19 @@
  * IdealState,StateModel,LiveInstance
  */
 public class BestPossibleStateCalcStage extends AbstractBaseStage {
-  private static final Logger logger = LoggerFactory.getLogger(BestPossibleStateCalcStage.class.getName());
+  private static final Logger logger =
+      LoggerFactory.getLogger(BestPossibleStateCalcStage.class.getName());
 
   @Override
   public void process(ClusterEvent event) throws Exception {
     _eventId = event.getEventId();
-    CurrentStateOutput currentStateOutput =
-        event.getAttribute(AttributeName.CURRENT_STATE.name());
+    CurrentStateOutput currentStateOutput = event.getAttribute(AttributeName.CURRENT_STATE.name());
     final Map<String, Resource> resourceMap =
         event.getAttribute(AttributeName.RESOURCES_TO_REBALANCE.name());
     final ClusterStatusMonitor clusterStatusMonitor =
         event.getAttribute(AttributeName.clusterStatusMonitor.name());
-    ResourceControllerDataProvider cache = event.getAttribute(AttributeName.ControllerDataProvider.name());
+    ResourceControllerDataProvider cache =
+        event.getAttribute(AttributeName.ControllerDataProvider.name());
 
     if (currentStateOutput == null || resourceMap == null || cache == null) {
       throw new StageException(
@@ -90,8 +97,7 @@
                     resourceMap, stateModelDefMap);
           }
         } catch (Exception e) {
-          LogUtil
-              .logError(logger, _eventId, "Could not update cluster status metrics!", e);
+          LogUtil.logError(logger, _eventId, "Could not update cluster status metrics!", e);
         }
         return null;
       }
@@ -100,43 +106,57 @@
 
   private BestPossibleStateOutput compute(ClusterEvent event, Map<String, Resource> resourceMap,
       CurrentStateOutput currentStateOutput) {
-    ResourceControllerDataProvider cache = event.getAttribute(AttributeName.ControllerDataProvider.name());
+    ResourceControllerDataProvider cache =
+        event.getAttribute(AttributeName.ControllerDataProvider.name());
     BestPossibleStateOutput output = new BestPossibleStateOutput();
 
     HelixManager helixManager = event.getAttribute(AttributeName.helixmanager.name());
     ClusterStatusMonitor clusterStatusMonitor =
         event.getAttribute(AttributeName.clusterStatusMonitor.name());
+    WagedRebalancer wagedRebalancer = event.getAttribute(AttributeName.STATEFUL_REBALANCER.name());
 
     // Check whether the offline/disabled instance count in the cluster reaches the set limit,
     // if yes, pause the rebalancer.
-    boolean isValid = validateOfflineInstancesLimit(cache,
-        (HelixManager) event.getAttribute(AttributeName.helixmanager.name()));
+    boolean isValid =
+        validateOfflineInstancesLimit(cache, event.getAttribute(AttributeName.helixmanager.name()));
 
     final List<String> failureResources = new ArrayList<>();
-    Iterator<Resource> itr = resourceMap.values().iterator();
+
+    Map<String, Resource> calculatedResourceMap =
+        computeResourceBestPossibleStateWithWagedRebalancer(wagedRebalancer, cache,
+            currentStateOutput, resourceMap, output, failureResources);
+
+    Map<String, Resource> remainingResourceMap = new HashMap<>(resourceMap);
+    remainingResourceMap.keySet().removeAll(calculatedResourceMap.keySet());
+
+    // Fallback to the original single resource rebalancer calculation.
+    // This is required because we support mixed cluster that uses both WAGED rebalancer and the
+    // older rebalancers.
+    Iterator<Resource> itr = remainingResourceMap.values().iterator();
     while (itr.hasNext()) {
       Resource resource = itr.next();
       boolean result = false;
       try {
-        result =
-            computeResourceBestPossibleState(event, cache, currentStateOutput, resource, output);
+        result = computeSingleResourceBestPossibleState(event, cache, currentStateOutput, resource,
+            output);
       } catch (HelixException ex) {
-        LogUtil.logError(logger, _eventId,
-            "Exception when calculating best possible states for " + resource.getResourceName(),
-            ex);
+        LogUtil.logError(logger, _eventId, String
+            .format("Exception when calculating best possible states for %s",
+                resource.getResourceName()), ex);
 
       }
       if (!result) {
         failureResources.add(resource.getResourceName());
-        LogUtil.logWarn(logger, _eventId,
-            "Failed to calculate best possible states for " + resource.getResourceName());
+        LogUtil.logWarn(logger, _eventId, String
+            .format("Failed to calculate best possible states for %s", resource.getResourceName()));
       }
     }
 
     // Check and report if resource rebalance has failure
     updateRebalanceStatus(!isValid || !failureResources.isEmpty(), failureResources, helixManager,
-        cache, clusterStatusMonitor,
-        "Failed to calculate best possible states for " + failureResources.size() + " resources.");
+        cache, clusterStatusMonitor, String
+            .format("Failed to calculate best possible states for %d resources.",
+                failureResources.size()));
 
     return output;
   }
@@ -185,8 +205,9 @@
         if (manager != null) {
           if (manager.getHelixDataAccessor()
               .getProperty(manager.getHelixDataAccessor().keyBuilder().maintenance()) == null) {
-            manager.getClusterManagmentTool().autoEnableMaintenanceMode(manager.getClusterName(),
-                true, errMsg, MaintenanceSignal.AutoTriggerReason.MAX_OFFLINE_INSTANCES_EXCEEDED);
+            manager.getClusterManagmentTool()
+                .autoEnableMaintenanceMode(manager.getClusterName(), true, errMsg,
+                    MaintenanceSignal.AutoTriggerReason.MAX_OFFLINE_INSTANCES_EXCEEDED);
             LogUtil.logWarn(logger, _eventId, errMsg);
           }
         } else {
@@ -199,8 +220,98 @@
     return true;
   }
 
-  private boolean computeResourceBestPossibleState(ClusterEvent event, ResourceControllerDataProvider cache,
-      CurrentStateOutput currentStateOutput, Resource resource, BestPossibleStateOutput output) {
+  private void updateWagedRebalancer(WagedRebalancer wagedRebalancer, ClusterConfig clusterConfig) {
+    if (clusterConfig != null) {
+      // Since the rebalance configuration can be updated at runtime, try to update the rebalancer
+      // before calculating.
+      wagedRebalancer.updateRebalancePreference(clusterConfig.getGlobalRebalancePreference());
+      wagedRebalancer
+          .setGlobalRebalanceAsyncMode(clusterConfig.isGlobalRebalanceAsyncModeEnabled());
+    }
+  }
+
+  /**
+   * Rebalance with the WAGED rebalancer
+   * The rebalancer only calculates the new ideal assignment for all the resources that are
+   * configured to use the WAGED rebalancer.
+   *
+   * @param wagedRebalancer    The WAGED rebalancer instance.
+   * @param cache              Cluster data cache.
+   * @param currentStateOutput The current state information.
+   * @param resourceMap        The complete resource map. The method will filter the map for the compatible resources.
+   * @param output             The best possible state output.
+   * @param failureResources   The failure records that will be updated if any resource cannot be computed.
+   * @return The map of all the calculated resources.
+   */
+  private Map<String, Resource> computeResourceBestPossibleStateWithWagedRebalancer(
+      WagedRebalancer wagedRebalancer, ResourceControllerDataProvider cache,
+      CurrentStateOutput currentStateOutput, Map<String, Resource> resourceMap,
+      BestPossibleStateOutput output, List<String> failureResources) {
+    if (cache.isMaintenanceModeEnabled()) {
+      // The WAGED rebalancer won't be used while maintenance mode is enabled.
+      return Collections.emptyMap();
+    }
+
+    // Find the compatible resources: 1. FULL_AUTO 2. Configured to use the WAGED rebalancer
+    Map<String, Resource> wagedRebalancedResourceMap =
+        resourceMap.entrySet().stream().filter(resourceEntry -> {
+          IdealState is = cache.getIdealState(resourceEntry.getKey());
+          return is != null && is.getRebalanceMode().equals(IdealState.RebalanceMode.FULL_AUTO)
+              && WagedRebalancer.class.getName().equals(is.getRebalancerClassName());
+        }).collect(Collectors.toMap(resourceEntry -> resourceEntry.getKey(),
+            resourceEntry -> resourceEntry.getValue()));
+
+    Map<String, IdealState> newIdealStates = new HashMap<>();
+
+    if (wagedRebalancer != null) {
+      updateWagedRebalancer(wagedRebalancer, cache.getClusterConfig());
+      try {
+        newIdealStates.putAll(wagedRebalancer
+            .computeNewIdealStates(cache, wagedRebalancedResourceMap, currentStateOutput));
+      } catch (HelixRebalanceException ex) {
+        // Note that unlike the legacy rebalancer, the WAGED rebalance won't return partial result.
+        // Since it calculates for all the eligible resources globally, a partial result is invalid.
+        // TODO propagate the rebalancer failure information to updateRebalanceStatus for monitoring.
+        LogUtil.logError(logger, _eventId, String
+            .format("Failed to calculate the new Ideal States using the rebalancer %s due to %s",
+                wagedRebalancer.getClass().getSimpleName(), ex.getFailureType()), ex);
+      }
+    } else {
+      LogUtil.logError(logger, _eventId,
+          "Skip rebalancing using the WAGED rebalancer since it is not configured in the rebalance pipeline.");
+    }
+
+    Iterator<Resource> itr = wagedRebalancedResourceMap.values().iterator();
+    while (itr.hasNext()) {
+      Resource resource = itr.next();
+      IdealState is = newIdealStates.get(resource.getResourceName());
+      // Check if the WAGED rebalancer has calculated the result for this resource or not.
+      if (is != null && checkBestPossibleStateCalculation(is)) {
+        // The WAGED rebalancer calculates a valid result, record in the output
+        updateBestPossibleStateOutput(output, resource, is);
+      } else {
+        failureResources.add(resource.getResourceName());
+        LogUtil.logWarn(logger, _eventId, String
+            .format("Failed to calculate best possible states for %s.",
+                resource.getResourceName()));
+      }
+    }
+    return wagedRebalancedResourceMap;
+  }
+
+  private void updateBestPossibleStateOutput(BestPossibleStateOutput output, Resource resource,
+      IdealState computedIdealState) {
+    output.setPreferenceLists(resource.getResourceName(), computedIdealState.getPreferenceLists());
+    for (Partition partition : resource.getPartitions()) {
+      Map<String, String> newStateMap =
+          computedIdealState.getInstanceStateMap(partition.getPartitionName());
+      output.setState(resource.getResourceName(), partition, newStateMap);
+    }
+  }
+
+  private boolean computeSingleResourceBestPossibleState(ClusterEvent event,
+      ResourceControllerDataProvider cache, CurrentStateOutput currentStateOutput,
+      Resource resource, BestPossibleStateOutput output) {
     // for each ideal state
     // read the state model def
     // for each resource
@@ -229,12 +340,13 @@
 
     Rebalancer<ResourceControllerDataProvider> rebalancer =
         getRebalancer(idealState, resourceName, cache.isMaintenanceModeEnabled());
-    MappingCalculator<ResourceControllerDataProvider> mappingCalculator = getMappingCalculator(rebalancer, resourceName);
+    MappingCalculator<ResourceControllerDataProvider> mappingCalculator =
+        getMappingCalculator(rebalancer, resourceName);
 
     if (rebalancer == null || mappingCalculator == null) {
-      LogUtil.logError(logger, _eventId,
-          "Error computing assignment for resource " + resourceName + ". no rebalancer found. rebalancer: " + rebalancer
-              + " mappingCalculator: " + mappingCalculator);
+      LogUtil.logError(logger, _eventId, "Error computing assignment for resource " + resourceName
+          + ". no rebalancer found. rebalancer: " + rebalancer + " mappingCalculator: "
+          + mappingCalculator);
     }
 
     if (rebalancer != null && mappingCalculator != null) {
@@ -299,10 +411,9 @@
     }
   }
 
-  private Rebalancer<ResourceControllerDataProvider> getRebalancer(IdealState idealState, String resourceName,
-      boolean isMaintenanceModeEnabled) {
+  private Rebalancer<ResourceControllerDataProvider> getCustomizedRebalancer(
+      String rebalancerClassName, String resourceName) {
     Rebalancer<ResourceControllerDataProvider> customizedRebalancer = null;
-    String rebalancerClassName = idealState.getRebalancerClassName();
     if (rebalancerClassName != null) {
       if (logger.isDebugEnabled()) {
         LogUtil.logDebug(logger, _eventId,
@@ -316,13 +427,19 @@
             "Exception while invoking custom rebalancer class:" + rebalancerClassName, e);
       }
     }
+    return customizedRebalancer;
+  }
 
+  private Rebalancer<ResourceControllerDataProvider> getRebalancer(IdealState idealState,
+      String resourceName, boolean isMaintenanceModeEnabled) {
     Rebalancer<ResourceControllerDataProvider> rebalancer = null;
     switch (idealState.getRebalanceMode()) {
     case FULL_AUTO:
       if (isMaintenanceModeEnabled) {
         rebalancer = new MaintenanceRebalancer();
       } else {
+        Rebalancer<ResourceControllerDataProvider> customizedRebalancer =
+            getCustomizedRebalancer(idealState.getRebalancerClassName(), resourceName);
         if (customizedRebalancer != null) {
           rebalancer = customizedRebalancer;
         } else {
@@ -338,14 +455,13 @@
       break;
     case USER_DEFINED:
     case TASK:
-      rebalancer = customizedRebalancer;
+      rebalancer = getCustomizedRebalancer(idealState.getRebalancerClassName(), resourceName);
       break;
     default:
       LogUtil.logError(logger, _eventId,
           "Fail to find the rebalancer, invalid rebalance mode " + idealState.getRebalanceMode());
       break;
     }
-
     return rebalancer;
   }
 
diff --git a/helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java b/helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java
index 66da8ba..62fda33 100644
--- a/helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java
+++ b/helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java
@@ -20,19 +20,31 @@
  */
 
 import java.util.Collection;
+import java.util.Collections;
 import java.util.List;
 import java.util.Map;
+import java.util.concurrent.ExecutorService;
+import java.util.stream.Collectors;
 
 import org.apache.helix.controller.LogUtil;
 import org.apache.helix.controller.dataproviders.BaseControllerDataProvider;
+import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
 import org.apache.helix.controller.pipeline.AbstractBaseStage;
 import org.apache.helix.controller.pipeline.StageException;
+import org.apache.helix.controller.rebalancer.util.ResourceUsageCalculator;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterModel;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterModelProvider;
 import org.apache.helix.model.CurrentState;
+import org.apache.helix.model.IdealState;
 import org.apache.helix.model.LiveInstance;
 import org.apache.helix.model.Message;
 import org.apache.helix.model.Message.MessageType;
 import org.apache.helix.model.Partition;
 import org.apache.helix.model.Resource;
+import org.apache.helix.model.ResourceAssignment;
+import org.apache.helix.model.ResourceConfig;
+import org.apache.helix.monitoring.mbeans.ClusterStatusMonitor;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
@@ -50,6 +62,8 @@
     _eventId = event.getEventId();
     BaseControllerDataProvider cache = event.getAttribute(AttributeName.ControllerDataProvider.name());
     final Map<String, Resource> resourceMap = event.getAttribute(AttributeName.RESOURCES.name());
+    final Map<String, Resource> resourceToRebalance =
+        event.getAttribute(AttributeName.RESOURCES_TO_REBALANCE.name());
 
     if (cache == null || resourceMap == null) {
       throw new StageException("Missing attributes in event:" + event
@@ -74,6 +88,16 @@
       updateCurrentStates(instance, currentStateMap.values(), currentStateOutput, resourceMap);
     }
     event.addAttribute(AttributeName.CURRENT_STATE.name(), currentStateOutput);
+
+    final ClusterStatusMonitor clusterStatusMonitor =
+        event.getAttribute(AttributeName.clusterStatusMonitor.name());
+    if (clusterStatusMonitor != null && cache instanceof ResourceControllerDataProvider) {
+      final ResourceControllerDataProvider dataProvider = (ResourceControllerDataProvider) cache;
+      reportInstanceCapacityMetrics(clusterStatusMonitor, dataProvider, resourceToRebalance,
+          currentStateOutput);
+      reportResourcePartitionCapacityMetrics(dataProvider.getAsyncTasksThreadPool(),
+          clusterStatusMonitor, dataProvider.getResourceConfigMap().values());
+    }
   }
 
   // update all pending messages to CurrentStateOutput.
@@ -220,4 +244,55 @@
       currentStateOutput.setCancellationMessage(resourceName, partition, instanceName, message);
     }
   }
+
+  private void reportInstanceCapacityMetrics(ClusterStatusMonitor clusterStatusMonitor,
+      ResourceControllerDataProvider dataProvider, Map<String, Resource> resourceMap,
+      CurrentStateOutput currentStateOutput) {
+    asyncExecute(dataProvider.getAsyncTasksThreadPool(), () -> {
+      try {
+        // ResourceToRebalance map also has resources from current states.
+        // Only use the resources in ideal states to parse all replicas.
+        Map<String, IdealState> idealStateMap = dataProvider.getIdealStates();
+        Map<String, Resource> resourceToMonitorMap = resourceMap.entrySet().stream()
+            .filter(idealStateMap::containsKey)
+            .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
+
+        Map<String, ResourceAssignment> currentStateAssignment =
+            currentStateOutput.getAssignment(resourceToMonitorMap.keySet());
+        ClusterModel clusterModel = ClusterModelProvider.generateClusterModelFromExistingAssignment(
+            dataProvider, resourceToMonitorMap, currentStateAssignment);
+
+        for (AssignableNode node : clusterModel.getAssignableNodes().values()) {
+          String instanceName = node.getInstanceName();
+          // There is no new usage adding to this node, so an empty map is passed in.
+          double usage = node.getProjectedHighestUtilization(Collections.emptyMap());
+          clusterStatusMonitor
+              .updateInstanceCapacityStatus(instanceName, usage, node.getMaxCapacity());
+        }
+      } catch (Exception ex) {
+        LOG.error("Failed to report instance capacity metrics. Exception message: {}",
+            ex.getMessage());
+      }
+
+      return null;
+    });
+  }
+
+  private void reportResourcePartitionCapacityMetrics(ExecutorService executorService,
+      ClusterStatusMonitor clusterStatusMonitor, Collection<ResourceConfig> resourceConfigs) {
+    asyncExecute(executorService, () -> {
+      try {
+        for (ResourceConfig config : resourceConfigs) {
+          Map<String, Integer> averageWeight = ResourceUsageCalculator
+              .calculateAveragePartitionWeight(config.getPartitionCapacityMap());
+          clusterStatusMonitor.updatePartitionWeight(config.getResourceName(), averageWeight);
+        }
+      } catch (Exception ex) {
+        LOG.error("Failed to report resource partition capacity metrics. Exception message: {}",
+            ex.getMessage());
+      }
+
+      return null;
+    });
+  }
 }
diff --git a/helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateOutput.java b/helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateOutput.java
index bbbf0fd..752a760 100644
--- a/helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateOutput.java
+++ b/helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateOutput.java
@@ -28,6 +28,7 @@
 import org.apache.helix.model.CurrentState;
 import org.apache.helix.model.Message;
 import org.apache.helix.model.Partition;
+import org.apache.helix.model.ResourceAssignment;
 
 /**
  * The current state includes both current state and pending messages
@@ -428,4 +429,26 @@
     return sb.toString();
   }
 
+  /**
+   * Get current state assignment for a set of resources.
+   * @param resourceSet a set of resources' names
+   * @return a map of current state resource assignment, {resourceName: resourceAssignment}
+   */
+  public Map<String, ResourceAssignment> getAssignment(Set<String> resourceSet) {
+    Map<String, ResourceAssignment> currentStateAssignment = new HashMap<>();
+    for (String resourceName : resourceSet) {
+      Map<Partition, Map<String, String>> currentStateMap =
+          getCurrentStateMap(resourceName);
+      if (!currentStateMap.isEmpty()) {
+        ResourceAssignment newResourceAssignment = new ResourceAssignment(resourceName);
+        currentStateMap.entrySet().stream().forEach(currentStateEntry -> {
+          newResourceAssignment.addReplicaMap(currentStateEntry.getKey(),
+              currentStateEntry.getValue());
+        });
+        currentStateAssignment.put(resourceName, newResourceAssignment);
+      }
+    }
+
+    return currentStateAssignment;
+  }
 }
diff --git a/helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixAdmin.java b/helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixAdmin.java
index 0a978e5..61e75b3 100644
--- a/helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixAdmin.java
+++ b/helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixAdmin.java
@@ -57,6 +57,10 @@
 import org.apache.helix.controller.rebalancer.DelayedAutoRebalancer;
 import org.apache.helix.controller.rebalancer.strategy.CrushEdRebalanceStrategy;
 import org.apache.helix.controller.rebalancer.strategy.RebalanceStrategy;
+import org.apache.helix.controller.rebalancer.util.WagedValidationUtil;
+import org.apache.helix.controller.rebalancer.waged.WagedRebalancer;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
 import org.apache.helix.manager.zk.client.HelixZkClient;
 import org.apache.helix.manager.zk.client.SharedZkClientFactory;
 import org.apache.helix.model.ClusterConfig;
@@ -76,6 +80,7 @@
 import org.apache.helix.model.Message.MessageState;
 import org.apache.helix.model.Message.MessageType;
 import org.apache.helix.model.PauseSignal;
+import org.apache.helix.model.ResourceConfig;
 import org.apache.helix.model.StateModelDefinition;
 import org.apache.helix.tools.DefaultIdealStateCalculator;
 import org.apache.helix.util.HelixUtil;
@@ -84,6 +89,7 @@
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
+
 public class ZKHelixAdmin implements HelixAdmin {
   public static final String CONNECTION_TIMEOUT = "helixAdmin.timeOutInSec";
   private static final String MAINTENANCE_ZNODE_ID = "maintenance";
@@ -180,7 +186,7 @@
           // does not repeatedly write instance history)
           logger.warn("Retrying dropping instance {} with exception {}",
               instanceConfig.getInstanceName(), e.getCause().getMessage());
-          retryCnt ++;
+          retryCnt++;
         } else {
           logger.error("Failed to drop instance {} (not retryable).",
               instanceConfig.getInstanceName(), e.getCause());
@@ -403,7 +409,8 @@
     HelixDataAccessor accessor =
         new ZKHelixDataAccessor(clusterName, new ZkBaseDataAccessor<ZNRecord>(_zkClient));
     Builder keyBuilder = accessor.keyBuilder();
-    return accessor.getBaseDataAccessor().exists(keyBuilder.maintenance().getPath(), AccessOption.PERSISTENT);
+    return accessor.getBaseDataAccessor()
+        .exists(keyBuilder.maintenance().getPath(), AccessOption.PERSISTENT);
   }
 
   @Override
@@ -436,16 +443,16 @@
    * @param customFields
    * @param triggeringEntity
    */
-  private void processMaintenanceMode(String clusterName, final boolean enabled, final String reason,
-      final MaintenanceSignal.AutoTriggerReason internalReason, final Map<String, String> customFields,
+  private void processMaintenanceMode(String clusterName, final boolean enabled,
+      final String reason, final MaintenanceSignal.AutoTriggerReason internalReason,
+      final Map<String, String> customFields,
       final MaintenanceSignal.TriggeringEntity triggeringEntity) {
     HelixDataAccessor accessor =
         new ZKHelixDataAccessor(clusterName, new ZkBaseDataAccessor<ZNRecord>(_zkClient));
     Builder keyBuilder = accessor.keyBuilder();
     logger.info("Cluster {} {} {} maintenance mode for reason {}.", clusterName,
         triggeringEntity == MaintenanceSignal.TriggeringEntity.CONTROLLER ? "automatically"
-            : "manually",
-        enabled ? "enters" : "exits", reason == null ? "NULL" : reason);
+            : "manually", enabled ? "enters" : "exits", reason == null ? "NULL" : reason);
     final long currentTime = System.currentTimeMillis();
     if (!enabled) {
       // Exit maintenance mode
@@ -459,23 +466,23 @@
       maintenanceSignal.setTimestamp(currentTime);
       maintenanceSignal.setTriggeringEntity(triggeringEntity);
       switch (triggeringEntity) {
-      case CONTROLLER:
-        // autoEnable
-        maintenanceSignal.setAutoTriggerReason(internalReason);
-        break;
-      case USER:
-      case UNKNOWN:
-        // manuallyEnable
-        if (customFields != null && !customFields.isEmpty()) {
-          // Enter all custom fields provided by the user
-          Map<String, String> simpleFields = maintenanceSignal.getRecord().getSimpleFields();
-          for (Map.Entry<String, String> entry : customFields.entrySet()) {
-            if (!simpleFields.containsKey(entry.getKey())) {
-              simpleFields.put(entry.getKey(), entry.getValue());
+        case CONTROLLER:
+          // autoEnable
+          maintenanceSignal.setAutoTriggerReason(internalReason);
+          break;
+        case USER:
+        case UNKNOWN:
+          // manuallyEnable
+          if (customFields != null && !customFields.isEmpty()) {
+            // Enter all custom fields provided by the user
+            Map<String, String> simpleFields = maintenanceSignal.getRecord().getSimpleFields();
+            for (Map.Entry<String, String> entry : customFields.entrySet()) {
+              if (!simpleFields.containsKey(entry.getKey())) {
+                simpleFields.put(entry.getKey(), entry.getValue());
+              }
             }
           }
-        }
-        break;
+          break;
       }
       if (!accessor.createMaintenance(maintenanceSignal)) {
         throw new HelixException("Failed to create maintenance signal!");
@@ -483,16 +490,17 @@
     }
 
     // Record a MaintenanceSignal history
-    if (!accessor.getBaseDataAccessor().update(keyBuilder.controllerLeaderHistory().getPath(),
-        new DataUpdater<ZNRecord>() {
+    if (!accessor.getBaseDataAccessor()
+        .update(keyBuilder.controllerLeaderHistory().getPath(), new DataUpdater<ZNRecord>() {
           @Override
           public ZNRecord update(ZNRecord oldRecord) {
             try {
               if (oldRecord == null) {
                 oldRecord = new ZNRecord(PropertyType.HISTORY.toString());
               }
-              return new ControllerHistory(oldRecord).updateMaintenanceHistory(enabled, reason,
-                  currentTime, internalReason, customFields, triggeringEntity);
+              return new ControllerHistory(oldRecord)
+                  .updateMaintenanceHistory(enabled, reason, currentTime, internalReason,
+                      customFields, triggeringEntity);
             } catch (IOException e) {
               logger.error("Failed to update maintenance history! Exception: {}", e);
               return oldRecord;
@@ -1241,7 +1249,8 @@
     setResourceIdealState(clusterName, resourceName, new IdealState(idealStateRecord));
   }
 
-  private static byte[] readFile(String filePath) throws IOException {
+  private static byte[] readFile(String filePath)
+      throws IOException {
     File file = new File(filePath);
 
     int size = (int) file.length();
@@ -1264,7 +1273,8 @@
 
   @Override
   public void addStateModelDef(String clusterName, String stateModelDefName,
-      String stateModelDefFile) throws IOException {
+      String stateModelDefFile)
+      throws IOException {
     ZNRecord record =
         (ZNRecord) (new ZNRecordSerializer().deserialize(readFile(stateModelDefFile)));
     if (record == null || record.getId() == null || !record.getId().equals(stateModelDefName)) {
@@ -1287,9 +1297,9 @@
     baseAccessor.update(path, new DataUpdater<ZNRecord>() {
       @Override
       public ZNRecord update(ZNRecord currentData) {
-        ClusterConstraints constraints = currentData == null ?
-            new ClusterConstraints(constraintType) :
-            new ClusterConstraints(currentData);
+        ClusterConstraints constraints =
+            currentData == null ? new ClusterConstraints(constraintType)
+                : new ClusterConstraints(currentData);
 
         constraints.addConstraintItem(constraintId, constraintItem);
         return constraints.getRecord();
@@ -1495,9 +1505,7 @@
           + ", instance config does not exist");
     }
 
-    baseAccessor.update(path, new DataUpdater<ZNRecord>()
-
-    {
+    baseAccessor.update(path, new DataUpdater<ZNRecord>() {
       @Override
       public ZNRecord update(ZNRecord currentData) {
         if (currentData == null) {
@@ -1587,4 +1595,212 @@
       _zkClient.close();
     }
   }
+
+  @Override
+  public boolean addResourceWithWeight(String clusterName, IdealState idealState,
+      ResourceConfig resourceConfig) {
+    // Null checks
+    if (clusterName == null || clusterName.isEmpty()) {
+      throw new HelixException("Cluster name is null or empty!");
+    }
+    if (idealState == null || !idealState.isValid()) {
+      throw new HelixException("IdealState is null or invalid!");
+    }
+    if (resourceConfig == null || !resourceConfig.isValid()) {
+      // TODO This might be okay because of default weight?
+      throw new HelixException("ResourceConfig is null or invalid!");
+    }
+
+    // Make sure IdealState and ResourceConfig are for the same resource
+    if (!idealState.getResourceName().equals(resourceConfig.getResourceName())) {
+      throw new HelixException("Resource names in IdealState and ResourceConfig are different!");
+    }
+
+    // Order in which a resource should be added:
+    // 1. Validate the weights in ResourceConfig against ClusterConfig
+    // Check that all capacity keys in ClusterConfig are set up in every partition in ResourceConfig field
+    if (!validateWeightForResourceConfig(_configAccessor.getClusterConfig(clusterName),
+        resourceConfig, idealState)) {
+      throw new HelixException(String
+          .format("Could not add resource %s with weight! Failed to validate the ResourceConfig!",
+              idealState.getResourceName()));
+    }
+
+    // 2. Add the resourceConfig to ZK
+    _configAccessor
+        .setResourceConfig(clusterName, resourceConfig.getResourceName(), resourceConfig);
+
+    // 3. Add the idealState to ZK
+    setResourceIdealState(clusterName, idealState.getResourceName(), idealState);
+
+    // 4. rebalance the resource
+    rebalance(clusterName, idealState.getResourceName(), Integer.parseInt(idealState.getReplicas()),
+        idealState.getResourceName(), idealState.getInstanceGroupTag());
+
+    return true;
+  }
+
+  @Override
+  public boolean enableWagedRebalance(String clusterName, List<String> resourceNames) {
+    // Null checks
+    if (clusterName == null || clusterName.isEmpty()) {
+      throw new HelixException("Cluster name is invalid!");
+    }
+    if (resourceNames == null || resourceNames.isEmpty()) {
+      throw new HelixException("Resource name list is invalid!");
+    }
+
+    HelixDataAccessor accessor =
+        new ZKHelixDataAccessor(clusterName, new ZkBaseDataAccessor<>(_zkClient));
+    Builder keyBuilder = accessor.keyBuilder();
+    List<IdealState> idealStates = accessor.getChildValues(keyBuilder.idealStates());
+    List<String> nullIdealStates = new ArrayList<>();
+    for (int i = 0; i < idealStates.size(); i++) {
+      if (idealStates.get(i) == null) {
+        nullIdealStates.add(resourceNames.get(i));
+      } else {
+        idealStates.get(i).setRebalancerClassName(WagedRebalancer.class.getName());
+        idealStates.get(i).setRebalanceMode(RebalanceMode.FULL_AUTO);
+      }
+    }
+    if (!nullIdealStates.isEmpty()) {
+      throw new HelixException(
+          String.format("Not all IdealStates exist in the cluster: %s", nullIdealStates));
+    }
+    List<PropertyKey> idealStateKeys = new ArrayList<>();
+    idealStates.forEach(
+        idealState -> idealStateKeys.add(keyBuilder.idealStates(idealState.getResourceName())));
+    boolean[] success = accessor.setChildren(idealStateKeys, idealStates);
+    for (boolean s : success) {
+      if (!s) {
+        return false;
+      }
+    }
+    return true;
+  }
+
+  @Override
+  public Map<String, Boolean> validateResourcesForWagedRebalance(String clusterName,
+      List<String> resourceNames) {
+    // Null checks
+    if (clusterName == null || clusterName.isEmpty()) {
+      throw new HelixException("Cluster name is invalid!");
+    }
+    if (resourceNames == null || resourceNames.isEmpty()) {
+      throw new HelixException("Resource name list is invalid!");
+    }
+
+    // Ensure that all instances are valid
+    HelixDataAccessor accessor =
+        new ZKHelixDataAccessor(clusterName, new ZkBaseDataAccessor<>(_zkClient));
+    Builder keyBuilder = accessor.keyBuilder();
+    List<String> instances = accessor.getChildNames(keyBuilder.instanceConfigs());
+    if (validateInstancesForWagedRebalance(clusterName, instances).containsValue(false)) {
+      throw new HelixException(String
+          .format("Instance capacities haven't been configured properly for cluster %s",
+              clusterName));
+    }
+
+    Map<String, Boolean> result = new HashMap<>();
+    ClusterConfig clusterConfig = _configAccessor.getClusterConfig(clusterName);
+    for (String resourceName : resourceNames) {
+      IdealState idealState = getResourceIdealState(clusterName, resourceName);
+      if (idealState == null || !idealState.isValid()) {
+        result.put(resourceName, false);
+        continue;
+      }
+      ResourceConfig resourceConfig = _configAccessor.getResourceConfig(clusterName, resourceName);
+      result.put(resourceName,
+          validateWeightForResourceConfig(clusterConfig, resourceConfig, idealState));
+    }
+    return result;
+  }
+
+  @Override
+  public Map<String, Boolean> validateInstancesForWagedRebalance(String clusterName,
+      List<String> instanceNames) {
+    // Null checks
+    if (clusterName == null || clusterName.isEmpty()) {
+      throw new HelixException("Cluster name is invalid!");
+    }
+    if (instanceNames == null || instanceNames.isEmpty()) {
+      throw new HelixException("Instance name list is invalid!");
+    }
+
+    Map<String, Boolean> result = new HashMap<>();
+    ClusterConfig clusterConfig = _configAccessor.getClusterConfig(clusterName);
+    for (String instanceName : instanceNames) {
+      InstanceConfig instanceConfig = _configAccessor.getInstanceConfig(clusterName, instanceName);
+      if (instanceConfig == null || !instanceConfig.isValid()) {
+        result.put(instanceName, false);
+        continue;
+      }
+      WagedValidationUtil.validateAndGetInstanceCapacity(clusterConfig, instanceConfig);
+      result.put(instanceName, true);
+    }
+
+    return result;
+  }
+
+  /**
+   * Validates ResourceConfig's weight field against the given ClusterConfig.
+   * @param clusterConfig
+   * @param resourceConfig
+   * @param idealState
+   * @return true if ResourceConfig has all the required fields. False otherwise.
+   */
+  private boolean validateWeightForResourceConfig(ClusterConfig clusterConfig,
+      ResourceConfig resourceConfig, IdealState idealState) {
+    if (resourceConfig == null) {
+      if (clusterConfig.getDefaultPartitionWeightMap().isEmpty()) {
+        logger.error(
+            "ResourceConfig for {} is null, and there are no default weights set in ClusterConfig!",
+            idealState.getResourceName());
+        return false;
+      }
+      // If ResourceConfig is null AND the default partition weight map is defined, and the map has all the required keys, we consider this valid since the default weights will be used
+      // Need to check the map contains all the required keys
+      if (clusterConfig.getDefaultPartitionWeightMap().keySet()
+          .containsAll(clusterConfig.getInstanceCapacityKeys())) {
+        // Contains all the required keys, so consider it valid since it will use the default weights
+        return true;
+      }
+      logger.error(
+          "ResourceConfig for {} is null, and ClusterConfig's default partition weight map doesn't have all the required keys!",
+          idealState.getResourceName());
+      return false;
+    }
+
+    // Parse the entire capacityMap from ResourceConfig
+    Map<String, Map<String, Integer>> capacityMap;
+    try {
+      capacityMap = resourceConfig.getPartitionCapacityMap();
+    } catch (IOException ex) {
+      logger.error("Invalid partition capacity configuration of resource: {}",
+          idealState.getResourceName(), ex);
+      return false;
+    }
+
+    Set<String> capacityMapSet = new HashSet<>(capacityMap.keySet());
+    boolean hasDefaultCapacity = capacityMapSet.contains(ResourceConfig.DEFAULT_PARTITION_KEY);
+    // Remove DEFAULT key
+    capacityMapSet.remove(ResourceConfig.DEFAULT_PARTITION_KEY);
+
+    // Make sure capacityMap contains all partitions defined in IdealState
+    // Here, IdealState has not been rebalanced, so listFields might be null, in which case, we would get an emptyList from getPartitionSet()
+    // So check using numPartitions instead
+    // This check allows us to fail early on instead of having to loop through all partitions
+    if (capacityMapSet.size() != idealState.getNumPartitions() && !hasDefaultCapacity) {
+      logger.error(
+          "ResourceConfig for {} does not have all partitions defined in PartitionCapacityMap!",
+          idealState.getResourceName());
+      return false;
+    }
+
+    // Loop through all partitions and validate
+    capacityMap.keySet().forEach(partitionName -> WagedValidationUtil
+        .validateAndGetPartitionCapacity(partitionName, resourceConfig, capacityMap,
+            clusterConfig));
+    return true;
+  }
 }
diff --git a/helix-core/src/main/java/org/apache/helix/manager/zk/ZNRecordJacksonSerializer.java b/helix-core/src/main/java/org/apache/helix/manager/zk/ZNRecordJacksonSerializer.java
new file mode 100644
index 0000000..b375e80
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/manager/zk/ZNRecordJacksonSerializer.java
@@ -0,0 +1,67 @@
+package org.apache.helix.manager.zk;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.IOException;
+import org.I0Itec.zkclient.exception.ZkMarshallingError;
+import org.I0Itec.zkclient.serialize.ZkSerializer;
+import org.apache.helix.HelixException;
+import org.apache.helix.ZNRecord;
+import org.codehaus.jackson.map.ObjectMapper;
+
+/**
+ * ZNRecordJacksonSerializer serializes ZNRecord objects into a byte array using Jackson. Note that
+ * this serializer doesn't check for the size of the resulting binary.
+ */
+public class ZNRecordJacksonSerializer implements ZkSerializer {
+  private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
+
+  @Override
+  public byte[] serialize(Object record) throws ZkMarshallingError {
+    if (!(record instanceof ZNRecord)) {
+      // null is NOT an instance of any class
+      throw new HelixException("Input object is not of type ZNRecord (was " + record + ")");
+    }
+    ZNRecord znRecord = (ZNRecord) record;
+
+    try {
+      return OBJECT_MAPPER.writeValueAsBytes(znRecord);
+    } catch (IOException e) {
+      throw new HelixException(
+          String.format("Exception during serialization. ZNRecord id: %s", znRecord.getId()), e);
+    }
+  }
+
+  @Override
+  public Object deserialize(byte[] bytes) throws ZkMarshallingError {
+    if (bytes == null || bytes.length == 0) {
+      // reading a parent/null node
+      return null;
+    }
+
+    ZNRecord record;
+    try {
+      record = OBJECT_MAPPER.readValue(bytes, ZNRecord.class);
+    } catch (IOException e) {
+      throw new HelixException("Exception during deserialization!", e);
+    }
+    return record;
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/manager/zk/ZkBucketDataAccessor.java b/helix-core/src/main/java/org/apache/helix/manager/zk/ZkBucketDataAccessor.java
new file mode 100644
index 0000000..bc13471
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/manager/zk/ZkBucketDataAccessor.java
@@ -0,0 +1,380 @@
+package org.apache.helix.manager.zk;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import com.google.common.collect.ImmutableMap;
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Map;
+import java.util.TimerTask;
+import java.util.concurrent.Executors;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+import org.I0Itec.zkclient.DataUpdater;
+import org.I0Itec.zkclient.exception.ZkMarshallingError;
+import org.I0Itec.zkclient.exception.ZkNoNodeException;
+import org.I0Itec.zkclient.serialize.ZkSerializer;
+import org.apache.helix.AccessOption;
+import org.apache.helix.BucketDataAccessor;
+import org.apache.helix.HelixException;
+import org.apache.helix.HelixProperty;
+import org.apache.helix.ZNRecord;
+import org.apache.helix.manager.zk.client.DedicatedZkClientFactory;
+import org.apache.helix.manager.zk.client.HelixZkClient;
+import org.apache.helix.util.GZipCompressionUtil;
+import org.codehaus.jackson.map.ObjectMapper;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class ZkBucketDataAccessor implements BucketDataAccessor, AutoCloseable {
+  private static final Logger LOG = LoggerFactory.getLogger(ZkBucketDataAccessor.class);
+
+  private static final int DEFAULT_BUCKET_SIZE = 50 * 1024; // 50KB
+  private static final long DEFAULT_VERSION_TTL = TimeUnit.MINUTES.toMillis(1L); // 1 min
+  private static final String BUCKET_SIZE_KEY = "BUCKET_SIZE";
+  private static final String DATA_SIZE_KEY = "DATA_SIZE";
+  private static final String METADATA_KEY = "METADATA";
+  private static final String LAST_SUCCESSFUL_WRITE_KEY = "LAST_SUCCESSFUL_WRITE";
+  private static final String LAST_WRITE_KEY = "LAST_WRITE";
+  private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
+  // Thread pool for deleting stale versions
+  private static final ScheduledExecutorService GC_THREAD = Executors.newScheduledThreadPool(1);
+
+  private final int _bucketSize;
+  private final long _versionTTL;
+  private ZkSerializer _zkSerializer;
+  private HelixZkClient _zkClient;
+  private ZkBaseDataAccessor<byte[]> _zkBaseDataAccessor;
+
+  /**
+   * Constructor that allows a custom bucket size.
+   * @param zkAddr
+   * @param bucketSize
+   * @param versionTTL in ms
+   */
+  public ZkBucketDataAccessor(String zkAddr, int bucketSize, long versionTTL) {
+    _zkClient = DedicatedZkClientFactory.getInstance()
+        .buildZkClient(new HelixZkClient.ZkConnectionConfig(zkAddr));
+    _zkClient.setZkSerializer(new ZkSerializer() {
+      @Override
+      public byte[] serialize(Object data) throws ZkMarshallingError {
+        if (data instanceof byte[]) {
+          return (byte[]) data;
+        }
+        throw new HelixException("ZkBucketDataAccesor only supports a byte array as an argument!");
+      }
+
+      @Override
+      public Object deserialize(byte[] data) throws ZkMarshallingError {
+        return data;
+      }
+    });
+    _zkBaseDataAccessor = new ZkBaseDataAccessor<>(_zkClient);
+    _zkSerializer = new ZNRecordJacksonSerializer();
+    _bucketSize = bucketSize;
+    _versionTTL = versionTTL;
+  }
+
+  /**
+   * Constructor that uses a default bucket size.
+   * @param zkAddr
+   */
+  public ZkBucketDataAccessor(String zkAddr) {
+    this(zkAddr, DEFAULT_BUCKET_SIZE, DEFAULT_VERSION_TTL);
+  }
+
+  @Override
+  public <T extends HelixProperty> boolean compressedBucketWrite(String rootPath, T value)
+      throws IOException {
+    DataUpdater<byte[]> lastWriteVersionUpdater = dataInZk -> {
+      if (dataInZk == null || dataInZk.length == 0) {
+        // No last write version exists, so start with 0
+        return "0".getBytes();
+      }
+      // Last write exists, so increment and write it back
+      // **String conversion is necessary to make it display in ZK (zooinspector)**
+      String lastWriteVersionStr = new String(dataInZk);
+      long lastWriteVersion = Long.parseLong(lastWriteVersionStr);
+      lastWriteVersion++;
+      return String.valueOf(lastWriteVersion).getBytes();
+    };
+
+    // 1. Increment lastWriteVersion using DataUpdater
+    ZkBaseDataAccessor.AccessResult result = _zkBaseDataAccessor.doUpdate(
+        rootPath + "/" + LAST_WRITE_KEY, lastWriteVersionUpdater, AccessOption.PERSISTENT);
+    if (result._retCode != ZkBaseDataAccessor.RetCode.OK) {
+      throw new HelixException(
+          String.format("Failed to write the write version at path: %s!", rootPath));
+    }
+
+    // Successfully reserved a version number
+    byte[] binaryVersion = (byte[]) result._updatedValue;
+    String versionStr = new String(binaryVersion);
+    final long version = Long.parseLong(versionStr);
+
+    // 2. Write to the incremented last write version
+    String versionedDataPath = rootPath + "/" + versionStr;
+
+    // Take the ZNRecord and serialize it (get byte[])
+    byte[] serializedRecord = _zkSerializer.serialize(value.getRecord());
+    // Compress the byte[]
+    byte[] compressedRecord = GZipCompressionUtil.compress(serializedRecord);
+    // Compute N - number of buckets
+    int numBuckets = (compressedRecord.length + _bucketSize - 1) / _bucketSize;
+
+    List<String> paths = new ArrayList<>();
+    List<byte[]> buckets = new ArrayList<>();
+
+    int ptr = 0;
+    int counter = 0;
+    while (counter < numBuckets) {
+      paths.add(versionedDataPath + "/" + counter);
+      if (counter == numBuckets - 1) {
+        // Special treatment for the last bucket
+        buckets.add(
+            Arrays.copyOfRange(compressedRecord, ptr, ptr + compressedRecord.length % _bucketSize));
+      } else {
+        buckets.add(Arrays.copyOfRange(compressedRecord, ptr, ptr + _bucketSize));
+      }
+      ptr += _bucketSize;
+      counter++;
+    }
+
+    // 3. Include the metadata in the batch write
+    Map<String, String> metadata = ImmutableMap.of(BUCKET_SIZE_KEY, Integer.toString(_bucketSize),
+        DATA_SIZE_KEY, Integer.toString(compressedRecord.length));
+    byte[] binaryMetadata = OBJECT_MAPPER.writeValueAsBytes(metadata);
+    paths.add(versionedDataPath + "/" + METADATA_KEY);
+    buckets.add(binaryMetadata);
+
+    // Do an async set to ZK
+    boolean[] success = _zkBaseDataAccessor.setChildren(paths, buckets, AccessOption.PERSISTENT);
+    // Exception and fail the write if any failed
+    for (boolean s : success) {
+      if (!s) {
+        throw new HelixException(
+            String.format("Failed to write the data buckets for path: %s", rootPath));
+      }
+    }
+
+    // 4. Update lastSuccessfulWriteVersion using Updater
+    DataUpdater<byte[]> lastSuccessfulWriteVersionUpdater = dataInZk -> {
+      if (dataInZk == null || dataInZk.length == 0) {
+        // No last write version exists, so write version from this write
+        return versionStr.getBytes();
+      }
+      // Last successful write exists so check if it's smaller than my number
+      String lastWriteVersionStr = new String(dataInZk);
+      long lastWriteVersion = Long.parseLong(lastWriteVersionStr);
+      if (lastWriteVersion < version) {
+        // Smaller, so I can overwrite
+        return versionStr.getBytes();
+      } else {
+        // Greater, I have lagged behind. Return null and do not write
+        return null;
+      }
+    };
+    if (!_zkBaseDataAccessor.update(rootPath + "/" + LAST_SUCCESSFUL_WRITE_KEY,
+        lastSuccessfulWriteVersionUpdater, AccessOption.PERSISTENT)) {
+      throw new HelixException(String
+          .format("Failed to write the last successful write metadata at path: %s!", rootPath));
+    }
+
+    // 5. Update the timer for GC
+    updateGCTimer(rootPath, versionStr);
+    return true;
+  }
+
+  @Override
+  public <T extends HelixProperty> HelixProperty compressedBucketRead(String path,
+      Class<T> helixPropertySubType) {
+    return helixPropertySubType.cast(compressedBucketRead(path));
+  }
+
+  @Override
+  public void compressedBucketDelete(String path) {
+    if (!_zkBaseDataAccessor.remove(path, AccessOption.PERSISTENT)) {
+      throw new HelixException(String.format("Failed to delete the bucket data! Path: %s", path));
+    }
+  }
+
+  @Override
+  public void disconnect() {
+    if (!_zkClient.isClosed()) {
+      _zkClient.close();
+    }
+  }
+
+  private HelixProperty compressedBucketRead(String path) {
+    // 1. Get the version to read
+    byte[] binaryVersionToRead = _zkBaseDataAccessor.get(path + "/" + LAST_SUCCESSFUL_WRITE_KEY,
+        null, AccessOption.PERSISTENT);
+    if (binaryVersionToRead == null) {
+      throw new ZkNoNodeException(
+          String.format("Last successful write ZNode does not exist for path: %s", path));
+    }
+    String versionToRead = new String(binaryVersionToRead);
+
+    // 2. Get the metadata map
+    byte[] binaryMetadata = _zkBaseDataAccessor.get(path + "/" + versionToRead + "/" + METADATA_KEY,
+        null, AccessOption.PERSISTENT);
+    if (binaryMetadata == null) {
+      throw new ZkNoNodeException(
+          String.format("Metadata ZNode does not exist for path: %s", path));
+    }
+    Map metadata;
+    try {
+      metadata = OBJECT_MAPPER.readValue(binaryMetadata, Map.class);
+    } catch (IOException e) {
+      throw new HelixException(String.format("Failed to deserialize path metadata: %s!", path), e);
+    }
+
+    // 3. Read the data
+    Object bucketSizeObj = metadata.get(BUCKET_SIZE_KEY);
+    Object dataSizeObj = metadata.get(DATA_SIZE_KEY);
+    if (bucketSizeObj == null) {
+      throw new HelixException(
+          String.format("Metadata ZNRecord does not have %s! Path: %s", BUCKET_SIZE_KEY, path));
+    }
+    if (dataSizeObj == null) {
+      throw new HelixException(
+          String.format("Metadata ZNRecord does not have %s! Path: %s", DATA_SIZE_KEY, path));
+    }
+    int bucketSize = Integer.parseInt((String) bucketSizeObj);
+    int dataSize = Integer.parseInt((String) dataSizeObj);
+
+    // Compute N - number of buckets
+    int numBuckets = (dataSize + _bucketSize - 1) / _bucketSize;
+    byte[] compressedRecord = new byte[dataSize];
+    String dataPath = path + "/" + versionToRead;
+
+    List<String> paths = new ArrayList<>();
+    for (int i = 0; i < numBuckets; i++) {
+      paths.add(dataPath + "/" + i);
+    }
+
+    // Async get
+    List<byte[]> buckets = _zkBaseDataAccessor.get(paths, null, AccessOption.PERSISTENT, true);
+
+    // Combine buckets into one byte array
+    int copyPtr = 0;
+    for (int i = 0; i < numBuckets; i++) {
+      if (i == numBuckets - 1) {
+        // Special treatment for the last bucket
+        System.arraycopy(buckets.get(i), 0, compressedRecord, copyPtr, dataSize % bucketSize);
+      } else {
+        System.arraycopy(buckets.get(i), 0, compressedRecord, copyPtr, bucketSize);
+        copyPtr += bucketSize;
+      }
+    }
+
+    // Decompress the byte array
+    ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(compressedRecord);
+    byte[] serializedRecord;
+    try {
+      serializedRecord = GZipCompressionUtil.uncompress(byteArrayInputStream);
+    } catch (IOException e) {
+      throw new HelixException(String.format("Failed to decompress path: %s!", path), e);
+    }
+
+    // Deserialize the record to retrieve the original
+    ZNRecord originalRecord = (ZNRecord) _zkSerializer.deserialize(serializedRecord);
+    return new HelixProperty(originalRecord);
+  }
+
+  @Override
+  public void close() {
+    disconnect();
+  }
+
+  private void updateGCTimer(String rootPath, String currentVersion) {
+    TimerTask gcTask = new TimerTask() {
+      @Override
+      public void run() {
+        deleteStaleVersions(rootPath, currentVersion);
+      }
+    };
+
+    // Schedule the gc task with TTL
+    GC_THREAD.schedule(gcTask, _versionTTL, TimeUnit.MILLISECONDS);
+  }
+
+  /**
+   * Deletes all stale versions.
+   * @param rootPath
+   * @param currentVersion
+   */
+  private void deleteStaleVersions(String rootPath, String currentVersion) {
+    // Get all children names under path
+    List<String> children = _zkBaseDataAccessor.getChildNames(rootPath, AccessOption.PERSISTENT);
+    if (children == null || children.isEmpty()) {
+      // The whole path has been deleted so return immediately
+      return;
+    }
+    filterChildrenNames(children, currentVersion);
+    List<String> pathsToDelete = getPathsToDelete(rootPath, children);
+    for (String pathToDelete : pathsToDelete) {
+      // TODO: Should be batch delete but it doesn't work. It's okay since this runs async
+      _zkBaseDataAccessor.remove(pathToDelete, AccessOption.PERSISTENT);
+    }
+  }
+
+  /**
+   * Filter out non-version children names and non-stale versions.
+   * @param children
+   */
+  private void filterChildrenNames(List<String> children, String currentVersion) {
+    // Leave out metadata
+    children.remove(LAST_SUCCESSFUL_WRITE_KEY);
+    children.remove(LAST_WRITE_KEY);
+
+    // Leave out currentVersion and above
+    // This is because we want to honor the TTL for newer versions
+    children.remove(currentVersion);
+    long currentVer = Long.parseLong(currentVersion);
+    for (String child : children) {
+      try {
+        long version = Long.parseLong(child);
+        if (version >= currentVer) {
+          children.remove(child);
+        }
+      } catch (Exception e) {
+        // Ignore ZNode names that aren't parseable
+        children.remove(child);
+        LOG.debug("Found an invalid ZNode: {}", child);
+      }
+    }
+  }
+
+  /**
+   * Generates all stale paths to delete.
+   * @param path
+   * @param staleVersions
+   * @return
+   */
+  private List<String> getPathsToDelete(String path, List<String> staleVersions) {
+    List<String> pathsToDelete = new ArrayList<>();
+    staleVersions.forEach(ver -> pathsToDelete.add(path + "/" + ver));
+    return pathsToDelete;
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/model/ClusterConfig.java b/helix-core/src/main/java/org/apache/helix/model/ClusterConfig.java
index bb478c3..f88d2f5 100644
--- a/helix-core/src/main/java/org/apache/helix/model/ClusterConfig.java
+++ b/helix-core/src/main/java/org/apache/helix/model/ClusterConfig.java
@@ -24,7 +24,9 @@
 import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
+import java.util.stream.Collectors;
 
+import com.google.common.collect.ImmutableMap;
 import com.google.common.collect.Maps;
 import org.apache.helix.HelixException;
 import org.apache.helix.HelixProperty;
@@ -81,7 +83,38 @@
     DISABLED_INSTANCES,
 
     // Specifies job types and used for quota allocation
-    QUOTA_TYPES
+    QUOTA_TYPES,
+
+    /**
+     * Configurable characteristics of the WAGED rebalancer.
+     * TODO: Split the WAGED rebalancer configuration items to the other config file.
+     */
+    // The required instance capacity keys for resource partition assignment calculation.
+    INSTANCE_CAPACITY_KEYS,
+    // The default instance capacity if no capacity is configured in the Instance Config node.
+    DEFAULT_INSTANCE_CAPACITY_MAP,
+    // The default partition weights if no weight is configured in the Resource Config node.
+    DEFAULT_PARTITION_WEIGHT_MAP,
+    // The preference of the rebalance result.
+    // EVENNESS - Evenness of the resource utilization, partition, and top state distribution.
+    // LESS_MOVEMENT - the tendency of keeping the current assignment instead of moving the partition for optimal assignment.
+    REBALANCE_PREFERENCE,
+    // Specify if the WAGED rebalancer should asynchronously perform the global rebalance, which is
+    // in general slower than the partial rebalance.
+    // Note that asynchronous global rebalance calculation will reduce the controller rebalance
+    // delay. But it may cause more partition movements. This is because the partial rebalance will
+    // be performed with a stale baseline. The rebalance result would be an intermediate one and
+    // could be changed again when a new baseline is calculated.
+    // For more details, please refer to
+    // https://github.com/apache/helix/wiki/Weight-aware-Globally-Evenly-distributed-Rebalancer#rebalance-coordinator
+    //
+    // Default to be true.
+    GLOBAL_REBALANCE_ASYNC_MODE
+  }
+
+  public enum GlobalRebalancePreferenceKey {
+    EVENNESS,
+    LESS_MOVEMENT
   }
 
   private final static int DEFAULT_MAX_CONCURRENT_TASK_PER_INSTANCE = 40;
@@ -95,6 +128,16 @@
 
   public final static String TASK_QUOTA_RATIO_NOT_SET = "-1";
 
+  // Default preference for all the aspects should be the same to ensure balanced setup.
+  public final static Map<GlobalRebalancePreferenceKey, Integer>
+      DEFAULT_GLOBAL_REBALANCE_PREFERENCE =
+      ImmutableMap.<GlobalRebalancePreferenceKey, Integer>builder()
+          .put(GlobalRebalancePreferenceKey.EVENNESS, 1)
+          .put(GlobalRebalancePreferenceKey.LESS_MOVEMENT, 1).build();
+  private final static int MAX_REBALANCE_PREFERENCE = 10;
+  private final static int MIN_REBALANCE_PREFERENCE = 0;
+  public final static boolean DEFAULT_GLOBAL_REBALANCE_ASYNC_MODE_ENABLED = true;
+
   /**
    * Instantiate for a specific cluster
    * @param cluster the cluster identifier
@@ -113,21 +156,21 @@
 
   /**
    * Set task quota type with the ratio of this quota.
-   * @param quotaType String
+   * @param quotaType  String
    * @param quotaRatio int
    */
   public void setTaskQuotaRatio(String quotaType, int quotaRatio) {
     if (_record.getMapField(ClusterConfigProperty.QUOTA_TYPES.name()) == null) {
       _record.setMapField(ClusterConfigProperty.QUOTA_TYPES.name(), new HashMap<String, String>());
     }
-    _record.getMapField(ClusterConfigProperty.QUOTA_TYPES.name()).put(quotaType,
-        Integer.toString(quotaRatio));
+    _record.getMapField(ClusterConfigProperty.QUOTA_TYPES.name())
+        .put(quotaType, Integer.toString(quotaRatio));
   }
 
   /**
    * Set task quota type with the ratio of this quota. Quota ratio must be a String that is
    * parse-able into an int.
-   * @param quotaType String
+   * @param quotaType  String
    * @param quotaRatio String
    */
   public void setTaskQuotaRatio(String quotaType, String quotaRatio) {
@@ -210,8 +253,8 @@
    * @return
    */
   public Boolean isPersistIntermediateAssignment() {
-    return _record.getBooleanField(ClusterConfigProperty.PERSIST_INTERMEDIATE_ASSIGNMENT.toString(),
-        false);
+    return _record
+        .getBooleanField(ClusterConfigProperty.PERSIST_INTERMEDIATE_ASSIGNMENT.toString(), false);
   }
 
   /**
@@ -233,8 +276,8 @@
   }
 
   public Boolean isPipelineTriggersDisabled() {
-    return _record.getBooleanField(ClusterConfigProperty.HELIX_DISABLE_PIPELINE_TRIGGERS.toString(),
-        false);
+    return _record
+        .getBooleanField(ClusterConfigProperty.HELIX_DISABLE_PIPELINE_TRIGGERS.toString(), false);
   }
 
   /**
@@ -403,8 +446,8 @@
    * @return
    */
   public int getNumOfflineInstancesForAutoExit() {
-    return _record.getIntField(ClusterConfigProperty.NUM_OFFLINE_INSTANCES_FOR_AUTO_EXIT.name(),
-        -1);
+    return _record
+        .getIntField(ClusterConfigProperty.NUM_OFFLINE_INSTANCES_FOR_AUTO_EXIT.name(), -1);
   }
 
   /**
@@ -444,9 +487,7 @@
     if (obj instanceof ClusterConfig) {
       ClusterConfig that = (ClusterConfig) obj;
 
-      if (this.getId().equals(that.getId())) {
-        return true;
-      }
+      return this.getId().equals(that.getId());
     }
     return false;
   }
@@ -490,8 +531,8 @@
     }
 
     if (!configStrs.isEmpty()) {
-      _record.setListField(ClusterConfigProperty.STATE_TRANSITION_THROTTLE_CONFIGS.name(),
-          configStrs);
+      _record
+          .setListField(ClusterConfigProperty.STATE_TRANSITION_THROTTLE_CONFIGS.name(), configStrs);
     }
   }
 
@@ -579,7 +620,7 @@
   public int getErrorPartitionThresholdForLoadBalance() {
     return _record.getIntField(
         ClusterConfigProperty.ERROR_PARTITION_THRESHOLD_FOR_LOAD_BALANCE.name(),
-        DEFAULT_ERROR_PARTITION_THRESHOLD_FOR_LOAD_BALANCE);
+            DEFAULT_ERROR_PARTITION_THRESHOLD_FOR_LOAD_BALANCE);
   }
 
   /**
@@ -658,6 +699,159 @@
   }
 
   /**
+   * Set the required Instance Capacity Keys.
+   * @param capacityKeys
+   */
+  public void setInstanceCapacityKeys(List<String> capacityKeys) {
+    if (capacityKeys == null || capacityKeys.isEmpty()) {
+      throw new IllegalArgumentException("The input instance capacity key list is empty.");
+    }
+    _record.setListField(ClusterConfigProperty.INSTANCE_CAPACITY_KEYS.name(), capacityKeys);
+  }
+
+  /**
+   * @return The required Instance Capacity Keys. If not configured, return an empty list.
+   */
+  public List<String> getInstanceCapacityKeys() {
+    List<String> capacityKeys = _record.getListField(ClusterConfigProperty.INSTANCE_CAPACITY_KEYS.name());
+    if (capacityKeys == null) {
+      return Collections.emptyList();
+    }
+    return capacityKeys;
+  }
+
+  /**
+   * Get the default instance capacity information from the map fields.
+   *
+   * @return data map if it exists, or empty map
+   */
+  public Map<String, Integer> getDefaultInstanceCapacityMap() {
+    return getDefaultCapacityMap(ClusterConfigProperty.DEFAULT_INSTANCE_CAPACITY_MAP);
+  }
+
+  /**
+   * Set the default instance capacity information with an Integer mapping.
+   * This information is required by the global rebalancer.
+   * @see <a href="Rebalance Algorithm">
+   * https://github.com/apache/helix/wiki/Design-Proposal---Weight-Aware-Globally-Even-Distribute-Rebalancer#rebalance-algorithm-adapter
+   * </a>
+   * If the instance capacity is not configured in either Instance Config nor Cluster Config, the
+   * cluster topology is considered invalid. So the rebalancer may stop working.
+   * @param capacityDataMap - map of instance capacity data
+   * @throws IllegalArgumentException - when any of the data value is a negative number or when the map is empty
+   */
+  public void setDefaultInstanceCapacityMap(Map<String, Integer> capacityDataMap)
+      throws IllegalArgumentException {
+    setDefaultCapacityMap(ClusterConfigProperty.DEFAULT_INSTANCE_CAPACITY_MAP, capacityDataMap);
+  }
+
+  /**
+   * Get the default partition weight information from the map fields.
+   *
+   * @return data map if it exists, or empty map
+   */
+  public Map<String, Integer> getDefaultPartitionWeightMap() {
+    return getDefaultCapacityMap(ClusterConfigProperty.DEFAULT_PARTITION_WEIGHT_MAP);
+  }
+
+  /**
+   * Set the default partition weight information with an Integer mapping.
+   * This information is required by the global rebalancer.
+   * @see <a href="Rebalance Algorithm">
+   * https://github.com/apache/helix/wiki/Design-Proposal---Weight-Aware-Globally-Even-Distribute-Rebalancer#rebalance-algorithm-adapter
+   * </a>
+   * If the partition weight is not configured in either Resource Config nor Cluster Config, the
+   * cluster topology is considered invalid. So the rebalancer may stop working.
+   * @param weightDataMap - map of partition weight data
+   * @throws IllegalArgumentException - when any of the data value is a negative number or when the map is empty
+   */
+  public void setDefaultPartitionWeightMap(Map<String, Integer> weightDataMap)
+      throws IllegalArgumentException {
+    setDefaultCapacityMap(ClusterConfigProperty.DEFAULT_PARTITION_WEIGHT_MAP, weightDataMap);
+  }
+
+  private Map<String, Integer> getDefaultCapacityMap(ClusterConfigProperty capacityPropertyType) {
+    Map<String, String> capacityData = _record.getMapField(capacityPropertyType.name());
+    if (capacityData != null) {
+      return capacityData.entrySet().stream().collect(
+          Collectors.toMap(entry -> entry.getKey(), entry -> Integer.parseInt(entry.getValue())));
+    }
+    return Collections.emptyMap();
+  }
+
+  private void setDefaultCapacityMap(ClusterConfigProperty capacityPropertyType,
+      Map<String, Integer> capacityDataMap) throws IllegalArgumentException {
+    if (capacityDataMap == null) {
+      throw new IllegalArgumentException("Default capacity data is null");
+    }
+    Map<String, String> data = new HashMap<>();
+    capacityDataMap.entrySet().stream().forEach(entry -> {
+      if (entry.getValue() < 0) {
+        throw new IllegalArgumentException(String
+            .format("Default capacity data contains a negative value: %s = %d", entry.getKey(),
+                entry.getValue()));
+      }
+      data.put(entry.getKey(), Integer.toString(entry.getValue()));
+    });
+    _record.setMapField(capacityPropertyType.name(), data);
+  }
+
+  /**
+   * Set the global rebalancer's assignment preference.
+   * @param preference A map of the GlobalRebalancePreferenceKey and the corresponding weight.
+   *                   The ratio of the configured weights will determine the rebalancer's behavior.
+   */
+  public void setGlobalRebalancePreference(Map<GlobalRebalancePreferenceKey, Integer> preference) {
+    Map<String, String> preferenceMap = new HashMap<>();
+
+    preference.entrySet().stream().forEach(entry -> {
+      if (entry.getValue() > MAX_REBALANCE_PREFERENCE
+          || entry.getValue() < MIN_REBALANCE_PREFERENCE) {
+        throw new IllegalArgumentException(String
+            .format("Invalid global rebalance preference configuration. Key %s, Value %d.",
+                entry.getKey().name(), entry.getValue()));
+      }
+      preferenceMap.put(entry.getKey().name(), Integer.toString(entry.getValue()));
+    });
+
+    _record.setMapField(ClusterConfigProperty.REBALANCE_PREFERENCE.name(), preferenceMap);
+  }
+
+  /**
+   * Get the global rebalancer's assignment preference.
+   */
+  public Map<GlobalRebalancePreferenceKey, Integer> getGlobalRebalancePreference() {
+    Map<String, String> preferenceStrMap =
+        _record.getMapField(ClusterConfigProperty.REBALANCE_PREFERENCE.name());
+    if (preferenceStrMap != null && !preferenceStrMap.isEmpty()) {
+      Map<GlobalRebalancePreferenceKey, Integer> preference = new HashMap<>();
+      for (GlobalRebalancePreferenceKey key : GlobalRebalancePreferenceKey.values()) {
+        if (!preferenceStrMap.containsKey(key.name())) {
+          // If any key is not configured with a value, return the default config.
+          return DEFAULT_GLOBAL_REBALANCE_PREFERENCE;
+        }
+        preference.put(key, Integer.parseInt(preferenceStrMap.get(key.name())));
+      }
+      return preference;
+    }
+    // If configuration is not complete, return the default one.
+    return DEFAULT_GLOBAL_REBALANCE_PREFERENCE;
+  }
+
+  /**
+   * Set the asynchronous global rebalance mode.
+   * @param isAsync true if the global rebalance should be performed asynchronously
+   */
+  public void setGlobalRebalanceAsyncMode(boolean isAsync) {
+    _record.setBooleanField(ClusterConfigProperty.GLOBAL_REBALANCE_ASYNC_MODE.name(), isAsync);
+  }
+
+  public boolean isGlobalRebalanceAsyncModeEnabled() {
+    return _record.getBooleanField(ClusterConfigProperty.GLOBAL_REBALANCE_ASYNC_MODE.name(),
+        DEFAULT_GLOBAL_REBALANCE_ASYNC_MODE_ENABLED);
+  }
+
+  /**
    * Get IdealState rules defined in the cluster config.
    * @return
    */
diff --git a/helix-core/src/main/java/org/apache/helix/model/InstanceConfig.java b/helix-core/src/main/java/org/apache/helix/model/InstanceConfig.java
index 4d01766..b55ba83 100644
--- a/helix-core/src/main/java/org/apache/helix/model/InstanceConfig.java
+++ b/helix-core/src/main/java/org/apache/helix/model/InstanceConfig.java
@@ -27,6 +27,7 @@
 import java.util.List;
 import java.util.Map;
 import java.util.Set;
+import java.util.stream.Collectors;
 
 import com.google.common.base.Splitter;
 import org.apache.helix.HelixException;
@@ -54,7 +55,8 @@
     INSTANCE_WEIGHT,
     DOMAIN,
     DELAY_REBALANCE_ENABLED,
-    MAX_CONCURRENT_TASK
+    MAX_CONCURRENT_TASK,
+    INSTANCE_CAPACITY_MAP
   }
 
   public static final int WEIGHT_NOT_SET = -1;
@@ -504,6 +506,54 @@
     _record.setIntField(InstanceConfigProperty.MAX_CONCURRENT_TASK.name(), maxConcurrentTask);
   }
 
+  /**
+   * Get the instance capacity information from the map fields.
+   * @return data map if it exists, or empty map
+   */
+  public Map<String, Integer> getInstanceCapacityMap() {
+    Map<String, String> capacityData =
+        _record.getMapField(InstanceConfigProperty.INSTANCE_CAPACITY_MAP.name());
+
+    if (capacityData != null) {
+      return capacityData.entrySet().stream().collect(
+          Collectors.toMap(entry -> entry.getKey(), entry -> Integer.parseInt(entry.getValue())));
+    }
+    return Collections.emptyMap();
+  }
+
+  /**
+   * Set the instance capacity information with an Integer mapping.
+   * @param capacityDataMap - map of instance capacity data
+   * @throws IllegalArgumentException - when any of the data value is a negative number or when the map is incomplete
+   *
+   * This information is required by the global rebalancer.
+   * @see <a href="Rebalance Algorithm">
+   *   https://github.com/apache/helix/wiki/Design-Proposal---Weight-Aware-Globally-Even-Distribute-Rebalancer#rebalance-algorithm-adapter
+   *   </a>
+   * If the instance capacity is not configured in neither Instance Config nor Cluster Config, the
+   * cluster topology is considered invalid. So the rebalancer may stop working.
+   * Note that when a rebalancer requires this capacity information, it will ignore INSTANCE_WEIGHT.
+   */
+  public void setInstanceCapacityMap(Map<String, Integer> capacityDataMap)
+      throws IllegalArgumentException {
+    if (capacityDataMap == null) {
+      throw new IllegalArgumentException("Capacity Data is null");
+    }
+
+    Map<String, String> capacityData = new HashMap<>();
+
+    capacityDataMap.entrySet().stream().forEach(entry -> {
+      if (entry.getValue() < 0) {
+        throw new IllegalArgumentException(String
+            .format("Capacity Data contains a negative value: %s = %d", entry.getKey(),
+                entry.getValue()));
+      }
+      capacityData.put(entry.getKey(), Integer.toString(entry.getValue()));
+    });
+
+    _record.setMapField(InstanceConfigProperty.INSTANCE_CAPACITY_MAP.name(), capacityData);
+  }
+
   @Override
   public boolean equals(Object obj) {
     if (obj instanceof InstanceConfig) {
diff --git a/helix-core/src/main/java/org/apache/helix/model/ResourceConfig.java b/helix-core/src/main/java/org/apache/helix/model/ResourceConfig.java
index c37a594..9cdb673 100644
--- a/helix-core/src/main/java/org/apache/helix/model/ResourceConfig.java
+++ b/helix-core/src/main/java/org/apache/helix/model/ResourceConfig.java
@@ -19,7 +19,9 @@
  * under the License.
  */
 
+import java.io.IOException;
 import java.util.Collections;
+import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
 import java.util.TreeMap;
@@ -29,6 +31,8 @@
 import org.apache.helix.api.config.HelixConfigProperty;
 import org.apache.helix.api.config.RebalanceConfig;
 import org.apache.helix.api.config.StateTransitionTimeoutConfig;
+import org.codehaus.jackson.map.ObjectMapper;
+import org.codehaus.jackson.type.TypeReference;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
@@ -53,7 +57,8 @@
     RESOURCE_TYPE,
     GROUP_ROUTING_ENABLED,
     EXTERNAL_VIEW_DISABLED,
-    DELAY_REBALANCE_ENABLED
+    DELAY_REBALANCE_ENABLED,
+    PARTITION_CAPACITY_MAP
   }
 
   public enum ResourceConfigConstants {
@@ -61,6 +66,10 @@
   }
 
   private static final Logger _logger = LoggerFactory.getLogger(ResourceConfig.class.getName());
+  private static final ObjectMapper _objectMapper = new ObjectMapper();
+
+  public static final String DEFAULT_PARTITION_KEY = "DEFAULT";
+
   /**
    * Instantiate for a specific instance
    *
@@ -92,10 +101,24 @@
       String stateModelDefRef, String stateModelFactoryName, String numReplica,
       int minActiveReplica, int maxPartitionsPerInstance, String instanceGroupTag,
       Boolean helixEnabled, String resourceGroupName, String resourceType,
-      Boolean groupRoutingEnabled, Boolean externalViewDisabled,
-      RebalanceConfig rebalanceConfig, StateTransitionTimeoutConfig stateTransitionTimeoutConfig,
+      Boolean groupRoutingEnabled, Boolean externalViewDisabled, RebalanceConfig rebalanceConfig,
+      StateTransitionTimeoutConfig stateTransitionTimeoutConfig,
       Map<String, List<String>> listFields, Map<String, Map<String, String>> mapFields,
       Boolean p2pMessageEnabled) {
+    this(resourceId, monitorDisabled, numPartitions, stateModelDefRef, stateModelFactoryName,
+        numReplica, minActiveReplica, maxPartitionsPerInstance, instanceGroupTag, helixEnabled,
+        resourceGroupName, resourceType, groupRoutingEnabled, externalViewDisabled, rebalanceConfig,
+        stateTransitionTimeoutConfig, listFields, mapFields, p2pMessageEnabled, null);
+  }
+
+  private ResourceConfig(String resourceId, Boolean monitorDisabled, int numPartitions,
+    String stateModelDefRef, String stateModelFactoryName, String numReplica,
+    int minActiveReplica, int maxPartitionsPerInstance, String instanceGroupTag,
+        Boolean helixEnabled, String resourceGroupName, String resourceType,
+        Boolean groupRoutingEnabled, Boolean externalViewDisabled,
+        RebalanceConfig rebalanceConfig, StateTransitionTimeoutConfig stateTransitionTimeoutConfig,
+        Map<String, List<String>> listFields, Map<String, Map<String, String>> mapFields,
+        Boolean p2pMessageEnabled, Map<String, Map<String, Integer>> partitionCapacityMap) {
     super(resourceId);
 
     if (monitorDisabled != null) {
@@ -172,6 +195,15 @@
     if (mapFields != null) {
       _record.setMapFields(mapFields);
     }
+
+    if (partitionCapacityMap != null) {
+      try {
+        setPartitionCapacityMap(partitionCapacityMap);
+      } catch (IOException e) {
+        throw new IllegalArgumentException(
+            "Failed to set partition capacity. Invalid capacity configuration.");
+      }
+    }
   }
 
 
@@ -350,6 +382,64 @@
   }
 
   /**
+   * Get the partition capacity information from a JSON among the map fields.
+   * <PartitionName or DEFAULT_PARTITION_KEY, <Capacity Key, Capacity Number>>
+   *
+   * @return data map if it exists, or empty map
+   * @throws IOException - when JSON conversion fails
+   */
+  public Map<String, Map<String, Integer>> getPartitionCapacityMap() throws IOException {
+    Map<String, String> partitionCapacityData =
+        _record.getMapField(ResourceConfigProperty.PARTITION_CAPACITY_MAP.name());
+    Map<String, Map<String, Integer>> partitionCapacityMap = new HashMap<>();
+    if (partitionCapacityData != null) {
+      for (String partition : partitionCapacityData.keySet()) {
+        Map<String, Integer> capacities = _objectMapper
+            .readValue(partitionCapacityData.get(partition),
+                new TypeReference<Map<String, Integer>>() {
+                });
+        partitionCapacityMap.put(partition, capacities);
+      }
+    }
+    return partitionCapacityMap;
+  }
+
+  /**
+   * Set the partition capacity information with a map <PartitionName or DEFAULT_PARTITION_KEY, <Capacity Key, Capacity Number>>
+   *
+   * @param partitionCapacityMap - map of partition capacity data
+   * @throws IllegalArgumentException - when any of the data value is a negative number or map is incomplete
+   * @throws IOException              - when JSON parsing fails
+   */
+  public void setPartitionCapacityMap(Map<String, Map<String, Integer>> partitionCapacityMap)
+      throws IllegalArgumentException, IOException {
+    if (partitionCapacityMap == null) {
+      throw new IllegalArgumentException("Capacity Map is null");
+    }
+    if (!partitionCapacityMap.containsKey(DEFAULT_PARTITION_KEY)) {
+      throw new IllegalArgumentException(String
+          .format("The default partition capacity with the default key %s is required.",
+              DEFAULT_PARTITION_KEY));
+    }
+
+    Map<String, String> newCapacityRecord = new HashMap<>();
+    for (String partition : partitionCapacityMap.keySet()) {
+      Map<String, Integer> capacities = partitionCapacityMap.get(partition);
+      // Verify the input is valid
+      if (capacities.isEmpty()) {
+        throw new IllegalArgumentException("Capacity Data is empty");
+      }
+      if (capacities.entrySet().stream().anyMatch(entry -> entry.getValue() < 0)) {
+        throw new IllegalArgumentException(
+            String.format("Capacity Data contains a negative value:%s", capacities.toString()));
+      }
+      newCapacityRecord.put(partition, _objectMapper.writeValueAsString(capacities));
+    }
+
+    _record.setMapField(ResourceConfigProperty.PARTITION_CAPACITY_MAP.name(), newCapacityRecord);
+  }
+
+  /**
    * Put a set of simple configs.
    *
    * @param configsMap
@@ -476,6 +566,7 @@
     private StateTransitionTimeoutConfig _stateTransitionTimeoutConfig;
     private Map<String, List<String>> _preferenceLists;
     private Map<String, Map<String, String>> _mapFields;
+    private Map<String, Map<String, Integer>> _partitionCapacityMap;
 
     public Builder(String resourceId) {
       _resourceId = resourceId;
@@ -664,6 +755,23 @@
       return _preferenceLists;
     }
 
+    public Builder setPartitionCapacity(Map<String, Integer> defaultCapacity) {
+      setPartitionCapacity(DEFAULT_PARTITION_KEY, defaultCapacity);
+      return this;
+    }
+
+    public Builder setPartitionCapacity(String partition, Map<String, Integer> capacity) {
+      if (_partitionCapacityMap == null) {
+        _partitionCapacityMap = new HashMap<>();
+      }
+      _partitionCapacityMap.put(partition, capacity);
+      return this;
+    }
+
+    public Map<String, Integer> getPartitionCapacity(String partition) {
+      return _partitionCapacityMap.get(partition);
+    }
+
     public Builder setMapField(String key, Map<String, String> fields) {
       if (_mapFields == null) {
         _mapFields = new TreeMap<>();
@@ -708,6 +816,19 @@
           }
         }
       }
+
+      if (_partitionCapacityMap != null) {
+        if (_partitionCapacityMap.keySet().stream()
+            .noneMatch(partition -> partition.equals(DEFAULT_PARTITION_KEY))) {
+          throw new IllegalArgumentException(
+              "Partition capacity is configured without the DEFAULT capacity!");
+        }
+        if (_partitionCapacityMap.values().stream()
+            .anyMatch(capacity -> capacity.values().stream().anyMatch(value -> value < 0))) {
+          throw new IllegalArgumentException(
+              "Partition capacity is configured with negative capacity value!");
+        }
+      }
     }
 
     public ResourceConfig build() {
@@ -718,7 +839,7 @@
           _stateModelFactoryName, _numReplica, _minActiveReplica, _maxPartitionsPerInstance,
           _instanceGroupTag, _helixEnabled, _resourceGroupName, _resourceType, _groupRoutingEnabled,
           _externalViewDisabled, _rebalanceConfig, _stateTransitionTimeoutConfig, _preferenceLists,
-          _mapFields, _p2pMessageEnabled);
+          _mapFields, _p2pMessageEnabled, _partitionCapacityMap);
     }
   }
 }
diff --git a/helix-core/src/main/java/org/apache/helix/model/StateModelDefinition.java b/helix-core/src/main/java/org/apache/helix/model/StateModelDefinition.java
index ae59522..0a40331 100644
--- a/helix-core/src/main/java/org/apache/helix/model/StateModelDefinition.java
+++ b/helix-core/src/main/java/org/apache/helix/model/StateModelDefinition.java
@@ -46,6 +46,8 @@
     STATE_PRIORITY_LIST
   }
 
+  public static final int TOP_STATE_PRIORITY = 1;
+
   /**
    * state model's initial state
    */
@@ -98,7 +100,7 @@
     _stateTransitionTable = new HashMap<>();
     _statesCountMap = new HashMap<>();
     if (_statesPriorityList != null) {
-      int priority = 1;
+      int priority = TOP_STATE_PRIORITY;
       for (String state : _statesPriorityList) {
         Map<String, String> metaData = record.getMapField(state + ".meta");
         if (metaData != null) {
diff --git a/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/ClusterStatusMonitor.java b/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/ClusterStatusMonitor.java
index d6c3bb2..fc0b19d 100644
--- a/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/ClusterStatusMonitor.java
+++ b/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/ClusterStatusMonitor.java
@@ -236,27 +236,28 @@
       // Unregister beans for instances that are no longer configured
       Set<String> toUnregister = Sets.newHashSet(_instanceMonitorMap.keySet());
       toUnregister.removeAll(instanceSet);
-      try {
-        unregisterInstances(toUnregister);
-      } catch (MalformedObjectNameException e) {
-        LOG.error("Could not unregister instances from MBean server: " + toUnregister, e);
-      }
+      unregisterInstances(toUnregister);
 
       // Register beans for instances that are newly configured
       Set<String> toRegister = Sets.newHashSet(instanceSet);
       toRegister.removeAll(_instanceMonitorMap.keySet());
       Set<InstanceMonitor> monitorsToRegister = Sets.newHashSet();
       for (String instanceName : toRegister) {
-        InstanceMonitor bean = new InstanceMonitor(_clusterName, instanceName);
-        bean.updateInstance(tags.get(instanceName), disabledPartitions.get(instanceName),
-            oldDisabledPartitions.get(instanceName), liveInstanceSet.contains(instanceName),
-            !disabledInstanceSet.contains(instanceName));
-        monitorsToRegister.add(bean);
+        try {
+          ObjectName objectName = getObjectName(getInstanceBeanName(instanceName));
+          InstanceMonitor bean = new InstanceMonitor(_clusterName, instanceName, objectName);
+          bean.updateInstance(tags.get(instanceName), disabledPartitions.get(instanceName),
+              oldDisabledPartitions.get(instanceName), liveInstanceSet.contains(instanceName),
+              !disabledInstanceSet.contains(instanceName));
+          monitorsToRegister.add(bean);
+        } catch (MalformedObjectNameException ex) {
+          LOG.error("Failed to create instance monitor for instance: {}.", instanceName);
+        }
       }
       try {
         registerInstances(monitorsToRegister);
-      } catch (MalformedObjectNameException e) {
-        LOG.error("Could not register instances with MBean server: " + toRegister, e);
+      } catch (JMException e) {
+        LOG.error("Could not register instances with MBean server: {}.", toRegister, e);
       }
 
       // Update all the sets
@@ -282,8 +283,8 @@
             try {
               unregisterInstances(Arrays.asList(instanceName));
               registerInstances(Arrays.asList(bean));
-            } catch (MalformedObjectNameException e) {
-              LOG.error("Could not refresh registration with MBean server: " + instanceName, e);
+            } catch (JMException e) {
+              LOG.error("Could not refresh registration with MBean server: {}", instanceName, e);
             }
           }
         }
@@ -366,6 +367,28 @@
   }
 
   /**
+   * Updates instance capacity status for per instance, including max usage and capacity of each
+   * capacity key. Before calling this API, we assume the instance monitors are already registered
+   * in ReadClusterDataStage. If the monitor is not registered, this instance capacity status update
+   * will fail.
+   *
+   * @param instanceName This instance name
+   * @param maxUsage Max capacity usage of this instance
+   * @param capacityMap A map of this instance capacity, {capacity key: capacity value}
+   */
+  public void updateInstanceCapacityStatus(String instanceName, double maxUsage,
+      Map<String, Integer> capacityMap) {
+    InstanceMonitor monitor = _instanceMonitorMap.get(instanceName);
+    if (monitor == null) {
+      LOG.warn("Failed to update instance capacity status because instance monitor is not found, "
+          + "instance: {}.", instanceName);
+      return;
+    }
+    monitor.updateMaxCapacityUsage(maxUsage);
+    monitor.updateCapacity(capacityMap);
+  }
+
+  /**
    * Update gauges for resource at instance level
    * @param bestPossibleStates
    * @param resourceMap
@@ -474,6 +497,25 @@
     }
   }
 
+  /**
+   * Updates metrics of average partition weight per capacity key for a resource. If a resource
+   * monitor is not yet existed for this resource, a new resource monitor will be created for this
+   * resource.
+   *
+   * @param resourceName The resource name for which partition weight is updated
+   * @param averageWeightMap A map of average partition weight of each capacity key:
+   *                         capacity key -> average partition weight
+   */
+  public void updatePartitionWeight(String resourceName, Map<String, Integer> averageWeightMap) {
+    ResourceMonitor monitor = getOrCreateResourceMonitor(resourceName);
+    if (monitor == null) {
+      LOG.warn("Failed to update partition weight metric for resource: {} because resource monitor"
+          + " is not created.", resourceName);
+      return;
+    }
+    monitor.updatePartitionWeightStats(averageWeightMap);
+  }
+
   public void updateMissingTopStateDurationStats(String resourceName, long totalDuration,
       long helixLatency, boolean isGraceful, boolean succeeded) {
     ResourceMonitor resourceMonitor = getOrCreateResourceMonitor(resourceName);
@@ -694,31 +736,35 @@
   }
 
   private void registerInstances(Collection<InstanceMonitor> instances)
-      throws MalformedObjectNameException {
+      throws JMException {
     synchronized (_instanceMonitorMap) {
       for (InstanceMonitor monitor : instances) {
         String instanceName = monitor.getInstanceName();
-        String beanName = getInstanceBeanName(instanceName);
-        register(monitor, getObjectName(beanName));
+        // If this instance MBean is already registered, unregister it.
+        InstanceMonitor removedMonitor = _instanceMonitorMap.remove(instanceName);
+        if (removedMonitor != null) {
+          removedMonitor.unregister();
+        }
+        monitor.register();
         _instanceMonitorMap.put(instanceName, monitor);
       }
     }
   }
 
-  private void unregisterAllInstances() throws MalformedObjectNameException {
+  private void unregisterAllInstances() {
     synchronized (_instanceMonitorMap) {
       unregisterInstances(_instanceMonitorMap.keySet());
     }
   }
 
-  private void unregisterInstances(Collection<String> instances)
-      throws MalformedObjectNameException {
+  private void unregisterInstances(Collection<String> instances) {
     synchronized (_instanceMonitorMap) {
       for (String instanceName : instances) {
-        String beanName = getInstanceBeanName(instanceName);
-        unregister(getObjectName(beanName));
+        InstanceMonitor monitor = _instanceMonitorMap.remove(instanceName);
+        if (monitor != null) {
+          monitor.unregister();
+        }
       }
-      _instanceMonitorMap.keySet().removeAll(instances);
     }
   }
 
diff --git a/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/InstanceMonitor.java b/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/InstanceMonitor.java
index dc43d48..e0c0f89 100644
--- a/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/InstanceMonitor.java
+++ b/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/InstanceMonitor.java
@@ -23,36 +23,105 @@
 import java.util.List;
 import java.util.Map;
 import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+import javax.management.JMException;
+import javax.management.ObjectName;
 
 import com.google.common.base.Joiner;
 import com.google.common.collect.ImmutableList;
 import com.google.common.collect.Lists;
+import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMBeanProvider;
+import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric;
+import org.apache.helix.monitoring.mbeans.dynamicMBeans.SimpleDynamicMetric;
+
 
 /**
  * Implementation of the instance status bean
  */
-public class InstanceMonitor implements InstanceMonitorMBean {
+public class InstanceMonitor extends DynamicMBeanProvider {
+  /**
+   * Metric names for instance capacity.
+   */
+  public enum InstanceMonitorMetric {
+    // TODO: change the metric names with Counter and Gauge suffix and deprecate old names.
+    TOTAL_MESSAGE_RECEIVED_COUNTER("TotalMessageReceived"),
+    ENABLED_STATUS_GAUGE("Enabled"),
+    ONLINE_STATUS_GAUGE("Online"),
+    DISABLED_PARTITIONS_GAUGE("DisabledPartitions"),
+    MAX_CAPACITY_USAGE_GAUGE("MaxCapacityUsageGauge");
+
+    private final String metricName;
+
+    InstanceMonitorMetric(String name) {
+      metricName = name;
+    }
+
+    public String metricName() {
+      return metricName;
+    }
+  }
+
   private final String _clusterName;
   private final String _participantName;
+  private final ObjectName _initObjectName;
+
   private List<String> _tags;
-  private long _disabledPartitions;
-  private boolean _isUp;
-  private boolean _isEnabled;
-  private long _totalMessageReceived;
+
+  // Counters
+  private SimpleDynamicMetric<Long> _totalMessagedReceivedCounter;
+
+  // Gauges
+  private SimpleDynamicMetric<Long> _enabledStatusGauge;
+  private SimpleDynamicMetric<Long> _disabledPartitionsGauge;
+  private SimpleDynamicMetric<Long> _onlineStatusGauge;
+  private SimpleDynamicMetric<Double> _maxCapacityUsageGauge;
+
+  // A map of dynamic capacity Gauges. The map's keys could change.
+  private final Map<String, SimpleDynamicMetric<Long>> _dynamicCapacityMetricsMap;
 
   /**
    * Initialize the bean
    * @param clusterName the cluster to monitor
    * @param participantName the instance whose statistics this holds
    */
-  public InstanceMonitor(String clusterName, String participantName) {
+  public InstanceMonitor(String clusterName, String participantName, ObjectName objectName) {
     _clusterName = clusterName;
     _participantName = participantName;
     _tags = ImmutableList.of(ClusterStatusMonitor.DEFAULT_TAG);
-    _disabledPartitions = 0L;
-    _isUp = false;
-    _isEnabled = false;
-    _totalMessageReceived = 0;
+    _initObjectName = objectName;
+    _dynamicCapacityMetricsMap = new ConcurrentHashMap<>();
+
+    createMetrics();
+  }
+
+  private void createMetrics() {
+    _totalMessagedReceivedCounter = new SimpleDynamicMetric<>(
+        InstanceMonitorMetric.TOTAL_MESSAGE_RECEIVED_COUNTER.metricName(), 0L);
+
+    _disabledPartitionsGauge =
+        new SimpleDynamicMetric<>(InstanceMonitorMetric.DISABLED_PARTITIONS_GAUGE.metricName(),
+            0L);
+    _enabledStatusGauge =
+        new SimpleDynamicMetric<>(InstanceMonitorMetric.ENABLED_STATUS_GAUGE.metricName(), 0L);
+    _onlineStatusGauge =
+        new SimpleDynamicMetric<>(InstanceMonitorMetric.ONLINE_STATUS_GAUGE.metricName(), 0L);
+    _maxCapacityUsageGauge =
+        new SimpleDynamicMetric<>(InstanceMonitorMetric.MAX_CAPACITY_USAGE_GAUGE.metricName(),
+            0.0d);
+  }
+
+  private List<DynamicMetric<?, ?>> buildAttributeList() {
+    List<DynamicMetric<?, ?>> attributeList = Lists.newArrayList(
+        _totalMessagedReceivedCounter,
+        _disabledPartitionsGauge,
+        _enabledStatusGauge,
+        _onlineStatusGauge,
+        _maxCapacityUsageGauge
+    );
+
+    attributeList.addAll(_dynamicCapacityMetricsMap.values());
+
+    return attributeList;
   }
 
   @Override
@@ -61,44 +130,32 @@
         serializedTags(), _participantName);
   }
 
-  @Override
-  public long getOnline() {
-    return _isUp ? 1 : 0;
+  protected long getOnline() {
+    return _onlineStatusGauge.getValue();
   }
 
-  @Override
-  public long getEnabled() {
-    return _isEnabled ? 1 : 0;
+  protected long getEnabled() {
+    return _enabledStatusGauge.getValue();
   }
 
-  @Override
-  public long getTotalMessageReceived() {
-    return _totalMessageReceived;
+  protected long getTotalMessageReceived() {
+    return _totalMessagedReceivedCounter.getValue();
   }
 
-  @Override
-  public long getDisabledPartitions() {
-    return _disabledPartitions;
-  }
-
-  /**
-   * Get all the tags currently on this instance
-   * @return list of tags
-   */
-  public List<String> getTags() {
-    return _tags;
+  protected long getDisabledPartitions() {
+    return _disabledPartitionsGauge.getValue();
   }
 
   /**
    * Get the name of the monitored instance
    * @return instance name as a string
    */
-  public String getInstanceName() {
+  protected String getInstanceName() {
     return _participantName;
   }
 
   private String serializedTags() {
-    return Joiner.on('|').skipNulls().join(_tags).toString();
+    return Joiner.on('|').skipNulls().join(_tags);
   }
 
   /**
@@ -117,20 +174,22 @@
       _tags = Lists.newArrayList(tags);
       Collections.sort(_tags);
     }
-    _disabledPartitions = 0L;
+    long numDisabledPartitions = 0L;
     if (disabledPartitions != null) {
       for (List<String> partitions : disabledPartitions.values()) {
         if (partitions != null) {
-          _disabledPartitions += partitions.size();
+          numDisabledPartitions += partitions.size();
         }
       }
     }
     // TODO : Get rid of this when old API removed.
     if (oldDisabledPartitions != null) {
-      _disabledPartitions += oldDisabledPartitions.size();
+      numDisabledPartitions += oldDisabledPartitions.size();
     }
-    _isUp = isLive;
-    _isEnabled = isEnabled;
+
+    _onlineStatusGauge.updateValue(isLive ? 1L : 0L);
+    _enabledStatusGauge.updateValue(isEnabled ? 1L : 0L);
+    _disabledPartitionsGauge.updateValue(numDisabledPartitions);
   }
 
   /**
@@ -138,7 +197,64 @@
    * @param messageReceived received message numbers
    */
   public synchronized void increaseMessageCount(long messageReceived) {
-    _totalMessageReceived += messageReceived;
+    _totalMessagedReceivedCounter
+        .updateValue(_totalMessagedReceivedCounter.getValue() + messageReceived);
   }
 
+  /**
+   * Updates max capacity usage for this instance.
+   * @param maxUsage max capacity usage of this instance
+   */
+  public synchronized void updateMaxCapacityUsage(double maxUsage) {
+    _maxCapacityUsageGauge.updateValue(maxUsage);
+  }
+
+  /**
+   * Gets max capacity usage of this instance.
+   * @return Max capacity usage of this instance.
+   */
+  protected synchronized double getMaxCapacityUsageGauge() {
+    return _maxCapacityUsageGauge.getValue();
+  }
+
+  /**
+   * Updates instance capacity metrics.
+   * @param capacity A map of instance capacity.
+   */
+  public void updateCapacity(Map<String, Integer> capacity) {
+    synchronized (_dynamicCapacityMetricsMap) {
+      // If capacity keys don't have any change, we just update the metric values.
+      if (_dynamicCapacityMetricsMap.keySet().equals(capacity.keySet())) {
+        for (Map.Entry<String, Integer> entry : capacity.entrySet()) {
+          _dynamicCapacityMetricsMap.get(entry.getKey()).updateValue((long) entry.getValue());
+        }
+        return;
+      }
+
+      // If capacity keys have any changes, we need to retain the capacity metrics.
+      // Make sure capacity metrics map has the same capacity keys.
+      // And update metrics values.
+      _dynamicCapacityMetricsMap.keySet().retainAll(capacity.keySet());
+      for (Map.Entry<String, Integer> entry : capacity.entrySet()) {
+        String capacityName = entry.getKey();
+        if (_dynamicCapacityMetricsMap.containsKey(capacityName)) {
+          _dynamicCapacityMetricsMap.get(capacityName).updateValue((long) entry.getValue());
+        } else {
+          _dynamicCapacityMetricsMap.put(capacityName,
+              new SimpleDynamicMetric<>(capacityName + "Gauge", (long) entry.getValue()));
+        }
+      }
+    }
+
+    // Update MBean's all attributes.
+    updateAttributesInfo(buildAttributeList(),
+        "Instance monitor for instance: " + getInstanceName());
+  }
+
+  @Override
+  public DynamicMBeanProvider register() throws JMException {
+    doRegister(buildAttributeList(), _initObjectName);
+
+    return this;
+  }
 }
diff --git a/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/InstanceMonitorMBean.java b/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/InstanceMonitorMBean.java
deleted file mode 100644
index a3221d8..0000000
--- a/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/InstanceMonitorMBean.java
+++ /dev/null
@@ -1,51 +0,0 @@
-package org.apache.helix.monitoring.mbeans;
-
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *   http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing,
- * software distributed under the License is distributed on an
- * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- * KIND, either express or implied.  See the License for the
- * specific language governing permissions and limitations
- * under the License.
- */
-
-import org.apache.helix.monitoring.SensorNameProvider;
-
-/**
- * A basic bean describing the status of a single instance
- */
-public interface InstanceMonitorMBean extends SensorNameProvider {
-  /**
-   * Check if this instance is live
-   * @return 1 if running, 0 otherwise
-   */
-  public long getOnline();
-
-  /**
-   * Check if this instance is enabled
-   * @return 1 if enabled, 0 if disabled
-   */
-  public long getEnabled();
-
-  /**
-   * Get total message received for this instances
-   * @return The total number of messages sent to this instance
-   */
-  public long getTotalMessageReceived();
-
-  /**
-   * Get the total disabled partitions number for this instance
-   * @return The total number of disabled partitions
-   */
-  public long getDisabledPartitions();
-}
diff --git a/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/MonitorDomainNames.java b/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/MonitorDomainNames.java
index 73bf057..fee9099 100644
--- a/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/MonitorDomainNames.java
+++ b/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/MonitorDomainNames.java
@@ -28,5 +28,6 @@
   HelixThreadPoolExecutor,
   HelixCallback,
   RoutingTableProvider,
-  CLMParticipantReport
+  CLMParticipantReport,
+  Rebalancer
 }
diff --git a/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/ResourceMonitor.java b/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/ResourceMonitor.java
index d7a368e..af9c318 100644
--- a/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/ResourceMonitor.java
+++ b/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/ResourceMonitor.java
@@ -19,18 +19,19 @@
  * under the License.
  */
 
-import java.util.ArrayList;
 import java.util.Collections;
 import java.util.HashSet;
 import java.util.List;
 import java.util.Map;
 import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
 import java.util.concurrent.TimeUnit;
 import javax.management.JMException;
 import javax.management.ObjectName;
 
 import com.codahale.metrics.Histogram;
 import com.codahale.metrics.SlidingTimeWindowArrayReservoir;
+import com.google.common.collect.Lists;
 import org.apache.helix.HelixDefinedState;
 import org.apache.helix.model.ExternalView;
 import org.apache.helix.model.IdealState;
@@ -49,6 +50,8 @@
     INTERMEDIATE_STATE_CAL_FAILED
   }
 
+  private static final String GAUGE_METRIC_SUFFIX = "Gauge";
+
   // Gauges
   private SimpleDynamicMetric<Long> _numOfPartitions;
   private SimpleDynamicMetric<Long> _numOfPartitionsInExternalView;
@@ -83,31 +86,13 @@
   private final String _clusterName;
   private final ObjectName _initObjectName;
 
+  // A map of dynamic capacity Gauges. The map's keys could change.
+  private final Map<String, SimpleDynamicMetric<Long>> _dynamicCapacityMetricsMap;
+
   @Override
-  public ResourceMonitor register() throws JMException {
-    List<DynamicMetric<?, ?>> attributeList = new ArrayList<>();
-    attributeList.add(_numOfPartitions);
-    attributeList.add(_numOfPartitionsInExternalView);
-    attributeList.add(_numOfErrorPartitions);
-    attributeList.add(_numNonTopStatePartitions);
-    attributeList.add(_numLessMinActiveReplicaPartitions);
-    attributeList.add(_numLessReplicaPartitions);
-    attributeList.add(_numPendingRecoveryRebalancePartitions);
-    attributeList.add(_numPendingLoadRebalancePartitions);
-    attributeList.add(_numRecoveryRebalanceThrottledPartitions);
-    attributeList.add(_numLoadRebalanceThrottledPartitions);
-    attributeList.add(_externalViewIdealStateDiff);
-    attributeList.add(_successfulTopStateHandoffDurationCounter);
-    attributeList.add(_successTopStateHandoffCounter);
-    attributeList.add(_failedTopStateHandoffCounter);
-    attributeList.add(_maxSinglePartitionTopStateHandoffDuration);
-    attributeList.add(_partitionTopStateHandoffDurationGauge);
-    attributeList.add(_partitionTopStateHandoffHelixLatencyGauge);
-    attributeList.add(_partitionTopStateNonGracefulHandoffDurationGauge);
-    attributeList.add(_totalMessageReceived);
-    attributeList.add(_numPendingStateTransitions);
-    attributeList.add(_rebalanceState);
-    doRegister(attributeList, _initObjectName);
+  public DynamicMBeanProvider register() throws JMException {
+    doRegister(buildAttributeList(), _initObjectName);
+
     return this;
   }
 
@@ -116,10 +101,12 @@
   }
 
   @SuppressWarnings("unchecked")
-  public ResourceMonitor(String clusterName, String resourceName, ObjectName objectName) {
+  public ResourceMonitor(String clusterName, String resourceName, ObjectName objectName)
+      throws JMException {
     _clusterName = clusterName;
     _resourceName = resourceName;
     _initObjectName = objectName;
+    _dynamicCapacityMetricsMap = new ConcurrentHashMap<>();
 
     _externalViewIdealStateDiff = new SimpleDynamicMetric("DifferenceWithIdealStateGauge", 0L);
     _numLoadRebalanceThrottledPartitions =
@@ -382,6 +369,36 @@
     _numLoadRebalanceThrottledPartitions.updateValue(numLoadRebalanceThrottledPartitions);
   }
 
+  /**
+   * Updates partition weight metric. If the partition capacity keys are changed, all MBean
+   * attributes will be updated accordingly: old capacity keys will be replaced with new capacity
+   * keys in MBean server.
+   *
+   * @param partitionWeightMap A map of partition weight: capacity key -> partition weight
+   */
+  void updatePartitionWeightStats(Map<String, Integer> partitionWeightMap) {
+    synchronized (_dynamicCapacityMetricsMap) {
+      if (_dynamicCapacityMetricsMap.keySet().equals(partitionWeightMap.keySet())) {
+        for (Map.Entry<String, Integer> entry : partitionWeightMap.entrySet()) {
+          _dynamicCapacityMetricsMap.get(entry.getKey()).updateValue((long) entry.getValue());
+        }
+        return;
+      }
+
+      // Capacity keys are changed, so capacity attribute map needs to be updated.
+      _dynamicCapacityMetricsMap.clear();
+      for (Map.Entry<String, Integer> entry : partitionWeightMap.entrySet()) {
+        String capacityKey = entry.getKey();
+        _dynamicCapacityMetricsMap.put(capacityKey,
+            new SimpleDynamicMetric<>(capacityKey + GAUGE_METRIC_SUFFIX, (long) entry.getValue()));
+      }
+    }
+
+    // Update all MBean attributes.
+    updateAttributesInfo(buildAttributeList(),
+        "Resource monitor for resource: " + getResourceName());
+  }
+
   public void setRebalanceState(RebalanceStatus state) {
     _rebalanceState.updateValue(state.name());
   }
@@ -428,4 +445,34 @@
       _lastResetTime = System.currentTimeMillis();
     }
   }
+
+  private List<DynamicMetric<?, ?>> buildAttributeList() {
+    List<DynamicMetric<?, ?>> attributeList = Lists.newArrayList(
+        _numOfPartitions,
+        _numOfPartitionsInExternalView,
+        _numOfErrorPartitions,
+        _numNonTopStatePartitions,
+        _numLessMinActiveReplicaPartitions,
+        _numLessReplicaPartitions,
+        _numPendingRecoveryRebalancePartitions,
+        _numPendingLoadRebalancePartitions,
+        _numRecoveryRebalanceThrottledPartitions,
+        _numLoadRebalanceThrottledPartitions,
+        _externalViewIdealStateDiff,
+        _successfulTopStateHandoffDurationCounter,
+        _successTopStateHandoffCounter,
+        _failedTopStateHandoffCounter,
+        _maxSinglePartitionTopStateHandoffDuration,
+        _partitionTopStateHandoffDurationGauge,
+        _partitionTopStateHandoffHelixLatencyGauge,
+        _partitionTopStateNonGracefulHandoffDurationGauge,
+        _totalMessageReceived,
+        _numPendingStateTransitions,
+        _rebalanceState
+    );
+
+    attributeList.addAll(_dynamicCapacityMetricsMap.values());
+
+    return attributeList;
+  }
 }
diff --git a/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/dynamicMBeans/DynamicMBeanProvider.java b/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/dynamicMBeans/DynamicMBeanProvider.java
index 0ce0b44..407a714 100644
--- a/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/dynamicMBeans/DynamicMBeanProvider.java
+++ b/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/dynamicMBeans/DynamicMBeanProvider.java
@@ -22,23 +22,19 @@
 import java.util.ArrayList;
 import java.util.Collection;
 import java.util.HashMap;
-import java.util.Iterator;
 import java.util.List;
 import java.util.Map;
 import javax.management.Attribute;
 import javax.management.AttributeList;
 import javax.management.AttributeNotFoundException;
 import javax.management.DynamicMBean;
-import javax.management.InvalidAttributeValueException;
 import javax.management.JMException;
 import javax.management.MBeanAttributeInfo;
 import javax.management.MBeanConstructorInfo;
-import javax.management.MBeanException;
 import javax.management.MBeanInfo;
 import javax.management.MBeanNotificationInfo;
 import javax.management.MBeanOperationInfo;
 import javax.management.ObjectName;
-import javax.management.ReflectionException;
 
 import org.apache.helix.SystemPropertyKeys;
 import org.apache.helix.monitoring.SensorNameProvider;
@@ -53,12 +49,12 @@
 public abstract class DynamicMBeanProvider implements DynamicMBean, SensorNameProvider {
   protected final Logger _logger = LoggerFactory.getLogger(getClass());
   protected static final long DEFAULT_RESET_INTERVAL_MS = 60 * 60 * 1000; // Reset time every hour
-  private static String SENSOR_NAME_TAG = "SensorName";
-  private static String DEFAULT_DESCRIPTION =
+  private static final String SENSOR_NAME_TAG = "SensorName";
+  private static final String DEFAULT_DESCRIPTION =
       "Information on the management interface of the MBean";
 
   // Attribute name to the DynamicMetric object mapping
-  private final Map<String, DynamicMetric> _attributeMap = new HashMap<>();
+  private Map<String, DynamicMetric> _attributeMap = new HashMap<>();
   private ObjectName _objectName = null;
   private MBeanInfo _mBeanInfo;
 
@@ -88,7 +84,7 @@
           objectName.getCanonicalName());
       return false;
     }
-    updateAttributtInfos(dynamicMetrics, description);
+    updateAttributesInfo(dynamicMetrics, description);
     _objectName = MBeanRegistrar.register(this, objectName);
     return true;
   }
@@ -99,26 +95,30 @@
   }
 
   /**
-   * Update the Dynamic MBean provider with new metric list.
+   * Updates the Dynamic MBean provider with new metric list.
+   * If the pass-in metrics collection is empty, the original attributes will be removed.
+   *
    * @param description description of the MBean
-   * @param dynamicMetrics the DynamicMetrics
+   * @param dynamicMetrics the DynamicMetrics. Empty collection will remove the metric attributes.
    */
-  private void updateAttributtInfos(Collection<DynamicMetric<?, ?>> dynamicMetrics,
+  protected void updateAttributesInfo(Collection<DynamicMetric<?, ?>> dynamicMetrics,
       String description) {
-    _attributeMap.clear();
+    if (dynamicMetrics == null) {
+      _logger.warn("Cannot update attributes info because dynamicMetrics is null.");
+      return;
+    }
 
-    // get all attributes that can be emit by the dynamicMetrics.
     List<MBeanAttributeInfo> attributeInfoList = new ArrayList<>();
-    if (dynamicMetrics != null) {
-      for (DynamicMetric dynamicMetric : dynamicMetrics) {
-        Iterator<MBeanAttributeInfo> iter = dynamicMetric.getAttributeInfos().iterator();
-        while (iter.hasNext()) {
-          MBeanAttributeInfo attributeInfo = iter.next();
-          // Info list to create MBean info
-          attributeInfoList.add(attributeInfo);
-          // Attribute mapping for getting attribute value when getAttribute() is called
-          _attributeMap.put(attributeInfo.getName(), dynamicMetric);
-        }
+    // Use a new attribute map to avoid concurrency issue.
+    Map<String, DynamicMetric> newAttributeMap = new HashMap<>();
+
+    // Get all attributes that can be emitted by the dynamicMetrics.
+    for (DynamicMetric<?, ?> dynamicMetric : dynamicMetrics) {
+      for (MBeanAttributeInfo attributeInfo : dynamicMetric.getAttributeInfos()) {
+        // Info list to create MBean info
+        attributeInfoList.add(attributeInfo);
+        // Attribute mapping for getting attribute value when getAttribute() is called
+        newAttributeMap.put(attributeInfo.getName(), dynamicMetric);
       }
     }
 
@@ -130,17 +130,19 @@
         String.format("Default %s Constructor", getClass().getSimpleName()),
         getClass().getConstructors()[0]);
 
-    MBeanAttributeInfo[] attributeInfos = new MBeanAttributeInfo[attributeInfoList.size()];
-    attributeInfos = attributeInfoList.toArray(attributeInfos);
+    MBeanAttributeInfo[] attributesInfo = new MBeanAttributeInfo[attributeInfoList.size()];
+    attributesInfo = attributeInfoList.toArray(attributesInfo);
 
     if (description == null) {
       description = DEFAULT_DESCRIPTION;
     }
 
-    _mBeanInfo = new MBeanInfo(getClass().getName(), description, attributeInfos,
-        new MBeanConstructorInfo[] {
-            constructorInfo
-        }, new MBeanOperationInfo[0], new MBeanNotificationInfo[0]);
+    _mBeanInfo = new MBeanInfo(getClass().getName(), description, attributesInfo,
+        new MBeanConstructorInfo[]{constructorInfo}, new MBeanOperationInfo[0],
+        new MBeanNotificationInfo[0]);
+
+    // Update _attributeMap reference.
+    _attributeMap = newAttributeMap;
   }
 
   /**
@@ -158,17 +160,17 @@
   }
 
   @Override
-  public Object getAttribute(String attribute)
-      throws AttributeNotFoundException, MBeanException, ReflectionException {
+  public Object getAttribute(String attribute) throws AttributeNotFoundException {
     if (SENSOR_NAME_TAG.equals(attribute)) {
       return getSensorName();
     }
 
-    if (!_attributeMap.containsKey(attribute)) {
-      return null;
+    DynamicMetric metric = _attributeMap.get(attribute);
+    if (metric == null) {
+      throw new AttributeNotFoundException("Attribute[" + attribute + "] is not found.");
     }
 
-    return _attributeMap.get(attribute).getAttributeValue(attribute);
+    return metric.getAttributeValue(attribute);
   }
 
   @Override
@@ -178,7 +180,7 @@
       try {
         Object value = getAttribute(attributeName);
         attributeList.add(new Attribute(attributeName, value));
-      } catch (AttributeNotFoundException | MBeanException | ReflectionException ex) {
+      } catch (AttributeNotFoundException ex) {
         _logger.error("Failed to get attribute: " + attributeName, ex);
       }
     }
@@ -191,8 +193,7 @@
   }
 
   @Override
-  public void setAttribute(Attribute attribute) throws AttributeNotFoundException,
-      InvalidAttributeValueException, MBeanException, ReflectionException {
+  public void setAttribute(Attribute attribute) {
     // All MBeans are readonly
     return;
   }
@@ -204,8 +205,7 @@
   }
 
   @Override
-  public Object invoke(String actionName, Object[] params, String[] signature)
-      throws MBeanException, ReflectionException {
+  public Object invoke(String actionName, Object[] params, String[] signature) {
     // No operation supported
     return null;
   }
diff --git a/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/dynamicMBeans/SimpleDynamicMetric.java b/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/dynamicMBeans/SimpleDynamicMetric.java
index 1be6a21..2b0f1db 100644
--- a/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/dynamicMBeans/SimpleDynamicMetric.java
+++ b/helix-core/src/main/java/org/apache/helix/monitoring/mbeans/dynamicMBeans/SimpleDynamicMetric.java
@@ -25,7 +25,7 @@
  * @param <T> the type of the metric value
  */
 public class SimpleDynamicMetric<T> extends DynamicMetric<T, T> {
-  private final String _metricName;
+  protected final String _metricName;
 
   /**
    * Instantiates a new Simple dynamic metric.
diff --git a/helix-core/src/main/java/org/apache/helix/monitoring/metrics/MetricCollector.java b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/MetricCollector.java
new file mode 100644
index 0000000..b08a840
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/MetricCollector.java
@@ -0,0 +1,99 @@
+package org.apache.helix.monitoring.metrics;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Map;
+import javax.management.JMException;
+import javax.management.ObjectName;
+import org.apache.helix.HelixException;
+import org.apache.helix.monitoring.metrics.model.Metric;
+import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMBeanProvider;
+import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric;
+
+/**
+ * Collects and manages all metrics that implement the {@link Metric} interface.
+ */
+public abstract class MetricCollector extends DynamicMBeanProvider {
+  private static final String CLUSTER_NAME_KEY = "ClusterName";
+  private static final String ENTITY_NAME_KEY = "EntityName";
+  private final String _monitorDomainName;
+  private final String _clusterName;
+  private final String _entityName;
+  private Map<String, Metric> _metricMap;
+
+  public MetricCollector(String monitorDomainName, String clusterName, String entityName) {
+    _monitorDomainName = monitorDomainName;
+    _clusterName = clusterName;
+    _entityName = entityName;
+    _metricMap = new HashMap<>();
+  }
+
+  @Override
+  public DynamicMBeanProvider register() throws JMException {
+    // First cast all Metric objects to DynamicMetrics
+    Collection<DynamicMetric<?, ?>> dynamicMetrics = new HashSet<>();
+    _metricMap.values().forEach(metric -> dynamicMetrics.add(metric.getDynamicMetric()));
+
+    // Define MBeanName and ObjectName
+    // MBean name has two key-value pairs:
+    // ------ 1) ClusterName KV pair (first %s=%s)
+    // ------ 2) EntityName KV pair (second %s=%s)
+    String mbeanName =
+        String.format("%s=%s, %s=%s", CLUSTER_NAME_KEY, _clusterName, ENTITY_NAME_KEY, _entityName);
+
+    // ObjectName has one key-value pair:
+    // ------ 1) Monitor domain name KV pair where value is the MBean name
+    doRegister(dynamicMetrics,
+        new ObjectName(String.format("%s:%s", _monitorDomainName, mbeanName)));
+    return this;
+  }
+
+  @Override
+  public String getSensorName() {
+    return String.format("%s.%s.%s", _monitorDomainName, _clusterName,
+        _entityName);
+  }
+
+  void addMetric(Metric metric) {
+    if (metric instanceof DynamicMetric) {
+      _metricMap.putIfAbsent(metric.getMetricName(), metric);
+    } else {
+      throw new HelixException("MetricCollector only supports Metrics that are DynamicMetric!");
+    }
+  }
+
+  /**
+   * Returns a desired type of the metric.
+   * @param metricName
+   * @param metricClass Desired type
+   * @param <T> Casted result of the metric
+   * @return
+   */
+  public <T extends DynamicMetric> T getMetric(String metricName, Class<T> metricClass) {
+    return metricClass.cast(_metricMap.get(metricName));
+  }
+
+  public Map<String, Metric> getMetricMap() {
+    return _metricMap;
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/monitoring/metrics/WagedRebalancerMetricCollector.java b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/WagedRebalancerMetricCollector.java
new file mode 100644
index 0000000..df8b60f
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/WagedRebalancerMetricCollector.java
@@ -0,0 +1,125 @@
+package org.apache.helix.monitoring.metrics;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import javax.management.JMException;
+
+import org.apache.helix.HelixException;
+import org.apache.helix.monitoring.mbeans.MonitorDomainNames;
+import org.apache.helix.monitoring.metrics.implementation.BaselineDivergenceGauge;
+import org.apache.helix.monitoring.metrics.implementation.RebalanceCounter;
+import org.apache.helix.monitoring.metrics.implementation.RebalanceFailureCount;
+import org.apache.helix.monitoring.metrics.implementation.RebalanceLatencyGauge;
+import org.apache.helix.monitoring.metrics.model.CountMetric;
+import org.apache.helix.monitoring.metrics.model.LatencyMetric;
+import org.apache.helix.monitoring.metrics.model.RatioMetric;
+
+
+public class WagedRebalancerMetricCollector extends MetricCollector {
+  private static final String WAGED_REBALANCER_ENTITY_NAME = "WagedRebalancer";
+
+  /**
+   * This enum class contains all metric names defined for WagedRebalancer. Note that all enums are
+   * in camel case for readability.
+   */
+  public enum WagedRebalancerMetricNames {
+    // Per-stage latency metrics
+    GlobalBaselineCalcLatencyGauge,
+    PartialRebalanceLatencyGauge,
+
+    // The following latency metrics are related to AssignmentMetadataStore
+    StateReadLatencyGauge,
+    StateWriteLatencyGauge,
+
+    /*
+     * Gauge of the difference (state and partition allocation) between the baseline and the best
+     * possible assignment.
+     */
+    BaselineDivergenceGauge,
+
+    // Count of any rebalance compute failure.
+    // Note the rebalancer may still be able to return the last known-good assignment on a rebalance
+    // compute failure. And this fallback logic won't impact this counting.
+    RebalanceFailureCounter,
+
+    // Waged rebalance counters.
+    GlobalBaselineCalcCounter,
+    PartialRebalanceCounter
+  }
+
+  public WagedRebalancerMetricCollector(String clusterName) {
+    super(MonitorDomainNames.Rebalancer.name(), clusterName, WAGED_REBALANCER_ENTITY_NAME);
+    createMetrics();
+    if (clusterName != null) {
+      try {
+        register();
+      } catch (JMException e) {
+        throw new HelixException("Failed to register MBean for the WagedRebalancerMetricCollector.",
+            e);
+      }
+    }
+  }
+
+  /**
+   * This constructor will create but will not register metrics. This constructor will be used in
+   * case of JMException so that the rebalancer could proceed without registering and emitting
+   * metrics.
+   */
+  public WagedRebalancerMetricCollector() {
+    this(null);
+  }
+
+  /**
+   * Creates and registers all metrics in MetricCollector for WagedRebalancer.
+   */
+  private void createMetrics() {
+    // Define all metrics
+    LatencyMetric globalBaselineCalcLatencyGauge =
+        new RebalanceLatencyGauge(WagedRebalancerMetricNames.GlobalBaselineCalcLatencyGauge.name(),
+            getResetIntervalInMs());
+    LatencyMetric partialRebalanceLatencyGauge =
+        new RebalanceLatencyGauge(WagedRebalancerMetricNames.PartialRebalanceLatencyGauge.name(),
+            getResetIntervalInMs());
+    LatencyMetric stateReadLatencyGauge =
+        new RebalanceLatencyGauge(WagedRebalancerMetricNames.StateReadLatencyGauge.name(),
+            getResetIntervalInMs());
+    LatencyMetric stateWriteLatencyGauge =
+        new RebalanceLatencyGauge(WagedRebalancerMetricNames.StateWriteLatencyGauge.name(),
+            getResetIntervalInMs());
+    RatioMetric baselineDivergenceGauge =
+        new BaselineDivergenceGauge(WagedRebalancerMetricNames.BaselineDivergenceGauge.name());
+    CountMetric calcFailureCount =
+        new RebalanceFailureCount(WagedRebalancerMetricNames.RebalanceFailureCounter.name());
+    CountMetric globalBaselineCalcCounter =
+        new RebalanceCounter(WagedRebalancerMetricNames.GlobalBaselineCalcCounter.name());
+    CountMetric partialRebalanceCounter =
+        new RebalanceCounter(WagedRebalancerMetricNames.PartialRebalanceCounter.name());
+
+    // Add metrics to WagedRebalancerMetricCollector
+    addMetric(globalBaselineCalcLatencyGauge);
+    addMetric(partialRebalanceLatencyGauge);
+    addMetric(stateReadLatencyGauge);
+    addMetric(stateWriteLatencyGauge);
+    addMetric(baselineDivergenceGauge);
+    addMetric(calcFailureCount);
+    addMetric(globalBaselineCalcCounter);
+    addMetric(partialRebalanceCounter);
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/monitoring/metrics/implementation/BaselineDivergenceGauge.java b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/implementation/BaselineDivergenceGauge.java
new file mode 100644
index 0000000..8e6d49b
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/implementation/BaselineDivergenceGauge.java
@@ -0,0 +1,68 @@
+package org.apache.helix.monitoring.metrics.implementation;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Map;
+import java.util.concurrent.ExecutorService;
+
+import org.apache.helix.controller.pipeline.AbstractBaseStage;
+import org.apache.helix.controller.rebalancer.util.ResourceUsageCalculator;
+import org.apache.helix.model.ResourceAssignment;
+import org.apache.helix.monitoring.metrics.model.RatioMetric;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+
+/**
+ * Gauge of the difference (state and partition allocation) between the baseline and the best
+ * possible assignment. Its value range is [0.0, 1.0].
+ */
+public class BaselineDivergenceGauge extends RatioMetric {
+  private static final Logger LOG = LoggerFactory.getLogger(BaselineDivergenceGauge.class);
+
+  /**
+   * Instantiates a new Simple dynamic metric.
+   * @param metricName   the metric name
+   */
+  public BaselineDivergenceGauge(String metricName) {
+    super(metricName, 0.0d);
+  }
+
+  /**
+   * Asynchronously measure and update metric value.
+   * @param threadPool an executor service to asynchronously run the task
+   * @param baseline baseline assignment
+   * @param bestPossibleAssignment best possible assignment
+   */
+  public void asyncMeasureAndUpdateValue(ExecutorService threadPool,
+      Map<String, ResourceAssignment> baseline,
+      Map<String, ResourceAssignment> bestPossibleAssignment) {
+    AbstractBaseStage.asyncExecute(threadPool, () -> {
+      try {
+        double baselineDivergence =
+            ResourceUsageCalculator.measureBaselineDivergence(baseline, bestPossibleAssignment);
+        updateValue(baselineDivergence);
+      } catch (Exception e) {
+        LOG.error("Failed to report BaselineDivergenceGauge metric.", e);
+      }
+      return null;
+    });
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/monitoring/metrics/implementation/RebalanceCounter.java b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/implementation/RebalanceCounter.java
new file mode 100644
index 0000000..8ecce7c
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/implementation/RebalanceCounter.java
@@ -0,0 +1,36 @@
+package org.apache.helix.monitoring.metrics.implementation;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.monitoring.metrics.model.CountMetric;
+
+
+/**
+ * To report counter type metrics related to rebalance. This monitor monotonically increases values.
+ */
+public class RebalanceCounter extends CountMetric {
+  /**
+   * Instantiates a new rebalance count metric.
+   * @param metricName the metric name
+   */
+  public RebalanceCounter(String metricName) {
+    super(metricName, 0L);
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/monitoring/metrics/implementation/RebalanceFailureCount.java b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/implementation/RebalanceFailureCount.java
new file mode 100644
index 0000000..fd335f2
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/implementation/RebalanceFailureCount.java
@@ -0,0 +1,34 @@
+package org.apache.helix.monitoring.metrics.implementation;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.monitoring.metrics.model.CountMetric;
+
+
+public class RebalanceFailureCount extends CountMetric {
+  /**
+   * Instantiates a new Simple dynamic metric.
+   *
+   * @param metricName the metric name
+   */
+  public RebalanceFailureCount(String metricName) {
+    super(metricName, 0L);
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/monitoring/metrics/implementation/RebalanceLatencyGauge.java b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/implementation/RebalanceLatencyGauge.java
new file mode 100644
index 0000000..b0c563b
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/implementation/RebalanceLatencyGauge.java
@@ -0,0 +1,89 @@
+package org.apache.helix.monitoring.metrics.implementation;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.concurrent.TimeUnit;
+
+import com.codahale.metrics.Histogram;
+import com.codahale.metrics.SlidingTimeWindowArrayReservoir;
+import org.apache.helix.monitoring.metrics.model.LatencyMetric;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class RebalanceLatencyGauge extends LatencyMetric {
+  private static final Logger LOG = LoggerFactory.getLogger(RebalanceLatencyGauge.class);
+  private static final long VALUE_NOT_SET = -1;
+  private long _lastEmittedMetricValue = VALUE_NOT_SET;
+  // Use threadlocal here so the start time can be updated and recorded in multi-threads.
+  private final ThreadLocal<Long> _startTime;
+
+  /**
+   * Instantiates a new Histogram dynamic metric.
+   * @param metricName the metric name
+   */
+  public RebalanceLatencyGauge(String metricName, long slidingTimeWindow) {
+    super(metricName, new Histogram(
+        new SlidingTimeWindowArrayReservoir(slidingTimeWindow, TimeUnit.MILLISECONDS)));
+    _metricName = metricName;
+    _startTime = ThreadLocal.withInitial(() -> VALUE_NOT_SET);
+  }
+
+  /**
+   * Calling this method multiple times would simply overwrite the previous state. This is because
+   * the rebalancer could fail at any point, and we want it to recover gracefully by resetting the
+   * internal state of this metric.
+   */
+  @Override
+  public void startMeasuringLatency() {
+    reset();
+    _startTime.set(System.currentTimeMillis());
+  }
+
+  @Override
+  public void endMeasuringLatency() {
+    if (_startTime.get() == VALUE_NOT_SET) {
+      LOG.error(
+          "Needs to call startMeasuringLatency first! Ignoring and resetting the metric. Metric name: {}",
+          _metricName);
+      return;
+    }
+    synchronized (this) {
+      _lastEmittedMetricValue = System.currentTimeMillis() - _startTime.get();
+      updateValue(_lastEmittedMetricValue);
+    }
+    reset();
+  }
+
+  /**
+   * Returns the most recently emitted metric value at the time of the call.
+   * @return
+   */
+  @Override
+  public Long getLastEmittedMetricValue() {
+    return _lastEmittedMetricValue;
+  }
+
+  /**
+   * Resets the internal state of this metric.
+   */
+  private void reset() {
+    _startTime.set(VALUE_NOT_SET);
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/monitoring/metrics/model/CountMetric.java b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/model/CountMetric.java
new file mode 100644
index 0000000..c64f761
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/model/CountMetric.java
@@ -0,0 +1,69 @@
+package org.apache.helix.monitoring.metrics.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric;
+import org.apache.helix.monitoring.mbeans.dynamicMBeans.SimpleDynamicMetric;
+
+/**
+ * Represents a count metric and defines methods to help with calculation. A count metric gives a
+ * gauge value of a certain property.
+ */
+public abstract class CountMetric extends SimpleDynamicMetric<Long> implements Metric<Long> {
+
+  /**
+   * Instantiates a new count metric.
+   *
+   * @param metricName the metric name
+   * @param initCount the initial count
+   */
+  public CountMetric(String metricName, long initCount) {
+    super(metricName, initCount);
+  }
+
+  /**
+   * Increment the metric by the input count.
+   *
+   * @param count
+   */
+  public void increment(long count) {
+    updateValue(getValue() + count);
+  }
+
+  @Override
+  public String getMetricName() {
+    return _metricName;
+  }
+
+  @Override
+  public String toString() {
+    return String.format("Metric %s's count is %d", getMetricName(), getValue());
+  }
+
+  @Override
+  public Long getLastEmittedMetricValue() {
+    return getValue();
+  }
+
+  @Override
+  public DynamicMetric getDynamicMetric() {
+    return this;
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/monitoring/metrics/model/LatencyMetric.java b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/model/LatencyMetric.java
new file mode 100644
index 0000000..733635e
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/model/LatencyMetric.java
@@ -0,0 +1,67 @@
+package org.apache.helix.monitoring.metrics.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import com.codahale.metrics.Histogram;
+import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric;
+import org.apache.helix.monitoring.mbeans.dynamicMBeans.HistogramDynamicMetric;
+
+/**
+ * Represents a latency metric and defines methods to help with calculation. A latency metric gives
+ * how long a particular stage in the logic took in milliseconds.
+ */
+public abstract class LatencyMetric extends HistogramDynamicMetric implements Metric<Long> {
+  protected String _metricName;
+
+  /**
+   * Instantiates a new Histogram dynamic metric.
+   * @param metricName the metric name
+   * @param metricObject the metric object
+   */
+  public LatencyMetric(String metricName, Histogram metricObject) {
+    super(metricName, metricObject);
+    _metricName = metricName;
+  }
+
+  /**
+   * Starts measuring the latency.
+   */
+  public abstract void startMeasuringLatency();
+
+  /**
+   * Ends measuring the latency.
+   */
+  public abstract void endMeasuringLatency();
+
+  @Override
+  public String getMetricName() {
+    return _metricName;
+  }
+
+  @Override
+  public String toString() {
+    return String.format("Metric %s's latency is %d", getMetricName(), getLastEmittedMetricValue());
+  }
+
+  @Override
+  public DynamicMetric getDynamicMetric() {
+    return this;
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/monitoring/metrics/model/Metric.java b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/model/Metric.java
new file mode 100644
index 0000000..be7ea80
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/model/Metric.java
@@ -0,0 +1,50 @@
+package org.apache.helix.monitoring.metrics.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric;
+
+/**
+ * Defines a generic metric interface.
+ * @param <T> type of input value for the metric
+ */
+public interface Metric<T> {
+
+  /**
+   * Gets the name of the metric.
+   */
+  String getMetricName();
+
+  /**
+   * Prints the metric along with its name.
+   */
+  String toString();
+
+  /**
+   * Returns the most recently emitted value for the metric at the time of the call.
+   * @return metric value
+   */
+  T getLastEmittedMetricValue();
+
+  /**
+   * Returns the underlying DynamicMetric.
+   */
+  DynamicMetric getDynamicMetric();
+}
diff --git a/helix-core/src/main/java/org/apache/helix/monitoring/metrics/model/RatioMetric.java b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/model/RatioMetric.java
new file mode 100644
index 0000000..d321e51
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/monitoring/metrics/model/RatioMetric.java
@@ -0,0 +1,58 @@
+package org.apache.helix.monitoring.metrics.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric;
+import org.apache.helix.monitoring.mbeans.dynamicMBeans.SimpleDynamicMetric;
+
+
+/**
+ * A gauge which defines the ratio of one value to another.
+ */
+public abstract class RatioMetric extends SimpleDynamicMetric<Double> implements Metric<Double> {
+  /**
+   * Instantiates a new Simple dynamic metric.
+   *  @param metricName the metric name
+   * @param metricObject the metric object
+   */
+  public RatioMetric(String metricName, double metricObject) {
+    super(metricName, metricObject);
+  }
+
+  @Override
+  public DynamicMetric getDynamicMetric() {
+    return this;
+  }
+
+  @Override
+  public String getMetricName() {
+    return _metricName;
+  }
+
+  @Override
+  public Double getLastEmittedMetricValue() {
+    return getValue();
+  }
+
+  @Override
+  public String toString() {
+    return String.format("Metric name: %s, metric value: %f", getMetricName(), getValue());
+  }
+}
diff --git a/helix-core/src/main/java/org/apache/helix/tools/ClusterVerifiers/BestPossibleExternalViewVerifier.java b/helix-core/src/main/java/org/apache/helix/tools/ClusterVerifiers/BestPossibleExternalViewVerifier.java
index d190976..66143fe 100644
--- a/helix-core/src/main/java/org/apache/helix/tools/ClusterVerifiers/BestPossibleExternalViewVerifier.java
+++ b/helix-core/src/main/java/org/apache/helix/tools/ClusterVerifiers/BestPossibleExternalViewVerifier.java
@@ -27,27 +27,37 @@
 import java.util.Iterator;
 import java.util.List;
 import java.util.Map;
+import java.util.Optional;
 import java.util.Set;
 
 import org.apache.helix.HelixDefinedState;
+import org.apache.helix.HelixRebalanceException;
 import org.apache.helix.PropertyKey;
 import org.apache.helix.controller.common.PartitionStateMap;
 import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
 import org.apache.helix.controller.pipeline.Stage;
 import org.apache.helix.controller.pipeline.StageContext;
+import org.apache.helix.controller.rebalancer.waged.AssignmentMetadataStore;
+import org.apache.helix.controller.rebalancer.waged.RebalanceAlgorithm;
+import org.apache.helix.controller.rebalancer.waged.WagedRebalancer;
+import org.apache.helix.controller.rebalancer.waged.constraints.ConstraintBasedAlgorithmFactory;
 import org.apache.helix.controller.stages.AttributeName;
 import org.apache.helix.controller.stages.BestPossibleStateCalcStage;
 import org.apache.helix.controller.stages.BestPossibleStateOutput;
 import org.apache.helix.controller.stages.ClusterEvent;
 import org.apache.helix.controller.stages.ClusterEventType;
 import org.apache.helix.controller.stages.CurrentStateComputationStage;
+import org.apache.helix.controller.stages.CurrentStateOutput;
 import org.apache.helix.controller.stages.ResourceComputationStage;
+import org.apache.helix.manager.zk.ZkBucketDataAccessor;
 import org.apache.helix.manager.zk.ZkClient;
 import org.apache.helix.manager.zk.client.HelixZkClient;
+import org.apache.helix.model.ClusterConfig;
 import org.apache.helix.model.ExternalView;
 import org.apache.helix.model.IdealState;
 import org.apache.helix.model.Partition;
 import org.apache.helix.model.Resource;
+import org.apache.helix.model.ResourceAssignment;
 import org.apache.helix.model.StateModelDefinition;
 import org.apache.helix.task.TaskConstants;
 import org.slf4j.Logger;
@@ -377,8 +387,16 @@
     }
 
     runStage(event, new CurrentStateComputationStage());
-    // TODO: be caution here, should be handled statelessly.
-    runStage(event, new BestPossibleStateCalcStage());
+    // Note the dryrunWagedRebalancer is just for one time usage
+    DryrunWagedRebalancer dryrunWagedRebalancer =
+        new DryrunWagedRebalancer(_zkClient.getServers(), cache.getClusterName(),
+            cache.getClusterConfig().getGlobalRebalancePreference());
+    event.addAttribute(AttributeName.STATEFUL_REBALANCER.name(), dryrunWagedRebalancer);
+    try {
+      runStage(event, new BestPossibleStateCalcStage());
+    } finally {
+      dryrunWagedRebalancer.close();
+    }
 
     BestPossibleStateOutput output = event.getAttribute(AttributeName.BEST_POSSIBLE_STATE.name());
     return output;
@@ -398,4 +416,55 @@
     return verifierName + "(" + _clusterName + "@" + _zkClient + "@resources["
        + (_resources != null ? Arrays.toString(_resources.toArray()) : "") + "])";
   }
+
+  /**
+   * A Dryrun WAGED rebalancer that only calculates the assignment based on the cluster status but
+   * never update the rebalancer assignment metadata.
+   * This rebalacer is used in the verifiers or tests.
+   */
+  private class DryrunWagedRebalancer extends WagedRebalancer {
+    DryrunWagedRebalancer(String metadataStoreAddrs, String clusterName,
+        Map<ClusterConfig.GlobalRebalancePreferenceKey, Integer> preferences) {
+      super(new ReadOnlyAssignmentMetadataStore(metadataStoreAddrs, clusterName),
+          ConstraintBasedAlgorithmFactory.getInstance(preferences), Optional.empty());
+    }
+
+    @Override
+    protected Map<String, ResourceAssignment> computeBestPossibleAssignment(
+        ResourceControllerDataProvider clusterData, Map<String, Resource> resourceMap,
+        Set<String> activeNodes, CurrentStateOutput currentStateOutput, RebalanceAlgorithm algorithm)
+        throws HelixRebalanceException {
+      return getBestPossibleAssignment(getAssignmentMetadataStore(), currentStateOutput,
+          resourceMap.keySet());
+    }
+  }
+
+  private class ReadOnlyAssignmentMetadataStore extends AssignmentMetadataStore {
+    ReadOnlyAssignmentMetadataStore(String metadataStoreAddrs, String clusterName) {
+      super(new ZkBucketDataAccessor(metadataStoreAddrs), clusterName);
+    }
+
+    @Override
+    public boolean persistBaseline(Map<String, ResourceAssignment> globalBaseline) {
+      // If baseline hasn't changed, skip writing to metadata store
+      if (compareAssignments(_globalBaseline, globalBaseline)) {
+        return false;
+      }
+      // Update the in-memory reference only
+      _globalBaseline = globalBaseline;
+      return true;
+    }
+
+    @Override
+    public boolean persistBestPossibleAssignment(
+        Map<String, ResourceAssignment> bestPossibleAssignment) {
+      // If bestPossibleAssignment hasn't changed, skip writing to metadata store
+      if (compareAssignments(_bestPossibleAssignment, bestPossibleAssignment)) {
+        return false;
+      }
+      // Update the in-memory reference only
+      _bestPossibleAssignment = bestPossibleAssignment;
+      return true;
+    }
+  }
 }
diff --git a/helix-core/src/main/java/org/apache/helix/tools/ClusterVerifiers/StrictMatchExternalViewVerifier.java b/helix-core/src/main/java/org/apache/helix/tools/ClusterVerifiers/StrictMatchExternalViewVerifier.java
index 13cc260..0b3c97e 100644
--- a/helix-core/src/main/java/org/apache/helix/tools/ClusterVerifiers/StrictMatchExternalViewVerifier.java
+++ b/helix-core/src/main/java/org/apache/helix/tools/ClusterVerifiers/StrictMatchExternalViewVerifier.java
@@ -23,7 +23,6 @@
 import java.util.Arrays;
 import java.util.Collections;
 import java.util.HashMap;
-import java.util.HashSet;
 import java.util.Iterator;
 import java.util.List;
 import java.util.Map;
@@ -56,19 +55,34 @@
 
   private final Set<String> _resources;
   private final Set<String> _expectLiveInstances;
+  private final boolean _isDeactivatedNodeAware;
 
+  @Deprecated
   public StrictMatchExternalViewVerifier(String zkAddr, String clusterName, Set<String> resources,
       Set<String> expectLiveInstances) {
+    this(zkAddr, clusterName, resources, expectLiveInstances, false);
+  }
+
+  @Deprecated
+  public StrictMatchExternalViewVerifier(HelixZkClient zkClient, String clusterName,
+      Set<String> resources, Set<String> expectLiveInstances) {
+    this(zkClient, clusterName, resources, expectLiveInstances, false);
+  }
+
+  private StrictMatchExternalViewVerifier(String zkAddr, String clusterName, Set<String> resources,
+      Set<String> expectLiveInstances, boolean isDeactivatedNodeAware) {
     super(zkAddr, clusterName);
     _resources = resources;
     _expectLiveInstances = expectLiveInstances;
+    _isDeactivatedNodeAware = isDeactivatedNodeAware;
   }
 
-  public StrictMatchExternalViewVerifier(HelixZkClient zkClient, String clusterName,
-      Set<String> resources, Set<String> expectLiveInstances) {
+  private StrictMatchExternalViewVerifier(HelixZkClient zkClient, String clusterName,
+      Set<String> resources, Set<String> expectLiveInstances, boolean isDeactivatedNodeAware) {
     super(zkClient, clusterName);
     _resources = resources;
     _expectLiveInstances = expectLiveInstances;
+    _isDeactivatedNodeAware = isDeactivatedNodeAware;
   }
 
   public static class Builder {
@@ -77,6 +91,8 @@
     private Set<String> _expectLiveInstances;
     private String _zkAddr;
     private HelixZkClient _zkClient;
+    // For backward compatibility, set the default isDeactivatedNodeAware to be false.
+    private boolean _isDeactivatedNodeAware = false;
 
     public StrictMatchExternalViewVerifier build() {
       if (_clusterName == null || (_zkAddr == null && _zkClient == null)) {
@@ -85,10 +101,10 @@
 
       if (_zkClient != null) {
         return new StrictMatchExternalViewVerifier(_zkClient, _clusterName, _resources,
-            _expectLiveInstances);
+            _expectLiveInstances, _isDeactivatedNodeAware);
       }
       return new StrictMatchExternalViewVerifier(_zkAddr, _clusterName, _resources,
-          _expectLiveInstances);
+          _expectLiveInstances, _isDeactivatedNodeAware);
     }
 
     public Builder(String clusterName) {
@@ -139,6 +155,15 @@
       _zkClient = zkClient;
       return this;
     }
+
+    public boolean getDeactivatedNodeAwareness() {
+      return _isDeactivatedNodeAware;
+    }
+
+    public Builder setDeactivatedNodeAwareness(boolean isDeactivatedNodeAware) {
+      _isDeactivatedNodeAware = isDeactivatedNodeAware;
+      return this;
+    }
   }
 
   @Override
@@ -278,17 +303,21 @@
     String stateModelDefName = idealState.getStateModelDefRef();
     StateModelDefinition stateModelDef = cache.getStateModelDef(stateModelDefName);
 
-    Map<String, Map<String, String>> idealPartitionState =
-        new HashMap<String, Map<String, String>>();
-
-    Set<String> liveEnabledInstances = new HashSet<String>(cache.getLiveInstances().keySet());
-    liveEnabledInstances.removeAll(cache.getDisabledInstances());
+    Map<String, Map<String, String>> idealPartitionState = new HashMap<>();
 
     for (String partition : idealState.getPartitionSet()) {
       List<String> preferenceList = AbstractRebalancer
-          .getPreferenceList(new Partition(partition), idealState, liveEnabledInstances);
-      Map<String, String> idealMapping =
-          HelixUtil.computeIdealMapping(preferenceList, stateModelDef, liveEnabledInstances);
+          .getPreferenceList(new Partition(partition), idealState, cache.getEnabledLiveInstances());
+      Map<String, String> idealMapping;
+      if (_isDeactivatedNodeAware) {
+        idealMapping = HelixUtil
+            .computeIdealMapping(preferenceList, stateModelDef, cache.getLiveInstances().keySet(),
+                cache.getDisabledInstancesForPartition(idealState.getResourceName(), partition));
+      } else {
+        idealMapping = HelixUtil
+            .computeIdealMapping(preferenceList, stateModelDef, cache.getEnabledLiveInstances(),
+                Collections.emptySet());
+      }
       idealPartitionState.put(partition, idealMapping);
     }
 
diff --git a/helix-core/src/main/java/org/apache/helix/tools/ClusterVerifiers/ZkHelixClusterVerifier.java b/helix-core/src/main/java/org/apache/helix/tools/ClusterVerifiers/ZkHelixClusterVerifier.java
index 020acbc..6efdff5 100644
--- a/helix-core/src/main/java/org/apache/helix/tools/ClusterVerifiers/ZkHelixClusterVerifier.java
+++ b/helix-core/src/main/java/org/apache/helix/tools/ClusterVerifiers/ZkHelixClusterVerifier.java
@@ -20,7 +20,6 @@
  */
 
 import java.util.List;
-import java.util.UUID;
 import java.util.concurrent.CountDownLatch;
 import java.util.concurrent.ExecutorService;
 import java.util.concurrent.Executors;
@@ -37,7 +36,6 @@
 import org.apache.helix.manager.zk.ZkClient;
 import org.apache.helix.manager.zk.client.DedicatedZkClientFactory;
 import org.apache.helix.manager.zk.client.HelixZkClient;
-import org.apache.helix.model.ResourceConfig;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
@@ -168,10 +166,6 @@
       long start = System.currentTimeMillis();
       boolean success;
       do {
-        // Add a rebalance invoker in case some callbacks got buried - sometimes callbacks get
-        // processed even before changes get fully written to ZK.
-        invokeRebalance(_accessor);
-
         success = verifyState();
         if (success) {
           return true;
@@ -307,15 +301,4 @@
   public String getClusterName() {
     return _clusterName;
   }
-
-  /**
-   * Invoke a cluster rebalance in case some callbacks get ignored. This is for Helix integration
-   * testing purposes only.
-   */
-  public static synchronized void invokeRebalance(HelixDataAccessor accessor) {
-    String dummyName = UUID.randomUUID().toString();
-    ResourceConfig dummyConfig = new ResourceConfig(dummyName);
-    accessor.updateProperty(accessor.keyBuilder().resourceConfig(dummyName), dummyConfig);
-    accessor.removeProperty(accessor.keyBuilder().resourceConfig(dummyName));
-  }
 }
diff --git a/helix-core/src/main/java/org/apache/helix/util/HelixUtil.java b/helix-core/src/main/java/org/apache/helix/util/HelixUtil.java
index a31c3fe..bfda60e 100644
--- a/helix-core/src/main/java/org/apache/helix/util/HelixUtil.java
+++ b/helix-core/src/main/java/org/apache/helix/util/HelixUtil.java
@@ -201,23 +201,42 @@
    */
   public static Map<String, String> computeIdealMapping(List<String> preferenceList,
       StateModelDefinition stateModelDef, Set<String> liveAndEnabled) {
+    return computeIdealMapping(preferenceList, stateModelDef, liveAndEnabled,
+        Collections.emptySet());
+  }
+
+  /**
+   * compute the ideal mapping for resource in Full-Auto and Semi-Auto based on its preference list
+   */
+  public static Map<String, String> computeIdealMapping(List<String> preferenceList,
+      StateModelDefinition stateModelDef, Set<String> liveInstanceSet,
+      Set<String> disabledInstancesForPartition) {
     Map<String, String> idealStateMap = new HashMap<String, String>();
 
     if (preferenceList == null) {
       return idealStateMap;
     }
 
+    for (String instance : preferenceList) {
+      if (disabledInstancesForPartition.contains(instance) && liveInstanceSet.contains(instance)) {
+        idealStateMap.put(instance, stateModelDef.getInitialState());
+      }
+    }
+
+    Set<String> liveAndEnabledInstances = new HashSet<>(liveInstanceSet);
+    liveAndEnabledInstances.removeAll(disabledInstancesForPartition);
+
     List<String> statesPriorityList = stateModelDef.getStatesPriorityList();
     Set<String> assigned = new HashSet<String>();
 
     for (String state : statesPriorityList) {
-      int stateCount = AbstractRebalancer.getStateCount(state, stateModelDef, liveAndEnabled.size(),
-          preferenceList.size());
+      int stateCount = AbstractRebalancer
+          .getStateCount(state, stateModelDef, liveAndEnabledInstances.size(), preferenceList.size());
       for (String instance : preferenceList) {
         if (stateCount <= 0) {
           break;
         }
-        if (!assigned.contains(instance)) {
+        if (!assigned.contains(instance) && liveAndEnabledInstances.contains(instance)) {
           idealStateMap.put(instance, state);
           assigned.add(instance);
           stateCount--;
diff --git a/helix-core/src/main/java/org/apache/helix/util/RebalanceUtil.java b/helix-core/src/main/java/org/apache/helix/util/RebalanceUtil.java
index 18163bd..050762d 100644
--- a/helix-core/src/main/java/org/apache/helix/util/RebalanceUtil.java
+++ b/helix-core/src/main/java/org/apache/helix/util/RebalanceUtil.java
@@ -143,6 +143,11 @@
   }
 
   public static void scheduleOnDemandPipeline(String clusterName, long delay) {
+    scheduleOnDemandPipeline(clusterName, delay, true);
+  }
+
+  public static void scheduleOnDemandPipeline(String clusterName, long delay,
+      boolean shouldRefreshCache) {
     if (clusterName == null) {
       LOG.error("Failed to issue a pipeline run. ClusterName is null.");
       return;
@@ -153,7 +158,7 @@
     }
     GenericHelixController controller = GenericHelixController.getController(clusterName);
     if (controller != null) {
-      controller.scheduleOnDemandRebalance(delay);
+      controller.scheduleOnDemandRebalance(delay, shouldRefreshCache);
     } else {
       LOG.error("Failed to issue a pipeline. Controller for cluster {} does not exist.",
           clusterName);
diff --git a/helix-core/src/main/resources/soft-constraint-weight.properties b/helix-core/src/main/resources/soft-constraint-weight.properties
new file mode 100644
index 0000000..c3c7931
--- /dev/null
+++ b/helix-core/src/main/resources/soft-constraint-weight.properties
@@ -0,0 +1,26 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+# Define the constraint weights for the WAGED rebalancer in this file.
+#
+# PartitionMovementConstraint=1f
+# InstancePartitionsCountConstraint=0.15f
+# ResourcePartitionAntiAffinityConstraint=0.05f
+# ResourceTopStateAntiAffinityConstraint=0.3f
+# MaxCapacityUsageInstanceConstraint=0.6f
diff --git a/helix-core/src/test/java/org/apache/helix/common/ZkTestBase.java b/helix-core/src/test/java/org/apache/helix/common/ZkTestBase.java
index 50c36ee..61c2544 100644
--- a/helix-core/src/test/java/org/apache/helix/common/ZkTestBase.java
+++ b/helix-core/src/test/java/org/apache/helix/common/ZkTestBase.java
@@ -53,6 +53,7 @@
 import org.apache.helix.controller.pipeline.StageContext;
 import org.apache.helix.controller.rebalancer.DelayedAutoRebalancer;
 import org.apache.helix.controller.rebalancer.strategy.AutoRebalanceStrategy;
+import org.apache.helix.controller.rebalancer.waged.WagedRebalancer;
 import org.apache.helix.controller.stages.AttributeName;
 import org.apache.helix.controller.stages.ClusterEvent;
 import org.apache.helix.manager.zk.ZKHelixAdmin;
@@ -347,6 +348,19 @@
   protected IdealState createResourceWithDelayedRebalance(String clusterName, String db,
       String stateModel, int numPartition, int replica, int minActiveReplica, long delay,
       String rebalanceStrategy) {
+    return createResource(clusterName, db, stateModel, numPartition, replica, minActiveReplica,
+        delay, DelayedAutoRebalancer.class.getName(), rebalanceStrategy);
+  }
+
+  protected IdealState createResourceWithWagedRebalance(String clusterName, String db,
+      String stateModel, int numPartition, int replica, int minActiveReplica) {
+    return createResource(clusterName, db, stateModel, numPartition, replica, minActiveReplica,
+        -1, WagedRebalancer.class.getName(), null);
+  }
+
+  private IdealState createResource(String clusterName, String db, String stateModel,
+      int numPartition, int replica, int minActiveReplica, long delay, String rebalancerClassName,
+      String rebalanceStrategy) {
     IdealState idealState =
         _gSetupTool.getClusterManagementTool().getResourceIdealState(clusterName, db);
     if (idealState == null) {
@@ -362,7 +376,7 @@
     if (delay > 0) {
       idealState.setRebalanceDelay(delay);
     }
-    idealState.setRebalancerClassName(DelayedAutoRebalancer.class.getName());
+    idealState.setRebalancerClassName(rebalancerClassName);
     _gSetupTool.getClusterManagementTool().setResourceIdealState(clusterName, db, idealState);
     _gSetupTool.rebalanceStorageCluster(clusterName, db, replica);
     idealState = _gSetupTool.getClusterManagementTool().getResourceIdealState(clusterName, db);
diff --git a/helix-core/src/test/java/org/apache/helix/controller/changedetector/TestResourceChangeDetector.java b/helix-core/src/test/java/org/apache/helix/controller/changedetector/TestResourceChangeDetector.java
new file mode 100644
index 0000000..bac9842
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/changedetector/TestResourceChangeDetector.java
@@ -0,0 +1,441 @@
+package org.apache.helix.controller.changedetector;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashMap;
+
+import org.apache.helix.AccessOption;
+import org.apache.helix.HelixConstants.ChangeType;
+import org.apache.helix.HelixDataAccessor;
+import org.apache.helix.PropertyKey;
+import org.apache.helix.TestHelper;
+import org.apache.helix.common.ZkTestBase;
+import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
+import org.apache.helix.integration.manager.ClusterControllerManager;
+import org.apache.helix.integration.manager.MockParticipantManager;
+import org.apache.helix.manager.zk.ZKHelixDataAccessor;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.IdealState;
+import org.apache.helix.model.InstanceConfig;
+import org.apache.helix.model.ResourceConfig;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+/**
+ * This test contains a series of unit tests for ResourceChangeDetector.
+ */
+public class TestResourceChangeDetector extends ZkTestBase {
+
+  // All possible change types for ResourceChangeDetector except for ClusterConfig
+  // since we don't provide the names of changed fields for ClusterConfig
+  private static final ChangeType[] RESOURCE_CHANGE_TYPES = {
+      ChangeType.IDEAL_STATE, ChangeType.INSTANCE_CONFIG, ChangeType.LIVE_INSTANCE,
+      ChangeType.RESOURCE_CONFIG, ChangeType.CLUSTER_CONFIG
+  };
+
+  private static final String CLUSTER_NAME = TestHelper.getTestClassName();
+  private static final String RESOURCE_NAME = "TestDB";
+  private static final String NEW_RESOURCE_NAME = "TestDB2";
+  private static final String STATE_MODEL = "MasterSlave";
+  // There are 5 possible change types for ResourceChangeDetector
+  private static final int NUM_CHANGE_TYPES = 5;
+  private static final int NUM_RESOURCES = 1;
+  private static final int NUM_PARTITIONS = 10;
+  private static final int NUM_REPLICAS = 3;
+  private static final int NUM_NODES = 5;
+
+  // Create a mock of ResourceControllerDataProvider so that we could manipulate it
+  private ResourceControllerDataProvider _dataProvider;
+  private ResourceChangeDetector _resourceChangeDetector;
+  private ClusterControllerManager _controller;
+  private MockParticipantManager[] _participants = new MockParticipantManager[NUM_NODES];
+  private HelixDataAccessor _dataAccessor;
+  private PropertyKey.Builder _keyBuilder;
+
+  @BeforeClass
+  public void beforeClass() throws Exception {
+    super.beforeClass();
+
+    // Set up a mock cluster
+    TestHelper.setupCluster(CLUSTER_NAME, ZK_ADDR, 12918, // participant port
+        "localhost", // participant name prefix
+        RESOURCE_NAME, // resource name prefix
+        NUM_RESOURCES, // resources
+        NUM_PARTITIONS, // partitions per resource
+        NUM_NODES, // nodes
+        NUM_REPLICAS, // replicas
+        STATE_MODEL, true); // do rebalance
+
+    // Start a controller
+    _controller = new ClusterControllerManager(ZK_ADDR, CLUSTER_NAME, "controller_0");
+    _controller.syncStart();
+
+    // Start Participants
+    for (int i = 0; i < NUM_NODES; i++) {
+      String instanceName = "localhost_" + (12918 + i);
+      _participants[i] = new MockParticipantManager(ZK_ADDR, CLUSTER_NAME, instanceName);
+      _participants[i].syncStart();
+    }
+
+    _dataAccessor = new ZKHelixDataAccessor(CLUSTER_NAME, _baseAccessor);
+    _keyBuilder = _dataAccessor.keyBuilder();
+    _resourceChangeDetector = new ResourceChangeDetector();
+
+    // Create a custom data provider
+    _dataProvider = new ResourceControllerDataProvider(CLUSTER_NAME);
+  }
+
+  @AfterClass
+  public void afterClass() throws Exception {
+    for (MockParticipantManager participant : _participants) {
+      if (participant != null && participant.isConnected()) {
+        participant.syncStop();
+      }
+    }
+    _controller.syncStop();
+    deleteCluster(CLUSTER_NAME);
+    Assert.assertFalse(TestHelper.verify(() -> _dataAccessor.getBaseDataAccessor()
+        .exists("/" + CLUSTER_NAME, AccessOption.PERSISTENT), 20000L));
+  }
+
+  /**
+   * Tests the initialization of the change detector. It should tell us that there's been changes
+   * for every change type and for all items per type.
+   * @throws Exception
+   */
+  @Test
+  public void testResourceChangeDetectorInit() {
+    _dataProvider.refresh(_dataAccessor);
+    _resourceChangeDetector.updateSnapshots(_dataProvider);
+
+    Collection<ChangeType> changeTypes = _resourceChangeDetector.getChangeTypes();
+    Assert.assertEquals(changeTypes.size(), NUM_CHANGE_TYPES,
+        "Not all change types have been detected for ResourceChangeDetector!");
+
+    // Check that the right amount of resources show up as added
+    checkDetectionCounts(ChangeType.IDEAL_STATE, NUM_RESOURCES, 0, 0);
+
+    // Check that the right amount of instances show up as added
+    checkDetectionCounts(ChangeType.LIVE_INSTANCE, NUM_NODES, 0, 0);
+    checkDetectionCounts(ChangeType.INSTANCE_CONFIG, NUM_NODES, 0, 0);
+
+    // Check that the right amount of cluster config item show up
+    checkDetectionCounts(ChangeType.CLUSTER_CONFIG, 1, 0, 0);
+  }
+
+  /**
+   * Add a resource (IS and ResourceConfig) and see if the detector detects it.
+   */
+  @Test(dependsOnMethods = "testResourceChangeDetectorInit")
+  public void testAddResource() {
+    // Create an IS and ResourceConfig
+    _gSetupTool.getClusterManagementTool().addResource(CLUSTER_NAME, NEW_RESOURCE_NAME,
+        NUM_PARTITIONS, STATE_MODEL);
+    ResourceConfig resourceConfig = new ResourceConfig(NEW_RESOURCE_NAME);
+    _dataAccessor.setProperty(_keyBuilder.resourceConfig(NEW_RESOURCE_NAME), resourceConfig);
+    // Manually notify dataProvider
+    _dataProvider.notifyDataChange(ChangeType.IDEAL_STATE);
+    _dataProvider.notifyDataChange(ChangeType.RESOURCE_CONFIG);
+
+    // Refresh the data provider
+    _dataProvider.refresh(_dataAccessor);
+
+    // Update the detector
+    _resourceChangeDetector.updateSnapshots(_dataProvider);
+
+    checkChangeTypes(ChangeType.IDEAL_STATE, ChangeType.RESOURCE_CONFIG);
+    // Check the counts
+    for (ChangeType type : RESOURCE_CHANGE_TYPES) {
+      if (type == ChangeType.IDEAL_STATE || type == ChangeType.RESOURCE_CONFIG) {
+        checkDetectionCounts(type, 1, 0, 0);
+      } else {
+        checkDetectionCounts(type, 0, 0, 0);
+      }
+    }
+    // Check that detector gives the right item
+    Assert.assertTrue(_resourceChangeDetector.getAdditionsByType(ChangeType.RESOURCE_CONFIG)
+        .contains(NEW_RESOURCE_NAME));
+  }
+
+  /**
+   * Modify a resource config for the new resource and test that detector detects it.
+   */
+  @Test(dependsOnMethods = "testAddResource")
+  public void testModifyResource() {
+    // Modify resource config
+    ResourceConfig resourceConfig =
+        _dataAccessor.getProperty(_keyBuilder.resourceConfig(NEW_RESOURCE_NAME));
+    resourceConfig.getRecord().setSimpleField("Did I change?", "Yes!");
+    _dataAccessor.updateProperty(_keyBuilder.resourceConfig(NEW_RESOURCE_NAME), resourceConfig);
+
+    // Notify data provider and check
+    _dataProvider.notifyDataChange(ChangeType.RESOURCE_CONFIG);
+    _dataProvider.refresh(_dataAccessor);
+    _resourceChangeDetector.updateSnapshots(_dataProvider);
+
+    checkChangeTypes(ChangeType.RESOURCE_CONFIG);
+    // Check the counts
+    for (ChangeType type : RESOURCE_CHANGE_TYPES) {
+      if (type == ChangeType.RESOURCE_CONFIG) {
+        checkDetectionCounts(type, 0, 1, 0);
+      } else {
+        checkDetectionCounts(type, 0, 0, 0);
+      }
+    }
+    Assert.assertTrue(_resourceChangeDetector.getChangesByType(ChangeType.RESOURCE_CONFIG)
+        .contains(NEW_RESOURCE_NAME));
+  }
+
+  /**
+   * Delete the new resource and test that detector detects it.
+   */
+  @Test(dependsOnMethods = "testModifyResource")
+  public void testDeleteResource() {
+    // Delete the newly added resource
+    _dataAccessor.removeProperty(_keyBuilder.idealStates(NEW_RESOURCE_NAME));
+    _dataAccessor.removeProperty(_keyBuilder.resourceConfig(NEW_RESOURCE_NAME));
+
+    // Notify data provider and check
+    _dataProvider.notifyDataChange(ChangeType.IDEAL_STATE);
+    _dataProvider.notifyDataChange(ChangeType.RESOURCE_CONFIG);
+    _dataProvider.refresh(_dataAccessor);
+    _resourceChangeDetector.updateSnapshots(_dataProvider);
+
+    checkChangeTypes(ChangeType.RESOURCE_CONFIG, ChangeType.IDEAL_STATE);
+    // Check the counts
+    for (ChangeType type : RESOURCE_CHANGE_TYPES) {
+      if (type == ChangeType.IDEAL_STATE || type == ChangeType.RESOURCE_CONFIG) {
+        checkDetectionCounts(type, 0, 0, 1);
+      } else {
+        checkDetectionCounts(type, 0, 0, 0);
+      }
+    }
+  }
+
+  /**
+   * Disconnect and reconnect a Participant and see if detector detects.
+   */
+  @Test(dependsOnMethods = "testDeleteResource")
+  public void testDisconnectReconnectInstance() {
+    // Disconnect a Participant
+    _participants[0].syncStop();
+    _dataProvider.notifyDataChange(ChangeType.LIVE_INSTANCE);
+    _dataProvider.refresh(_dataAccessor);
+    _resourceChangeDetector.updateSnapshots(_dataProvider);
+
+    checkChangeTypes(ChangeType.LIVE_INSTANCE);
+    // Check the counts
+    for (ChangeType type : RESOURCE_CHANGE_TYPES) {
+      if (type == ChangeType.LIVE_INSTANCE) {
+        checkDetectionCounts(type, 0, 0, 1);
+      } else {
+        checkDetectionCounts(type, 0, 0, 0);
+      }
+    }
+
+    // Reconnect the Participant
+    _participants[0] = new MockParticipantManager(ZK_ADDR, CLUSTER_NAME, "localhost_12918");
+    _participants[0].syncStart();
+    _dataProvider.notifyDataChange(ChangeType.LIVE_INSTANCE);
+    _dataProvider.refresh(_dataAccessor);
+    _resourceChangeDetector.updateSnapshots(_dataProvider);
+
+    checkChangeTypes(ChangeType.LIVE_INSTANCE);
+    // Check the counts
+    for (ChangeType type : RESOURCE_CHANGE_TYPES) {
+      if (type == ChangeType.LIVE_INSTANCE) {
+        checkDetectionCounts(type, 1, 0, 0);
+      } else {
+        checkDetectionCounts(type, 0, 0, 0);
+      }
+    }
+  }
+
+  /**
+   * Remove an instance completely and see if detector detects.
+   */
+  @Test(dependsOnMethods = "testDisconnectReconnectInstance")
+  public void testRemoveInstance() {
+    _participants[0].syncStop();
+    InstanceConfig instanceConfig =
+        _dataAccessor.getProperty(_keyBuilder.instanceConfig(_participants[0].getInstanceName()));
+    _gSetupTool.getClusterManagementTool().dropInstance(CLUSTER_NAME, instanceConfig);
+
+    _dataProvider.notifyDataChange(ChangeType.LIVE_INSTANCE);
+    _dataProvider.notifyDataChange(ChangeType.INSTANCE_CONFIG);
+    _dataProvider.refresh(_dataAccessor);
+    _resourceChangeDetector.updateSnapshots(_dataProvider);
+
+    checkChangeTypes(ChangeType.LIVE_INSTANCE, ChangeType.INSTANCE_CONFIG);
+    // Check the counts
+    for (ChangeType type : RESOURCE_CHANGE_TYPES) {
+      if (type == ChangeType.LIVE_INSTANCE || type == ChangeType.INSTANCE_CONFIG) {
+        checkDetectionCounts(type, 0, 0, 1);
+      } else {
+        checkDetectionCounts(type, 0, 0, 0);
+      }
+    }
+  }
+
+  /**
+   * Modify cluster config and see if detector detects.
+   */
+  @Test(dependsOnMethods = "testRemoveInstance")
+  public void testModifyClusterConfig() {
+    // Modify cluster config
+    ClusterConfig clusterConfig = _dataAccessor.getProperty(_keyBuilder.clusterConfig());
+    clusterConfig.setTopology("Change");
+    _dataAccessor.updateProperty(_keyBuilder.clusterConfig(), clusterConfig);
+
+    _dataProvider.notifyDataChange(ChangeType.CLUSTER_CONFIG);
+    _dataProvider.refresh(_dataAccessor);
+    _resourceChangeDetector.updateSnapshots(_dataProvider);
+
+    checkChangeTypes(ChangeType.CLUSTER_CONFIG);
+    // Check the counts for other types
+    for (ChangeType type : RESOURCE_CHANGE_TYPES) {
+      if (type == ChangeType.CLUSTER_CONFIG) {
+        checkDetectionCounts(type, 0, 1, 0);
+      } else {
+        checkDetectionCounts(type, 0, 0, 0);
+      }
+    }
+  }
+
+  /**
+   * Test that change detector gives correct results when there are no changes after updating
+   * snapshots.
+   */
+  @Test(dependsOnMethods = "testModifyClusterConfig")
+  public void testNoChange() {
+    // Test twice to make sure that no change is stable across different runs
+    for (int i = 0; i < 2; i++) {
+      _dataProvider.refresh(_dataAccessor);
+      _resourceChangeDetector.updateSnapshots(_dataProvider);
+
+      Assert.assertEquals(_resourceChangeDetector.getChangeTypes().size(), 0);
+      // Check the counts for all the other types
+      for (ChangeType type : RESOURCE_CHANGE_TYPES) {
+        checkDetectionCounts(type, 0, 0, 0);
+      }
+    }
+  }
+
+  /**
+   * Modify IdealState mapping fields for a FULL_AUTO resource and see if detector detects.
+   */
+  @Test(dependsOnMethods = "testNoChange")
+  public void testIgnoreControllerGeneratedFields() {
+    // Modify cluster config and IdealState to ensure the mapping field of the IdealState will be
+    // considered as the fields that are modified by Helix logic.
+    ClusterConfig clusterConfig = _dataAccessor.getProperty(_keyBuilder.clusterConfig());
+    clusterConfig.setPersistBestPossibleAssignment(true);
+    _dataAccessor.updateProperty(_keyBuilder.clusterConfig(), clusterConfig);
+
+    // Create an new IS
+    String resourceName = "Resource" + TestHelper.getTestMethodName();
+    _gSetupTool.getClusterManagementTool()
+        .addResource(CLUSTER_NAME, resourceName, NUM_PARTITIONS, STATE_MODEL);
+    IdealState idealState = _dataAccessor.getProperty(_keyBuilder.idealStates(resourceName));
+    idealState.setRebalanceMode(IdealState.RebalanceMode.FULL_AUTO);
+    idealState.getRecord().getMapFields().put("Partition1", new HashMap<>());
+    _dataAccessor.updateProperty(_keyBuilder.idealStates(resourceName), idealState);
+    _dataProvider.notifyDataChange(ChangeType.CLUSTER_CONFIG);
+    _dataProvider.notifyDataChange(ChangeType.IDEAL_STATE);
+    _dataProvider.refresh(_dataAccessor);
+
+    // Test with ignore option to be true
+    ResourceChangeDetector changeDetector = new ResourceChangeDetector(true);
+    changeDetector.updateSnapshots(_dataProvider);
+    // Now, modify the field
+    idealState.getRecord().getMapFields().put("Partition1", Collections.singletonMap("foo", "bar"));
+    _dataAccessor.updateProperty(_keyBuilder.idealStates(resourceName), idealState);
+    _dataProvider.notifyDataChange(ChangeType.IDEAL_STATE);
+    _dataProvider.refresh(_dataAccessor);
+    changeDetector.updateSnapshots(_dataProvider);
+    Assert.assertEquals(changeDetector.getChangeTypes(),
+        Collections.singleton(ChangeType.IDEAL_STATE));
+    Assert.assertEquals(
+        changeDetector.getAdditionsByType(ChangeType.IDEAL_STATE).size() + changeDetector
+            .getChangesByType(ChangeType.IDEAL_STATE).size() + changeDetector
+            .getRemovalsByType(ChangeType.IDEAL_STATE).size(), 0);
+  }
+
+  @Test(dependsOnMethods = "testIgnoreControllerGeneratedFields")
+  public void testResetSnapshots() {
+    // Initialize a new detector with the existing data
+    ResourceChangeDetector changeDetector = new ResourceChangeDetector();
+    _dataProvider.notifyDataChange(ChangeType.IDEAL_STATE);
+    _dataProvider.refresh(_dataAccessor);
+    changeDetector.updateSnapshots(_dataProvider);
+    Assert.assertEquals(
+        changeDetector.getAdditionsByType(ChangeType.IDEAL_STATE).size() + changeDetector
+            .getChangesByType(ChangeType.IDEAL_STATE).size() + changeDetector
+            .getRemovalsByType(ChangeType.IDEAL_STATE).size(), 2);
+
+    // Update the detector with old data, since nothing changed, the result will be empty.
+    _dataProvider.notifyDataChange(ChangeType.IDEAL_STATE);
+    _dataProvider.refresh(_dataAccessor);
+    changeDetector.updateSnapshots(_dataProvider);
+    Assert.assertEquals(
+        changeDetector.getAdditionsByType(ChangeType.IDEAL_STATE).size() + changeDetector
+            .getChangesByType(ChangeType.IDEAL_STATE).size() + changeDetector
+            .getRemovalsByType(ChangeType.IDEAL_STATE).size(), 0);
+
+    // Reset the snapshots
+    changeDetector.resetSnapshots();
+    // After reset, all the data in the data provider will be treated as new changes
+    _dataProvider.notifyDataChange(ChangeType.IDEAL_STATE);
+    _dataProvider.refresh(_dataAccessor);
+    changeDetector.updateSnapshots(_dataProvider);
+    Assert.assertEquals(
+        changeDetector.getAdditionsByType(ChangeType.IDEAL_STATE).size() + changeDetector
+            .getChangesByType(ChangeType.IDEAL_STATE).size() + changeDetector
+            .getRemovalsByType(ChangeType.IDEAL_STATE).size(), 2);
+  }
+
+  /**
+   * Check that the given change types appear in detector's change types.
+   * @param types
+   */
+  private void checkChangeTypes(ChangeType... types) {
+    for (ChangeType type : types) {
+      Assert.assertTrue(_resourceChangeDetector.getChangeTypes().contains(type));
+    }
+  }
+
+  /**
+   * Convenience method for checking three types of detections.
+   * @param changeType
+   * @param additions
+   * @param changes
+   * @param deletions
+   */
+  private void checkDetectionCounts(ChangeType changeType, int additions, int changes,
+      int deletions) {
+    Assert.assertEquals(_resourceChangeDetector.getAdditionsByType(changeType).size(), additions);
+    Assert.assertEquals(_resourceChangeDetector.getChangesByType(changeType).size(), changes);
+    Assert.assertEquals(_resourceChangeDetector.getRemovalsByType(changeType).size(), deletions);
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/util/TestResourceUsageCalculator.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/util/TestResourceUsageCalculator.java
new file mode 100644
index 0000000..ef1737f
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/util/TestResourceUsageCalculator.java
@@ -0,0 +1,103 @@
+package org.apache.helix.controller.rebalancer.util;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Map;
+
+import com.google.common.collect.ImmutableMap;
+import org.apache.helix.model.Partition;
+import org.apache.helix.model.ResourceAssignment;
+import org.apache.helix.util.TestInputLoader;
+import org.testng.Assert;
+import org.testng.annotations.DataProvider;
+import org.testng.annotations.Test;
+
+
+public class TestResourceUsageCalculator {
+  @Test(dataProvider = "TestMeasureBaselineDivergenceInput")
+  public void testMeasureBaselineDivergence(Map<String, Map<String, Map<String, String>>> baseline,
+      Map<String, Map<String, Map<String, String>>> someMatchBestPossible,
+      Map<String, Map<String, Map<String, String>>> noMatchBestPossible) {
+    Map<String, ResourceAssignment> baselineAssignment = buildResourceAssignment(baseline);
+    Map<String, ResourceAssignment> someMatchBestPossibleAssignment =
+        buildResourceAssignment(someMatchBestPossible);
+    Map<String, ResourceAssignment> noMatchBestPossibleAssignment =
+        buildResourceAssignment(noMatchBestPossible);
+
+    // Empty best possible assignment.
+    Assert.assertEquals(ResourceUsageCalculator
+        .measureBaselineDivergence(baselineAssignment, Collections.emptyMap()), 1.0d);
+    // Empty baseline assignment.
+    Assert.assertEquals(ResourceUsageCalculator
+        .measureBaselineDivergence(Collections.emptyMap(), noMatchBestPossibleAssignment), 1.0d);
+
+    Assert.assertEquals(ResourceUsageCalculator
+        .measureBaselineDivergence(baselineAssignment, noMatchBestPossibleAssignment), 1.0d);
+    Assert.assertEquals(ResourceUsageCalculator
+            .measureBaselineDivergence(baselineAssignment, someMatchBestPossibleAssignment),
+        (1.0d - (double) 1 / (double) 3));
+    Assert.assertEquals(
+        ResourceUsageCalculator.measureBaselineDivergence(baselineAssignment, baselineAssignment),
+        0.0d);
+  }
+
+  @Test
+  public void testCalculateAveragePartitionWeight() {
+    Map<String, Map<String, Integer>> partitionCapacityMap = ImmutableMap.of(
+        "partition1", ImmutableMap.of("capacity1", 20, "capacity2", 40),
+        "partition2", ImmutableMap.of("capacity1", 30, "capacity2", 50),
+        "partition3", ImmutableMap.of("capacity1", 16, "capacity2", 30));
+
+    Map<String, Integer> averageCapacityWeightMap =
+        ResourceUsageCalculator.calculateAveragePartitionWeight(partitionCapacityMap);
+    Map<String, Integer> expectedAverageWeightMap =
+        ImmutableMap.of("capacity1", 22, "capacity2", 40);
+
+    Assert.assertNotNull(averageCapacityWeightMap);
+    Assert.assertEquals(averageCapacityWeightMap, expectedAverageWeightMap);
+  }
+
+  private Map<String, ResourceAssignment> buildResourceAssignment(
+      Map<String, Map<String, Map<String, String>>> resourceMap) {
+    Map<String, ResourceAssignment> assignment = new HashMap<>();
+    for (Map.Entry<String, Map<String, Map<String, String>>> resourceEntry
+        : resourceMap.entrySet()) {
+      ResourceAssignment resource = new ResourceAssignment(resourceEntry.getKey());
+      Map<String, Map<String, String>> partitionMap = resourceEntry.getValue();
+      for (Map.Entry<String, Map<String, String>> partitionEntry : partitionMap.entrySet()) {
+        resource.addReplicaMap(new Partition(partitionEntry.getKey()), partitionEntry.getValue());
+      }
+
+      assignment.put(resourceEntry.getKey(), resource);
+    }
+
+    return assignment;
+  }
+
+  @DataProvider(name = "TestMeasureBaselineDivergenceInput")
+  private Object[][] loadTestMeasureBaselineDivergenceInput() {
+    final String[] params =
+        new String[]{"baseline", "someMatchBestPossible", "noMatchBestPossible"};
+    return TestInputLoader
+        .loadTestInputs("TestResourceUsageCalculator.MeasureBaselineDivergence.json", params);
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/MockAssignmentMetadataStore.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/MockAssignmentMetadataStore.java
new file mode 100644
index 0000000..6e7f896
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/MockAssignmentMetadataStore.java
@@ -0,0 +1,60 @@
+package org.apache.helix.controller.rebalancer.waged;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Collections;
+import java.util.Map;
+
+import org.apache.helix.BucketDataAccessor;
+import org.apache.helix.model.ResourceAssignment;
+import org.mockito.Mockito;
+
+/**
+ * A mock up metadata store for unit test.
+ * This mock datastore persist assignments in memory only.
+ */
+public class MockAssignmentMetadataStore extends AssignmentMetadataStore {
+  MockAssignmentMetadataStore() {
+    super(Mockito.mock(BucketDataAccessor.class), "");
+  }
+
+  public Map<String, ResourceAssignment> getBaseline() {
+    return _globalBaseline == null ? Collections.emptyMap() : _globalBaseline;
+  }
+
+  public boolean persistBaseline(Map<String, ResourceAssignment> globalBaseline) {
+    _globalBaseline = globalBaseline;
+    return true;
+  }
+
+  public Map<String, ResourceAssignment> getBestPossibleAssignment() {
+    return _bestPossibleAssignment == null ? Collections.emptyMap() : _bestPossibleAssignment;
+  }
+
+  public boolean persistBestPossibleAssignment(
+      Map<String, ResourceAssignment> bestPossibleAssignment) {
+    _bestPossibleAssignment = bestPossibleAssignment;
+    return true;
+  }
+
+  public void close() {
+    // do nothing
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/TestAssignmentMetadataStore.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/TestAssignmentMetadataStore.java
new file mode 100644
index 0000000..3237420
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/TestAssignmentMetadataStore.java
@@ -0,0 +1,186 @@
+package org.apache.helix.controller.rebalancer.waged;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.helix.AccessOption;
+import org.apache.helix.HelixManager;
+import org.apache.helix.HelixManagerFactory;
+import org.apache.helix.InstanceType;
+import org.apache.helix.common.ZkTestBase;
+import org.apache.helix.integration.manager.ClusterControllerManager;
+import org.apache.helix.integration.manager.MockParticipantManager;
+import org.apache.helix.model.Partition;
+import org.apache.helix.model.ResourceAssignment;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+
+public class TestAssignmentMetadataStore extends ZkTestBase {
+  protected static final int NODE_NR = 5;
+  protected static final int START_PORT = 12918;
+  protected static final String STATE_MODEL = "MasterSlave";
+  protected static final String TEST_DB = "TestDB";
+  protected static final int _PARTITIONS = 20;
+
+  protected HelixManager _manager;
+  protected final String CLASS_NAME = getShortClassName();
+  protected final String CLUSTER_NAME = CLUSTER_PREFIX + "_" + CLASS_NAME;
+
+  protected MockParticipantManager[] _participants = new MockParticipantManager[NODE_NR];
+  protected ClusterControllerManager _controller;
+  protected int _replica = 3;
+
+  private AssignmentMetadataStore _store;
+
+  @BeforeClass
+  public void beforeClass()
+      throws Exception {
+    super.beforeClass();
+
+    // setup storage cluster
+    _gSetupTool.addCluster(CLUSTER_NAME, true);
+    _gSetupTool.addResourceToCluster(CLUSTER_NAME, TEST_DB, _PARTITIONS, STATE_MODEL);
+    for (int i = 0; i < NODE_NR; i++) {
+      String storageNodeName = PARTICIPANT_PREFIX + "_" + (START_PORT + i);
+      _gSetupTool.addInstanceToCluster(CLUSTER_NAME, storageNodeName);
+    }
+    _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, TEST_DB, _replica);
+
+    // start dummy participants
+    for (int i = 0; i < NODE_NR; i++) {
+      String instanceName = PARTICIPANT_PREFIX + "_" + (START_PORT + i);
+      _participants[i] = new MockParticipantManager(ZK_ADDR, CLUSTER_NAME, instanceName);
+      _participants[i].syncStart();
+    }
+
+    // start controller
+    String controllerName = CONTROLLER_PREFIX + "_0";
+    _controller = new ClusterControllerManager(ZK_ADDR, CLUSTER_NAME, controllerName);
+    _controller.syncStart();
+
+    // create cluster manager
+    _manager = HelixManagerFactory
+        .getZKHelixManager(CLUSTER_NAME, "Admin", InstanceType.ADMINISTRATOR, ZK_ADDR);
+    _manager.connect();
+
+    // create AssignmentMetadataStore
+    _store = new AssignmentMetadataStore(_manager.getMetadataStoreConnectionString(),
+        _manager.getClusterName());
+  }
+
+  @AfterClass
+  public void afterClass() {
+    if (_store != null) {
+      _store.close();
+    }
+  }
+
+  /**
+   * TODO: Reading baseline will be empty because AssignmentMetadataStore isn't being used yet by
+   * the new rebalancer. Modify this integration test once the WAGED rebalancer
+   * starts using AssignmentMetadataStore's persist APIs.
+   * TODO: WAGED Rebalancer currently does NOT work with ZKClusterVerifier because verifier's
+   * HelixManager is null, and that causes an NPE when instantiating AssignmentMetadataStore.
+   */
+  @Test
+  public void testReadEmptyBaseline() {
+    Map<String, ResourceAssignment> baseline = _store.getBaseline();
+    Assert.assertTrue(baseline.isEmpty());
+  }
+
+  /**
+   * Test that if the old assignment and new assignment are the same,
+   */
+  @Test(dependsOnMethods = "testReadEmptyBaseline")
+  public void testAvoidingRedundantWrite() {
+    String baselineKey = "BASELINE";
+    String bestPossibleKey = "BEST_POSSIBLE";
+
+    Map<String, ResourceAssignment> dummyAssignment = getDummyAssignment();
+
+    // Call persist functions
+    _store.persistBaseline(dummyAssignment);
+    _store.persistBestPossibleAssignment(dummyAssignment);
+
+    // Check that only one version exists
+    List<String> baselineVersions = getExistingVersionNumbers(baselineKey);
+    List<String> bestPossibleVersions = getExistingVersionNumbers(bestPossibleKey);
+    Assert.assertEquals(baselineVersions.size(), 1);
+    Assert.assertEquals(bestPossibleVersions.size(), 1);
+
+    // Call persist functions again
+    _store.persistBaseline(dummyAssignment);
+    _store.persistBestPossibleAssignment(dummyAssignment);
+
+    // Check that only one version exists still
+    baselineVersions = getExistingVersionNumbers(baselineKey);
+    bestPossibleVersions = getExistingVersionNumbers(bestPossibleKey);
+    Assert.assertEquals(baselineVersions.size(), 1);
+    Assert.assertEquals(bestPossibleVersions.size(), 1);
+  }
+
+  @Test
+  public void testAssignmentCache() {
+    Map<String, ResourceAssignment> dummyAssignment = getDummyAssignment();
+    // Call persist functions
+    _store.persistBaseline(dummyAssignment);
+    _store.persistBestPossibleAssignment(dummyAssignment);
+
+    Assert.assertEquals(_store._bestPossibleAssignment, dummyAssignment);
+    Assert.assertEquals(_store._globalBaseline, dummyAssignment);
+
+    _store.reset();
+
+    Assert.assertEquals(_store._bestPossibleAssignment, null);
+    Assert.assertEquals(_store._globalBaseline, null);
+  }
+
+  private Map<String, ResourceAssignment> getDummyAssignment() {
+    // Generate a dummy assignment
+    Map<String, ResourceAssignment> dummyAssignment = new HashMap<>();
+    ResourceAssignment assignment = new ResourceAssignment(TEST_DB);
+    Partition partition = new Partition(TEST_DB);
+    Map<String, String> replicaMap = new HashMap<>();
+    replicaMap.put(TEST_DB, TEST_DB);
+    assignment.addReplicaMap(partition, replicaMap);
+    dummyAssignment.put(TEST_DB, new ResourceAssignment(TEST_DB));
+    return dummyAssignment;
+  }
+
+  /**
+   * Returns a list of existing version numbers only.
+   * @param metadataType
+   * @return
+   */
+  private List<String> getExistingVersionNumbers(String metadataType) {
+    List<String> children = _baseAccessor
+        .getChildNames("/" + CLUSTER_NAME + "/ASSIGNMENT_METADATA/" + metadataType,
+            AccessOption.PERSISTENT);
+    children.remove("LAST_SUCCESSFUL_WRITE");
+    children.remove("LAST_WRITE");
+    return children;
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/TestWagedRebalancer.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/TestWagedRebalancer.java
new file mode 100644
index 0000000..8efa66b
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/TestWagedRebalancer.java
@@ -0,0 +1,524 @@
+package org.apache.helix.controller.rebalancer.waged;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.IOException;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+import org.apache.helix.HelixConstants;
+import org.apache.helix.HelixRebalanceException;
+import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
+import org.apache.helix.controller.rebalancer.strategy.CrushRebalanceStrategy;
+import org.apache.helix.controller.rebalancer.waged.constraints.MockRebalanceAlgorithm;
+import org.apache.helix.controller.rebalancer.waged.model.AbstractTestClusterModel;
+import org.apache.helix.controller.stages.CurrentStateOutput;
+import org.apache.helix.model.CurrentState;
+import org.apache.helix.model.IdealState;
+import org.apache.helix.model.InstanceConfig;
+import org.apache.helix.model.LiveInstance;
+import org.apache.helix.model.Partition;
+import org.apache.helix.model.Resource;
+import org.apache.helix.model.ResourceAssignment;
+import org.apache.helix.model.ResourceConfig;
+import org.apache.helix.monitoring.metrics.WagedRebalancerMetricCollector;
+import org.apache.helix.monitoring.metrics.model.CountMetric;
+import org.mockito.Mockito;
+import org.mockito.stubbing.Answer;
+import org.testng.Assert;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+import static org.mockito.Matchers.any;
+import static org.mockito.Matchers.anyString;
+import static org.mockito.Mockito.when;
+
+public class TestWagedRebalancer extends AbstractTestClusterModel {
+  private Set<String> _instances;
+  private MockRebalanceAlgorithm _algorithm;
+  private MockAssignmentMetadataStore _metadataStore;
+
+  @BeforeClass
+  public void initialize() {
+    super.initialize();
+    _instances = new HashSet<>();
+    _instances.add(_testInstanceId);
+    _algorithm = new MockRebalanceAlgorithm();
+
+    // Initialize a mock assignment metadata store
+    _metadataStore = new MockAssignmentMetadataStore();
+  }
+
+  @Override
+  protected ResourceControllerDataProvider setupClusterDataCache() throws IOException {
+    ResourceControllerDataProvider testCache = super.setupClusterDataCache();
+
+    // Set up mock idealstate
+    Map<String, IdealState> isMap = new HashMap<>();
+    for (String resource : _resourceNames) {
+      IdealState is = new IdealState(resource);
+      is.setNumPartitions(_partitionNames.size());
+      is.setRebalanceMode(IdealState.RebalanceMode.FULL_AUTO);
+      is.setStateModelDefRef("MasterSlave");
+      is.setReplicas("3");
+      is.setRebalancerClassName(WagedRebalancer.class.getName());
+      _partitionNames.stream()
+          .forEach(partition -> is.setPreferenceList(partition, Collections.emptyList()));
+      isMap.put(resource, is);
+    }
+    when(testCache.getIdealState(anyString())).thenAnswer(
+        (Answer<IdealState>) invocationOnMock -> isMap.get(invocationOnMock.getArguments()[0]));
+    when(testCache.getIdealStates()).thenReturn(isMap);
+
+    // Set up 2 more instances
+    for (int i = 1; i < 3; i++) {
+      String instanceName = _testInstanceId + i;
+      _instances.add(instanceName);
+      // 1. Set up the default instance information with capacity configuration.
+      InstanceConfig testInstanceConfig = createMockInstanceConfig(instanceName);
+      Map<String, InstanceConfig> instanceConfigMap = testCache.getInstanceConfigMap();
+      instanceConfigMap.put(instanceName, testInstanceConfig);
+      when(testCache.getInstanceConfigMap()).thenReturn(instanceConfigMap);
+      // 2. Mock the live instance node for the default instance.
+      LiveInstance testLiveInstance = createMockLiveInstance(instanceName);
+      Map<String, LiveInstance> liveInstanceMap = testCache.getLiveInstances();
+      liveInstanceMap.put(instanceName, testLiveInstance);
+      when(testCache.getLiveInstances()).thenReturn(liveInstanceMap);
+      when(testCache.getEnabledInstances()).thenReturn(liveInstanceMap.keySet());
+      when(testCache.getEnabledLiveInstances()).thenReturn(liveInstanceMap.keySet());
+      when(testCache.getAllInstances()).thenReturn(_instances);
+    }
+
+    return testCache;
+  }
+
+  @Test
+  public void testRebalance() throws IOException, HelixRebalanceException {
+    _metadataStore.reset();
+    WagedRebalancer rebalancer = new WagedRebalancer(_metadataStore, _algorithm, Optional.empty());
+
+    // Generate the input for the rebalancer.
+    ResourceControllerDataProvider clusterData = setupClusterDataCache();
+    Map<String, Resource> resourceMap = clusterData.getIdealStates().entrySet().stream()
+        .collect(Collectors.toMap(entry -> entry.getKey(), entry -> {
+          Resource resource = new Resource(entry.getKey());
+          entry.getValue().getPartitionSet().stream()
+              .forEach(partition -> resource.addPartition(partition));
+          return resource;
+        }));
+    // Mocking the change types for triggering a baseline rebalance.
+    when(clusterData.getRefreshedChangeTypes())
+        .thenReturn(Collections.singleton(HelixConstants.ChangeType.CLUSTER_CONFIG));
+
+    Map<String, IdealState> newIdealStates =
+        rebalancer.computeNewIdealStates(clusterData, resourceMap, new CurrentStateOutput());
+    Map<String, ResourceAssignment> algorithmResult = _algorithm.getRebalanceResult();
+    // Since there is no special condition, the calculated IdealStates should be exactly the same
+    // as the mock algorithm result.
+    validateRebalanceResult(resourceMap, newIdealStates, algorithmResult);
+  }
+
+  @Test(dependsOnMethods = "testRebalance")
+  public void testPartialRebalance() throws IOException, HelixRebalanceException {
+    _metadataStore.reset();
+    WagedRebalancer rebalancer = new WagedRebalancer(_metadataStore, _algorithm, Optional.empty());
+
+    // Generate the input for the rebalancer.
+    ResourceControllerDataProvider clusterData = setupClusterDataCache();
+    Map<String, Resource> resourceMap = clusterData.getIdealStates().entrySet().stream()
+        .collect(Collectors.toMap(entry -> entry.getKey(), entry -> {
+          Resource resource = new Resource(entry.getKey());
+          entry.getValue().getPartitionSet().stream()
+              .forEach(partition -> resource.addPartition(partition));
+          return resource;
+        }));
+    // Mocking the change types for triggering a baseline rebalance.
+    when(clusterData.getRefreshedChangeTypes())
+        .thenReturn(Collections.singleton(HelixConstants.ChangeType.CLUSTER_CONFIG));
+
+    // Test with partial resources listed in the resourceMap input.
+    // Remove the first resource from the input. Note it still exists in the cluster data cache.
+    _metadataStore.reset();
+    resourceMap.remove(_resourceNames.get(0));
+    Map<String, IdealState> newIdealStates =
+        rebalancer.computeNewIdealStates(clusterData, resourceMap, new CurrentStateOutput());
+    Map<String, ResourceAssignment> algorithmResult = _algorithm.getRebalanceResult();
+    validateRebalanceResult(resourceMap, newIdealStates, algorithmResult);
+  }
+
+  @Test(dependsOnMethods = "testRebalance")
+  public void testRebalanceWithCurrentState() throws IOException, HelixRebalanceException {
+    _metadataStore.reset();
+    WagedRebalancer rebalancer = new WagedRebalancer(_metadataStore, _algorithm, Optional.empty());
+
+    // Generate the input for the rebalancer.
+    ResourceControllerDataProvider clusterData = setupClusterDataCache();
+    Map<String, Resource> resourceMap = clusterData.getIdealStates().entrySet().stream()
+        .collect(Collectors.toMap(entry -> entry.getKey(), entry -> {
+          Resource resource = new Resource(entry.getKey());
+          entry.getValue().getPartitionSet().stream()
+              .forEach(partition -> resource.addPartition(partition));
+          return resource;
+        }));
+    // Mocking the change types for triggering a baseline rebalance.
+    when(clusterData.getRefreshedChangeTypes())
+        .thenReturn(Collections.singleton(HelixConstants.ChangeType.CLUSTER_CONFIG));
+
+    // Test with current state exists, so the rebalancer should calculate for the intermediate state
+    // Create current state based on the cluster data cache.
+    CurrentStateOutput currentStateOutput = new CurrentStateOutput();
+    for (String instanceName : _instances) {
+      for (Map.Entry<String, CurrentState> csEntry : clusterData
+          .getCurrentState(instanceName, _sessionId).entrySet()) {
+        String resourceName = csEntry.getKey();
+        CurrentState cs = csEntry.getValue();
+        for (Map.Entry<String, String> partitionStateEntry : cs.getPartitionStateMap().entrySet()) {
+          currentStateOutput.setCurrentState(resourceName,
+              new Partition(partitionStateEntry.getKey()), instanceName,
+              partitionStateEntry.getValue());
+        }
+      }
+    }
+
+    // The state calculation will be adjusted based on the current state.
+    // So test the following cases:
+    // 1.1. Disable a resource, and the partitions in CS will be offline.
+    String disabledResourceName = _resourceNames.get(0);
+    clusterData.getIdealState(disabledResourceName).enable(false);
+    // 1.2. Adding more unknown partitions to the CS, so they will be dropped.
+    String droppingResourceName = _resourceNames.get(1);
+    String droppingPartitionName = "UnknownPartition";
+    String droppingFromInstance = _testInstanceId;
+    currentStateOutput.setCurrentState(droppingResourceName, new Partition(droppingPartitionName),
+        droppingFromInstance, "SLAVE");
+    resourceMap.get(droppingResourceName).addPartition(droppingPartitionName);
+
+    Map<String, IdealState> newIdealStates =
+        rebalancer.computeNewIdealStates(clusterData, resourceMap, currentStateOutput);
+    // All the replica state should be OFFLINE
+    IdealState disabledIdealState = newIdealStates.get(disabledResourceName);
+    for (String partition : disabledIdealState.getPartitionSet()) {
+      Assert.assertTrue(disabledIdealState.getInstanceStateMap(partition).values().stream()
+          .allMatch(state -> state.equals("OFFLINE")));
+    }
+    // the dropped partition should be dropped.
+    IdealState droppedIdealState = newIdealStates.get(droppingResourceName);
+    Assert.assertEquals(
+        droppedIdealState.getInstanceStateMap(droppingPartitionName).get(droppingFromInstance),
+        "DROPPED");
+  }
+
+  @Test(dependsOnMethods = "testRebalance", expectedExceptions = HelixRebalanceException.class, expectedExceptionsMessageRegExp = "Input contains invalid resource\\(s\\) that cannot be rebalanced by the WAGED rebalancer. \\[Resource1\\] Failure Type: INVALID_INPUT")
+  public void testNonCompatibleConfiguration()
+      throws IOException, HelixRebalanceException {
+    _metadataStore.reset();
+    WagedRebalancer rebalancer = new WagedRebalancer(_metadataStore, _algorithm, Optional.empty());
+
+    ResourceControllerDataProvider clusterData = setupClusterDataCache();
+    String nonCompatibleResourceName = _resourceNames.get(0);
+    clusterData.getIdealState(nonCompatibleResourceName)
+        .setRebalancerClassName(CrushRebalanceStrategy.class.getName());
+    // The input resource Map shall contain all the valid resources.
+    Map<String, Resource> resourceMap = clusterData.getIdealStates().entrySet().stream()
+        .collect(Collectors.toMap(entry -> entry.getKey(), entry -> {
+          Resource resource = new Resource(entry.getKey());
+          entry.getValue().getPartitionSet().stream()
+              .forEach(partition -> resource.addPartition(partition));
+          return resource;
+        }));
+    rebalancer.computeNewIdealStates(clusterData, resourceMap, new CurrentStateOutput());
+  }
+
+  // TODO test with invalid capacity configuration which will fail the cluster model constructing.
+  @Test(dependsOnMethods = "testRebalance")
+  public void testInvalidClusterStatus() throws IOException, HelixRebalanceException {
+    _metadataStore.reset();
+    WagedRebalancer rebalancer = new WagedRebalancer(_metadataStore, _algorithm, Optional.empty());
+
+    ResourceControllerDataProvider clusterData = setupClusterDataCache();
+    String invalidResource = _resourceNames.get(0);
+    // The state model does not exist
+    clusterData.getIdealState(invalidResource).setStateModelDefRef("foobar");
+    // The input resource Map shall contain all the valid resources.
+    Map<String, Resource> resourceMap = clusterData.getIdealStates().keySet().stream().collect(
+        Collectors.toMap(resourceName -> resourceName, resourceName -> new Resource(resourceName)));
+    try {
+      rebalancer.computeBestPossibleAssignment(clusterData, resourceMap,
+          clusterData.getEnabledLiveInstances(), new CurrentStateOutput(), _algorithm);
+      Assert.fail("Rebalance shall fail.");
+    } catch (HelixRebalanceException ex) {
+      Assert.assertEquals(ex.getFailureType(), HelixRebalanceException.Type.INVALID_CLUSTER_STATUS);
+      Assert.assertEquals(ex.getMessage(),
+          "Failed to generate cluster model for partial rebalance. Failure Type: INVALID_CLUSTER_STATUS");
+    }
+
+    // The rebalance will be done with empty mapping result since there is no previously calculated
+    // assignment.
+    Assert.assertTrue(
+        rebalancer.computeNewIdealStates(clusterData, resourceMap, new CurrentStateOutput())
+            .isEmpty());
+  }
+
+  @Test(dependsOnMethods = "testRebalance")
+  public void testInvalidRebalancerStatus() throws IOException {
+    // Mock a metadata store that will fail on all the calls.
+    AssignmentMetadataStore metadataStore = Mockito.mock(AssignmentMetadataStore.class);
+    when(metadataStore.getBaseline())
+        .thenThrow(new RuntimeException("Mock Error. Metadata store fails."));
+    WagedRebalancer rebalancer = new WagedRebalancer(metadataStore, _algorithm, Optional.empty());
+
+    ResourceControllerDataProvider clusterData = setupClusterDataCache();
+    // The input resource Map shall contain all the valid resources.
+    Map<String, Resource> resourceMap = clusterData.getIdealStates().keySet().stream().collect(
+        Collectors.toMap(resourceName -> resourceName, resourceName -> new Resource(resourceName)));
+    try {
+      rebalancer.computeNewIdealStates(clusterData, resourceMap, new CurrentStateOutput());
+      Assert.fail("Rebalance shall fail.");
+    } catch (HelixRebalanceException ex) {
+      Assert.assertEquals(ex.getFailureType(),
+          HelixRebalanceException.Type.INVALID_REBALANCER_STATUS);
+      Assert.assertEquals(ex.getMessage(),
+          "Failed to get the current baseline assignment because of unexpected error. Failure Type: INVALID_REBALANCER_STATUS");
+    }
+  }
+
+  @Test(dependsOnMethods = "testRebalance")
+  public void testAlgorithmException()
+      throws IOException, HelixRebalanceException {
+    _metadataStore.reset();
+    WagedRebalancer rebalancer = new WagedRebalancer(_metadataStore, _algorithm, Optional.empty());
+
+    ResourceControllerDataProvider clusterData = setupClusterDataCache();
+    Map<String, Resource> resourceMap = clusterData.getIdealStates().entrySet().stream()
+        .collect(Collectors.toMap(entry -> entry.getKey(), entry -> {
+          Resource resource = new Resource(entry.getKey());
+          entry.getValue().getPartitionSet().stream()
+              .forEach(partition -> resource.addPartition(partition));
+          return resource;
+        }));
+    // Rebalance with normal configuration. So the assignment will be persisted in the metadata store.
+    Map<String, IdealState> result =
+        rebalancer.computeNewIdealStates(clusterData, resourceMap, new CurrentStateOutput());
+
+    // Recreate a rebalance with the same metadata store but bad algorithm instance.
+    RebalanceAlgorithm badAlgorithm = Mockito.mock(RebalanceAlgorithm.class);
+    when(badAlgorithm.calculate(any())).thenThrow(new HelixRebalanceException("Algorithm fails.",
+        HelixRebalanceException.Type.FAILED_TO_CALCULATE));
+    rebalancer = new WagedRebalancer(_metadataStore, badAlgorithm, Optional.empty());
+
+    // Calculation will fail
+    try {
+      rebalancer.computeBestPossibleAssignment(clusterData, resourceMap,
+          clusterData.getEnabledLiveInstances(), new CurrentStateOutput(), badAlgorithm);
+      Assert.fail("Rebalance shall fail.");
+    } catch (HelixRebalanceException ex) {
+      Assert.assertEquals(ex.getFailureType(), HelixRebalanceException.Type.FAILED_TO_CALCULATE);
+      Assert.assertEquals(ex.getMessage(), "Algorithm fails. Failure Type: FAILED_TO_CALCULATE");
+    }
+    // But if call with the public method computeNewIdealStates(), the rebalance will return with
+    // the previous rebalance result.
+    Map<String, IdealState> newResult =
+        rebalancer.computeNewIdealStates(clusterData, resourceMap, new CurrentStateOutput());
+    Assert.assertEquals(newResult, result);
+    // Ensure failure has been recorded
+    Assert.assertEquals(rebalancer.getMetricCollector().getMetric(
+        WagedRebalancerMetricCollector.WagedRebalancerMetricNames.RebalanceFailureCounter.name(),
+        CountMetric.class).getValue().longValue(), 1l);
+  }
+
+  @Test(dependsOnMethods = "testRebalance")
+  public void testRebalanceOnChanges() throws IOException, HelixRebalanceException {
+    // Test continuously rebalance with the same rebalancer with different internal state. Ensure
+    // that the rebalancer handles different input (different cluster changes) based on the internal
+    // state in a correct way.
+
+    // Note that this test relies on the MockRebalanceAlgorithm implementation. The mock algorithm
+    // won't propagate any existing assignment from the cluster model.
+    _metadataStore.reset();
+    WagedRebalancer rebalancer = new WagedRebalancer(_metadataStore, _algorithm, Optional.empty());
+
+    // 1. rebalance with baseline calculation done
+    // Generate the input for the rebalancer.
+    ResourceControllerDataProvider clusterData = setupClusterDataCache();
+    // Cluster config change will trigger baseline to be recalculated.
+    when(clusterData.getRefreshedChangeTypes())
+        .thenReturn(Collections.singleton(HelixConstants.ChangeType.CLUSTER_CONFIG));
+    // Update the config so the cluster config will be marked as changed.
+    clusterData.getClusterConfig().getRecord().setSimpleField("foo", "bar");
+    Map<String, Resource> resourceMap = clusterData.getIdealStates().entrySet().stream()
+        .collect(Collectors.toMap(entry -> entry.getKey(), entry -> {
+          Resource resource = new Resource(entry.getKey());
+          entry.getValue().getPartitionSet().stream()
+              .forEach(partition -> resource.addPartition(partition));
+          return resource;
+        }));
+    Map<String, IdealState> newIdealStates =
+        rebalancer.computeNewIdealStates(clusterData, resourceMap, new CurrentStateOutput());
+    Map<String, ResourceAssignment> algorithmResult = _algorithm.getRebalanceResult();
+    // Since there is no special condition, the calculated IdealStates should be exactly the same
+    // as the mock algorithm result.
+    validateRebalanceResult(resourceMap, newIdealStates, algorithmResult);
+    Map<String, ResourceAssignment> baseline = _metadataStore.getBaseline();
+    Assert.assertEquals(algorithmResult, baseline);
+    Map<String, ResourceAssignment> bestPossibleAssignment =
+        _metadataStore.getBestPossibleAssignment();
+    Assert.assertEquals(algorithmResult, bestPossibleAssignment);
+
+    // 2. rebalance with one resource changed in the Resource Config znode only
+    String changedResourceName = _resourceNames.get(0);
+    when(clusterData.getRefreshedChangeTypes())
+        .thenReturn(Collections.singleton(HelixConstants.ChangeType.RESOURCE_CONFIG));
+    ResourceConfig config = new ResourceConfig(clusterData.getResourceConfig(changedResourceName).getRecord());
+    // Update the config so the resource will be marked as changed.
+    config.putSimpleConfig("foo", "bar");
+    when(clusterData.getResourceConfig(changedResourceName)).thenReturn(config);
+    clusterData.getResourceConfigMap().put(changedResourceName, config);
+
+    // Although the input contains 2 resources, the rebalancer shall only call the algorithm to
+    // rebalance the changed one.
+    newIdealStates =
+        rebalancer.computeNewIdealStates(clusterData, resourceMap, new CurrentStateOutput());
+    Map<String, ResourceAssignment> partialAlgorithmResult = _algorithm.getRebalanceResult();
+
+    // Verify that only the changed resource has been included in the calculation.
+    validateRebalanceResult(
+        Collections.singletonMap(changedResourceName, new Resource(changedResourceName)),
+        newIdealStates, partialAlgorithmResult);
+    // Best possible assignment contains the new assignment of only one resource.
+    baseline = _metadataStore.getBaseline();
+    Assert.assertEquals(baseline, partialAlgorithmResult);
+    // Best possible assignment contains the new assignment of only one resource.
+    bestPossibleAssignment = _metadataStore.getBestPossibleAssignment();
+    Assert.assertEquals(bestPossibleAssignment, partialAlgorithmResult);
+
+    // * Before the next test, recover the best possible assignment record.
+    _metadataStore.persistBestPossibleAssignment(algorithmResult);
+    _metadataStore.persistBaseline(algorithmResult);
+
+    // 3. rebalance with current state change only
+    // Create a new cluster data cache to simulate cluster change
+    clusterData = setupClusterDataCache();
+    when(clusterData.getRefreshedChangeTypes())
+        .thenReturn(Collections.singleton(HelixConstants.ChangeType.CURRENT_STATE));
+    // Modify any current state
+    CurrentState cs =
+        clusterData.getCurrentState(_testInstanceId, _sessionId).get(_resourceNames.get(0));
+    // Update the tag so the ideal state will be marked as changed.
+    cs.setInfo(_partitionNames.get(0), "mock update");
+
+    // Although the input contains 2 resources, the rebalancer shall not try to recalculate
+    // assignment since there is only current state change.
+    newIdealStates =
+        rebalancer.computeNewIdealStates(clusterData, resourceMap, new CurrentStateOutput());
+    Map<String, ResourceAssignment> newAlgorithmResult = _algorithm.getRebalanceResult();
+
+    // Verify that only the changed resource has been included in the calculation.
+    validateRebalanceResult(Collections.emptyMap(), newIdealStates, newAlgorithmResult);
+    // There should be no changes in the baseline since only the currentStates changed
+    baseline = _metadataStore.getBaseline();
+    Assert.assertEquals(baseline, algorithmResult);
+    // The BestPossible assignment should have been updated since computeNewIdealStates() should have been called.
+    bestPossibleAssignment = _metadataStore.getBestPossibleAssignment();
+    Assert.assertEquals(bestPossibleAssignment, newAlgorithmResult);
+
+    // 4. rebalance with no change but best possible state record missing.
+    // This usually happens when the persisted assignment state is gone.
+    clusterData = setupClusterDataCache(); // Note this mock data cache won't report any change.
+    // Even with no change, since the previous assignment is empty, the rebalancer will still
+    // calculate the assignment for both resources.
+    newIdealStates =
+        rebalancer.computeNewIdealStates(clusterData, resourceMap, new CurrentStateOutput());
+    newAlgorithmResult = _algorithm.getRebalanceResult();
+    // Verify that both resource has been included in the calculation.
+    validateRebalanceResult(resourceMap, newIdealStates, newAlgorithmResult);
+    // There should not be any changes in the baseline.
+    baseline = _metadataStore.getBaseline();
+    Assert.assertEquals(baseline, algorithmResult);
+    // The BestPossible assignment should have been updated since computeNewIdealStates() should have been called.
+    bestPossibleAssignment = _metadataStore.getBestPossibleAssignment();
+    Assert.assertEquals(bestPossibleAssignment, newAlgorithmResult);
+  }
+
+  @Test(dependsOnMethods = "testRebalance")
+  public void testReset() throws IOException, HelixRebalanceException {
+    _metadataStore.reset();
+    WagedRebalancer rebalancer = new WagedRebalancer(_metadataStore, _algorithm, Optional.empty());
+    // Generate the input for the rebalancer.
+    ResourceControllerDataProvider clusterData = setupClusterDataCache();
+    Map<String, Resource> resourceMap = clusterData.getIdealStates().entrySet().stream()
+        .collect(Collectors.toMap(entry -> entry.getKey(), entry -> {
+          Resource resource = new Resource(entry.getKey());
+          entry.getValue().getPartitionSet().stream()
+              .forEach(partition -> resource.addPartition(partition));
+          return resource;
+        }));
+    // Mocking the change types for triggering a baseline rebalance.
+    when(clusterData.getRefreshedChangeTypes())
+        .thenReturn(Collections.singleton(HelixConstants.ChangeType.CLUSTER_CONFIG));
+    Map<String, IdealState> newIdealStates =
+        rebalancer.computeNewIdealStates(clusterData, resourceMap, new CurrentStateOutput());
+    Map<String, ResourceAssignment> algorithmResult = _algorithm.getRebalanceResult();
+    validateRebalanceResult(resourceMap, newIdealStates, algorithmResult);
+
+    // Clean up algorithm result for the next test step
+    algorithmResult.clear();
+    // Try to trigger a new rebalancer, since nothing has been changed. There will be no rebalance.
+    when(clusterData.getRefreshedChangeTypes())
+        .thenReturn(Collections.singleton(HelixConstants.ChangeType.CLUSTER_CONFIG));
+    Assert.assertEquals(
+        rebalancer.computeNewIdealStates(clusterData, resourceMap, new CurrentStateOutput()),
+        newIdealStates);
+    algorithmResult = _algorithm.getRebalanceResult();
+    Assert.assertEquals(algorithmResult, Collections.emptyMap());
+
+    // Reset the rebalance and do the same operation. Without any cache info, the rebalancer will
+    // finish the complete rebalance.
+    rebalancer.reset();
+    algorithmResult.clear();
+    // Try to trigger a new rebalancer, since nothing has been changed. There will be no rebalance.
+    when(clusterData.getRefreshedChangeTypes())
+        .thenReturn(Collections.singleton(HelixConstants.ChangeType.CLUSTER_CONFIG));
+    newIdealStates =
+        rebalancer.computeNewIdealStates(clusterData, resourceMap, new CurrentStateOutput());
+    algorithmResult = _algorithm.getRebalanceResult();
+    validateRebalanceResult(resourceMap, newIdealStates, algorithmResult);
+  }
+
+  private void validateRebalanceResult(Map<String, Resource> resourceMap,
+      Map<String, IdealState> newIdealStates, Map<String, ResourceAssignment> expectedResult) {
+    Assert.assertEquals(newIdealStates.keySet(), resourceMap.keySet());
+    for (String resourceName : expectedResult.keySet()) {
+      Assert.assertTrue(newIdealStates.containsKey(resourceName));
+      IdealState is = newIdealStates.get(resourceName);
+      ResourceAssignment assignment = expectedResult.get(resourceName);
+      Assert.assertEquals(is.getPartitionSet(), new HashSet<>(assignment.getMappedPartitions()
+          .stream().map(partition -> partition.getPartitionName()).collect(Collectors.toSet())));
+      for (String partitionName : is.getPartitionSet()) {
+        Assert.assertEquals(is.getInstanceStateMap(partitionName),
+            assignment.getReplicaMap(new Partition(partitionName)));
+      }
+    }
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/TestWagedRebalancerMetrics.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/TestWagedRebalancerMetrics.java
new file mode 100644
index 0000000..a38753a
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/TestWagedRebalancerMetrics.java
@@ -0,0 +1,190 @@
+package org.apache.helix.controller.rebalancer.waged;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.IOException;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+import java.util.concurrent.Executors;
+import java.util.stream.Collectors;
+import javax.management.JMException;
+
+import org.apache.helix.HelixConstants;
+import org.apache.helix.HelixRebalanceException;
+import org.apache.helix.TestHelper;
+import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
+import org.apache.helix.controller.rebalancer.waged.constraints.MockRebalanceAlgorithm;
+import org.apache.helix.controller.rebalancer.waged.model.AbstractTestClusterModel;
+import org.apache.helix.controller.stages.CurrentStateOutput;
+import org.apache.helix.model.IdealState;
+import org.apache.helix.model.InstanceConfig;
+import org.apache.helix.model.LiveInstance;
+import org.apache.helix.model.Resource;
+import org.apache.helix.monitoring.metrics.MetricCollector;
+import org.apache.helix.monitoring.metrics.WagedRebalancerMetricCollector;
+import org.apache.helix.monitoring.metrics.model.CountMetric;
+import org.apache.helix.monitoring.metrics.model.RatioMetric;
+import org.mockito.stubbing.Answer;
+import org.testng.Assert;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+import static org.mockito.Matchers.anyString;
+import static org.mockito.Mockito.when;
+
+public class TestWagedRebalancerMetrics extends AbstractTestClusterModel {
+  private static final String TEST_STRING = "TEST";
+  private MetricCollector _metricCollector;
+  private Set<String> _instances;
+  private MockRebalanceAlgorithm _algorithm;
+  private MockAssignmentMetadataStore _metadataStore;
+
+  @BeforeClass
+  public void initialize() {
+    super.initialize();
+    _instances = new HashSet<>();
+    _instances.add(_testInstanceId);
+    _algorithm = new MockRebalanceAlgorithm();
+
+    // Initialize a mock assignment metadata store
+    _metadataStore = new MockAssignmentMetadataStore();
+  }
+
+  @Test
+  public void testMetricValuePropagation()
+      throws JMException, HelixRebalanceException, IOException {
+    _metadataStore.reset();
+    _metricCollector = new WagedRebalancerMetricCollector(TEST_STRING);
+    WagedRebalancer rebalancer =
+        new WagedRebalancer(_metadataStore, _algorithm, Optional.of(_metricCollector));
+
+    // Generate the input for the rebalancer.
+    ResourceControllerDataProvider clusterData = setupClusterDataCache();
+    Map<String, Resource> resourceMap = clusterData.getIdealStates().entrySet().stream()
+        .collect(Collectors.toMap(entry -> entry.getKey(), entry -> {
+          Resource resource = new Resource(entry.getKey());
+          entry.getValue().getPartitionSet().stream()
+              .forEach(partition -> resource.addPartition(partition));
+          return resource;
+        }));
+    Map<String, IdealState> newIdealStates =
+        rebalancer.computeNewIdealStates(clusterData, resourceMap, new CurrentStateOutput());
+
+    // Check that there exists a non-zero value in the metrics
+    Assert.assertTrue(_metricCollector.getMetricMap().values().stream()
+        .anyMatch(metric -> (long) metric.getLastEmittedMetricValue() > 0L));
+  }
+
+  @Test
+  public void testWagedRebalanceMetrics()
+      throws Exception {
+    _metadataStore.reset();
+    MetricCollector metricCollector = new WagedRebalancerMetricCollector(TEST_STRING);
+    WagedRebalancer rebalancer =
+        new WagedRebalancer(_metadataStore, _algorithm, Optional.of(metricCollector));
+    // Generate the input for the rebalancer.
+    ResourceControllerDataProvider clusterData = setupClusterDataCache();
+    Map<String, Resource> resourceMap = clusterData.getIdealStates().entrySet().stream()
+        .collect(Collectors.toMap(entry -> entry.getKey(), entry -> {
+          Resource resource = new Resource(entry.getKey());
+          entry.getValue().getPartitionSet().stream()
+              .forEach(partition -> resource.addPartition(partition));
+          return resource;
+        }));
+
+    Assert.assertEquals((long) metricCollector.getMetric(
+        WagedRebalancerMetricCollector.WagedRebalancerMetricNames.GlobalBaselineCalcCounter.name(),
+        CountMetric.class).getLastEmittedMetricValue(), 0L);
+    Assert.assertEquals((long) metricCollector.getMetric(
+        WagedRebalancerMetricCollector.WagedRebalancerMetricNames.PartialRebalanceCounter.name(),
+        CountMetric.class).getLastEmittedMetricValue(), 0L);
+    Assert.assertEquals((double) metricCollector.getMetric(
+        WagedRebalancerMetricCollector.WagedRebalancerMetricNames.BaselineDivergenceGauge.name(),
+        RatioMetric.class).getLastEmittedMetricValue(), 0.0d);
+
+    // Cluster config change will trigger baseline recalculation and partial rebalance.
+    when(clusterData.getRefreshedChangeTypes())
+        .thenReturn(Collections.singleton(HelixConstants.ChangeType.CLUSTER_CONFIG));
+    // Add a field to the cluster config so the cluster config will be marked as changed in the change detector.
+    clusterData.getClusterConfig().getRecord().setSimpleField("foo", "bar");
+
+    rebalancer.computeNewIdealStates(clusterData, resourceMap, new CurrentStateOutput());
+
+    Assert.assertEquals((long) metricCollector.getMetric(
+        WagedRebalancerMetricCollector.WagedRebalancerMetricNames.GlobalBaselineCalcCounter.name(),
+        CountMetric.class).getLastEmittedMetricValue(), 1L);
+    Assert.assertEquals((long) metricCollector.getMetric(
+        WagedRebalancerMetricCollector.WagedRebalancerMetricNames.PartialRebalanceCounter.name(),
+        CountMetric.class).getLastEmittedMetricValue(), 1L);
+
+    // Wait for asyncReportBaselineDivergenceGauge to complete and verify.
+    Assert.assertTrue(TestHelper.verify(() -> (double) metricCollector.getMetric(
+        WagedRebalancerMetricCollector.WagedRebalancerMetricNames.BaselineDivergenceGauge.name(),
+        RatioMetric.class).getLastEmittedMetricValue() == 0.0d, TestHelper.WAIT_DURATION));
+  }
+
+  @Override
+  protected ResourceControllerDataProvider setupClusterDataCache() throws IOException {
+    ResourceControllerDataProvider testCache = super.setupClusterDataCache();
+
+    // Set up mock idealstate
+    Map<String, IdealState> isMap = new HashMap<>();
+    for (String resource : _resourceNames) {
+      IdealState is = new IdealState(resource);
+      is.setNumPartitions(_partitionNames.size());
+      is.setRebalanceMode(IdealState.RebalanceMode.FULL_AUTO);
+      is.setStateModelDefRef("MasterSlave");
+      is.setReplicas("100");
+      is.setRebalancerClassName(WagedRebalancer.class.getName());
+      _partitionNames.stream()
+          .forEach(partition -> is.setPreferenceList(partition, Collections.emptyList()));
+      isMap.put(resource, is);
+    }
+    when(testCache.getIdealState(anyString())).thenAnswer(
+        (Answer<IdealState>) invocationOnMock -> isMap.get(invocationOnMock.getArguments()[0]));
+    when(testCache.getIdealStates()).thenReturn(isMap);
+    when(testCache.getAsyncTasksThreadPool()).thenReturn(Executors.newSingleThreadExecutor());
+
+    // Set up 2 more instances
+    for (int i = 1; i < 3; i++) {
+      String instanceName = _testInstanceId + i;
+      _instances.add(instanceName);
+      // 1. Set up the default instance information with capacity configuration.
+      InstanceConfig testInstanceConfig = createMockInstanceConfig(instanceName);
+      Map<String, InstanceConfig> instanceConfigMap = testCache.getInstanceConfigMap();
+      instanceConfigMap.put(instanceName, testInstanceConfig);
+      when(testCache.getInstanceConfigMap()).thenReturn(instanceConfigMap);
+      // 2. Mock the live instance node for the default instance.
+      LiveInstance testLiveInstance = createMockLiveInstance(instanceName);
+      Map<String, LiveInstance> liveInstanceMap = testCache.getLiveInstances();
+      liveInstanceMap.put(instanceName, testLiveInstance);
+      when(testCache.getLiveInstances()).thenReturn(liveInstanceMap);
+      when(testCache.getEnabledInstances()).thenReturn(liveInstanceMap.keySet());
+      when(testCache.getEnabledLiveInstances()).thenReturn(liveInstanceMap.keySet());
+      when(testCache.getAllInstances()).thenReturn(_instances);
+    }
+
+    return testCache;
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/MockRebalanceAlgorithm.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/MockRebalanceAlgorithm.java
new file mode 100644
index 0000000..e3b9523
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/MockRebalanceAlgorithm.java
@@ -0,0 +1,84 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.controller.rebalancer.waged.RebalanceAlgorithm;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterModel;
+import org.apache.helix.controller.rebalancer.waged.model.OptimalAssignment;
+import org.apache.helix.model.Partition;
+import org.apache.helix.model.ResourceAssignment;
+import org.mockito.Mockito;
+
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.Map;
+
+import static org.mockito.Mockito.when;
+
+/**
+ * A mock up rebalance algorithm for unit test.
+ * Note that the mock algorithm won't propagate the existing assignment to the output as a real
+ * algorithm will do. This is for the convenience of testing.
+ */
+public class MockRebalanceAlgorithm implements RebalanceAlgorithm {
+  Map<String, ResourceAssignment> _resultHistory = Collections.emptyMap();
+
+  @Override
+  public OptimalAssignment calculate(ClusterModel clusterModel) {
+    // If no predefined rebalance result setup, do card dealing.
+    Map<String, ResourceAssignment> result = new HashMap<>();
+    Iterator<AssignableNode> nodeIterator =
+        clusterModel.getAssignableNodes().values().stream().sorted().iterator();
+    for (String resource : clusterModel.getAssignableReplicaMap().keySet()) {
+      Iterator<AssignableReplica> replicaIterator =
+          clusterModel.getAssignableReplicaMap().get(resource).stream().sorted().iterator();
+      while (replicaIterator.hasNext()) {
+        AssignableReplica replica = replicaIterator.next();
+        if (!nodeIterator.hasNext()) {
+          nodeIterator = clusterModel.getAssignableNodes().values().stream().sorted().iterator();
+        }
+        AssignableNode node = nodeIterator.next();
+
+        // Put the assignment
+        ResourceAssignment assignment = result.computeIfAbsent(replica.getResourceName(),
+            resourceName -> new ResourceAssignment(resourceName));
+        Partition partition = new Partition(replica.getPartitionName());
+        if (assignment.getReplicaMap(partition).isEmpty()) {
+          assignment.addReplicaMap(partition, new HashMap<>());
+        }
+        assignment.getReplicaMap(partition).put(node.getInstanceName(), replica.getReplicaState());
+      }
+    }
+
+    _resultHistory = result;
+
+    // Mock the return value for supporting test.
+    OptimalAssignment optimalAssignment = Mockito.mock(OptimalAssignment.class);
+    when(optimalAssignment.getOptimalResourceAssignment()).thenReturn(result);
+    return optimalAssignment;
+  }
+
+  public Map<String, ResourceAssignment> getRebalanceResult() {
+    return new HashMap<>(_resultHistory);
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestConstraintBasedAlgorithm.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestConstraintBasedAlgorithm.java
new file mode 100644
index 0000000..e0e2eb3
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestConstraintBasedAlgorithm.java
@@ -0,0 +1,72 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import static org.mockito.Matchers.any;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.when;
+
+import java.io.IOException;
+
+import org.apache.helix.HelixRebalanceException;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterModel;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterModelTestHelper;
+import org.apache.helix.controller.rebalancer.waged.model.OptimalAssignment;
+import org.testng.Assert;
+import org.testng.annotations.BeforeMethod;
+import org.testng.annotations.Test;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+
+public class TestConstraintBasedAlgorithm {
+  private ConstraintBasedAlgorithm _algorithm;
+
+  @BeforeMethod
+  public void beforeMethod() {
+    HardConstraint mockHardConstraint = mock(HardConstraint.class);
+    SoftConstraint mockSoftConstraint = mock(SoftConstraint.class);
+    when(mockHardConstraint.isAssignmentValid(any(), any(), any())).thenReturn(false);
+    when(mockSoftConstraint.getAssignmentNormalizedScore(any(), any(), any())).thenReturn(1.0);
+
+    _algorithm = new ConstraintBasedAlgorithm(ImmutableList.of(mockHardConstraint),
+        ImmutableMap.of(mockSoftConstraint, 1f));
+  }
+
+  @Test(expectedExceptions = HelixRebalanceException.class)
+  public void testCalculateNoValidAssignment() throws IOException, HelixRebalanceException {
+    ClusterModel clusterModel = new ClusterModelTestHelper().getDefaultClusterModel();
+    _algorithm.calculate(clusterModel);
+  }
+
+  @Test
+  public void testCalculateWithValidAssignment() throws IOException, HelixRebalanceException {
+    HardConstraint mockHardConstraint = mock(HardConstraint.class);
+    SoftConstraint mockSoftConstraint = mock(SoftConstraint.class);
+    when(mockHardConstraint.isAssignmentValid(any(), any(), any())).thenReturn(true);
+    when(mockSoftConstraint.getAssignmentNormalizedScore(any(), any(), any())).thenReturn(1.0);
+    _algorithm = new ConstraintBasedAlgorithm(ImmutableList.of(mockHardConstraint),
+        ImmutableMap.of(mockSoftConstraint, 1f));
+    ClusterModel clusterModel = new ClusterModelTestHelper().getDefaultClusterModel();
+    OptimalAssignment optimalAssignment = _algorithm.calculate(clusterModel);
+
+    Assert.assertFalse(optimalAssignment.hasAnyFailure());
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestFaultZoneAwareConstraint.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestFaultZoneAwareConstraint.java
new file mode 100644
index 0000000..9d2cb14
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestFaultZoneAwareConstraint.java
@@ -0,0 +1,79 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import static org.mockito.Mockito.when;
+
+import java.util.Collections;
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+import org.mockito.Mockito;
+import org.testng.Assert;
+import org.testng.annotations.BeforeMethod;
+import org.testng.annotations.Test;
+
+import com.google.common.collect.ImmutableSet;
+
+public class TestFaultZoneAwareConstraint {
+  private static final String TEST_PARTITION = "testPartition";
+  private static final String TEST_ZONE = "testZone";
+  private static final String TEST_RESOURCE = "testResource";
+  private final AssignableReplica _testReplica = Mockito.mock(AssignableReplica.class);
+  private final AssignableNode _testNode = Mockito.mock(AssignableNode.class);
+  private final ClusterContext _clusterContext = Mockito.mock(ClusterContext.class);
+
+  private final HardConstraint _faultZoneAwareConstraint = new FaultZoneAwareConstraint();
+
+  @BeforeMethod
+  public void init() {
+    when(_testReplica.getResourceName()).thenReturn(TEST_RESOURCE);
+    when(_testReplica.getPartitionName()).thenReturn(TEST_PARTITION);
+    when(_testNode.getFaultZone()).thenReturn(TEST_ZONE);
+  }
+
+  @Test
+  public void inValidWhenFaultZoneAlreadyAssigned() {
+    when(_testNode.hasFaultZone()).thenReturn(true);
+    when(_clusterContext.getPartitionsForResourceAndFaultZone(TEST_RESOURCE, TEST_ZONE)).thenReturn(
+            ImmutableSet.of(TEST_PARTITION));
+
+    Assert.assertFalse(
+        _faultZoneAwareConstraint.isAssignmentValid(_testNode, _testReplica, _clusterContext));
+  }
+
+  @Test
+  public void validWhenEmptyAssignment() {
+    when(_testNode.hasFaultZone()).thenReturn(true);
+    when(_clusterContext.getPartitionsForResourceAndFaultZone(TEST_RESOURCE, TEST_ZONE)).thenReturn(Collections.emptySet());
+
+    Assert.assertTrue(
+        _faultZoneAwareConstraint.isAssignmentValid(_testNode, _testReplica, _clusterContext));
+  }
+
+  @Test
+  public void validWhenNoFaultZone() {
+    when(_testNode.hasFaultZone()).thenReturn(false);
+
+    Assert.assertTrue(
+        _faultZoneAwareConstraint.isAssignmentValid(_testNode, _testReplica, _clusterContext));
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestInstancePartitionsCountConstraint.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestInstancePartitionsCountConstraint.java
new file mode 100644
index 0000000..a54379e
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestInstancePartitionsCountConstraint.java
@@ -0,0 +1,63 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import static org.mockito.Mockito.when;
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+import org.mockito.Mockito;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+public class TestInstancePartitionsCountConstraint {
+  private final AssignableReplica _testReplica = Mockito.mock(AssignableReplica.class);
+  private final AssignableNode _testNode = Mockito.mock(AssignableNode.class);
+  private final ClusterContext _clusterContext = Mockito.mock(ClusterContext.class);
+
+  private final SoftConstraint _constraint = new InstancePartitionsCountConstraint();
+
+  @Test
+  public void testWhenInstanceIsIdle() {
+    when(_testNode.getAssignedReplicaCount()).thenReturn(0);
+    double score =
+        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+    Assert.assertEquals(score, 1.0);
+  }
+
+  @Test
+  public void testWhenInstanceIsFull() {
+    when(_testNode.getAssignedReplicaCount()).thenReturn(10);
+    when(_clusterContext.getEstimatedMaxPartitionCount()).thenReturn(10);
+    double score =
+        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+    Assert.assertEquals(score, 0.5);
+  }
+
+  @Test
+  public void testWhenInstanceHalfOccupied() {
+    when(_testNode.getAssignedReplicaCount()).thenReturn(10);
+    when(_clusterContext.getEstimatedMaxPartitionCount()).thenReturn(20);
+    double score =
+        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+    Assert.assertTrue(score > 0.99);
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestMaxCapacityUsageInstanceConstraint.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestMaxCapacityUsageInstanceConstraint.java
new file mode 100644
index 0000000..5d52cb7
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestMaxCapacityUsageInstanceConstraint.java
@@ -0,0 +1,57 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+import org.testng.Assert;
+import org.testng.annotations.BeforeMethod;
+import org.testng.annotations.Test;
+
+import static org.mockito.Matchers.anyMap;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.when;
+
+public class TestMaxCapacityUsageInstanceConstraint {
+  private AssignableReplica _testReplica;
+  private AssignableNode _testNode;
+  private ClusterContext _clusterContext;
+  private final SoftConstraint _constraint = new MaxCapacityUsageInstanceConstraint();
+
+  @BeforeMethod
+  public void setUp() {
+    _testNode = mock(AssignableNode.class);
+    _testReplica = mock(AssignableReplica.class);
+    _clusterContext = mock(ClusterContext.class);
+  }
+
+  @Test
+  public void testGetNormalizedScore() {
+    when(_testNode.getProjectedHighestUtilization(anyMap())).thenReturn(0.8f);
+    when(_clusterContext.getEstimatedMaxUtilization()).thenReturn(1f);
+    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    // Convert to float so as to compare with equal.
+    Assert.assertEquals((float) score,0.8f);
+    double normalizedScore =
+        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+    Assert.assertTrue(normalizedScore > 0.99);
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestNodeCapacityConstraint.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestNodeCapacityConstraint.java
new file mode 100644
index 0000000..4365a42
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestNodeCapacityConstraint.java
@@ -0,0 +1,54 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import static org.mockito.Mockito.when;
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+import org.mockito.Mockito;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+import com.google.common.collect.ImmutableMap;
+
+public class TestNodeCapacityConstraint {
+  private final AssignableReplica _testReplica = Mockito.mock(AssignableReplica.class);
+  private final AssignableNode _testNode = Mockito.mock(AssignableNode.class);
+  private final ClusterContext _clusterContext = Mockito.mock(ClusterContext.class);
+  private final HardConstraint _constraint = new NodeCapacityConstraint();
+
+  @Test
+  public void testConstraintValidWhenNodeHasEnoughSpace() {
+    String key = "testKey";
+    when(_testNode.getRemainingCapacity()).thenReturn(ImmutableMap.of(key,  10));
+    when(_testReplica.getCapacity()).thenReturn(ImmutableMap.of(key, 5));
+    Assert.assertTrue(_constraint.isAssignmentValid(_testNode, _testReplica, _clusterContext));
+  }
+
+  @Test
+  public void testConstraintInValidWhenNodeHasInsufficientSpace() {
+    String key = "testKey";
+    when(_testNode.getRemainingCapacity()).thenReturn(ImmutableMap.of(key,  1));
+    when(_testReplica.getCapacity()).thenReturn(ImmutableMap.of(key, 5));
+    Assert.assertFalse(_constraint.isAssignmentValid(_testNode, _testReplica, _clusterContext));
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestNodeMaxPartitionLimitConstraint.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestNodeMaxPartitionLimitConstraint.java
new file mode 100644
index 0000000..4cb7466
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestNodeMaxPartitionLimitConstraint.java
@@ -0,0 +1,56 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import static org.mockito.Mockito.when;
+
+import java.util.Collections;
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+import org.mockito.Mockito;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+public class TestNodeMaxPartitionLimitConstraint {
+  private static final String TEST_RESOURCE = "TestResource";
+  private final AssignableReplica _testReplica = Mockito.mock(AssignableReplica.class);
+  private final AssignableNode _testNode = Mockito.mock(AssignableNode.class);
+  private final ClusterContext _clusterContext = Mockito.mock(ClusterContext.class);
+  private final HardConstraint _constraint = new NodeMaxPartitionLimitConstraint();
+
+  @Test
+  public void testConstraintValid() {
+    when(_testNode.getAssignedReplicaCount()).thenReturn(0);
+    when(_testNode.getMaxPartition()).thenReturn(10);
+    when(_testNode.getAssignedPartitionsByResource(TEST_RESOURCE))
+        .thenReturn(Collections.emptySet());
+    when(_testReplica.getResourceMaxPartitionsPerInstance()).thenReturn(5);
+    Assert.assertTrue(_constraint.isAssignmentValid(_testNode, _testReplica, _clusterContext));
+  }
+
+  @Test
+  public void testConstraintInvalid() {
+    when(_testNode.getAssignedReplicaCount()).thenReturn(10);
+    when(_testNode.getMaxPartition()).thenReturn(5);
+    Assert.assertFalse(_constraint.isAssignmentValid(_testNode, _testReplica, _clusterContext));
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestPartitionActivateConstraint.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestPartitionActivateConstraint.java
new file mode 100644
index 0000000..ecfdaa2
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestPartitionActivateConstraint.java
@@ -0,0 +1,64 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import static org.mockito.Mockito.when;
+
+import java.util.Collections;
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+import org.mockito.Mockito;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+
+public class TestPartitionActivateConstraint {
+  private static final String TEST_PARTITION = "TestPartition";
+  private static final String TEST_RESOURCE = "TestResource";
+  private final AssignableReplica _testReplica = Mockito.mock(AssignableReplica.class);
+  private final AssignableNode _testNode = Mockito.mock(AssignableNode.class);
+  private final ClusterContext _clusterContext = Mockito.mock(ClusterContext.class);
+  private final HardConstraint _constraint = new ReplicaActivateConstraint();
+
+  @Test
+  public void testConstraintValid() {
+    when(_testReplica.getResourceName()).thenReturn(TEST_RESOURCE);
+    when(_testReplica.getPartitionName()).thenReturn(TEST_PARTITION);
+    when(_testNode.getDisabledPartitionsMap())
+        .thenReturn(ImmutableMap.of(TEST_PARTITION, Collections.emptyList()));
+    Assert.assertTrue(_constraint.isAssignmentValid(_testNode, _testReplica, _clusterContext));
+    when(_testNode.getDisabledPartitionsMap())
+        .thenReturn(ImmutableMap.of(TEST_PARTITION, ImmutableList.of("dummy")));
+    Assert.assertTrue(_constraint.isAssignmentValid(_testNode, _testReplica, _clusterContext));
+  }
+
+  @Test
+  public void testConstraintInvalidWhenReplicaIsDisabled() {
+    when(_testReplica.getResourceName()).thenReturn(TEST_RESOURCE);
+    when(_testReplica.getPartitionName()).thenReturn(TEST_PARTITION);
+    when(_testNode.getDisabledPartitionsMap())
+        .thenReturn(ImmutableMap.of(TEST_PARTITION, ImmutableList.of(TEST_PARTITION)));
+    Assert.assertTrue(_constraint.isAssignmentValid(_testNode, _testReplica, _clusterContext));
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestPartitionMovementConstraint.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestPartitionMovementConstraint.java
new file mode 100644
index 0000000..2629c25
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestPartitionMovementConstraint.java
@@ -0,0 +1,127 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.when;
+
+import java.util.Collections;
+import java.util.Map;
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+import org.apache.helix.model.Partition;
+import org.apache.helix.model.ResourceAssignment;
+import org.testng.Assert;
+import org.testng.annotations.BeforeMethod;
+import org.testng.annotations.Test;
+
+import com.google.common.collect.ImmutableMap;
+
+public class TestPartitionMovementConstraint {
+  private static final String INSTANCE = "TestInstance";
+  private static final String RESOURCE = "TestResource";
+  private static final String PARTITION = "TestPartition";
+  private AssignableNode _testNode;
+  private AssignableReplica _testReplica;
+  private ClusterContext _clusterContext;
+  private SoftConstraint _constraint = new PartitionMovementConstraint();
+
+  @BeforeMethod
+  public void init() {
+    _testNode = mock(AssignableNode.class);
+    _testReplica = mock(AssignableReplica.class);
+    _clusterContext = mock(ClusterContext.class);
+    when(_testReplica.getResourceName()).thenReturn(RESOURCE);
+    when(_testReplica.getPartitionName()).thenReturn(PARTITION);
+    when(_testNode.getInstanceName()).thenReturn(INSTANCE);
+  }
+
+  @Test
+  public void testGetAssignmentScoreWhenBestPossibleBaselineMissing() {
+    when(_clusterContext.getBaselineAssignment()).thenReturn(Collections.emptyMap());
+    when(_clusterContext.getBestPossibleAssignment()).thenReturn(Collections.emptyMap());
+    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    double normalizedScore =
+        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+    Assert.assertEquals(score, 0.0);
+    Assert.assertEquals(normalizedScore, 0.0);
+  }
+
+  @Test
+  public void testGetAssignmentScoreWhenBestPossibleBaselineSame() {
+    ResourceAssignment mockResourceAssignment = mock(ResourceAssignment.class);
+    when(mockResourceAssignment.getReplicaMap(new Partition(PARTITION)))
+        .thenReturn(ImmutableMap.of(INSTANCE, "Master"));
+    Map<String, ResourceAssignment> assignmentMap =
+        ImmutableMap.of(RESOURCE, mockResourceAssignment);
+    when(_clusterContext.getBaselineAssignment()).thenReturn(assignmentMap);
+    when(_clusterContext.getBestPossibleAssignment()).thenReturn(assignmentMap);
+    // when the calculated states are both equal to the replica's current state
+    when(_testReplica.getReplicaState()).thenReturn("Master");
+    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    double normalizedScore =
+        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+
+    Assert.assertEquals(score, 1.0);
+    Assert.assertEquals(normalizedScore, 1.0);
+    // when the calculated states are both different from the replica's current state
+    when(_testReplica.getReplicaState()).thenReturn("Slave");
+    score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    normalizedScore =
+        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+
+    Assert.assertEquals(score, 0.5);
+    Assert.assertEquals(normalizedScore, 0.5);
+  }
+
+  @Test
+  public void testGetAssignmentScoreWhenBestPossibleBaselineOpposite() {
+    ResourceAssignment bestPossibleResourceAssignment = mock(ResourceAssignment.class);
+    when(bestPossibleResourceAssignment.getReplicaMap(new Partition(PARTITION)))
+        .thenReturn(ImmutableMap.of(INSTANCE, "Master"));
+    ResourceAssignment baselineResourceAssignment = mock(ResourceAssignment.class);
+    when(baselineResourceAssignment.getReplicaMap(new Partition(PARTITION)))
+        .thenReturn(ImmutableMap.of(INSTANCE, "Slave"));
+    when(_clusterContext.getBaselineAssignment())
+        .thenReturn(ImmutableMap.of(RESOURCE, baselineResourceAssignment));
+    when(_clusterContext.getBestPossibleAssignment())
+        .thenReturn(ImmutableMap.of(RESOURCE, bestPossibleResourceAssignment));
+    // when the replica's state matches with best possible only
+    when(_testReplica.getReplicaState()).thenReturn("Master");
+    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    double normalizedScore =
+        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+
+    Assert.assertEquals(score, 1.0);
+    Assert.assertEquals(normalizedScore, 1.0);
+    // when the replica's state matches with baseline only
+    when(_testReplica.getReplicaState()).thenReturn("Slave");
+    score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    normalizedScore =
+        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+
+    // The calculated score is lower than previous value cause the replica's state matches with
+    // best possible is preferred
+    Assert.assertEquals(score, 0.5);
+    Assert.assertEquals(normalizedScore, 0.5);
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestResourcePartitionAntiAffinityConstraint.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestResourcePartitionAntiAffinityConstraint.java
new file mode 100644
index 0000000..30bd630
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestResourcePartitionAntiAffinityConstraint.java
@@ -0,0 +1,67 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import static org.mockito.Mockito.when;
+
+import java.util.Collections;
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+import org.mockito.Mockito;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+import com.google.common.collect.ImmutableSet;
+
+public class TestResourcePartitionAntiAffinityConstraint {
+  private static final String TEST_PARTITION = "TestPartition";
+  private static final String TEST_RESOURCE = "TestResource";
+  private final AssignableReplica _testReplica = Mockito.mock(AssignableReplica.class);
+  private final AssignableNode _testNode = Mockito.mock(AssignableNode.class);
+  private final ClusterContext _clusterContext = Mockito.mock(ClusterContext.class);
+  private final SoftConstraint _constraint = new ResourcePartitionAntiAffinityConstraint();
+
+  @Test
+  public void testGetAssignmentScore() {
+    when(_testReplica.getResourceName()).thenReturn(TEST_RESOURCE);
+    when(_testNode.getAssignedPartitionsByResource(TEST_RESOURCE)).thenReturn(
+        ImmutableSet.of(TEST_PARTITION + "1", TEST_PARTITION + "2", TEST_PARTITION + "3"));
+    when(_clusterContext.getEstimatedMaxPartitionByResource(TEST_RESOURCE)).thenReturn(10);
+
+    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    double normalizedScore = _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+    Assert.assertEquals(score, 0.3);
+    Assert.assertTrue(normalizedScore > 0.99);
+  }
+
+  @Test
+  public void testGetAssignmentScoreMaxScore() {
+    when(_testReplica.getResourceName()).thenReturn(TEST_RESOURCE);
+    when(_testNode.getAssignedPartitionsByResource(TEST_RESOURCE)).thenReturn(Collections.emptySet());
+    when(_clusterContext.getEstimatedMaxPartitionByResource(TEST_RESOURCE)).thenReturn(10);
+
+    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    double normalizedScore = _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+    Assert.assertEquals(score, 0.0);
+    Assert.assertEquals(normalizedScore, 1.0);
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestResourceTopStateAntiAffinityConstraint.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestResourceTopStateAntiAffinityConstraint.java
new file mode 100644
index 0000000..2a26030
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestResourceTopStateAntiAffinityConstraint.java
@@ -0,0 +1,82 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import static org.mockito.Mockito.verifyZeroInteractions;
+import static org.mockito.Mockito.when;
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+import org.mockito.Mockito;
+import org.testng.Assert;
+import org.testng.annotations.BeforeMethod;
+import org.testng.annotations.Test;
+
+public class TestResourceTopStateAntiAffinityConstraint {
+  private AssignableReplica _testReplica;
+  private AssignableNode _testNode;
+  private ClusterContext _clusterContext;
+
+  private final SoftConstraint _constraint = new ResourceTopStateAntiAffinityConstraint();
+
+  @BeforeMethod
+  public void init() {
+    _testReplica = Mockito.mock(AssignableReplica.class);
+    _testNode = Mockito.mock(AssignableNode.class);
+    _clusterContext = Mockito.mock(ClusterContext.class);
+  }
+
+  @Test
+  public void testGetAssignmentScoreWhenReplicaNotTopState() {
+    when(_testReplica.isReplicaTopState()).thenReturn(false);
+    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    double normalizedScore =
+        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+    Assert.assertEquals(score, 0.0);
+    Assert.assertEquals(normalizedScore, 1.0);
+    verifyZeroInteractions(_testNode);
+    verifyZeroInteractions(_clusterContext);
+  }
+
+  @Test
+  public void testGetAssignmentScoreWhenReplicaIsTopStateHeavyLoad() {
+    when(_testReplica.isReplicaTopState()).thenReturn(true);
+    when(_testNode.getAssignedTopStatePartitionsCount()).thenReturn(20);
+    when(_clusterContext.getEstimatedMaxTopStateCount()).thenReturn(20);
+    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    double normalizedScore =
+        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+    Assert.assertEquals(score, 1.0);
+    Assert.assertEquals(normalizedScore, 0.5);
+  }
+
+  @Test
+  public void testGetAssignmentScoreWhenReplicaIsTopStateLightLoad() {
+    when(_testReplica.isReplicaTopState()).thenReturn(true);
+    when(_testNode.getAssignedTopStatePartitionsCount()).thenReturn(0);
+    when(_clusterContext.getEstimatedMaxTopStateCount()).thenReturn(20);
+    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    double normalizedScore =
+        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+    Assert.assertEquals(score, 0.0);
+    Assert.assertEquals(normalizedScore, 1.0);
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestSamePartitionOnInstanceConstraint.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestSamePartitionOnInstanceConstraint.java
new file mode 100644
index 0000000..50b0c03
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestSamePartitionOnInstanceConstraint.java
@@ -0,0 +1,59 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import static org.mockito.Mockito.when;
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+import org.mockito.Mockito;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+import com.google.common.collect.ImmutableSet;
+
+public class TestSamePartitionOnInstanceConstraint {
+  private static final String TEST_RESOURCE = "TestResource";
+  private static final String TEST_PARTITIOIN = TEST_RESOURCE + "0";
+  private final AssignableReplica _testReplica = Mockito.mock(AssignableReplica.class);
+  private final AssignableNode _testNode = Mockito.mock(AssignableNode.class);
+  private final ClusterContext _clusterContext = Mockito.mock(ClusterContext.class);
+  private final HardConstraint _constraint = new SamePartitionOnInstanceConstraint();
+
+  @Test
+  public void testConstraintValid() {
+    when(_testNode.getAssignedPartitionsByResource(TEST_RESOURCE))
+        .thenReturn(ImmutableSet.of("dummy"));
+    when(_testReplica.getResourceName()).thenReturn(TEST_RESOURCE);
+    when(_testReplica.getPartitionName()).thenReturn(TEST_PARTITIOIN);
+
+    Assert.assertTrue(_constraint.isAssignmentValid(_testNode, _testReplica, _clusterContext));
+  }
+
+  @Test
+  public void testConstraintInValid() {
+    when(_testNode.getAssignedPartitionsByResource(TEST_RESOURCE))
+        .thenReturn(ImmutableSet.of(TEST_PARTITIOIN));
+    when(_testReplica.getResourceName()).thenReturn(TEST_RESOURCE);
+    when(_testReplica.getPartitionName()).thenReturn(TEST_PARTITIOIN);
+    Assert.assertFalse(_constraint.isAssignmentValid(_testNode, _testReplica, _clusterContext));
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestSoftConstraintNormalizeFunction.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestSoftConstraintNormalizeFunction.java
new file mode 100644
index 0000000..ad34705
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestSoftConstraintNormalizeFunction.java
@@ -0,0 +1,47 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+public class TestSoftConstraintNormalizeFunction {
+  @Test
+  public void testDefaultNormalizeFunction() {
+    int maxScore = 100;
+    int minScore = 0;
+    SoftConstraint softConstraint = new SoftConstraint(maxScore, minScore) {
+      @Override
+      protected double getAssignmentScore(AssignableNode node, AssignableReplica replica,
+          ClusterContext clusterContext) {
+        return 0;
+      }
+    };
+
+    for (int i = minScore; i <= maxScore; i++) {
+      double normalized = softConstraint.getNormalizeFunction().scale(i);
+      Assert.assertTrue(normalized <= 1 && normalized >= 0,
+          String.format("input: %s, output: %s", i, normalized));
+    }
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestValidGroupTagConstraint.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestValidGroupTagConstraint.java
new file mode 100644
index 0000000..8d02b3d
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestValidGroupTagConstraint.java
@@ -0,0 +1,66 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import static org.mockito.Mockito.when;
+
+import java.util.Collections;
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+import org.mockito.Mockito;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+import com.google.common.collect.ImmutableSet;
+
+public class TestValidGroupTagConstraint {
+  private static final String TEST_TAG = "testTag";
+  private final AssignableReplica _testReplica = Mockito.mock(AssignableReplica.class);
+  private final AssignableNode _testNode = Mockito.mock(AssignableNode.class);
+  private final ClusterContext _clusterContext = Mockito.mock(ClusterContext.class);
+  private final HardConstraint _constraint = new ValidGroupTagConstraint();
+
+  @Test
+  public void testConstraintValid() {
+    when(_testReplica.hasResourceInstanceGroupTag()).thenReturn(true);
+    when(_testReplica.getResourceInstanceGroupTag()).thenReturn(TEST_TAG);
+    when(_testNode.getInstanceTags()).thenReturn(ImmutableSet.of(TEST_TAG));
+
+    Assert.assertTrue(_constraint.isAssignmentValid(_testNode, _testReplica, _clusterContext));
+  }
+
+  @Test
+  public void testConstraintInValid() {
+    when(_testReplica.hasResourceInstanceGroupTag()).thenReturn(true);
+    when(_testReplica.getResourceInstanceGroupTag()).thenReturn(TEST_TAG);
+    when(_testNode.getInstanceTags()).thenReturn(Collections.emptySet());
+
+    Assert.assertFalse(_constraint.isAssignmentValid(_testNode, _testReplica, _clusterContext));
+  }
+
+  @Test
+  public void testConstraintWhenReplicaHasNoTag() {
+    when(_testReplica.hasResourceInstanceGroupTag()).thenReturn(false);
+
+    Assert.assertTrue(_constraint.isAssignmentValid(_testNode, _testReplica, _clusterContext));
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/AbstractTestClusterModel.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/AbstractTestClusterModel.java
new file mode 100644
index 0000000..0fec67b
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/AbstractTestClusterModel.java
@@ -0,0 +1,204 @@
+package org.apache.helix.controller.rebalancer.waged.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
+import org.apache.helix.model.BuiltInStateModelDefinitions;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.CurrentState;
+import org.apache.helix.model.InstanceConfig;
+import org.apache.helix.model.LiveInstance;
+import org.apache.helix.model.ResourceConfig;
+import org.mockito.Mockito;
+import org.testng.annotations.BeforeClass;
+
+import static org.mockito.Mockito.when;
+
+public abstract class AbstractTestClusterModel {
+  protected static String _sessionId = "testSessionId";
+  protected String _testInstanceId;
+  protected List<String> _resourceNames;
+  protected List<String> _partitionNames;
+  protected Map<String, Integer> _capacityDataMap;
+  protected Map<String, List<String>> _disabledPartitionsMap;
+  protected List<String> _testInstanceTags;
+  protected String _testFaultZoneId;
+
+  @BeforeClass
+  public void initialize() {
+    _testInstanceId = "testInstanceId";
+    _resourceNames = new ArrayList<>();
+    _resourceNames.add("Resource1");
+    _resourceNames.add("Resource2");
+    _partitionNames = new ArrayList<>();
+    _partitionNames.add("Partition1");
+    _partitionNames.add("Partition2");
+    _partitionNames.add("Partition3");
+    _partitionNames.add("Partition4");
+    _capacityDataMap = new HashMap<>();
+    _capacityDataMap.put("item1", 20);
+    _capacityDataMap.put("item2", 40);
+    _capacityDataMap.put("item3", 30);
+    List<String> disabledPartitions = new ArrayList<>();
+    disabledPartitions.add("TestPartition");
+    _disabledPartitionsMap = new HashMap<>();
+    _disabledPartitionsMap.put("TestResource", disabledPartitions);
+    _testInstanceTags = new ArrayList<>();
+    _testInstanceTags.add("TestTag");
+    _testFaultZoneId = "testZone";
+  }
+
+  protected InstanceConfig createMockInstanceConfig(String instanceId) {
+    InstanceConfig testInstanceConfig = new InstanceConfig(instanceId);
+    testInstanceConfig.setInstanceCapacityMap(_capacityDataMap);
+    testInstanceConfig.addTag(_testInstanceTags.get(0));
+    testInstanceConfig.setInstanceEnabled(true);
+    testInstanceConfig.setZoneId(_testFaultZoneId);
+    return testInstanceConfig;
+  }
+
+  protected LiveInstance createMockLiveInstance(String instanceId) {
+    LiveInstance testLiveInstance = new LiveInstance(instanceId);
+    testLiveInstance.setSessionId(_sessionId);
+    return testLiveInstance;
+  }
+
+  protected ResourceControllerDataProvider setupClusterDataCache() throws IOException {
+    ResourceControllerDataProvider testCache = Mockito.mock(ResourceControllerDataProvider.class);
+
+    // 1. Set up the default instance information with capacity configuration.
+    InstanceConfig testInstanceConfig = createMockInstanceConfig(_testInstanceId);
+    testInstanceConfig.setInstanceEnabledForPartition("TestResource", "TestPartition", false);
+    Map<String, InstanceConfig> instanceConfigMap = new HashMap<>();
+    instanceConfigMap.put(_testInstanceId, testInstanceConfig);
+    when(testCache.getInstanceConfigMap()).thenReturn(instanceConfigMap);
+
+    // 2. Set up the basic cluster configuration.
+    ClusterConfig testClusterConfig = new ClusterConfig("testClusterConfigId");
+    testClusterConfig.setMaxPartitionsPerInstance(5);
+    testClusterConfig.setDisabledInstances(Collections.emptyMap());
+    testClusterConfig.setInstanceCapacityKeys(new ArrayList<>(_capacityDataMap.keySet()));
+    testClusterConfig.setDefaultPartitionWeightMap(
+        _capacityDataMap.keySet().stream().collect(Collectors.toMap(key -> key, key -> 0)));
+    testClusterConfig.setTopologyAwareEnabled(true);
+    when(testCache.getClusterConfig()).thenReturn(testClusterConfig);
+
+    // 3. Mock the live instance node for the default instance.
+    LiveInstance testLiveInstance = createMockLiveInstance(_testInstanceId);
+    Map<String, LiveInstance> liveInstanceMap = new HashMap<>();
+    liveInstanceMap.put(_testInstanceId, testLiveInstance);
+    when(testCache.getLiveInstances()).thenReturn(liveInstanceMap);
+
+    // 4. Mock two resources, each with 2 partitions on the default instance.
+    // The instance will have the following partitions assigned
+    // Resource 1:
+    // -------------- partition 1 - MASTER
+    // -------------- partition 2 - SLAVE
+    // Resource 2:
+    // -------------- partition 3 - MASTER
+    // -------------- partition 4 - SLAVE
+    CurrentState testCurrentStateResource1 = Mockito.mock(CurrentState.class);
+    Map<String, String> partitionStateMap1 = new HashMap<>();
+    partitionStateMap1.put(_partitionNames.get(0), "MASTER");
+    partitionStateMap1.put(_partitionNames.get(1), "SLAVE");
+    when(testCurrentStateResource1.getResourceName()).thenReturn(_resourceNames.get(0));
+    when(testCurrentStateResource1.getPartitionStateMap()).thenReturn(partitionStateMap1);
+    when(testCurrentStateResource1.getStateModelDefRef()).thenReturn("MasterSlave");
+    when(testCurrentStateResource1.getState(_partitionNames.get(0))).thenReturn("MASTER");
+    when(testCurrentStateResource1.getState(_partitionNames.get(1))).thenReturn("SLAVE");
+    CurrentState testCurrentStateResource2 = Mockito.mock(CurrentState.class);
+    Map<String, String> partitionStateMap2 = new HashMap<>();
+    partitionStateMap2.put(_partitionNames.get(2), "MASTER");
+    partitionStateMap2.put(_partitionNames.get(3), "SLAVE");
+    when(testCurrentStateResource2.getResourceName()).thenReturn(_resourceNames.get(1));
+    when(testCurrentStateResource2.getPartitionStateMap()).thenReturn(partitionStateMap2);
+    when(testCurrentStateResource2.getStateModelDefRef()).thenReturn("MasterSlave");
+    when(testCurrentStateResource2.getState(_partitionNames.get(2))).thenReturn("MASTER");
+    when(testCurrentStateResource2.getState(_partitionNames.get(3))).thenReturn("SLAVE");
+    Map<String, CurrentState> currentStatemap = new HashMap<>();
+    currentStatemap.put(_resourceNames.get(0), testCurrentStateResource1);
+    currentStatemap.put(_resourceNames.get(1), testCurrentStateResource2);
+    when(testCache.getCurrentState(_testInstanceId, _sessionId)).thenReturn(currentStatemap);
+
+    // 5. Set up the resource config for the two resources with the partition weight.
+    Map<String, Integer> capacityDataMapResource1 = new HashMap<>();
+    capacityDataMapResource1.put("item1", 3);
+    capacityDataMapResource1.put("item2", 6);
+    ResourceConfig testResourceConfigResource1 = new ResourceConfig("Resource1");
+    testResourceConfigResource1.setPartitionCapacityMap(
+        Collections.singletonMap(ResourceConfig.DEFAULT_PARTITION_KEY, capacityDataMapResource1));
+    when(testCache.getResourceConfig("Resource1")).thenReturn(testResourceConfigResource1);
+    Map<String, Integer> capacityDataMapResource2 = new HashMap<>();
+    capacityDataMapResource2.put("item1", 5);
+    capacityDataMapResource2.put("item2", 10);
+    ResourceConfig testResourceConfigResource2 = new ResourceConfig("Resource2");
+    testResourceConfigResource2.setPartitionCapacityMap(
+        Collections.singletonMap(ResourceConfig.DEFAULT_PARTITION_KEY, capacityDataMapResource2));
+    when(testCache.getResourceConfig("Resource2")).thenReturn(testResourceConfigResource2);
+    Map<String, ResourceConfig> configMap = new HashMap<>();
+    configMap.put("Resource1", testResourceConfigResource1);
+    configMap.put("Resource2", testResourceConfigResource2);
+    when(testCache.getResourceConfigMap()).thenReturn(configMap);
+
+    // 6. Define mock state model
+    for (BuiltInStateModelDefinitions bsmd : BuiltInStateModelDefinitions.values()) {
+      when(testCache.getStateModelDef(bsmd.name())).thenReturn(bsmd.getStateModelDefinition());
+    }
+
+    return testCache;
+  }
+
+  /**
+   * Generate the replica objects according to the provider information.
+   */
+  protected Set<AssignableReplica> generateReplicas(ResourceControllerDataProvider dataProvider) {
+    // Create assignable replica based on the current state.
+    Map<String, CurrentState> currentStatemap =
+        dataProvider.getCurrentState(_testInstanceId, _sessionId);
+    Set<AssignableReplica> assignmentSet = new HashSet<>();
+    for (CurrentState cs : currentStatemap.values()) {
+      ResourceConfig resourceConfig = dataProvider.getResourceConfig(cs.getResourceName());
+      // Construct one AssignableReplica for each partition in the current state.
+      cs.getPartitionStateMap().entrySet().stream()
+          .forEach(entry -> assignmentSet
+              .add(new AssignableReplica(dataProvider.getClusterConfig(), resourceConfig,
+                  entry.getKey(), entry.getValue(), entry.getValue().equals("MASTER") ? 1 : 2)));
+    }
+    return assignmentSet;
+  }
+
+  protected Set<AssignableNode> generateNodes(ResourceControllerDataProvider testCache) {
+    Set<AssignableNode> nodeSet = new HashSet<>();
+    testCache.getInstanceConfigMap().values().stream()
+        .forEach(config -> nodeSet.add(new AssignableNode(testCache.getClusterConfig(),
+            testCache.getInstanceConfigMap().get(_testInstanceId), config.getInstanceName())));
+    return nodeSet;
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/ClusterModelTestHelper.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/ClusterModelTestHelper.java
new file mode 100644
index 0000000..131d92a
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/ClusterModelTestHelper.java
@@ -0,0 +1,40 @@
+package org.apache.helix.controller.rebalancer.waged.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.IOException;
+import java.util.Collections;
+import java.util.Set;
+
+import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
+
+public class ClusterModelTestHelper extends AbstractTestClusterModel {
+
+  public ClusterModel getDefaultClusterModel() throws IOException {
+    initialize();
+    ResourceControllerDataProvider testCache = setupClusterDataCache();
+    Set<AssignableReplica> assignableReplicas = generateReplicas(testCache);
+    Set<AssignableNode> assignableNodes = generateNodes(testCache);
+
+    ClusterContext context =
+        new ClusterContext(assignableReplicas, assignableNodes, Collections.emptyMap(), Collections.emptyMap());
+    return new ClusterModel(context, assignableReplicas, assignableNodes);
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestAssignableNode.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestAssignableNode.java
new file mode 100644
index 0000000..2b93353
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestAssignableNode.java
@@ -0,0 +1,280 @@
+package org.apache.helix.controller.rebalancer.waged.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import org.apache.helix.HelixException;
+import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.InstanceConfig;
+import org.testng.Assert;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+import static org.mockito.Mockito.when;
+
+public class TestAssignableNode extends AbstractTestClusterModel {
+  @BeforeClass
+  public void initialize() {
+    super.initialize();
+  }
+
+  @Test
+  public void testNormalUsage() throws IOException {
+    // Test 1 - initialize based on the data cache and check with the expected result
+    ResourceControllerDataProvider testCache = setupClusterDataCache();
+    Set<AssignableReplica> assignmentSet = generateReplicas(testCache);
+
+    Set<String> expectedTopStateAssignmentSet1 = new HashSet<>(_partitionNames.subList(0, 1));
+    Set<String> expectedTopStateAssignmentSet2 = new HashSet<>(_partitionNames.subList(2, 3));
+    Set<String> expectedAssignmentSet1 = new HashSet<>(_partitionNames.subList(0, 2));
+    Set<String> expectedAssignmentSet2 = new HashSet<>(_partitionNames.subList(2, 4));
+    Map<String, Set<String>> expectedAssignment = new HashMap<>();
+    expectedAssignment.put("Resource1", expectedAssignmentSet1);
+    expectedAssignment.put("Resource2", expectedAssignmentSet2);
+    Map<String, Integer> expectedCapacityMap = new HashMap<>();
+    expectedCapacityMap.put("item1", 4);
+    expectedCapacityMap.put("item2", 8);
+    expectedCapacityMap.put("item3", 30);
+
+    AssignableNode assignableNode = new AssignableNode(testCache.getClusterConfig(),
+        testCache.getInstanceConfigMap().get(_testInstanceId), _testInstanceId);
+    assignableNode.assignInitBatch(assignmentSet);
+    Assert.assertEquals(assignableNode.getAssignedPartitionsMap(), expectedAssignment);
+    Assert.assertEquals(assignableNode.getAssignedReplicaCount(), 4);
+    Assert.assertEquals(assignableNode.getProjectedHighestUtilization(Collections.EMPTY_MAP),
+        16.0 / 20.0, 0.005);
+    Assert.assertEquals(assignableNode.getMaxCapacity(), _capacityDataMap);
+    Assert.assertEquals(assignableNode.getMaxPartition(), 5);
+    Assert.assertEquals(assignableNode.getInstanceTags(), _testInstanceTags);
+    Assert.assertEquals(assignableNode.getFaultZone(), _testFaultZoneId);
+    Assert.assertEquals(assignableNode.getDisabledPartitionsMap(), _disabledPartitionsMap);
+    Assert.assertEquals(assignableNode.getRemainingCapacity(), expectedCapacityMap);
+    Assert.assertEquals(assignableNode.getAssignedReplicas(), assignmentSet);
+    Assert.assertEquals(assignableNode.getAssignedPartitionsByResource(_resourceNames.get(0)),
+        expectedAssignmentSet1);
+    Assert.assertEquals(assignableNode.getAssignedPartitionsByResource(_resourceNames.get(1)),
+        expectedAssignmentSet2);
+    Assert
+        .assertEquals(assignableNode.getAssignedTopStatePartitionsByResource(_resourceNames.get(0)),
+            expectedTopStateAssignmentSet1);
+    Assert
+        .assertEquals(assignableNode.getAssignedTopStatePartitionsByResource(_resourceNames.get(1)),
+            expectedTopStateAssignmentSet2);
+    Assert.assertEquals(assignableNode.getAssignedTopStatePartitionsCount(),
+        expectedTopStateAssignmentSet1.size() + expectedTopStateAssignmentSet2.size());
+
+    // Test 2 - release assignment from the AssignableNode
+    AssignableReplica removingReplica = new AssignableReplica(testCache.getClusterConfig(),
+        testCache.getResourceConfig(_resourceNames.get(1)), _partitionNames.get(2), "MASTER", 1);
+    expectedAssignment.get(_resourceNames.get(1)).remove(_partitionNames.get(2));
+    expectedCapacityMap.put("item1", 9);
+    expectedCapacityMap.put("item2", 18);
+    Iterator<AssignableReplica> iter = assignmentSet.iterator();
+    while (iter.hasNext()) {
+      AssignableReplica replica = iter.next();
+      if (replica.equals(removingReplica)) {
+        iter.remove();
+      }
+    }
+    expectedTopStateAssignmentSet2.remove(_partitionNames.get(2));
+
+    assignableNode.release(removingReplica);
+
+    Assert.assertEquals(assignableNode.getAssignedPartitionsMap(), expectedAssignment);
+    Assert.assertEquals(assignableNode.getAssignedReplicaCount(), 3);
+    Assert.assertEquals(assignableNode.getProjectedHighestUtilization(Collections.EMPTY_MAP),
+        11.0 / 20.0, 0.005);
+    Assert.assertEquals(assignableNode.getMaxCapacity(), _capacityDataMap);
+    Assert.assertEquals(assignableNode.getMaxPartition(), 5);
+    Assert.assertEquals(assignableNode.getInstanceTags(), _testInstanceTags);
+    Assert.assertEquals(assignableNode.getFaultZone(), _testFaultZoneId);
+    Assert.assertEquals(assignableNode.getDisabledPartitionsMap(), _disabledPartitionsMap);
+    Assert.assertEquals(assignableNode.getRemainingCapacity(), expectedCapacityMap);
+    Assert.assertEquals(assignableNode.getAssignedReplicas(), assignmentSet);
+    Assert.assertEquals(assignableNode.getAssignedPartitionsByResource(_resourceNames.get(0)),
+        expectedAssignmentSet1);
+    Assert.assertEquals(assignableNode.getAssignedPartitionsByResource(_resourceNames.get(1)),
+        expectedAssignmentSet2);
+    Assert
+        .assertEquals(assignableNode.getAssignedTopStatePartitionsByResource(_resourceNames.get(0)),
+            expectedTopStateAssignmentSet1);
+    Assert
+        .assertEquals(assignableNode.getAssignedTopStatePartitionsByResource(_resourceNames.get(1)),
+            expectedTopStateAssignmentSet2);
+    Assert.assertEquals(assignableNode.getAssignedTopStatePartitionsCount(),
+        expectedTopStateAssignmentSet1.size() + expectedTopStateAssignmentSet2.size());
+
+    // Test 3 - add assignment to the AssignableNode
+    AssignableReplica addingReplica = new AssignableReplica(testCache.getClusterConfig(),
+        testCache.getResourceConfig(_resourceNames.get(1)), _partitionNames.get(2), "SLAVE", 2);
+    expectedAssignment.get(_resourceNames.get(1)).add(_partitionNames.get(2));
+    expectedCapacityMap.put("item1", 4);
+    expectedCapacityMap.put("item2", 8);
+    assignmentSet.add(addingReplica);
+
+    assignableNode.assign(addingReplica);
+
+    Assert.assertEquals(assignableNode.getAssignedPartitionsMap(), expectedAssignment);
+    Assert.assertEquals(assignableNode.getAssignedReplicaCount(), 4);
+    Assert.assertEquals(assignableNode.getProjectedHighestUtilization(Collections.EMPTY_MAP),
+        16.0 / 20.0, 0.005);
+    Assert.assertEquals(assignableNode.getMaxCapacity(), _capacityDataMap);
+    Assert.assertEquals(assignableNode.getMaxPartition(), 5);
+    Assert.assertEquals(assignableNode.getInstanceTags(), _testInstanceTags);
+    Assert.assertEquals(assignableNode.getFaultZone(), _testFaultZoneId);
+    Assert.assertEquals(assignableNode.getDisabledPartitionsMap(), _disabledPartitionsMap);
+    Assert.assertEquals(assignableNode.getRemainingCapacity(), expectedCapacityMap);
+    Assert.assertEquals(assignableNode.getAssignedReplicas(), assignmentSet);
+    Assert.assertEquals(assignableNode.getAssignedPartitionsByResource(_resourceNames.get(0)),
+        expectedAssignmentSet1);
+    Assert.assertEquals(assignableNode.getAssignedPartitionsByResource(_resourceNames.get(1)),
+        expectedAssignmentSet2);
+    Assert
+        .assertEquals(assignableNode.getAssignedTopStatePartitionsByResource(_resourceNames.get(0)),
+            expectedTopStateAssignmentSet1);
+    Assert
+        .assertEquals(assignableNode.getAssignedTopStatePartitionsByResource(_resourceNames.get(1)),
+            expectedTopStateAssignmentSet2);
+    Assert.assertEquals(assignableNode.getAssignedTopStatePartitionsCount(),
+        expectedTopStateAssignmentSet1.size() + expectedTopStateAssignmentSet2.size());
+  }
+
+  @Test
+  public void testReleaseNoPartition() throws IOException {
+    ResourceControllerDataProvider testCache = setupClusterDataCache();
+
+    AssignableNode assignableNode = new AssignableNode(testCache.getClusterConfig(),
+        testCache.getInstanceConfigMap().get(_testInstanceId), _testInstanceId);
+    AssignableReplica removingReplica = new AssignableReplica(testCache.getClusterConfig(),
+        testCache.getResourceConfig(_resourceNames.get(1)), _partitionNames.get(2) + "non-exist",
+        "MASTER", 1);
+
+    // Release shall pass.
+    assignableNode.release(removingReplica);
+  }
+
+  @Test(expectedExceptions = HelixException.class, expectedExceptionsMessageRegExp = "Resource Resource1 already has a replica with state SLAVE from partition Partition1 on node testInstanceId")
+  public void testAssignDuplicateReplica() throws IOException {
+    ResourceControllerDataProvider testCache = setupClusterDataCache();
+    Set<AssignableReplica> assignmentSet = generateReplicas(testCache);
+
+    AssignableNode assignableNode = new AssignableNode(testCache.getClusterConfig(),
+        testCache.getInstanceConfigMap().get(_testInstanceId), _testInstanceId);
+    assignableNode.assignInitBatch(assignmentSet);
+    AssignableReplica duplicateReplica = new AssignableReplica(testCache.getClusterConfig(),
+        testCache.getResourceConfig(_resourceNames.get(0)), _partitionNames.get(0), "SLAVE", 2);
+    assignableNode.assign(duplicateReplica);
+  }
+
+  @Test
+  public void testParseFaultZoneNotFound() throws IOException {
+    ResourceControllerDataProvider testCache = setupClusterDataCache();
+
+    ClusterConfig testClusterConfig = new ClusterConfig("testClusterConfigId");
+    testClusterConfig.setFaultZoneType("zone");
+    testClusterConfig.setTopologyAwareEnabled(true);
+    testClusterConfig.setTopology("/zone/");
+    when(testCache.getClusterConfig()).thenReturn(testClusterConfig);
+
+    InstanceConfig testInstanceConfig = new InstanceConfig("testInstanceConfigId");
+    testInstanceConfig.setDomain("instance=testInstance");
+    Map<String, InstanceConfig> instanceConfigMap = new HashMap<>();
+    instanceConfigMap.put(_testInstanceId, testInstanceConfig);
+    when(testCache.getInstanceConfigMap()).thenReturn(instanceConfigMap);
+
+    AssignableNode node = new AssignableNode(testCache.getClusterConfig(),
+        testCache.getInstanceConfigMap().get(_testInstanceId), _testInstanceId);
+    Assert.assertEquals(node.getFaultZone(), "Default_zone");
+  }
+
+  @Test
+  public void testParseFaultZone() throws IOException {
+    ResourceControllerDataProvider testCache = setupClusterDataCache();
+
+    ClusterConfig testClusterConfig = new ClusterConfig("testClusterConfigId");
+    testClusterConfig.setFaultZoneType("zone");
+    testClusterConfig.setTopologyAwareEnabled(true);
+    testClusterConfig.setTopology("/zone/instance");
+    when(testCache.getClusterConfig()).thenReturn(testClusterConfig);
+
+    InstanceConfig testInstanceConfig = new InstanceConfig("testInstanceConfigId");
+    testInstanceConfig.setDomain("zone=2, instance=testInstance");
+    Map<String, InstanceConfig> instanceConfigMap = new HashMap<>();
+    instanceConfigMap.put(_testInstanceId, testInstanceConfig);
+    when(testCache.getInstanceConfigMap()).thenReturn(instanceConfigMap);
+
+    AssignableNode assignableNode = new AssignableNode(testCache.getClusterConfig(),
+        testCache.getInstanceConfigMap().get(_testInstanceId), _testInstanceId);
+
+    Assert.assertEquals(assignableNode.getFaultZone(), "2");
+
+    testClusterConfig = new ClusterConfig("testClusterConfigId");
+    testClusterConfig.setFaultZoneType("instance");
+    testClusterConfig.setTopologyAwareEnabled(true);
+    testClusterConfig.setTopology("/zone/instance");
+    when(testCache.getClusterConfig()).thenReturn(testClusterConfig);
+
+    testInstanceConfig = new InstanceConfig("testInstanceConfigId");
+    testInstanceConfig.setDomain("zone=2, instance=testInstance");
+    instanceConfigMap = new HashMap<>();
+    instanceConfigMap.put(_testInstanceId, testInstanceConfig);
+    when(testCache.getInstanceConfigMap()).thenReturn(instanceConfigMap);
+
+    assignableNode = new AssignableNode(testCache.getClusterConfig(),
+        testCache.getInstanceConfigMap().get(_testInstanceId), _testInstanceId);
+
+    Assert.assertEquals(assignableNode.getFaultZone(), "2/testInstance");
+  }
+
+  @Test
+  public void testDefaultInstanceCapacity() {
+    ClusterConfig testClusterConfig = new ClusterConfig("testClusterConfigId");
+    testClusterConfig.setDefaultInstanceCapacityMap(_capacityDataMap);
+
+    InstanceConfig testInstanceConfig = new InstanceConfig("testInstanceConfigId");
+
+    AssignableNode assignableNode =
+        new AssignableNode(testClusterConfig, testInstanceConfig, _testInstanceId);
+    Assert.assertEquals(assignableNode.getMaxCapacity(), _capacityDataMap);
+  }
+
+  @Test(expectedExceptions = HelixException.class, expectedExceptionsMessageRegExp = "The required capacity keys: \\[item2, item1, item3, AdditionalCapacityKey\\] are not fully configured in the instance: testInstanceId, capacity map: \\{item2=40, item1=20, item3=30\\}.")
+  public void testIncompleteInstanceCapacity() {
+    ClusterConfig testClusterConfig = new ClusterConfig("testClusterConfigId");
+    List<String> requiredCapacityKeys = new ArrayList<>(_capacityDataMap.keySet());
+    requiredCapacityKeys.add("AdditionalCapacityKey");
+    testClusterConfig.setInstanceCapacityKeys(requiredCapacityKeys);
+
+    InstanceConfig testInstanceConfig = new InstanceConfig(_testInstanceId);
+    testInstanceConfig.setInstanceCapacityMap(_capacityDataMap);
+
+    new AssignableNode(testClusterConfig, testInstanceConfig, _testInstanceId);
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestAssignableReplica.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestAssignableReplica.java
new file mode 100644
index 0000000..00d392b
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestAssignableReplica.java
@@ -0,0 +1,167 @@
+package org.apache.helix.controller.rebalancer.waged.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.helix.HelixException;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.ResourceConfig;
+import org.apache.helix.model.StateModelDefinition;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+public class TestAssignableReplica {
+  String resourceName = "Resource";
+  String partitionNamePrefix = "partition";
+  String masterState = "Master";
+  int masterPriority = StateModelDefinition.TOP_STATE_PRIORITY;
+  String slaveState = "Slave";
+  int slavePriority = 2;
+
+  @Test
+  public void testConstructReplicaWithResourceConfig() throws IOException {
+    // Init assignable replica with a basic config object
+    Map<String, Integer> capacityDataMapResource1 = new HashMap<>();
+    capacityDataMapResource1.put("item1", 3);
+    capacityDataMapResource1.put("item2", 6);
+    ResourceConfig testResourceConfigResource = new ResourceConfig(resourceName);
+    testResourceConfigResource.setPartitionCapacityMap(
+        Collections.singletonMap(ResourceConfig.DEFAULT_PARTITION_KEY, capacityDataMapResource1));
+    ClusterConfig testClusterConfig = new ClusterConfig("testCluster");
+    testClusterConfig.setInstanceCapacityKeys(new ArrayList<>(capacityDataMapResource1.keySet()));
+
+    String partitionName = partitionNamePrefix + 1;
+    AssignableReplica replica =
+        new AssignableReplica(testClusterConfig, testResourceConfigResource, partitionName,
+            masterState, masterPriority);
+    Assert.assertEquals(replica.getResourceName(), resourceName);
+    Assert.assertEquals(replica.getPartitionName(), partitionName);
+    Assert.assertEquals(replica.getReplicaState(), masterState);
+    Assert.assertEquals(replica.getStatePriority(), masterPriority);
+    Assert.assertTrue(replica.isReplicaTopState());
+    Assert.assertEquals(replica.getCapacity(), capacityDataMapResource1);
+    Assert.assertEquals(replica.getResourceInstanceGroupTag(), null);
+    Assert.assertEquals(replica.getResourceMaxPartitionsPerInstance(), Integer.MAX_VALUE);
+
+    // Modify the config and initialize more replicas.
+    // 1. update capacity
+    Map<String, Integer> capacityDataMapResource2 = new HashMap<>();
+    capacityDataMapResource2.put("item1", 5);
+    capacityDataMapResource2.put("item2", 10);
+    Map<String, Map<String, Integer>> capacityMap =
+        testResourceConfigResource.getPartitionCapacityMap();
+    String partitionName2 = partitionNamePrefix + 2;
+    capacityMap.put(partitionName2, capacityDataMapResource2);
+    testResourceConfigResource.setPartitionCapacityMap(capacityMap);
+    // 2. update instance group tag and max partitions per instance
+    String group = "DEFAULT";
+    int maxPartition = 10;
+    testResourceConfigResource.getRecord()
+        .setSimpleField(ResourceConfig.ResourceConfigProperty.INSTANCE_GROUP_TAG.toString(), group);
+    testResourceConfigResource.getRecord()
+        .setIntField(ResourceConfig.ResourceConfigProperty.MAX_PARTITIONS_PER_INSTANCE.name(),
+            maxPartition);
+
+    replica = new AssignableReplica(testClusterConfig, testResourceConfigResource, partitionName,
+        masterState, masterPriority);
+    Assert.assertEquals(replica.getCapacity(), capacityDataMapResource1);
+    Assert.assertEquals(replica.getResourceInstanceGroupTag(), group);
+    Assert.assertEquals(replica.getResourceMaxPartitionsPerInstance(), maxPartition);
+
+    replica = new AssignableReplica(testClusterConfig, testResourceConfigResource, partitionName2,
+        slaveState, slavePriority);
+    Assert.assertEquals(replica.getResourceName(), resourceName);
+    Assert.assertEquals(replica.getPartitionName(), partitionName2);
+    Assert.assertEquals(replica.getReplicaState(), slaveState);
+    Assert.assertEquals(replica.getStatePriority(), slavePriority);
+    Assert.assertFalse(replica.isReplicaTopState());
+    Assert.assertEquals(replica.getCapacity(), capacityDataMapResource2);
+    Assert.assertEquals(replica.getResourceInstanceGroupTag(), group);
+    Assert.assertEquals(replica.getResourceMaxPartitionsPerInstance(), maxPartition);
+  }
+
+  /**
+   *  Tests that if default partition weight map is configured in ClusterConfig and NOT in
+   *  ResourceConfig. AssignableReplica actually will get the default weight from ClusterConfig
+   *  even though it's not set in ResourceConfig.
+   */
+  @Test
+  public void testDefaultPartitionWeight() {
+    Map<String, Integer> defaultWeightDataMapResource = new HashMap<>();
+    defaultWeightDataMapResource.put("item1", 3);
+    defaultWeightDataMapResource.put("item2", 6);
+    ClusterConfig testClusterConfig = new ClusterConfig("testClusterConfigId");
+    testClusterConfig
+        .setInstanceCapacityKeys(new ArrayList<>(defaultWeightDataMapResource.keySet()));
+    testClusterConfig.setDefaultPartitionWeightMap(defaultWeightDataMapResource);
+
+    ResourceConfig testResourceConfigResource = new ResourceConfig(resourceName);
+    AssignableReplica replica = new AssignableReplica(testClusterConfig, testResourceConfigResource,
+        partitionNamePrefix + 1, masterState, masterPriority);
+
+    Assert.assertEquals(replica.getCapacity().size(), defaultWeightDataMapResource.size());
+    Assert.assertEquals(replica.getCapacity(), defaultWeightDataMapResource);
+  }
+
+  @Test
+  public void testIncompletePartitionWeightConfig() throws IOException {
+    // Init assignable replica with a basic config object
+    Map<String, Integer> capacityDataMapResource = new HashMap<>();
+    capacityDataMapResource.put("item1", 3);
+    capacityDataMapResource.put("item2", 6);
+    ResourceConfig testResourceConfigResource = new ResourceConfig(resourceName);
+    testResourceConfigResource.setPartitionCapacityMap(
+        Collections.singletonMap(ResourceConfig.DEFAULT_PARTITION_KEY, capacityDataMapResource));
+    ClusterConfig testClusterConfig = new ClusterConfig("testCluster");
+    List<String> requiredCapacityKeys = new ArrayList<>(capacityDataMapResource.keySet());
+    // Remove one required key, so it becomes a unnecessary item.
+    String unnecessaryCapacityKey = requiredCapacityKeys.remove(0);
+    // Add one new required key, so it does not exist in the resource config.
+    String newCapacityKey = "newCapacityKey";
+    requiredCapacityKeys.add(newCapacityKey);
+    testClusterConfig.setInstanceCapacityKeys(requiredCapacityKeys);
+
+    try {
+      new AssignableReplica(testClusterConfig, testResourceConfigResource,
+          partitionNamePrefix + 1, masterState, masterPriority);
+      Assert.fail("Creating new replica should fail because of incomplete partition weight.");
+    } catch (HelixException ex) {
+      // expected
+    }
+
+    Map<String, Integer> defaultCapacityDataMap = new HashMap<>();
+    for (String key : requiredCapacityKeys) {
+      defaultCapacityDataMap.put(key, 0);
+    }
+    testClusterConfig.setDefaultPartitionWeightMap(defaultCapacityDataMap);
+
+    AssignableReplica replica = new AssignableReplica(testClusterConfig, testResourceConfigResource,
+        partitionNamePrefix + 1, masterState, masterPriority);
+    Assert.assertTrue(replica.getCapacity().keySet().containsAll(requiredCapacityKeys));
+    Assert.assertEquals(replica.getCapacity().get(newCapacityKey).intValue(), 0);
+    Assert.assertFalse(replica.getCapacity().containsKey(unnecessaryCapacityKey));
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestClusterContext.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestClusterContext.java
new file mode 100644
index 0000000..732ae8f
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestClusterContext.java
@@ -0,0 +1,93 @@
+package org.apache.helix.controller.rebalancer.waged.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.IOException;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+import org.apache.helix.HelixException;
+import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
+import org.testng.Assert;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+public class TestClusterContext extends AbstractTestClusterModel {
+  @BeforeClass
+  public void initialize() {
+    super.initialize();
+  }
+
+  @Test
+  public void testNormalUsage() throws IOException {
+    // Test 1 - initialize the cluster context based on the data cache.
+    ResourceControllerDataProvider testCache = setupClusterDataCache();
+    Set<AssignableReplica> assignmentSet = generateReplicas(testCache);
+
+    ClusterContext context =
+        new ClusterContext(assignmentSet, generateNodes(testCache), new HashMap<>(),
+            new HashMap<>());
+
+    Assert.assertEquals(context.getEstimatedMaxPartitionCount(), 4);
+    Assert.assertEquals(context.getEstimatedMaxTopStateCount(), 2);
+    Assert.assertEquals(context.getAssignmentForFaultZoneMap(), Collections.emptyMap());
+    for (String resourceName : _resourceNames) {
+      Assert.assertEquals(context.getEstimatedMaxPartitionByResource(resourceName), 2);
+      Assert.assertEquals(
+          context.getPartitionsForResourceAndFaultZone(_testFaultZoneId, resourceName),
+          Collections.emptySet());
+    }
+
+    // Assign
+    Map<String, Map<String, Set<String>>> expectedFaultZoneMap = Collections
+        .singletonMap(_testFaultZoneId, assignmentSet.stream().collect(Collectors
+            .groupingBy(AssignableReplica::getResourceName,
+                Collectors.mapping(AssignableReplica::getPartitionName, Collectors.toSet()))));
+
+    assignmentSet.stream().forEach(replica -> context
+        .addPartitionToFaultZone(_testFaultZoneId, replica.getResourceName(),
+            replica.getPartitionName()));
+    Assert.assertEquals(context.getAssignmentForFaultZoneMap(), expectedFaultZoneMap);
+
+    // release
+    expectedFaultZoneMap.get(_testFaultZoneId).get(_resourceNames.get(0))
+        .remove(_partitionNames.get(0));
+    Assert.assertTrue(context.removePartitionFromFaultZone(_testFaultZoneId, _resourceNames.get(0),
+        _partitionNames.get(0)));
+
+    Assert.assertEquals(context.getAssignmentForFaultZoneMap(), expectedFaultZoneMap);
+  }
+
+  @Test(expectedExceptions = HelixException.class, expectedExceptionsMessageRegExp = "Resource Resource1 already has a replica from partition Partition1 in fault zone testZone")
+  public void testDuplicateAssign() throws IOException {
+    ResourceControllerDataProvider testCache = setupClusterDataCache();
+    Set<AssignableReplica> assignmentSet = generateReplicas(testCache);
+    ClusterContext context =
+        new ClusterContext(assignmentSet, generateNodes(testCache), new HashMap<>(),
+            new HashMap<>());
+    context.addPartitionToFaultZone(_testFaultZoneId, _resourceNames.get(0), _partitionNames.get(0));
+    // Insert again and trigger the error.
+    context
+        .addPartitionToFaultZone(_testFaultZoneId, _resourceNames.get(0), _partitionNames.get(0));
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestClusterModel.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestClusterModel.java
new file mode 100644
index 0000000..60967ca
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestClusterModel.java
@@ -0,0 +1,101 @@
+package org.apache.helix.controller.rebalancer.waged.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.IOException;
+import java.util.Collections;
+import java.util.Set;
+
+import org.apache.helix.HelixException;
+import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
+import org.testng.Assert;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+public class TestClusterModel extends AbstractTestClusterModel {
+  @BeforeClass
+  public void initialize() {
+    super.initialize();
+  }
+
+  @Test
+  public void testNormalUsage() throws IOException {
+    // Test 1 - initialize the cluster model based on the data cache.
+    ResourceControllerDataProvider testCache = setupClusterDataCache();
+    Set<AssignableReplica> assignableReplicas = generateReplicas(testCache);
+    Set<AssignableNode> assignableNodes = generateNodes(testCache);
+
+    ClusterContext context =
+        new ClusterContext(assignableReplicas, assignableNodes, Collections.emptyMap(),
+            Collections.emptyMap());
+    ClusterModel clusterModel = new ClusterModel(context, assignableReplicas, assignableNodes);
+
+    Assert.assertTrue(clusterModel.getContext().getAssignmentForFaultZoneMap().values().stream()
+        .allMatch(resourceMap -> resourceMap.values().isEmpty()));
+    Assert.assertFalse(clusterModel.getAssignableNodes().values().stream()
+        .anyMatch(node -> node.getAssignedReplicaCount() != 0));
+
+    // The initialization of the context, node and replication has been tested separately. So for
+    // cluster model, focus on testing the assignment and release.
+
+    // Assign
+    AssignableReplica replica = assignableReplicas.iterator().next();
+    AssignableNode assignableNode = assignableNodes.iterator().next();
+    clusterModel
+        .assign(replica.getResourceName(), replica.getPartitionName(), replica.getReplicaState(),
+            assignableNode.getInstanceName());
+
+    Assert.assertTrue(
+        clusterModel.getContext().getAssignmentForFaultZoneMap().get(assignableNode.getFaultZone())
+            .get(replica.getResourceName()).contains(replica.getPartitionName()));
+    Assert.assertTrue(assignableNode.getAssignedPartitionsMap().get(replica.getResourceName())
+        .contains(replica.getPartitionName()));
+
+    // Assign a nonexist replication
+    try {
+      clusterModel.assign("NOT-EXIST", replica.getPartitionName(), replica.getReplicaState(),
+          assignableNode.getInstanceName());
+      Assert.fail("Assigning a non existing resource partition shall fail.");
+    } catch (HelixException ex) {
+      // expected
+    }
+
+    // Assign a non-exist replication
+    try {
+      clusterModel
+          .assign(replica.getResourceName(), replica.getPartitionName(), replica.getReplicaState(),
+              "NON-EXIST");
+      Assert.fail("Assigning a resource partition to a non existing instance shall fail.");
+    } catch (HelixException ex) {
+      // expected
+    }
+
+    // Release
+    clusterModel
+        .release(replica.getResourceName(), replica.getPartitionName(), replica.getReplicaState(),
+            assignableNode.getInstanceName());
+
+    Assert.assertTrue(clusterModel.getContext().getAssignmentForFaultZoneMap().values().stream()
+        .allMatch(resourceMap -> resourceMap.values().stream()
+            .allMatch(partitions -> partitions.isEmpty())));
+    Assert.assertFalse(clusterModel.getAssignableNodes().values().stream()
+        .anyMatch(node -> node.getAssignedReplicaCount() != 0));
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestClusterModelProvider.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestClusterModelProvider.java
new file mode 100644
index 0000000..7a3ff22
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestClusterModelProvider.java
@@ -0,0 +1,376 @@
+package org.apache.helix.controller.rebalancer.waged.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.IOException;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+import org.apache.helix.HelixConstants;
+import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
+import org.apache.helix.controller.rebalancer.waged.WagedRebalancer;
+import org.apache.helix.model.CurrentState;
+import org.apache.helix.model.IdealState;
+import org.apache.helix.model.InstanceConfig;
+import org.apache.helix.model.LiveInstance;
+import org.apache.helix.model.Partition;
+import org.apache.helix.model.Resource;
+import org.apache.helix.model.ResourceAssignment;
+import org.mockito.stubbing.Answer;
+import org.testng.Assert;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+import static org.mockito.Matchers.anyString;
+import static org.mockito.Mockito.when;
+
+public class TestClusterModelProvider extends AbstractTestClusterModel {
+  Set<String> _instances;
+
+  @BeforeClass
+  public void initialize() {
+    super.initialize();
+    _instances = new HashSet<>();
+    _instances.add(_testInstanceId);
+  }
+
+  @Override
+  protected ResourceControllerDataProvider setupClusterDataCache() throws IOException {
+    ResourceControllerDataProvider testCache = super.setupClusterDataCache();
+
+    // Set up mock idealstate
+    Map<String, IdealState> isMap = new HashMap<>();
+    for (String resource : _resourceNames) {
+      IdealState is = new IdealState(resource);
+      is.setNumPartitions(_partitionNames.size());
+      is.setRebalanceMode(IdealState.RebalanceMode.FULL_AUTO);
+      is.setStateModelDefRef("MasterSlave");
+      is.setReplicas("3");
+      is.setRebalancerClassName(WagedRebalancer.class.getName());
+      _partitionNames.stream()
+          .forEach(partition -> is.setPreferenceList(partition, Collections.emptyList()));
+      isMap.put(resource, is);
+    }
+    when(testCache.getIdealState(anyString())).thenAnswer(
+        (Answer<IdealState>) invocationOnMock -> isMap.get(invocationOnMock.getArguments()[0]));
+
+    // Set up 2 more instances
+    for (int i = 1; i < 3; i++) {
+      String instanceName = _testInstanceId + i;
+      _instances.add(instanceName);
+      // 1. Set up the default instance information with capacity configuration.
+      InstanceConfig testInstanceConfig = createMockInstanceConfig(instanceName);
+      Map<String, InstanceConfig> instanceConfigMap = testCache.getInstanceConfigMap();
+      instanceConfigMap.put(instanceName, testInstanceConfig);
+      when(testCache.getInstanceConfigMap()).thenReturn(instanceConfigMap);
+      // 2. Mock the live instance node for the default instance.
+      LiveInstance testLiveInstance = createMockLiveInstance(instanceName);
+      Map<String, LiveInstance> liveInstanceMap = testCache.getLiveInstances();
+      liveInstanceMap.put(instanceName, testLiveInstance);
+      when(testCache.getLiveInstances()).thenReturn(liveInstanceMap);
+    }
+
+    return testCache;
+  }
+
+  @Test
+  public void testGenerateClusterModel() throws IOException {
+    ResourceControllerDataProvider testCache = setupClusterDataCache();
+    // 1. test generating a cluster model with empty assignment
+    ClusterModel clusterModel = ClusterModelProvider.generateClusterModelForBaseline(testCache,
+        _resourceNames.stream()
+            .collect(Collectors.toMap(resource -> resource, resource -> new Resource(resource))),
+        _instances, Collections.emptyMap(), Collections.emptyMap());
+    // There should be no existing assignment.
+    Assert.assertFalse(clusterModel.getContext().getAssignmentForFaultZoneMap().values().stream()
+        .anyMatch(resourceMap -> !resourceMap.isEmpty()));
+    Assert.assertFalse(clusterModel.getAssignableNodes().values().stream()
+        .anyMatch(node -> node.getAssignedReplicaCount() != 0));
+    // Have all 3 instances
+    Assert.assertEquals(
+        clusterModel.getAssignableNodes().values().stream().map(AssignableNode::getInstanceName)
+            .collect(Collectors.toSet()), _instances);
+    // Shall have 2 resources and 4 replicas, since all nodes are in the same fault zone.
+    Assert.assertEquals(clusterModel.getAssignableReplicaMap().size(), 2);
+    Assert.assertTrue(clusterModel.getAssignableReplicaMap().values().stream()
+        .allMatch(replicaSet -> replicaSet.size() == 4));
+
+    // Adjust instance fault zone, so they have different fault zones.
+    testCache.getInstanceConfigMap().values().stream()
+        .forEach(config -> config.setZoneId(config.getInstanceName()));
+    clusterModel = ClusterModelProvider.generateClusterModelForBaseline(testCache,
+        _resourceNames.stream()
+            .collect(Collectors.toMap(resource -> resource, resource -> new Resource(resource))),
+        _instances, Collections.emptyMap(), Collections.emptyMap());
+    // Shall have 2 resources and 12 replicas after fault zone adjusted.
+    Assert.assertEquals(clusterModel.getAssignableReplicaMap().size(), 2);
+    Assert.assertTrue(clusterModel.getAssignableReplicaMap().values().stream()
+        .allMatch(replicaSet -> replicaSet.size() == 12));
+
+    // 2. test with only one active node
+    clusterModel = ClusterModelProvider.generateClusterModelForBaseline(testCache,
+        _resourceNames.stream()
+            .collect(Collectors.toMap(resource -> resource, resource -> new Resource(resource))),
+        Collections.singleton(_testInstanceId), Collections.emptyMap(), Collections.emptyMap());
+    // Have only one instance
+    Assert.assertEquals(
+        clusterModel.getAssignableNodes().values().stream().map(AssignableNode::getInstanceName)
+            .collect(Collectors.toSet()), Collections.singleton(_testInstanceId));
+    // Shall have 4 assignable replicas because there is only one valid node.
+    Assert.assertTrue(clusterModel.getAssignableReplicaMap().values().stream()
+        .allMatch(replicaSet -> replicaSet.size() == 4));
+
+    // 3. test with no active instance
+    clusterModel = ClusterModelProvider.generateClusterModelForBaseline(testCache,
+        _resourceNames.stream()
+            .collect(Collectors.toMap(resource -> resource, resource -> new Resource(resource))),
+        Collections.emptySet(), Collections.emptyMap(), Collections.emptyMap());
+    // Have only one instance
+    Assert.assertEquals(clusterModel.getAssignableNodes().size(), 0);
+    // Shall have 0 assignable replicas because there are 0 valid nodes.
+    Assert.assertTrue(clusterModel.getAssignableReplicaMap().values().stream()
+        .allMatch(replicaSet -> replicaSet.isEmpty()));
+
+    // 4. test with baseline assignment
+    // Mock a baseline assignment based on the current states.
+    Map<String, ResourceAssignment> baselineAssignment = new HashMap<>();
+    for (String resource : _resourceNames) {
+      // <partition, <instance, state>>
+      Map<String, Map<String, String>> assignmentMap = new HashMap<>();
+      CurrentState cs = testCache.getCurrentState(_testInstanceId, _sessionId).get(resource);
+      if (cs != null) {
+        for (Map.Entry<String, String> stateEntry : cs.getPartitionStateMap().entrySet()) {
+          assignmentMap.computeIfAbsent(stateEntry.getKey(), k -> new HashMap<>())
+              .put(_testInstanceId, stateEntry.getValue());
+        }
+        ResourceAssignment assignment = new ResourceAssignment(resource);
+        assignmentMap.keySet().stream().forEach(partition -> assignment
+            .addReplicaMap(new Partition(partition), assignmentMap.get(partition)));
+        baselineAssignment.put(resource, assignment);
+      }
+    }
+
+    // Generate a cluster model based on the best possible assignment
+    clusterModel = ClusterModelProvider.generateClusterModelForBaseline(testCache,
+        _resourceNames.stream()
+            .collect(Collectors.toMap(resource -> resource, resource -> new Resource(resource))),
+        _instances, Collections.emptyMap(), baselineAssignment);
+    // There should be 4 existing assignments in total (each resource has 2) in the specified instance
+    Assert.assertTrue(clusterModel.getContext().getAssignmentForFaultZoneMap().values().stream()
+        .allMatch(resourceMap -> resourceMap.values().stream()
+            .allMatch(partitionSet -> partitionSet.size() == 2)));
+    Assert.assertEquals(
+        clusterModel.getAssignableNodes().get(_testInstanceId).getAssignedReplicaCount(), 4);
+    // Since each resource has 2 replicas assigned, the assignable replica count should be 10.
+    Assert.assertEquals(clusterModel.getAssignableReplicaMap().size(), 2);
+    Assert.assertTrue(clusterModel.getAssignableReplicaMap().values().stream()
+        .allMatch(replicaSet -> replicaSet.size() == 10));
+
+    // 5. test with best possible assignment but cluster topology is changed
+    clusterModel = ClusterModelProvider.generateClusterModelForBaseline(testCache,
+        _resourceNames.stream()
+            .collect(Collectors.toMap(resource -> resource, resource -> new Resource(resource))),
+        _instances,
+        Collections.singletonMap(HelixConstants.ChangeType.CLUSTER_CONFIG, Collections.emptySet()),
+        baselineAssignment);
+    // There should be no existing assignment since the topology change invalidates all existing assignment
+    Assert.assertTrue(clusterModel.getContext().getAssignmentForFaultZoneMap().values().stream()
+        .allMatch(resourceMap -> resourceMap.isEmpty()));
+    Assert.assertFalse(clusterModel.getAssignableNodes().values().stream()
+        .anyMatch(node -> node.getAssignedReplicaCount() != 0));
+    // Shall have 2 resources and 12 replicas
+    Assert.assertEquals(clusterModel.getAssignableReplicaMap().size(), 2);
+    Assert.assertTrue(clusterModel.getAssignableReplicaMap().values().stream()
+        .allMatch(replicaSet -> replicaSet.size() == 12));
+
+    // 6. test with best possible assignment and one resource config change
+    // Generate a cluster model based on the same best possible assignment, but resource1 config is changed
+    String changedResourceName = _resourceNames.get(0);
+    clusterModel = ClusterModelProvider.generateClusterModelForBaseline(testCache,
+        _resourceNames.stream()
+            .collect(Collectors.toMap(resource -> resource, resource -> new Resource(resource))),
+        _instances, Collections.singletonMap(HelixConstants.ChangeType.RESOURCE_CONFIG,
+            Collections.singleton(changedResourceName)), baselineAssignment);
+    // There should be no existing assignment for all the resource except for resource2
+    Assert.assertEquals(clusterModel.getContext().getAssignmentForFaultZoneMap().size(), 1);
+    Map<String, Set<String>> resourceAssignmentMap =
+        clusterModel.getContext().getAssignmentForFaultZoneMap().get(_testInstanceId);
+    // Should be only resource2 in the map
+    Assert.assertEquals(resourceAssignmentMap.size(), 1);
+    for (String resource : _resourceNames) {
+      Assert
+          .assertEquals(resourceAssignmentMap.getOrDefault(resource, Collections.emptySet()).size(),
+              resource.equals(changedResourceName) ? 0 : 2);
+    }
+    // Only the first instance will have 2 assignment from resource2.
+    for (String instance : _instances) {
+      Assert.assertEquals(clusterModel.getAssignableNodes().get(instance).getAssignedReplicaCount(),
+          instance.equals(_testInstanceId) ? 2 : 0);
+    }
+    // Shall have 2 resources and 12 replicas
+    Assert.assertEquals(clusterModel.getAssignableReplicaMap().keySet().size(), 2);
+    for (String resource : _resourceNames) {
+      Assert.assertEquals(clusterModel.getAssignableReplicaMap().get(resource).size(),
+          resource.equals(changedResourceName) ? 12 : 10);
+    }
+
+    // 7. test with best possible assignment but the instance becomes inactive
+    // Generate a cluster model based on the best possible assignment, but the assigned node is disabled
+    Set<String> limitedActiveInstances = new HashSet<>(_instances);
+    limitedActiveInstances.remove(_testInstanceId);
+    clusterModel = ClusterModelProvider.generateClusterModelForBaseline(testCache,
+        _resourceNames.stream()
+            .collect(Collectors.toMap(resource -> resource, resource -> new Resource(resource))),
+        limitedActiveInstances, Collections.emptyMap(), baselineAssignment);
+    // There should be no existing assignment.
+    Assert.assertFalse(clusterModel.getContext().getAssignmentForFaultZoneMap().values().stream()
+        .anyMatch(resourceMap -> !resourceMap.isEmpty()));
+    Assert.assertFalse(clusterModel.getAssignableNodes().values().stream()
+        .anyMatch(node -> node.getAssignedReplicaCount() != 0));
+    // Have only 2 instances
+    Assert.assertEquals(
+        clusterModel.getAssignableNodes().values().stream().map(AssignableNode::getInstanceName)
+            .collect(Collectors.toSet()), limitedActiveInstances);
+    // Since only 2 instances are active, we shall have 8 assignable replicas in each resource.
+    Assert.assertEquals(clusterModel.getAssignableReplicaMap().size(), 2);
+    Assert.assertTrue(clusterModel.getAssignableReplicaMap().values().stream()
+        .allMatch(replicaSet -> replicaSet.size() == 8));
+  }
+
+  @Test (dependsOnMethods = "testGenerateClusterModel")
+  public void testGenerateClusterModelForPartialRebalance() throws IOException {
+    ResourceControllerDataProvider testCache = setupClusterDataCache();
+    // 1. test generating a cluster model with empty assignment
+    ClusterModel clusterModel = ClusterModelProvider
+        .generateClusterModelForPartialRebalance(testCache, _resourceNames.stream()
+                .collect(Collectors.toMap(resource -> resource, resource -> new Resource(resource))),
+            _instances, Collections.emptyMap(), Collections.emptyMap());
+    // There should be no existing assignment.
+    Assert.assertFalse(clusterModel.getContext().getAssignmentForFaultZoneMap().values().stream()
+        .anyMatch(resourceMap -> !resourceMap.isEmpty()));
+    Assert.assertFalse(clusterModel.getAssignableNodes().values().stream()
+        .anyMatch(node -> node.getAssignedReplicaCount() != 0));
+    // Have all 3 instances
+    Assert.assertEquals(
+        clusterModel.getAssignableNodes().values().stream().map(AssignableNode::getInstanceName)
+            .collect(Collectors.toSet()), _instances);
+    // Shall have 0 resources and 0 replicas since the baseline is empty. The partial rebalance
+    // should not rebalance any replica.
+    Assert.assertEquals(clusterModel.getAssignableReplicaMap().size(), 0);
+
+    // Adjust instance fault zone, so they have different fault zones.
+    testCache.getInstanceConfigMap().values().stream()
+        .forEach(config -> config.setZoneId(config.getInstanceName()));
+
+    // 2. test with a pair of identical best possible assignment and baseline assignment
+    // Mock a best possible assignment based on the current states.
+    Map<String, ResourceAssignment> bestPossibleAssignment = new HashMap<>();
+    for (String resource : _resourceNames) {
+      // <partition, <instance, state>>
+      Map<String, Map<String, String>> assignmentMap = new HashMap<>();
+      CurrentState cs = testCache.getCurrentState(_testInstanceId, _sessionId).get(resource);
+      if (cs != null) {
+        for (Map.Entry<String, String> stateEntry : cs.getPartitionStateMap().entrySet()) {
+          assignmentMap.computeIfAbsent(stateEntry.getKey(), k -> new HashMap<>())
+              .put(_testInstanceId, stateEntry.getValue());
+        }
+        ResourceAssignment assignment = new ResourceAssignment(resource);
+        assignmentMap.keySet().stream().forEach(partition -> assignment
+            .addReplicaMap(new Partition(partition), assignmentMap.get(partition)));
+        bestPossibleAssignment.put(resource, assignment);
+      }
+    }
+    Map<String, ResourceAssignment> baseline = new HashMap<>(bestPossibleAssignment);
+    // Generate a cluster model for partial rebalance
+    clusterModel = ClusterModelProvider.generateClusterModelForPartialRebalance(testCache,
+        _resourceNames.stream()
+            .collect(Collectors.toMap(resource -> resource, resource -> new Resource(resource))),
+        _instances, baseline, bestPossibleAssignment);
+    // There should be 4 existing assignments in total (each resource has 2) in the specified instance
+    Assert.assertTrue(clusterModel.getContext().getAssignmentForFaultZoneMap().values().stream()
+        .allMatch(resourceMap -> resourceMap.values().stream()
+            .allMatch(partitionSet -> partitionSet.size() == 2)));
+    Assert.assertEquals(
+        clusterModel.getAssignableNodes().get(_testInstanceId).getAssignedReplicaCount(), 4);
+    // Since the best possible matches the baseline, no replica needs to be reassigned.
+    Assert.assertEquals(clusterModel.getAssignableReplicaMap().size(), 0);
+
+    // 3. test with inactive instance in the baseline and the best possible assignment
+    Set<String> partialInstanceList = new HashSet<>(_instances);
+    partialInstanceList.remove(_testInstanceId);
+    clusterModel = ClusterModelProvider.generateClusterModelForPartialRebalance(testCache,
+        _resourceNames.stream()
+            .collect(Collectors.toMap(resource -> resource, resource -> new Resource(resource))),
+        partialInstanceList, baseline, bestPossibleAssignment);
+    // Have the other 2 active instances
+    Assert.assertEquals(clusterModel.getAssignableNodes().size(), 2);
+    // All the replicas in the existing assignment should be rebalanced.
+    Assert.assertEquals(clusterModel.getAssignableReplicaMap().size(), 2);
+    Assert.assertTrue(clusterModel.getAssignableReplicaMap().values().stream()
+        .allMatch(replicaSet -> replicaSet.size() == 2));
+    // Shall have 0 assigned replicas
+    Assert.assertTrue(clusterModel.getAssignableNodes().values().stream()
+        .allMatch(assignableNode -> assignableNode.getAssignedReplicaCount() == 0));
+
+    // 4. test with one resource that is only in the baseline
+    String resourceInBaselineOnly = _resourceNames.get(0);
+    Map<String, ResourceAssignment> partialBestPossibleAssignment =
+        new HashMap<>(bestPossibleAssignment);
+    partialBestPossibleAssignment.remove(resourceInBaselineOnly);
+    // Generate a cluster mode with the adjusted best possible assignment
+    clusterModel = ClusterModelProvider.generateClusterModelForPartialRebalance(testCache,
+        _resourceNames.stream()
+            .collect(Collectors.toMap(resource -> resource, resource -> new Resource(resource))),
+        _instances, baseline, partialBestPossibleAssignment);
+    // There should be 2 existing assignments in total in the specified instance
+    Assert.assertTrue(clusterModel.getContext().getAssignmentForFaultZoneMap().values().stream()
+        .allMatch(resourceMap -> resourceMap.values().stream()
+            .allMatch(partitionSet -> partitionSet.size() == 2)));
+    // Only the replicas of one resource require rebalance
+    Assert.assertEquals(
+        clusterModel.getAssignableNodes().get(_testInstanceId).getAssignedReplicaCount(), 2);
+    Assert.assertEquals(clusterModel.getAssignableReplicaMap().size(), 1);
+    Assert.assertTrue(clusterModel.getAssignableReplicaMap().containsKey(resourceInBaselineOnly));
+    Assert.assertTrue(clusterModel.getAssignableReplicaMap().values().stream()
+        .allMatch(replicaSet -> replicaSet.size() == 2));
+
+    // 5. test with one resource only in the best possible assignment
+    String resourceInBestPossibleOnly = _resourceNames.get(1);
+    Map<String, ResourceAssignment> partialBaseline = new HashMap<>(baseline);
+    partialBaseline.remove(resourceInBestPossibleOnly);
+    // Generate a cluster model with the adjusted baseline
+    clusterModel = ClusterModelProvider.generateClusterModelForPartialRebalance(testCache,
+        _resourceNames.stream()
+            .collect(Collectors.toMap(resource -> resource, resource -> new Resource(resource))),
+        _instances, partialBaseline, bestPossibleAssignment);
+    // There should be 2 existing assignments in total and all of them require rebalance.
+    Assert.assertTrue(clusterModel.getContext().getAssignmentForFaultZoneMap().values().stream()
+        .allMatch(resourceMap -> resourceMap.values().stream()
+            .allMatch(partitionSet -> partitionSet.size() == 2)));
+    Assert.assertEquals(
+        clusterModel.getAssignableNodes().get(_testInstanceId).getAssignedReplicaCount(), 2);
+    // No need to rebalance the replicas that are not in the baseline yet.
+    Assert.assertEquals(clusterModel.getAssignableReplicaMap().size(), 0);
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestOptimalAssignment.java b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestOptimalAssignment.java
new file mode 100644
index 0000000..bd820a9
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/TestOptimalAssignment.java
@@ -0,0 +1,91 @@
+package org.apache.helix.controller.rebalancer.waged.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.Map;
+
+import org.apache.helix.HelixException;
+import org.apache.helix.model.Partition;
+import org.apache.helix.model.ResourceAssignment;
+import org.testng.Assert;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+public class TestOptimalAssignment extends ClusterModelTestHelper {
+
+  @BeforeClass
+  public void initialize() {
+    super.initialize();
+  }
+
+  @Test
+  public void testUpdateAssignment() throws IOException {
+    OptimalAssignment assignment = new OptimalAssignment();
+
+    // update with empty cluster model
+    assignment.updateAssignments(getDefaultClusterModel());
+    Map<String, ResourceAssignment> optimalAssignmentMap =
+        assignment.getOptimalResourceAssignment();
+    Assert.assertEquals(optimalAssignmentMap, Collections.emptyMap());
+
+    // update with valid assignment
+    ClusterModel model = getDefaultClusterModel();
+    model.assign(_resourceNames.get(0), _partitionNames.get(1), "SLAVE", _testInstanceId);
+    model.assign(_resourceNames.get(0), _partitionNames.get(0), "MASTER", _testInstanceId);
+    assignment.updateAssignments(model);
+    optimalAssignmentMap = assignment.getOptimalResourceAssignment();
+    Assert.assertEquals(optimalAssignmentMap.get(_resourceNames.get(0)).getMappedPartitions(),
+        Arrays
+            .asList(new Partition(_partitionNames.get(0)), new Partition(_partitionNames.get(1))));
+    Assert.assertEquals(optimalAssignmentMap.get(_resourceNames.get(0))
+            .getReplicaMap(new Partition(_partitionNames.get(1))),
+        Collections.singletonMap(_testInstanceId, "SLAVE"));
+    Assert.assertEquals(optimalAssignmentMap.get(_resourceNames.get(0))
+            .getReplicaMap(new Partition(_partitionNames.get(0))),
+        Collections.singletonMap(_testInstanceId, "MASTER"));
+  }
+
+  @Test(dependsOnMethods = "testUpdateAssignment")
+  public void TestAssignmentFailure() throws IOException {
+    OptimalAssignment assignment = new OptimalAssignment();
+    ClusterModel model = getDefaultClusterModel();
+
+    // record failure
+    AssignableReplica targetFailureReplica =
+        model.getAssignableReplicaMap().get(_resourceNames.get(0)).iterator().next();
+    AssignableNode targetFailureNode = model.getAssignableNodes().get(_testInstanceId);
+    assignment.recordAssignmentFailure(targetFailureReplica, Collections
+        .singletonMap(targetFailureNode, Collections.singletonList("Assignment Failure!")));
+
+    Assert.assertTrue(assignment.hasAnyFailure());
+
+    assignment.updateAssignments(getDefaultClusterModel());
+    try {
+      assignment.getOptimalResourceAssignment();
+      Assert.fail("Get optimal assignment shall fail because of the failure record.");
+    } catch (HelixException ex) {
+      Assert.assertTrue(ex.getMessage().startsWith(
+          "Cannot get the optimal resource assignment since a calculation failure is recorded."));
+    }
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/CrushRebalancers/TestCrushAutoRebalanceNonRack.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/CrushRebalancers/TestCrushAutoRebalanceNonRack.java
index a1baf03..5b7d083 100644
--- a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/CrushRebalancers/TestCrushAutoRebalanceNonRack.java
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/CrushRebalancers/TestCrushAutoRebalanceNonRack.java
@@ -150,7 +150,7 @@
 
     HelixClusterVerifier _clusterVerifier =
         new StrictMatchExternalViewVerifier.Builder(CLUSTER_NAME).setZkAddr(ZK_ADDR)
-            .setResources(_allDBs).build();
+            .setDeactivatedNodeAwareness(true).setResources(_allDBs).build();
     Assert.assertTrue(_clusterVerifier.verify(5000));
     for (String db : _allDBs) {
       IdealState is =
@@ -182,7 +182,7 @@
 
     HelixClusterVerifier _clusterVerifier =
         new StrictMatchExternalViewVerifier.Builder(CLUSTER_NAME).setZkAddr(ZK_ADDR)
-            .setResources(_allDBs).build();
+            .setDeactivatedNodeAwareness(true).setResources(_allDBs).build();
     Assert.assertTrue(_clusterVerifier.verify(5000));
     for (String db : _allDBs) {
       IdealState is =
@@ -218,7 +218,7 @@
 
     HelixClusterVerifier _clusterVerifier =
         new StrictMatchExternalViewVerifier.Builder(CLUSTER_NAME).setZkAddr(ZK_ADDR)
-            .setResources(_allDBs).build();
+            .setDeactivatedNodeAwareness(true).setResources(_allDBs).build();
     Assert.assertTrue(_clusterVerifier.verify(5000));
     for (String db : _allDBs) {
       IdealState is =
@@ -262,7 +262,7 @@
     Thread.sleep(300);
     ZkHelixClusterVerifier _clusterVerifier =
         new StrictMatchExternalViewVerifier.Builder(CLUSTER_NAME).setZkAddr(ZK_ADDR)
-            .setResources(_allDBs).build();
+            .setDeactivatedNodeAwareness(true).setResources(_allDBs).build();
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
     for (String db : _allDBs) {
       IdealState is =
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/CrushRebalancers/TestNodeSwap.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/CrushRebalancers/TestNodeSwap.java
index b71580e..7f4c669 100644
--- a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/CrushRebalancers/TestNodeSwap.java
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/CrushRebalancers/TestNodeSwap.java
@@ -148,7 +148,7 @@
 
     HelixClusterVerifier _clusterVerifier =
         new StrictMatchExternalViewVerifier.Builder(CLUSTER_NAME).setZkAddr(ZK_ADDR)
-            .setResources(_allDBs).build();
+            .setDeactivatedNodeAwareness(true).setResources(_allDBs).build();
     Assert.assertTrue(_clusterVerifier.verify(5000));
 
     Map<String, ExternalView> record = new HashMap<>();
@@ -192,7 +192,7 @@
     Thread.sleep(2000);
 
     _clusterVerifier = new StrictMatchExternalViewVerifier.Builder(CLUSTER_NAME).setZkAddr(ZK_ADDR)
-        .setResources(_allDBs).build();
+        .setDeactivatedNodeAwareness(true).setResources(_allDBs).build();
     Assert.assertTrue(_clusterVerifier.verify(5000));
 
     for (String db : _allDBs) {
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/DelayedAutoRebalancer/TestDelayedAutoRebalance.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/DelayedAutoRebalancer/TestDelayedAutoRebalance.java
index 5ae022e..afcabdf 100644
--- a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/DelayedAutoRebalancer/TestDelayedAutoRebalance.java
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/DelayedAutoRebalancer/TestDelayedAutoRebalance.java
@@ -44,19 +44,22 @@
 import org.testng.annotations.Test;
 
 public class TestDelayedAutoRebalance extends ZkTestBase {
-  final int NUM_NODE = 5;
+  static final int NUM_NODE = 5;
   protected static final int START_PORT = 12918;
-  protected static final int _PARTITIONS = 5;
+  protected static final int PARTITIONS = 5;
+  // TODO: remove this wait time once we have a better way to determine if the rebalance has been
+  // TODO: done as a reaction of the test operations.
+  protected static final int DEFAULT_REBALANCE_PROCESSING_WAIT_TIME = 1000;
 
   protected final String CLASS_NAME = getShortClassName();
   protected final String CLUSTER_NAME = CLUSTER_PREFIX + "_" + CLASS_NAME;
   protected ClusterControllerManager _controller;
 
-  List<MockParticipantManager> _participants = new ArrayList<>();
-  int _replica = 3;
-  int _minActiveReplica = _replica - 1;
-  ZkHelixClusterVerifier _clusterVerifier;
-  List<String> _testDBs = new ArrayList<String>();
+  protected List<MockParticipantManager> _participants = new ArrayList<>();
+  protected int _replica = 3;
+  protected int _minActiveReplica = _replica - 1;
+  protected ZkHelixClusterVerifier _clusterVerifier;
+  protected List<String> _testDBs = new ArrayList<>();
 
   @BeforeClass
   public void beforeClass() throws Exception {
@@ -123,7 +126,8 @@
 
     // bring down another node, the minimal active replica for each partition should be maintained.
     _participants.get(3).syncStop();
-    Thread.sleep(500);
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
+    Assert.assertTrue(_clusterVerifier.verifyByPolling());
     for (String db : _testDBs) {
       ExternalView ev =
           _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, db);
@@ -141,10 +145,11 @@
     enablePersistBestPossibleAssignment(_gZkClient, CLUSTER_NAME, true);
 
     long delay = 4000;
-    Map<String, ExternalView> externalViewsBefore = createTestDBs(delay);
+    setDelayTimeInCluster(_gZkClient, CLUSTER_NAME, delay);
+    Map<String, ExternalView> externalViewsBefore = createTestDBs(-1);
     validateDelayedMovements(externalViewsBefore);
 
-    Thread.sleep(delay + 200);
+    Thread.sleep(delay + DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
     // after delay time, it should maintain required number of replicas.
     for (String db : _testDBs) {
@@ -157,7 +162,8 @@
 
   @Test (dependsOnMethods = {"testMinimalActiveReplicaMaintain"})
   public void testDisableDelayRebalanceInResource() throws Exception {
-    Map<String, ExternalView> externalViewsBefore = createTestDBs(1000000);
+    setDelayTimeInCluster(_gZkClient, CLUSTER_NAME, 1000000);
+    Map<String, ExternalView> externalViewsBefore = createTestDBs(-1);
     validateDelayedMovements(externalViewsBefore);
 
     // disable delay rebalance for one db, partition should be moved immediately
@@ -166,7 +172,7 @@
         CLUSTER_NAME, testDb);
     idealState.setDelayRebalanceEnabled(false);
     _gSetupTool.getClusterManagementTool().setResourceIdealState(CLUSTER_NAME, testDb, idealState);
-
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
 
     // once delay rebalance is disabled, it should maintain required number of replicas for that db.
@@ -190,13 +196,13 @@
   @Test (dependsOnMethods = {"testDisableDelayRebalanceInResource"})
   public void testDisableDelayRebalanceInCluster() throws Exception {
     enableDelayRebalanceInCluster(_gZkClient, CLUSTER_NAME, true);
-
-    Map<String, ExternalView> externalViewsBefore = createTestDBs(1000000);
+    setDelayTimeInCluster(_gZkClient, CLUSTER_NAME, 1000000);
+    Map<String, ExternalView> externalViewsBefore = createTestDBs(-1);
     validateDelayedMovements(externalViewsBefore);
 
     // disable delay rebalance for the entire cluster.
     enableDelayRebalanceInCluster(_gZkClient, CLUSTER_NAME, false);
-    Thread.sleep(100);
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
     for (String db : _testDBs) {
       ExternalView ev =
@@ -210,13 +216,14 @@
 
   @Test (dependsOnMethods = {"testDisableDelayRebalanceInCluster"})
   public void testDisableDelayRebalanceInInstance() throws Exception {
-    Map<String, ExternalView> externalViewsBefore = createTestDBs(1000000);
+    setDelayTimeInCluster(_gZkClient, CLUSTER_NAME, 1000000);
+    Map<String, ExternalView> externalViewsBefore = createTestDBs(-1);
     validateDelayedMovements(externalViewsBefore);
 
     String disabledInstanceName = _participants.get(0).getInstanceName();
     enableDelayRebalanceInInstance(_gZkClient, CLUSTER_NAME, disabledInstanceName, false);
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
-
     for (String db : _testDBs) {
       IdealState is = _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, db);
       Map<String, List<String>> preferenceLists = is.getPreferenceLists();
@@ -234,7 +241,7 @@
       _gSetupTool.dropResourceFromCluster(CLUSTER_NAME, db);
     }
     _testDBs.clear();
-    Thread.sleep(50);
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
   }
 
   @BeforeMethod
@@ -255,11 +262,11 @@
     int i = 0;
     for (String stateModel : TestStateModels) {
       String db = "Test-DB-" + i++;
-      createResourceWithDelayedRebalance(CLUSTER_NAME, db, stateModel, _PARTITIONS, _replica,
+      createResourceWithDelayedRebalance(CLUSTER_NAME, db, stateModel, PARTITIONS, _replica,
           _minActiveReplica, delayTime, CrushRebalanceStrategy.class.getName());
       _testDBs.add(db);
     }
-    Thread.sleep(100);
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
     for (String db : _testDBs) {
       ExternalView ev =
@@ -302,7 +309,7 @@
   private void validateDelayedMovements(Map<String, ExternalView> externalViewsBefore)
       throws InterruptedException {
     _participants.get(0).syncStop();
-    Thread.sleep(100);
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
 
     for (String db : _testDBs) {
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/DelayedAutoRebalancer/TestDelayedAutoRebalanceWithDisabledInstance.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/DelayedAutoRebalancer/TestDelayedAutoRebalanceWithDisabledInstance.java
index 80141c9..bc5a024 100644
--- a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/DelayedAutoRebalancer/TestDelayedAutoRebalanceWithDisabledInstance.java
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/DelayedAutoRebalancer/TestDelayedAutoRebalanceWithDisabledInstance.java
@@ -56,7 +56,7 @@
     String instance = _participants.get(0).getInstanceName();
     enableInstance(instance, false);
 
-    Thread.sleep(300);
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
 
     for (String db : _testDBs) {
@@ -79,7 +79,7 @@
     String instance = _participants.get(0).getInstanceName();
     enableInstance(instance, false);
 
-    Thread.sleep(100);
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
 
     for (String db : _testDBs) {
@@ -106,7 +106,7 @@
 
     // disable one node, no partition should be moved.
     enableInstance(_participants.get(0).getInstanceName(), false);
-    Thread.sleep(100);
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
 
     for (String db : _testDBs) {
@@ -120,7 +120,7 @@
 
     // disable another node, the minimal active replica for each partition should be maintained.
     enableInstance(_participants.get(3).getInstanceName(), false);
-    Thread.sleep(1000);
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
 
     for (String db : _testDBs) {
@@ -143,7 +143,7 @@
 
     // disable one node, no partition should be moved.
     enableInstance(_participants.get(0).getInstanceName(), false);
-    Thread.sleep(100);
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
 
     for (String db : _testDBs) {
@@ -157,7 +157,7 @@
 
     // bring down another node, the minimal active replica for each partition should be maintained.
     _participants.get(3).syncStop();
-    Thread.sleep(100);
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
 
     for (String db : _testDBs) {
@@ -178,11 +178,12 @@
     enablePersistBestPossibleAssignment(_gZkClient, CLUSTER_NAME, true);
 
     long delay = 10000;
-    Map<String, ExternalView> externalViewsBefore = createTestDBs(delay);
+    setDelayTimeInCluster(_gZkClient, CLUSTER_NAME, delay);
+    Map<String, ExternalView> externalViewsBefore = createTestDBs(-1);
 
     // disable one node, no partition should be moved.
     enableInstance(_participants.get(0).getInstanceName(), false);
-    Thread.sleep(100);
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
     for (String db : _testDBs) {
       ExternalView ev =
@@ -193,7 +194,8 @@
           _participants.get(0).getInstanceName(), true);
     }
 
-    Thread.sleep(delay + 500);
+    Thread.sleep(delay + DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
+    Assert.assertTrue(_clusterVerifier.verifyByPolling());
     // after delay time, it should maintain required number of replicas.
     for (String db : _testDBs) {
       ExternalView ev =
@@ -210,7 +212,7 @@
 
     // disable one node, no partition should be moved.
     enableInstance(_participants.get(0).getInstanceName(), false);
-    Thread.sleep(100);
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
 
     for (String db : _testDBs) {
@@ -228,7 +230,7 @@
         CLUSTER_NAME, testDb);
     idealState.setDelayRebalanceEnabled(false);
     _gSetupTool.getClusterManagementTool().setResourceIdealState(CLUSTER_NAME, testDb, idealState);
-    Thread.sleep(2000);
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
 
     // once delay rebalance is disabled, it should maintain required number of replicas for that db.
@@ -253,12 +255,12 @@
   @Override
   public void testDisableDelayRebalanceInCluster() throws Exception {
     enableDelayRebalanceInCluster(_gZkClient, CLUSTER_NAME, true);
-
-    Map<String, ExternalView> externalViewsBefore = createTestDBs(1000000);
+    setDelayTimeInCluster(_gZkClient, CLUSTER_NAME, 1000000);
+    Map<String, ExternalView> externalViewsBefore = createTestDBs(-1);
 
     // disable one node, no partition should be moved.
     enableInstance(_participants.get(0).getInstanceName(), false);
-    Thread.sleep(100);
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
 
     for (String db : _testDBs) {
@@ -272,7 +274,7 @@
 
     // disable delay rebalance for the entire cluster.
     enableDelayRebalanceInCluster(_gZkClient, CLUSTER_NAME, false);
-    Thread.sleep(2000);
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
     Assert.assertTrue(_clusterVerifier.verifyByPolling());
     for (String db : _testDBs) {
       ExternalView ev =
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/DelayedAutoRebalancer/TestDelayedAutoRebalanceWithRackaware.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/DelayedAutoRebalancer/TestDelayedAutoRebalanceWithRackaware.java
index c285449..d8840f0 100644
--- a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/DelayedAutoRebalancer/TestDelayedAutoRebalanceWithRackaware.java
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/DelayedAutoRebalancer/TestDelayedAutoRebalanceWithRackaware.java
@@ -30,7 +30,7 @@
 import org.testng.annotations.Test;
 
 public class TestDelayedAutoRebalanceWithRackaware extends TestDelayedAutoRebalance {
-  final int NUM_NODE = 9;
+  static final int NUM_NODE = 9;
 
   @BeforeClass
   public void beforeClass() throws Exception {
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/PartitionMigration/TestExpandCluster.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/PartitionMigration/TestExpandCluster.java
index d9dc1bc..385e2b5 100644
--- a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/PartitionMigration/TestExpandCluster.java
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/PartitionMigration/TestExpandCluster.java
@@ -31,14 +31,14 @@
 
 
 public class TestExpandCluster extends TestPartitionMigrationBase {
-
   Map<String, IdealState> _resourceMap;
 
-
   @BeforeClass
   public void beforeClass() throws Exception {
     super.beforeClass();
     _resourceMap = createTestDBs(1000000);
+    // TODO remove this sleep after fix https://github.com/apache/helix/issues/526
+    Thread.sleep(1000);
     _migrationVerifier = new MigrationStateVerifier(_resourceMap, _manager);
   }
 
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/PartitionMigration/TestPartitionMigrationBase.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/PartitionMigration/TestPartitionMigrationBase.java
index bfc80e1..754e6c0 100644
--- a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/PartitionMigration/TestPartitionMigrationBase.java
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/PartitionMigration/TestPartitionMigrationBase.java
@@ -49,7 +49,7 @@
 
 
 public class TestPartitionMigrationBase extends ZkTestBase {
-  final int NUM_NODE = 6;
+  protected final int NUM_NODE = 6;
   protected static final int START_PORT = 12918;
   protected static final int _PARTITIONS = 50;
 
@@ -57,15 +57,15 @@
   protected final String CLUSTER_NAME = CLUSTER_PREFIX + "_" + CLASS_NAME;
   protected ClusterControllerManager _controller;
 
-  List<MockParticipantManager> _participants = new ArrayList<>();
-  int _replica = 3;
-  int _minActiveReplica = _replica - 1;
-  ZkHelixClusterVerifier _clusterVerifier;
-  List<String> _testDBs = new ArrayList<>();
+  protected List<MockParticipantManager> _participants = new ArrayList<>();
+  protected int _replica = 3;
+  protected int _minActiveReplica = _replica - 1;
+  protected ZkHelixClusterVerifier _clusterVerifier;
+  protected List<String> _testDBs = new ArrayList<>();
 
-  MigrationStateVerifier _migrationVerifier;
-  HelixManager _manager;
-  ConfigAccessor _configAccessor;
+  protected MigrationStateVerifier _migrationVerifier;
+  protected HelixManager _manager;
+  protected ConfigAccessor _configAccessor;
 
 
   @BeforeClass
@@ -90,8 +90,8 @@
 
     enablePersistIntermediateAssignment(_gZkClient, CLUSTER_NAME, true);
 
-    _manager =
-        HelixManagerFactory.getZKHelixManager(CLUSTER_NAME, "admin", InstanceType.ADMINISTRATOR, ZK_ADDR);
+    _manager = HelixManagerFactory
+        .getZKHelixManager(CLUSTER_NAME, "admin", InstanceType.ADMINISTRATOR, ZK_ADDR);
     _manager.connect();
     _configAccessor = new ConfigAccessor(_gZkClient);
   }
@@ -134,7 +134,7 @@
     return idealStateMap;
   }
 
-  class MigrationStateVerifier implements IdealStateChangeListener, ExternalViewChangeListener {
+  protected class MigrationStateVerifier implements IdealStateChangeListener, ExternalViewChangeListener {
     static final int EXTRA_REPLICA = 1;
 
     boolean _hasMoreReplica = false;
@@ -144,7 +144,6 @@
     boolean trackEnabled = false;
     Map<String, IdealState> _resourceMap;
 
-
     public MigrationStateVerifier(Map<String, IdealState> resourceMap, HelixManager manager) {
       _resourceMap = resourceMap;
       _manager = manager;
@@ -243,7 +242,6 @@
     }
   }
 
-
   @AfterClass
   public void afterClass() throws Exception {
     /**
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/PartitionMigration/TestWagedRebalancerMigration.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/PartitionMigration/TestWagedRebalancerMigration.java
new file mode 100644
index 0000000..52def54
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/PartitionMigration/TestWagedRebalancerMigration.java
@@ -0,0 +1,111 @@
+package org.apache.helix.integration.rebalancer.PartitionMigration;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Collections;
+
+import org.apache.helix.ConfigAccessor;
+import org.apache.helix.controller.rebalancer.strategy.CrushRebalanceStrategy;
+import org.apache.helix.controller.rebalancer.waged.WagedRebalancer;
+import org.apache.helix.integration.manager.MockParticipantManager;
+import org.apache.helix.model.BuiltInStateModelDefinitions;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.IdealState;
+import org.apache.helix.tools.ClusterVerifiers.BestPossibleExternalViewVerifier;
+import org.apache.helix.tools.ClusterVerifiers.ZkHelixClusterVerifier;
+import org.testng.Assert;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.DataProvider;
+import org.testng.annotations.Test;
+
+public class TestWagedRebalancerMigration extends TestPartitionMigrationBase {
+  ConfigAccessor _configAccessor;
+
+  @BeforeClass
+  public void beforeClass()
+      throws Exception {
+    super.beforeClass();
+    _configAccessor = new ConfigAccessor(_gZkClient);
+  }
+
+  @DataProvider(name = "stateModels")
+  public static Object[][] stateModels() {
+    return new Object[][] { { BuiltInStateModelDefinitions.MasterSlave.name(), true },
+        { BuiltInStateModelDefinitions.OnlineOffline.name(), true },
+        { BuiltInStateModelDefinitions.LeaderStandby.name(), true },
+        { BuiltInStateModelDefinitions.MasterSlave.name(), false },
+        { BuiltInStateModelDefinitions.OnlineOffline.name(), false },
+        { BuiltInStateModelDefinitions.LeaderStandby.name(), false },
+    };
+  }
+
+  // TODO check the movements in between
+  @Test(dataProvider = "stateModels")
+  public void testMigrateToWagedRebalancerWhileExpandCluster(String stateModel,
+      boolean delayEnabled)
+      throws Exception {
+    String db = "Test-DB-" + stateModel;
+    if (delayEnabled) {
+      createResourceWithDelayedRebalance(CLUSTER_NAME, db, stateModel, _PARTITIONS, _replica,
+          _replica - 1, 3000000, CrushRebalanceStrategy.class.getName());
+    } else {
+      createResourceWithDelayedRebalance(CLUSTER_NAME, db, stateModel, _PARTITIONS, _replica,
+          _replica, 0, CrushRebalanceStrategy.class.getName());
+    }
+    IdealState idealState =
+        _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, db);
+    ClusterConfig config = _configAccessor.getClusterConfig(CLUSTER_NAME);
+    config.setDelayRebalaceEnabled(delayEnabled);
+    config.setRebalanceDelayTime(3000000);
+    _configAccessor.setClusterConfig(CLUSTER_NAME, config);
+
+    // add new instance to the cluster
+    int numNodes = _participants.size();
+    for (int i = numNodes; i < numNodes + NUM_NODE; i++) {
+      String storageNodeName = PARTICIPANT_PREFIX + "_" + (START_PORT + i);
+      MockParticipantManager participant = createAndStartParticipant(storageNodeName);
+      _participants.add(participant);
+      Thread.sleep(100);
+    }
+    Thread.sleep(2000);
+    ZkHelixClusterVerifier clusterVerifier =
+        new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME).setZkAddr(ZK_ADDR).build();
+    Assert.assertTrue(clusterVerifier.verifyByPolling());
+
+    _migrationVerifier =
+        new MigrationStateVerifier(Collections.singletonMap(db, idealState), _manager);
+
+    _migrationVerifier.reset();
+    _migrationVerifier.start();
+
+    IdealState currentIdealState =
+        _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, db);
+    currentIdealState.setRebalancerClassName(WagedRebalancer.class.getName());
+    _gSetupTool.getClusterManagementTool()
+        .setResourceIdealState(CLUSTER_NAME, db, currentIdealState);
+    Thread.sleep(2000);
+    Assert.assertTrue(clusterVerifier.verifyByPolling());
+
+    Assert.assertFalse(_migrationVerifier.hasLessReplica());
+    Assert.assertFalse(_migrationVerifier.hasMoreReplica());
+
+    _migrationVerifier.stop();
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/TestMixedModeAutoRebalance.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/TestMixedModeAutoRebalance.java
index fb8399c..db51fd5 100644
--- a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/TestMixedModeAutoRebalance.java
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/TestMixedModeAutoRebalance.java
@@ -27,14 +27,13 @@
 import java.util.Set;
 
 import org.apache.helix.ConfigAccessor;
-import org.apache.helix.HelixDataAccessor;
 import org.apache.helix.NotificationContext;
+import org.apache.helix.TestHelper;
 import org.apache.helix.common.ZkTestBase;
 import org.apache.helix.controller.rebalancer.strategy.CrushEdRebalanceStrategy;
 import org.apache.helix.controller.rebalancer.strategy.CrushRebalanceStrategy;
 import org.apache.helix.integration.manager.ClusterControllerManager;
 import org.apache.helix.integration.manager.MockParticipantManager;
-import org.apache.helix.manager.zk.ZKHelixDataAccessor;
 import org.apache.helix.mock.participant.MockMSModelFactory;
 import org.apache.helix.mock.participant.MockMSStateModel;
 import org.apache.helix.mock.participant.MockTransition;
@@ -60,13 +59,12 @@
 
   private final String CLASS_NAME = getShortClassName();
   private final String CLUSTER_NAME = CLUSTER_PREFIX + "_" + CLASS_NAME;
-  private ClusterControllerManager _controller;
 
+  private ClusterControllerManager _controller;
   private List<MockParticipantManager> _participants = new ArrayList<>();
   private int _replica = 3;
   private ZkHelixClusterVerifier _clusterVerifier;
   private ConfigAccessor _configAccessor;
-  private HelixDataAccessor _dataAccessor;
 
   @BeforeClass
   public void beforeClass() throws Exception {
@@ -96,7 +94,6 @@
     enablePersistBestPossibleAssignment(_gZkClient, CLUSTER_NAME, true);
 
     _configAccessor = new ConfigAccessor(_gZkClient);
-    _dataAccessor = new ZKHelixDataAccessor(CLUSTER_NAME, _baseAccessor);
   }
 
   @DataProvider(name = "stateModels")
@@ -112,89 +109,125 @@
     };
   }
 
-  @Test(dataProvider = "stateModels")
-  public void testUserDefinedPreferenceListsInFullAuto(
-      String stateModel, boolean delayEnabled, String rebalanceStrateyName) throws Exception {
-    String db = "Test-DB-" + stateModel;
+  protected void createResource(String dbName, String stateModel, int numPartition, int replica,
+      boolean delayEnabled, String rebalanceStrategy) {
     if (delayEnabled) {
-      createResourceWithDelayedRebalance(CLUSTER_NAME, db, stateModel, _PARTITIONS, _replica,
-          _replica - 1, 200, rebalanceStrateyName);
+      createResourceWithDelayedRebalance(CLUSTER_NAME, dbName, stateModel, numPartition, replica,
+          replica - 1, 200, rebalanceStrategy);
     } else {
-      createResourceWithDelayedRebalance(CLUSTER_NAME, db, stateModel, _PARTITIONS, _replica,
-          _replica, 0, rebalanceStrateyName);
+      createResourceWithDelayedRebalance(CLUSTER_NAME, dbName, stateModel, numPartition, replica,
+          replica, 0, rebalanceStrategy);
     }
-    IdealState idealState =
-        _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, db);
-    Map<String, List<String>> userDefinedPreferenceLists = idealState.getPreferenceLists();
-    List<String> userDefinedPartitions = new ArrayList<>();
-    for (String partition : userDefinedPreferenceLists.keySet()) {
-      List<String> preferenceList = new ArrayList<>();
-      for (int k = _replica; k >= 0; k--) {
-        String instance = _participants.get(k).getInstanceName();
-        preferenceList.add(instance);
+  }
+
+  @Test(dataProvider = "stateModels")
+  public void testUserDefinedPreferenceListsInFullAuto(String stateModel, boolean delayEnabled,
+      String rebalanceStrateyName) throws Exception {
+    String dbName = "Test-DB-" + stateModel;
+    createResource(dbName, stateModel, _PARTITIONS, _replica, delayEnabled,
+        rebalanceStrateyName);
+    try {
+      IdealState idealState =
+          _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, dbName);
+      Map<String, List<String>> userDefinedPreferenceLists = idealState.getPreferenceLists();
+      List<String> userDefinedPartitions = new ArrayList<>();
+      for (String partition : userDefinedPreferenceLists.keySet()) {
+        List<String> preferenceList = new ArrayList<>();
+        for (int k = _replica; k >= 0; k--) {
+          String instance = _participants.get(k).getInstanceName();
+          preferenceList.add(instance);
+        }
+        userDefinedPreferenceLists.put(partition, preferenceList);
+        userDefinedPartitions.add(partition);
       }
-      userDefinedPreferenceLists.put(partition, preferenceList);
-      userDefinedPartitions.add(partition);
-    }
 
-    ResourceConfig resourceConfig =
-        new ResourceConfig.Builder(db).setPreferenceLists(userDefinedPreferenceLists).build();
-    _configAccessor.setResourceConfig(CLUSTER_NAME, db, resourceConfig);
+      ResourceConfig resourceConfig =
+          new ResourceConfig.Builder(dbName).setPreferenceLists(userDefinedPreferenceLists).build();
+      _configAccessor.setResourceConfig(CLUSTER_NAME, dbName, resourceConfig);
 
-    Assert.assertTrue(_clusterVerifier.verify(1000));
-    verifyUserDefinedPreferenceLists(db, userDefinedPreferenceLists, userDefinedPartitions);
+      // TODO remove this sleep after fix https://github.com/apache/helix/issues/526
+      Thread.sleep(500);
 
-    while (userDefinedPartitions.size() > 0) {
-      IdealState originIS = _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, db);
-      Set<String> nonUserDefinedPartitions = new HashSet<>(originIS.getPartitionSet());
-      nonUserDefinedPartitions.removeAll(userDefinedPartitions);
+      Assert.assertTrue(_clusterVerifier.verify(3000));
+      verifyUserDefinedPreferenceLists(dbName, userDefinedPreferenceLists,
+          userDefinedPartitions);
 
-      removePartitionFromUserDefinedList(db, userDefinedPartitions);
-      Assert.assertTrue(_clusterVerifier.verify(1000));
-      verifyUserDefinedPreferenceLists(db, userDefinedPreferenceLists, userDefinedPartitions);
-      verifyNonUserDefinedAssignment(db, originIS, nonUserDefinedPartitions);
+      while (userDefinedPartitions.size() > 0) {
+        IdealState originIS =
+            _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, dbName);
+        Set<String> nonUserDefinedPartitions = new HashSet<>(originIS.getPartitionSet());
+        nonUserDefinedPartitions.removeAll(userDefinedPartitions);
+
+        removePartitionFromUserDefinedList(dbName, userDefinedPartitions);
+        // TODO: Remove wait once we enable the BestPossibleExternalViewVerifier for the WAGED rebalancer.
+        Thread.sleep(1000);
+        Assert.assertTrue(_clusterVerifier.verify(3000));
+        verifyUserDefinedPreferenceLists(dbName, userDefinedPreferenceLists,
+            userDefinedPartitions);
+        verifyNonUserDefinedAssignment(dbName, originIS, nonUserDefinedPartitions);
+      }
+    } finally {
+      _gSetupTool.getClusterManagementTool().dropResource(CLUSTER_NAME, dbName);
+      _clusterVerifier.verify(5000);
     }
   }
 
   @Test
   public void testUserDefinedPreferenceListsInFullAutoWithErrors() throws Exception {
-    String db = "Test-DB-1";
-    createResourceWithDelayedRebalance(CLUSTER_NAME, db,
-        BuiltInStateModelDefinitions.MasterSlave.name(), 5, _replica, _replica, 0,
+    String dbName = "Test-DB-withErrors";
+    createResource(dbName, BuiltInStateModelDefinitions.MasterSlave.name(), 5, _replica, false,
         CrushRebalanceStrategy.class.getName());
+    try {
+      IdealState idealState =
+          _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, dbName);
+      Map<String, List<String>> userDefinedPreferenceLists = idealState.getPreferenceLists();
 
-    IdealState idealState =
-        _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, db);
-    Map<String, List<String>> userDefinedPreferenceLists = idealState.getPreferenceLists();
+      List<String> newNodes = new ArrayList<>();
+      for (int i = NUM_NODE; i < NUM_NODE + _replica; i++) {
+        String instance = PARTICIPANT_PREFIX + "_" + (START_PORT + i);
+        _gSetupTool.addInstanceToCluster(CLUSTER_NAME, instance);
 
-    List<String> newNodes = new ArrayList<>();
-    for (int i = NUM_NODE; i < NUM_NODE + _replica; i++) {
-      String instance = PARTICIPANT_PREFIX + "_" + (START_PORT + i);
-      _gSetupTool.addInstanceToCluster(CLUSTER_NAME, instance);
+        // start dummy participants
+        MockParticipantManager participant =
+            new TestMockParticipantManager(ZK_ADDR, CLUSTER_NAME, instance);
+        participant.syncStart();
+        _participants.add(participant);
+        newNodes.add(instance);
+      }
 
-      // start dummy participants
-      MockParticipantManager participant =
-          new TestMockParticipantManager(ZK_ADDR, CLUSTER_NAME, instance);
-      participant.syncStart();
-      _participants.add(participant);
-      newNodes.add(instance);
+      List<String> userDefinedPartitions = new ArrayList<>();
+      for (String partition : userDefinedPreferenceLists.keySet()) {
+        userDefinedPreferenceLists.put(partition, newNodes);
+        userDefinedPartitions.add(partition);
+      }
+
+      ResourceConfig resourceConfig =
+          new ResourceConfig.Builder(dbName).setPreferenceLists(userDefinedPreferenceLists).build();
+      _configAccessor.setResourceConfig(CLUSTER_NAME, dbName, resourceConfig);
+
+      TestHelper.verify(() -> {
+        ExternalView ev =
+            _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, dbName);
+        if (ev != null) {
+          for (String partition : ev.getPartitionSet()) {
+            Map<String, String> stateMap = ev.getStateMap(partition);
+            if (stateMap.values().contains("ERROR")) {
+              return true;
+            }
+          }
+        }
+        return false;
+      }, 2000);
+
+      ExternalView ev =
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, dbName);
+      IdealState is =
+          _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, dbName);
+      validateMinActiveAndTopStateReplica(is, ev, _replica, NUM_NODE);
+    } finally {
+      _gSetupTool.getClusterManagementTool().dropResource(CLUSTER_NAME, dbName);
+      _clusterVerifier.verify(5000);
     }
-
-    List<String> userDefinedPartitions = new ArrayList<>();
-    for (String partition : userDefinedPreferenceLists.keySet()) {
-      userDefinedPreferenceLists.put(partition, newNodes);
-      userDefinedPartitions.add(partition);
-    }
-
-    ResourceConfig resourceConfig =
-        new ResourceConfig.Builder(db).setPreferenceLists(userDefinedPreferenceLists).build();
-    _configAccessor.setResourceConfig(CLUSTER_NAME, db, resourceConfig);
-
-    Thread.sleep(1000);
-    ExternalView ev =
-        _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, db);
-    IdealState is = _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, db);
-    validateMinActiveAndTopStateReplica(is, ev, _replica, NUM_NODE);
   }
 
   private void verifyUserDefinedPreferenceLists(String db,
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/TestZeroReplicaAvoidance.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/TestZeroReplicaAvoidance.java
index 941dc0a..1a02299 100644
--- a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/TestZeroReplicaAvoidance.java
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/TestZeroReplicaAvoidance.java
@@ -42,8 +42,8 @@
 import org.apache.helix.tools.ClusterVerifiers.BestPossibleExternalViewVerifier;
 import org.apache.helix.tools.ClusterVerifiers.ZkHelixClusterVerifier;
 import org.testng.Assert;
-import org.testng.annotations.AfterClass;
-import org.testng.annotations.BeforeClass;
+import org.testng.annotations.AfterMethod;
+import org.testng.annotations.BeforeMethod;
 import org.testng.annotations.Test;
 
 public class TestZeroReplicaAvoidance extends ZkTestBase
@@ -60,10 +60,8 @@
 
   private ClusterControllerManager _controller;
 
-  @BeforeClass
-  public void beforeClass() throws Exception {
-    System.out.println("START " + CLASS_NAME + " at " + new Date(System.currentTimeMillis()));
-
+  @BeforeMethod
+  public void beforeMethod() {
     _gSetupTool.addCluster(CLUSTER_NAME, true);
     for (int i = 0; i < NUM_NODE; i++) {
       String storageNodeName = PARTICIPANT_PREFIX + "_" + (START_PORT + i);
@@ -83,8 +81,9 @@
         new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME).setZkAddr(ZK_ADDR).build();
   }
 
-  @AfterClass
-  public void afterClass() {
+  @AfterMethod
+  public void afterMethod() {
+    _startListen = false;
     if (_controller != null && _controller.isConnected()) {
       _controller.syncStop();
     }
@@ -93,6 +92,7 @@
         participant.syncStop();
       }
     }
+    _participants.clear();
     deleteCluster(CLUSTER_NAME);
   }
 
@@ -103,7 +103,8 @@
   };
 
   @Test
-  public void test() throws Exception {
+  public void testDelayedRebalancer() throws Exception {
+    System.out.println("START testDelayedRebalancer at " + new Date(System.currentTimeMillis()));
     HelixManager manager =
         HelixManagerFactory.getZKHelixManager(CLUSTER_NAME, null, InstanceType.SPECTATOR, ZK_ADDR);
     manager.connect();
@@ -139,6 +140,49 @@
     if (manager.isConnected()) {
       manager.disconnect();
     }
+    System.out.println("END testDelayedRebalancer at " + new Date(System.currentTimeMillis()));
+  }
+
+  @Test
+  public void testWagedRebalancer() throws Exception {
+    System.out.println("START testWagedRebalancer at " + new Date(System.currentTimeMillis()));
+    HelixManager manager =
+        HelixManagerFactory.getZKHelixManager(CLUSTER_NAME, null, InstanceType.SPECTATOR, ZK_ADDR);
+    manager.connect();
+    manager.addExternalViewChangeListener(this);
+    manager.addIdealStateChangeListener(this);
+    enablePersistBestPossibleAssignment(_gZkClient, CLUSTER_NAME, true);
+
+    // Start half number of nodes.
+    int i = 0;
+    for (; i < NUM_NODE / 2; i++) {
+      _participants.get(i).syncStart();
+    }
+
+    int replica = 3;
+    int partition = 30;
+    for (String stateModel : TestStateModels) {
+      String db = "Test-DB-" + stateModel;
+      createResourceWithWagedRebalance(CLUSTER_NAME, db, stateModel, partition, replica, replica);
+    }
+    // TODO remove this sleep after fix https://github.com/apache/helix/issues/526
+    Thread.sleep(1000);
+    Assert.assertTrue(_clusterVerifier.verifyByPolling(50000L, 100L));
+
+    _startListen = true;
+    DelayedTransition.setDelay(5);
+
+    // add the other half of nodes.
+    for (; i < NUM_NODE; i++) {
+      _participants.get(i).syncStart();
+    }
+    Assert.assertTrue(_clusterVerifier.verify(70000L));
+    Assert.assertTrue(_testSuccess);
+
+    if (manager.isConnected()) {
+      manager.disconnect();
+    }
+    System.out.println("END testWagedRebalancer at " + new Date(System.currentTimeMillis()));
   }
 
   /**
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestDelayedWagedRebalance.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestDelayedWagedRebalance.java
new file mode 100644
index 0000000..e75da84
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestDelayedWagedRebalance.java
@@ -0,0 +1,89 @@
+package org.apache.helix.integration.rebalancer.WagedRebalancer;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.helix.TestHelper;
+import org.apache.helix.integration.rebalancer.DelayedAutoRebalancer.TestDelayedAutoRebalance;
+import org.apache.helix.model.ExternalView;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+/**
+ * Inherit TestDelayedAutoRebalance to ensure the test logic is the same.
+ */
+public class TestDelayedWagedRebalance extends TestDelayedAutoRebalance {
+  // create test DBs, wait it converged and return externalviews
+  protected Map<String, ExternalView> createTestDBs(long delayTime) throws InterruptedException {
+    Map<String, ExternalView> externalViews = new HashMap<>();
+    int i = 0;
+    for (String stateModel : TestStateModels) {
+      String db = "Test-DB-" + TestHelper.getTestMethodName() + i++;
+      createResourceWithWagedRebalance(CLUSTER_NAME, db, stateModel, PARTITIONS, _replica,
+          _minActiveReplica);
+      _testDBs.add(db);
+    }
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
+    Assert.assertTrue(_clusterVerifier.verifyByPolling());
+    for (String db : _testDBs) {
+      ExternalView ev =
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, db);
+      externalViews.put(db, ev);
+    }
+    return externalViews;
+  }
+
+  @Test
+  public void testDelayedPartitionMovement() {
+    // Waged Rebalancer takes cluster level delay config only. Skip this test.
+  }
+
+  @Test
+  public void testDisableDelayRebalanceInResource() {
+    // Waged Rebalancer takes cluster level delay config only. Skip this test.
+  }
+
+  @Test(dependsOnMethods = { "testDelayedPartitionMovement" })
+  public void testDelayedPartitionMovementWithClusterConfigedDelay() throws Exception {
+    super.testDelayedPartitionMovementWithClusterConfigedDelay();
+  }
+
+  @Test(dependsOnMethods = { "testDelayedPartitionMovementWithClusterConfigedDelay" })
+  public void testMinimalActiveReplicaMaintain() throws Exception {
+    super.testMinimalActiveReplicaMaintain();
+  }
+
+  @Test(dependsOnMethods = { "testMinimalActiveReplicaMaintain" })
+  public void testPartitionMovementAfterDelayTime() throws Exception {
+    super.testPartitionMovementAfterDelayTime();
+  }
+
+  @Test(dependsOnMethods = { "testDisableDelayRebalanceInResource" })
+  public void testDisableDelayRebalanceInCluster() throws Exception {
+    super.testDisableDelayRebalanceInCluster();
+  }
+
+  @Test(dependsOnMethods = { "testDisableDelayRebalanceInCluster" })
+  public void testDisableDelayRebalanceInInstance() throws Exception {
+    super.testDisableDelayRebalanceInInstance();
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestDelayedWagedRebalanceWithDisabledInstance.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestDelayedWagedRebalanceWithDisabledInstance.java
new file mode 100644
index 0000000..92988c4
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestDelayedWagedRebalanceWithDisabledInstance.java
@@ -0,0 +1,96 @@
+package org.apache.helix.integration.rebalancer.WagedRebalancer;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.helix.TestHelper;
+import org.apache.helix.integration.rebalancer.DelayedAutoRebalancer.TestDelayedAutoRebalanceWithDisabledInstance;
+import org.apache.helix.model.ExternalView;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+
+/**
+ * Inherit TestDelayedAutoRebalanceWithDisabledInstance to ensure the test logic is the same.
+ */
+public class TestDelayedWagedRebalanceWithDisabledInstance extends TestDelayedAutoRebalanceWithDisabledInstance {
+  // create test DBs, wait it converged and return externalviews
+  protected Map<String, ExternalView> createTestDBs(long delayTime)
+      throws InterruptedException {
+    Map<String, ExternalView> externalViews = new HashMap<>();
+    int i = 0;
+    for (String stateModel : TestStateModels) {
+      String db = "Test-DB-" + TestHelper.getTestMethodName() + i++;
+      createResourceWithWagedRebalance(CLUSTER_NAME, db, stateModel, PARTITIONS, _replica,
+          _minActiveReplica);
+      _testDBs.add(db);
+    }
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
+    Assert.assertTrue(_clusterVerifier.verifyByPolling());
+    for (String db : _testDBs) {
+      ExternalView ev =
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, db);
+      externalViews.put(db, ev);
+    }
+    return externalViews;
+  }
+
+  @Test
+  public void testDelayedPartitionMovement() {
+    // Waged Rebalancer takes cluster level delay config only. Skip this test.
+  }
+
+  @Test
+  public void testDisableDelayRebalanceInResource() {
+    // Waged Rebalancer takes cluster level delay config only. Skip this test.
+  }
+
+  @Test(dependsOnMethods = {"testDelayedPartitionMovement"})
+  public void testDelayedPartitionMovementWithClusterConfigedDelay()
+      throws Exception {
+    super.testDelayedPartitionMovementWithClusterConfigedDelay();
+  }
+
+  @Test(dependsOnMethods = {"testDelayedPartitionMovementWithClusterConfigedDelay"})
+  public void testMinimalActiveReplicaMaintain()
+      throws Exception {
+    super.testMinimalActiveReplicaMaintain();
+  }
+
+  @Test(dependsOnMethods = {"testMinimalActiveReplicaMaintain"})
+  public void testPartitionMovementAfterDelayTime()
+      throws Exception {
+    super.testPartitionMovementAfterDelayTime();
+  }
+
+  @Test(dependsOnMethods = {"testDisableDelayRebalanceInResource"})
+  public void testDisableDelayRebalanceInCluster()
+      throws Exception {
+    super.testDisableDelayRebalanceInCluster();
+  }
+
+  @Test(dependsOnMethods = {"testDisableDelayRebalanceInCluster"})
+  public void testDisableDelayRebalanceInInstance()
+      throws Exception {
+    super.testDisableDelayRebalanceInInstance();
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestDelayedWagedRebalanceWithRackaware.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestDelayedWagedRebalanceWithRackaware.java
new file mode 100644
index 0000000..cd1f337
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestDelayedWagedRebalanceWithRackaware.java
@@ -0,0 +1,96 @@
+package org.apache.helix.integration.rebalancer.WagedRebalancer;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.helix.TestHelper;
+import org.apache.helix.integration.rebalancer.DelayedAutoRebalancer.TestDelayedAutoRebalanceWithRackaware;
+import org.apache.helix.model.ExternalView;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+
+/**
+ * Inherit TestDelayedAutoRebalanceWithRackaware to ensure the test logic is the same.
+ */
+public class TestDelayedWagedRebalanceWithRackaware extends TestDelayedAutoRebalanceWithRackaware {
+  // create test DBs, wait it converged and return externalviews
+  protected Map<String, ExternalView> createTestDBs(long delayTime)
+      throws InterruptedException {
+    Map<String, ExternalView> externalViews = new HashMap<>();
+    int i = 0;
+    for (String stateModel : TestStateModels) {
+      String db = "Test-DB-" + TestHelper.getTestMethodName() + i++;
+      createResourceWithWagedRebalance(CLUSTER_NAME, db, stateModel, PARTITIONS, _replica,
+          _minActiveReplica);
+      _testDBs.add(db);
+    }
+    Thread.sleep(DEFAULT_REBALANCE_PROCESSING_WAIT_TIME);
+    Assert.assertTrue(_clusterVerifier.verifyByPolling());
+    for (String db : _testDBs) {
+      ExternalView ev =
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, db);
+      externalViews.put(db, ev);
+    }
+    return externalViews;
+  }
+
+  @Test
+  public void testDelayedPartitionMovement() {
+    // Waged Rebalancer takes cluster level delay config only. Skip this test.
+  }
+
+  @Test
+  public void testDisableDelayRebalanceInResource() {
+    // Waged Rebalancer takes cluster level delay config only. Skip this test.
+  }
+
+  @Test(dependsOnMethods = {"testDelayedPartitionMovement"})
+  public void testDelayedPartitionMovementWithClusterConfigedDelay()
+      throws Exception {
+    super.testDelayedPartitionMovementWithClusterConfigedDelay();
+  }
+
+  @Test(dependsOnMethods = {"testDelayedPartitionMovementWithClusterConfigedDelay"})
+  public void testMinimalActiveReplicaMaintain()
+      throws Exception {
+    super.testMinimalActiveReplicaMaintain();
+  }
+
+  @Test(dependsOnMethods = {"testMinimalActiveReplicaMaintain"})
+  public void testPartitionMovementAfterDelayTime()
+      throws Exception {
+    super.testPartitionMovementAfterDelayTime();
+  }
+
+  @Test(dependsOnMethods = {"testDisableDelayRebalanceInResource"})
+  public void testDisableDelayRebalanceInCluster()
+      throws Exception {
+    super.testDisableDelayRebalanceInCluster();
+  }
+
+  @Test(dependsOnMethods = {"testDisableDelayRebalanceInCluster"})
+  public void testDisableDelayRebalanceInInstance()
+      throws Exception {
+    super.testDisableDelayRebalanceInInstance();
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestMixedModeWagedRebalance.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestMixedModeWagedRebalance.java
new file mode 100644
index 0000000..c482e8f
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestMixedModeWagedRebalance.java
@@ -0,0 +1,58 @@
+package org.apache.helix.integration.rebalancer.WagedRebalancer;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.integration.rebalancer.TestMixedModeAutoRebalance;
+import org.apache.helix.model.BuiltInStateModelDefinitions;
+import org.testng.annotations.AfterMethod;
+import org.testng.annotations.DataProvider;
+
+public class TestMixedModeWagedRebalance extends TestMixedModeAutoRebalance {
+  private final String CLASS_NAME = getShortClassName();
+  private final String CLUSTER_NAME = CLUSTER_PREFIX + "_" + CLASS_NAME;
+
+  @DataProvider(name = "stateModels")
+  public static Object[][] stateModels() {
+    return new Object[][] { { BuiltInStateModelDefinitions.MasterSlave.name(), true, null },
+        { BuiltInStateModelDefinitions.OnlineOffline.name(), true, null },
+        { BuiltInStateModelDefinitions.LeaderStandby.name(), true, null },
+        { BuiltInStateModelDefinitions.MasterSlave.name(), false, null },
+        { BuiltInStateModelDefinitions.OnlineOffline.name(), false, null },
+        { BuiltInStateModelDefinitions.LeaderStandby.name(), false, null }
+    };
+  }
+
+  protected void createResource(String dbName, String stateModel, int numPartition, int replica,
+      boolean delayEnabled, String rebalanceStrategy) {
+    if (delayEnabled) {
+      setDelayTimeInCluster(_gZkClient, CLUSTER_NAME, 200);
+      createResourceWithWagedRebalance(CLUSTER_NAME, dbName, stateModel, numPartition, replica,
+          replica - 1);
+    } else {
+      createResourceWithWagedRebalance(CLUSTER_NAME, dbName, stateModel, numPartition, replica,
+          replica);
+    }
+  }
+
+  @AfterMethod
+  public void afterMethod() {
+    setDelayTimeInCluster(_gZkClient, CLUSTER_NAME, -1);
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestWagedExpandCluster.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestWagedExpandCluster.java
new file mode 100644
index 0000000..4c4119c
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestWagedExpandCluster.java
@@ -0,0 +1,159 @@
+package org.apache.helix.integration.rebalancer.WagedRebalancer;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.helix.integration.manager.MockParticipantManager;
+import org.apache.helix.integration.rebalancer.PartitionMigration.TestPartitionMigrationBase;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.IdealState;
+import org.apache.helix.model.InstanceConfig;
+import org.testng.Assert;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+
+public class TestWagedExpandCluster extends TestPartitionMigrationBase {
+  Map<String, IdealState> _resourceMap;
+
+  @BeforeClass
+  public void beforeClass()
+      throws Exception {
+    super.beforeClass();
+    _resourceMap = createTestDBs(1000000);
+    // TODO remove this sleep after fix https://github.com/apache/helix/issues/526
+    Thread.sleep(1000);
+    _migrationVerifier = new MigrationStateVerifier(_resourceMap, _manager);
+  }
+
+  protected Map<String, IdealState> createTestDBs(long delayTime) {
+    Map<String, IdealState> idealStateMap = new HashMap<>();
+    int i = 0;
+    for (String stateModel : TestStateModels) {
+      String db = "Test-DB-" + i++;
+      createResourceWithWagedRebalance(CLUSTER_NAME, db, stateModel, _PARTITIONS, _replica,
+          _minActiveReplica);
+      _testDBs.add(db);
+    }
+    for (String db : _testDBs) {
+      IdealState is =
+          _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, db);
+      idealStateMap.put(db, is);
+    }
+    ClusterConfig clusterConfig = _configAccessor.getClusterConfig(CLUSTER_NAME);
+    clusterConfig.setDelayRebalaceEnabled(true);
+    clusterConfig.setRebalanceDelayTime(delayTime);
+    _configAccessor.setClusterConfig(CLUSTER_NAME, clusterConfig);
+
+    return idealStateMap;
+  }
+
+  @Test
+  public void testClusterExpansion()
+      throws Exception {
+    Assert.assertTrue(_clusterVerifier.verifyByPolling());
+
+    _migrationVerifier.start();
+
+    // expand cluster by adding instance one by one
+    int numNodes = _participants.size();
+    for (int i = numNodes; i < numNodes + NUM_NODE; i++) {
+      String storageNodeName = PARTICIPANT_PREFIX + "_" + (START_PORT + i);
+      MockParticipantManager participant = createAndStartParticipant(storageNodeName);
+      _participants.add(participant);
+      Thread.sleep(50);
+    }
+
+    Assert.assertTrue(_clusterVerifier.verifyByPolling());
+    Assert.assertFalse(_migrationVerifier.hasLessReplica());
+    Assert.assertFalse(_migrationVerifier.hasMoreReplica());
+
+    _migrationVerifier.stop();
+  }
+
+  @Test(dependsOnMethods = {"testClusterExpansion"})
+  public void testClusterExpansionByEnableInstance()
+      throws Exception {
+    Assert.assertTrue(_clusterVerifier.verifyByPolling());
+
+    _migrationVerifier.reset();
+    _migrationVerifier.start();
+
+    int numNodes = _participants.size();
+    // add new instances with all disabled
+    for (int i = numNodes; i < numNodes + NUM_NODE; i++) {
+      String storageNodeName = PARTICIPANT_PREFIX + "_" + (START_PORT + i);
+      InstanceConfig config = InstanceConfig.toInstanceConfig(storageNodeName);
+      config.setInstanceEnabled(false);
+      config.getRecord().getSimpleFields()
+          .remove(InstanceConfig.InstanceConfigProperty.HELIX_ENABLED_TIMESTAMP.name());
+
+      _gSetupTool.getClusterManagementTool().addInstance(CLUSTER_NAME, config);
+
+      // start dummy participants
+      MockParticipantManager participant =
+          new MockParticipantManager(ZK_ADDR, CLUSTER_NAME, storageNodeName);
+      participant.syncStart();
+      _participants.add(participant);
+    }
+
+    // enable new instance one by one
+    for (int i = numNodes; i < numNodes + NUM_NODE; i++) {
+      String storageNodeName = PARTICIPANT_PREFIX + "_" + (START_PORT + i);
+      _gSetupTool.getClusterManagementTool().enableInstance(CLUSTER_NAME, storageNodeName, true);
+      Thread.sleep(100);
+    }
+
+    Assert.assertTrue(_clusterVerifier.verifyByPolling());
+    Assert.assertFalse(_migrationVerifier.hasLessReplica());
+    _migrationVerifier.stop();
+  }
+
+  @Test(dependsOnMethods = {"testClusterExpansion", "testClusterExpansionByEnableInstance"})
+  public void testClusterShrink()
+      throws Exception {
+    ClusterConfig clusterConfig = _configAccessor.getClusterConfig(CLUSTER_NAME);
+    clusterConfig.setDelayRebalaceEnabled(false);
+    clusterConfig.setRebalanceDelayTime(0);
+    _configAccessor.setClusterConfig(CLUSTER_NAME, clusterConfig);
+
+    Assert.assertTrue(_clusterVerifier.verifyByPolling());
+
+    _migrationVerifier.reset();
+    _migrationVerifier.start();
+
+    // remove instance one by one
+    for (int i = 0; i < NUM_NODE; i++) {
+      String storageNodeName = PARTICIPANT_PREFIX + "_" + (START_PORT + i);
+      MockParticipantManager participant = _participants.get(i);
+      participant.syncStop();
+      _gSetupTool.getClusterManagementTool().enableInstance(CLUSTER_NAME, storageNodeName, false);
+      Assert.assertTrue(_clusterVerifier.verifyByPolling());
+    }
+
+    Assert.assertTrue(_clusterVerifier.verifyByPolling());
+    Assert.assertFalse(_migrationVerifier.hasLessMinActiveReplica());
+    Assert.assertFalse(_migrationVerifier.hasMoreReplica());
+
+    _migrationVerifier.stop();
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestWagedNodeSwap.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestWagedNodeSwap.java
new file mode 100644
index 0000000..369d46a
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestWagedNodeSwap.java
@@ -0,0 +1,294 @@
+package org.apache.helix.integration.rebalancer.WagedRebalancer;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import org.apache.helix.ConfigAccessor;
+import org.apache.helix.common.ZkTestBase;
+import org.apache.helix.integration.manager.ClusterControllerManager;
+import org.apache.helix.integration.manager.MockParticipantManager;
+import org.apache.helix.model.BuiltInStateModelDefinitions;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.ExternalView;
+import org.apache.helix.model.InstanceConfig;
+import org.apache.helix.tools.ClusterVerifiers.BestPossibleExternalViewVerifier;
+import org.apache.helix.tools.ClusterVerifiers.HelixClusterVerifier;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+public class TestWagedNodeSwap extends ZkTestBase {
+  final int NUM_NODE = 6;
+  protected static final int START_PORT = 12918;
+  protected static final int _PARTITIONS = 20;
+
+  protected final String CLASS_NAME = getShortClassName();
+  protected final String CLUSTER_NAME = CLUSTER_PREFIX + "_" + CLASS_NAME;
+  protected ClusterControllerManager _controller;
+  protected HelixClusterVerifier _clusterVerifier;
+
+  List<MockParticipantManager> _participants = new ArrayList<>();
+  Set<String> _allDBs = new HashSet<>();
+  int _replica = 3;
+
+  String[] _testModels = { BuiltInStateModelDefinitions.OnlineOffline.name(),
+      BuiltInStateModelDefinitions.MasterSlave.name(),
+      BuiltInStateModelDefinitions.LeaderStandby.name()
+  };
+
+  @BeforeClass
+  public void beforeClass() throws Exception {
+    _gSetupTool.addCluster(CLUSTER_NAME, true);
+
+    ConfigAccessor configAccessor = new ConfigAccessor(_gZkClient);
+    ClusterConfig clusterConfig = configAccessor.getClusterConfig(CLUSTER_NAME);
+    clusterConfig.setTopology("/zone/instance");
+    clusterConfig.setFaultZoneType("zone");
+    clusterConfig.setDelayRebalaceEnabled(true);
+    // Set a long enough time to ensure delayed rebalance is activate
+    clusterConfig.setRebalanceDelayTime(3000000);
+
+    // TODO remove this setup once issue https://github.com/apache/helix/issues/532 is fixed
+    Map<ClusterConfig.GlobalRebalancePreferenceKey, Integer> preference = new HashMap<>();
+    preference.put(ClusterConfig.GlobalRebalancePreferenceKey.EVENNESS, 0);
+    preference.put(ClusterConfig.GlobalRebalancePreferenceKey.LESS_MOVEMENT, 10);
+    clusterConfig.setGlobalRebalancePreference(preference);
+
+    configAccessor.setClusterConfig(CLUSTER_NAME, clusterConfig);
+
+    Set<String> nodes = new HashSet<>();
+    for (int i = 0; i < NUM_NODE; i++) {
+      String storageNodeName = PARTICIPANT_PREFIX + "_" + (START_PORT + i);
+      _gSetupTool.addInstanceToCluster(CLUSTER_NAME, storageNodeName);
+      String zone = "zone-" + i % 3;
+      String domain = String.format("zone=%s,instance=%s", zone, storageNodeName);
+
+      InstanceConfig instanceConfig =
+          configAccessor.getInstanceConfig(CLUSTER_NAME, storageNodeName);
+      instanceConfig.setDomain(domain);
+      _gSetupTool.getClusterManagementTool()
+          .setInstanceConfig(CLUSTER_NAME, storageNodeName, instanceConfig);
+      nodes.add(storageNodeName);
+    }
+
+    // start dummy participants
+    for (String node : nodes) {
+      MockParticipantManager participant = new MockParticipantManager(ZK_ADDR, CLUSTER_NAME, node);
+      participant.syncStart();
+      _participants.add(participant);
+    }
+
+    // start controller
+    String controllerName = CONTROLLER_PREFIX + "_0";
+    _controller = new ClusterControllerManager(ZK_ADDR, CLUSTER_NAME, controllerName);
+    _controller.syncStart();
+
+    enablePersistBestPossibleAssignment(_gZkClient, CLUSTER_NAME, true);
+    enableTopologyAwareRebalance(_gZkClient, CLUSTER_NAME, true);
+
+    int i = 0;
+    for (String stateModel : _testModels) {
+      String db = "Test-DB-" + i++;
+      createResourceWithWagedRebalance(CLUSTER_NAME, db, stateModel, _PARTITIONS, _replica,
+          _replica - 1);
+      _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, db, _replica);
+      _allDBs.add(db);
+    }
+    Thread.sleep(1000);
+
+    _clusterVerifier =
+        new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME).setZkAddr(ZK_ADDR).build();
+    Assert.assertTrue(_clusterVerifier.verify(5000));
+  }
+
+  @AfterClass
+  public void afterClass() throws Exception {
+    _controller.syncStop();
+    for (MockParticipantManager p : _participants) {
+      p.syncStop();
+    }
+    deleteCluster(CLUSTER_NAME);
+  }
+
+  @Test
+  public void testNodeSwap() throws Exception {
+    Map<String, ExternalView> record = new HashMap<>();
+    for (String db : _allDBs) {
+      record.put(db,
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, db));
+    }
+    ConfigAccessor configAccessor = new ConfigAccessor(_gZkClient);
+
+    // 1. disable an old node
+    MockParticipantManager oldParticipant = _participants.get(0);
+    String oldParticipantName = oldParticipant.getInstanceName();
+    final InstanceConfig instanceConfig =
+        _gSetupTool.getClusterManagementTool().getInstanceConfig(CLUSTER_NAME, oldParticipantName);
+    instanceConfig.setInstanceEnabled(false);
+    _gSetupTool.getClusterManagementTool()
+        .setInstanceConfig(CLUSTER_NAME, oldParticipantName, instanceConfig);
+    Assert.assertTrue(_clusterVerifier.verify(10000));
+
+    // 2. then entering maintenance mode and remove it from topology
+    _gSetupTool.getClusterManagementTool()
+        .manuallyEnableMaintenanceMode(CLUSTER_NAME, true, "NodeSwap", Collections.emptyMap());
+    oldParticipant.syncStop();
+    _participants.remove(oldParticipant);
+    Thread.sleep(2000);
+    _gSetupTool.getClusterManagementTool().dropInstance(CLUSTER_NAME, instanceConfig);
+
+    // 3. create new participant with same topology
+    String newParticipantName = "RandomParticipant_" + START_PORT;
+    _gSetupTool.addInstanceToCluster(CLUSTER_NAME, newParticipantName);
+    InstanceConfig newConfig = configAccessor.getInstanceConfig(CLUSTER_NAME, newParticipantName);
+    String zone = instanceConfig.getDomainAsMap().get("zone");
+    String domain = String.format("zone=%s,instance=%s", zone, newParticipantName);
+    newConfig.setDomain(domain);
+    _gSetupTool.getClusterManagementTool()
+        .setInstanceConfig(CLUSTER_NAME, newParticipantName, newConfig);
+
+    MockParticipantManager participant =
+        new MockParticipantManager(ZK_ADDR, CLUSTER_NAME, newParticipantName);
+    participant.syncStart();
+    _participants.add(0, participant);
+
+    // 4. exit maintenance mode and rebalance
+    _gSetupTool.getClusterManagementTool()
+        .manuallyEnableMaintenanceMode(CLUSTER_NAME, false, "NodeSwapDone", Collections.emptyMap());
+
+    Thread.sleep(2000);
+    Assert.assertTrue(_clusterVerifier.verify(5000));
+
+    // Since only one node temporary down, the same partitions will be moved to the newly added node.
+    for (String db : _allDBs) {
+      ExternalView ev =
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, db);
+      ExternalView oldEv = record.get(db);
+      for (String partition : ev.getPartitionSet()) {
+        Map<String, String> stateMap = ev.getStateMap(partition);
+        Map<String, String> oldStateMap = oldEv.getStateMap(partition);
+        Assert.assertTrue(oldStateMap != null && stateMap != null);
+        Assert.assertEquals(stateMap.size(), _replica);
+        // Note the WAGED rebalanacer won't ensure the same state, because moving the top states
+        // back to the replaced node might be unnecessary and causing overhead.
+        Set<String> instanceSet = new HashSet<>(stateMap.keySet());
+        if (instanceSet.remove(newParticipantName)) {
+          instanceSet.add(oldParticipantName);
+        }
+        Assert.assertEquals(oldStateMap.keySet(), instanceSet);
+      }
+    }
+  }
+
+  @Test(dependsOnMethods = "testNodeSwap")
+  public void testFaultZoneSwap() throws Exception {
+    Map<String, ExternalView> record = new HashMap<>();
+    for (String db : _allDBs) {
+      record.put(db,
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, db));
+    }
+    ConfigAccessor configAccessor = new ConfigAccessor(_gZkClient);
+
+    // 1. disable a whole fault zone
+    Map<String, InstanceConfig> removedInstanceConfigMap = new HashMap<>();
+    for (MockParticipantManager participant : _participants) {
+      String instanceName = participant.getInstanceName();
+      InstanceConfig instanceConfig =
+          _gSetupTool.getClusterManagementTool().getInstanceConfig(CLUSTER_NAME, instanceName);
+      if (instanceConfig.getDomainAsMap().get("zone").equals("zone-0")) {
+        instanceConfig.setInstanceEnabled(false);
+        _gSetupTool.getClusterManagementTool()
+            .setInstanceConfig(CLUSTER_NAME, instanceName, instanceConfig);
+        removedInstanceConfigMap.put(instanceName, instanceConfig);
+      }
+    }
+    Assert.assertTrue(_clusterVerifier.verify(10000));
+
+    // 2. then entering maintenance mode and remove all the zone-0 nodes from topology
+    _gSetupTool.getClusterManagementTool()
+        .manuallyEnableMaintenanceMode(CLUSTER_NAME, true, "NodeSwap", Collections.emptyMap());
+    Iterator<MockParticipantManager> iter = _participants.iterator();
+    while(iter.hasNext()) {
+      MockParticipantManager participant = iter.next();
+      String instanceName = participant.getInstanceName();
+      if (removedInstanceConfigMap.containsKey(instanceName)) {
+        participant.syncStop();
+        iter.remove();
+        Thread.sleep(1000);
+        _gSetupTool.getClusterManagementTool()
+            .dropInstance(CLUSTER_NAME, removedInstanceConfigMap.get(instanceName));
+      }
+    }
+
+    // 3. create new participants with same topology
+    Set<String> newInstanceNames = new HashSet<>();
+    for (int i = 0; i < removedInstanceConfigMap.size(); i++) {
+      String newParticipantName = "NewParticipant_" + (START_PORT + i++);
+      newInstanceNames.add(newParticipantName);
+      _gSetupTool.addInstanceToCluster(CLUSTER_NAME, newParticipantName);
+      InstanceConfig newConfig = configAccessor.getInstanceConfig(CLUSTER_NAME, newParticipantName);
+      String domain = String.format("zone=zone-0,instance=%s", newParticipantName);
+      newConfig.setDomain(domain);
+      _gSetupTool.getClusterManagementTool()
+          .setInstanceConfig(CLUSTER_NAME, newParticipantName, newConfig);
+
+      MockParticipantManager participant =
+          new MockParticipantManager(ZK_ADDR, CLUSTER_NAME, newParticipantName);
+      participant.syncStart();
+      _participants.add(participant);
+    }
+
+    // 4. exit maintenance mode and rebalance
+    _gSetupTool.getClusterManagementTool()
+        .manuallyEnableMaintenanceMode(CLUSTER_NAME, false, "NodeSwapDone", Collections.emptyMap());
+
+    Thread.sleep(2000);
+    Assert.assertTrue(_clusterVerifier.verify(5000));
+
+    for (String db : _allDBs) {
+      ExternalView ev =
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, db);
+      ExternalView oldEv = record.get(db);
+      for (String partition : ev.getPartitionSet()) {
+        Map<String, String> stateMap = ev.getStateMap(partition);
+        Map<String, String> oldStateMap = oldEv.getStateMap(partition);
+        Assert.assertTrue(oldStateMap != null && stateMap != null);
+        Assert.assertEquals(stateMap.size(), _replica);
+        Set<String> instanceSet = new HashSet<>(stateMap.keySet());
+        instanceSet.removeAll(oldStateMap.keySet());
+        // All the different instances in the new mapping are the newly added instance
+        Assert.assertTrue(newInstanceNames.containsAll(instanceSet));
+        instanceSet = new HashSet<>(oldStateMap.keySet());
+        instanceSet.removeAll(stateMap.keySet());
+        // All the different instances in the old mapping are the removed instance
+        Assert.assertTrue(removedInstanceConfigMap.keySet().containsAll(instanceSet));
+      }
+    }
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestWagedRebalance.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestWagedRebalance.java
new file mode 100644
index 0000000..96180bf
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestWagedRebalance.java
@@ -0,0 +1,590 @@
+package org.apache.helix.integration.rebalancer.WagedRebalancer;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.ArrayList;
+import java.util.Date;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import com.google.common.collect.ImmutableMap;
+import org.apache.helix.ConfigAccessor;
+import org.apache.helix.TestHelper;
+import org.apache.helix.common.ZkTestBase;
+import org.apache.helix.controller.rebalancer.strategy.CrushEdRebalanceStrategy;
+import org.apache.helix.controller.rebalancer.strategy.CrushRebalanceStrategy;
+import org.apache.helix.integration.manager.ClusterControllerManager;
+import org.apache.helix.integration.manager.MockParticipantManager;
+import org.apache.helix.model.BuiltInStateModelDefinitions;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.ExternalView;
+import org.apache.helix.model.IdealState;
+import org.apache.helix.model.InstanceConfig;
+import org.apache.helix.tools.ClusterVerifiers.HelixClusterVerifier;
+import org.apache.helix.tools.ClusterVerifiers.StrictMatchExternalViewVerifier;
+import org.apache.helix.tools.ClusterVerifiers.ZkHelixClusterVerifier;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.AfterMethod;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+public class TestWagedRebalance extends ZkTestBase {
+  protected final int NUM_NODE = 6;
+  protected static final int START_PORT = 12918;
+  protected static final int PARTITIONS = 20;
+  protected static final int TAGS = 2;
+
+  protected final String CLASS_NAME = getShortClassName();
+  protected final String CLUSTER_NAME = CLUSTER_PREFIX + "_" + CLASS_NAME;
+  protected ClusterControllerManager _controller;
+
+  List<MockParticipantManager> _participants = new ArrayList<>();
+  Map<String, String> _nodeToTagMap = new HashMap<>();
+  List<String> _nodes = new ArrayList<>();
+  private Set<String> _allDBs = new HashSet<>();
+  private int _replica = 3;
+
+  private static String[] _testModels = {
+      BuiltInStateModelDefinitions.OnlineOffline.name(),
+      BuiltInStateModelDefinitions.MasterSlave.name(),
+      BuiltInStateModelDefinitions.LeaderStandby.name()
+  };
+
+  @BeforeClass
+  public void beforeClass() throws Exception {
+    System.out.println("START " + CLASS_NAME + " at " + new Date(System.currentTimeMillis()));
+
+    _gSetupTool.addCluster(CLUSTER_NAME, true);
+
+    for (int i = 0; i < NUM_NODE; i++) {
+      String storageNodeName = PARTICIPANT_PREFIX + "_" + (START_PORT + i);
+      addInstanceConfig(storageNodeName, i, TAGS);
+    }
+
+    // start dummy participants
+    for (String node : _nodes) {
+      MockParticipantManager participant = new MockParticipantManager(ZK_ADDR, CLUSTER_NAME, node);
+      participant.syncStart();
+      _participants.add(participant);
+    }
+
+    // start controller
+    String controllerName = CONTROLLER_PREFIX + "_0";
+    _controller = new ClusterControllerManager(ZK_ADDR, CLUSTER_NAME, controllerName);
+    _controller.syncStart();
+
+    enablePersistBestPossibleAssignment(_gZkClient, CLUSTER_NAME, true);
+  }
+
+  protected void addInstanceConfig(String storageNodeName, int seqNo, int tagCount) {
+    _gSetupTool.addInstanceToCluster(CLUSTER_NAME, storageNodeName);
+    String tag = "tag-" + seqNo % tagCount;
+    _gSetupTool.getClusterManagementTool().addInstanceTag(CLUSTER_NAME, storageNodeName, tag);
+    _nodeToTagMap.put(storageNodeName, tag);
+    _nodes.add(storageNodeName);
+  }
+
+  @Test
+  public void test() throws Exception {
+    int i = 0;
+    for (String stateModel : _testModels) {
+      String db = "Test-DB-" + TestHelper.getTestMethodName() + i++;
+      createResourceWithWagedRebalance(CLUSTER_NAME, db, stateModel, PARTITIONS, _replica, _replica);
+      _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, db, _replica);
+      _allDBs.add(db);
+    }
+    Thread.sleep(300);
+
+    validate(_replica);
+
+    // Adding 3 more resources
+    i = 0;
+    for (String stateModel : _testModels) {
+      String moreDB = "More-Test-DB-" + i++;
+      createResourceWithWagedRebalance(CLUSTER_NAME, moreDB, stateModel, PARTITIONS, _replica,
+          _replica);
+      _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, moreDB, _replica);
+      _allDBs.add(moreDB);
+
+      Thread.sleep(300);
+
+      validate(_replica);
+    }
+
+    // Drop the 3 additional resources
+    for (int j = 0; j < 3; j++) {
+      String moreDB = "More-Test-DB-" + j++;
+      _gSetupTool.dropResourceFromCluster(CLUSTER_NAME, moreDB);
+      _allDBs.remove(moreDB);
+
+      Thread.sleep(300);
+
+      validate(_replica);
+    }
+  }
+
+  @Test(dependsOnMethods = "test")
+  public void testWithInstanceTag() throws Exception {
+    Set<String> tags = new HashSet<String>(_nodeToTagMap.values());
+    int i = 3;
+    for (String tag : tags) {
+      String db = "Test-DB-" + TestHelper.getTestMethodName() + i++;
+      createResourceWithWagedRebalance(CLUSTER_NAME, db,
+          BuiltInStateModelDefinitions.MasterSlave.name(), PARTITIONS, _replica, _replica);
+      IdealState is =
+          _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, db);
+      is.setInstanceGroupTag(tag);
+      _gSetupTool.getClusterManagementTool().setResourceIdealState(CLUSTER_NAME, db, is);
+      _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, db, _replica);
+      _allDBs.add(db);
+    }
+    Thread.sleep(300);
+    validate(_replica);
+  }
+
+  @Test(dependsOnMethods = "test")
+  public void testChangeIdealState() throws InterruptedException {
+    String dbName = "Test-DB-" + TestHelper.getTestMethodName();
+    createResourceWithWagedRebalance(CLUSTER_NAME, dbName,
+        BuiltInStateModelDefinitions.MasterSlave.name(), PARTITIONS, _replica, _replica);
+    _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, dbName, _replica);
+    _allDBs.add(dbName);
+    Thread.sleep(300);
+
+    validate(_replica);
+
+    // Adjust the replica count
+    IdealState is =
+        _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, dbName);
+    int newReplicaFactor = _replica - 1;
+    is.setReplicas("" + newReplicaFactor);
+    _gSetupTool.getClusterManagementTool().setResourceIdealState(CLUSTER_NAME, dbName, is);
+    Thread.sleep(300);
+
+    validate(newReplicaFactor);
+
+    // Adjust the partition list
+    is = _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, dbName);
+    is.setNumPartitions(PARTITIONS + 1);
+    _gSetupTool.getClusterManagementTool().setResourceIdealState(CLUSTER_NAME, dbName, is);
+    _gSetupTool.getClusterManagementTool().rebalance(CLUSTER_NAME, dbName, newReplicaFactor);
+    Thread.sleep(300);
+
+    validate(newReplicaFactor);
+    ExternalView ev =
+        _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, dbName);
+    Assert.assertEquals(ev.getPartitionSet().size(), PARTITIONS + 1);
+  }
+
+  @Test(dependsOnMethods = "test")
+  public void testDisableInstance() throws InterruptedException {
+    String dbName = "Test-DB-" + TestHelper.getTestMethodName();
+    createResourceWithWagedRebalance(CLUSTER_NAME, dbName,
+        BuiltInStateModelDefinitions.MasterSlave.name(), PARTITIONS, _replica, _replica);
+    _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, dbName, _replica);
+    _allDBs.add(dbName);
+    Thread.sleep(300);
+
+    validate(_replica);
+
+    // Disable participants, keep only three left
+    Set<String> disableParticipants = new HashSet<>();
+
+    try {
+      for (int i = 3; i < _participants.size(); i++) {
+        MockParticipantManager p = _participants.get(i);
+        disableParticipants.add(p.getInstanceName());
+        InstanceConfig config = _gSetupTool.getClusterManagementTool()
+            .getInstanceConfig(CLUSTER_NAME, p.getInstanceName());
+        config.setInstanceEnabled(false);
+        _gSetupTool.getClusterManagementTool()
+            .setInstanceConfig(CLUSTER_NAME, p.getInstanceName(), config);
+      }
+      Thread.sleep(300);
+
+      validate(_replica);
+
+      // Verify there is no assignment on the disabled participants.
+      ExternalView ev =
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, dbName);
+      for (String partition : ev.getPartitionSet()) {
+        Map<String, String> replicaStateMap = ev.getStateMap(partition);
+        for (String instance : replicaStateMap.keySet()) {
+          Assert.assertFalse(disableParticipants.contains(instance));
+        }
+      }
+    } finally {
+      // recover the config
+      for (String instanceName : disableParticipants) {
+        InstanceConfig config =
+            _gSetupTool.getClusterManagementTool().getInstanceConfig(CLUSTER_NAME, instanceName);
+        config.setInstanceEnabled(true);
+        _gSetupTool.getClusterManagementTool()
+            .setInstanceConfig(CLUSTER_NAME, instanceName, config);
+      }
+    }
+  }
+
+  @Test(dependsOnMethods = "testDisableInstance")
+  public void testLackEnoughLiveInstances() throws Exception {
+    // shutdown participants, keep only two left
+    for (int i = 2; i < _participants.size(); i++) {
+      _participants.get(i).syncStop();
+    }
+
+    int j = 0;
+    for (String stateModel : _testModels) {
+      String db = "Test-DB-" + TestHelper.getTestMethodName() + j++;
+      createResourceWithWagedRebalance(CLUSTER_NAME, db, stateModel, PARTITIONS, _replica,
+          _replica);
+      _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, db, _replica);
+      _allDBs.add(db);
+    }
+
+    Thread.sleep(300);
+    // Verify if the partitions get assigned
+    validate(2);
+
+    // restart the participants within the zone
+    for (int i = 2; i < _participants.size(); i++) {
+      MockParticipantManager p = _participants.get(i);
+      MockParticipantManager newNode =
+          new MockParticipantManager(ZK_ADDR, CLUSTER_NAME, p.getInstanceName());
+      _participants.set(i, newNode);
+      newNode.syncStart();
+    }
+
+    Thread.sleep(300);
+    // Verify if the partitions get assigned
+    validate(_replica);
+  }
+
+  @Test(dependsOnMethods = "testDisableInstance")
+  public void testLackEnoughInstances() throws Exception {
+    // shutdown participants, keep only two left
+    for (int i = 2; i < _participants.size(); i++) {
+      MockParticipantManager p = _participants.get(i);
+      p.syncStop();
+      _gSetupTool.getClusterManagementTool()
+          .enableInstance(CLUSTER_NAME, p.getInstanceName(), false);
+      _gSetupTool.dropInstanceFromCluster(CLUSTER_NAME, p.getInstanceName());
+
+    }
+
+    int j = 0;
+    for (String stateModel : _testModels) {
+      String db = "Test-DB-" + TestHelper.getTestMethodName() + j++;
+      createResourceWithWagedRebalance(CLUSTER_NAME, db, stateModel, PARTITIONS, _replica,
+          _replica);
+      _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, db, _replica);
+      _allDBs.add(db);
+    }
+
+    Thread.sleep(300);
+    // Verify if the partitions get assigned
+    validate(2);
+
+    // Create new participants within the zone
+    for (int i = 2; i < _participants.size(); i++) {
+      MockParticipantManager p = _participants.get(i);
+      String replaceNodeName = p.getInstanceName() + "-replacement_" + START_PORT;
+      addInstanceConfig(replaceNodeName, i, TAGS);
+      MockParticipantManager newNode =
+          new MockParticipantManager(ZK_ADDR, CLUSTER_NAME, replaceNodeName);
+      _participants.set(i, newNode);
+      newNode.syncStart();
+    }
+
+    Thread.sleep(300);
+    // Verify if the partitions get assigned
+    validate(_replica);
+  }
+
+  @Test(dependsOnMethods = "test")
+  public void testMixedRebalancerUsage() throws InterruptedException {
+    int i = 0;
+    for (String stateModel : _testModels) {
+      String db = "Test-DB-" + TestHelper.getTestMethodName() + i++;
+      if (i == 0) {
+        _gSetupTool.addResourceToCluster(CLUSTER_NAME, db, PARTITIONS, stateModel,
+            IdealState.RebalanceMode.FULL_AUTO + "", CrushRebalanceStrategy.class.getName());
+      } else if (i == 1) {
+        _gSetupTool.addResourceToCluster(CLUSTER_NAME, db, PARTITIONS, stateModel,
+            IdealState.RebalanceMode.FULL_AUTO + "", CrushEdRebalanceStrategy.class.getName());
+      } else {
+        createResourceWithWagedRebalance(CLUSTER_NAME, db, stateModel, PARTITIONS, _replica,
+            _replica);
+      }
+      _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, db, _replica);
+      _allDBs.add(db);
+    }
+    Thread.sleep(300);
+
+    validate(_replica);
+  }
+
+  @Test(dependsOnMethods = "test")
+  public void testMaxPartitionLimitation() throws Exception {
+    ConfigAccessor configAccessor = new ConfigAccessor(_gZkClient);
+    ClusterConfig clusterConfig = configAccessor.getClusterConfig(CLUSTER_NAME);
+    // Change the cluster level config so no assignment can be done
+    clusterConfig.setMaxPartitionsPerInstance(1);
+    configAccessor.setClusterConfig(CLUSTER_NAME, clusterConfig);
+    try {
+      String limitedResourceName = null;
+      int i = 0;
+      for (String stateModel : _testModels) {
+        String db = "Test-DB-" + TestHelper.getTestMethodName() + i++;
+        createResourceWithWagedRebalance(CLUSTER_NAME, db, stateModel, PARTITIONS, _replica,
+            _replica);
+        if (i == 1) {
+          // The limited resource has additional limitation.
+          // The other resources could have been assigned in theory if the WAGED rebalancer were
+          // not used.
+          // However, with the WAGED rebalancer, this restricted resource will block the other ones.
+          limitedResourceName = db;
+          IdealState idealState =
+              _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, db);
+          idealState.setMaxPartitionsPerInstance(1);
+          _gSetupTool.getClusterManagementTool().setResourceIdealState(CLUSTER_NAME, db, idealState);
+        }
+        _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, db, _replica);
+        _allDBs.add(db);
+      }
+      Thread.sleep(300);
+
+      // Since the WAGED rebalancer need to finish rebalancing every resources, the initial
+      // assignment won't show.
+      Assert.assertFalse(TestHelper.verify(() -> _allDBs.stream().anyMatch(db -> {
+        ExternalView ev =
+            _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, db);
+        return ev != null && !ev.getPartitionSet().isEmpty();
+      }), 2000));
+
+      // Remove the cluster level limitation
+      clusterConfig.setMaxPartitionsPerInstance(-1);
+      configAccessor.setClusterConfig(CLUSTER_NAME, clusterConfig);
+      Thread.sleep(300);
+
+      // Since the WAGED rebalancer need to finish rebalancing every resources, the assignment won't
+      // show even removed cluster level restriction
+      Assert.assertFalse(TestHelper.verify(() -> _allDBs.stream().anyMatch(db -> {
+        ExternalView ev =
+            _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, db);
+        return ev != null && !ev.getPartitionSet().isEmpty();
+      }), 2000));
+
+      // Remove the resource level limitation
+      IdealState idealState = _gSetupTool.getClusterManagementTool()
+          .getResourceIdealState(CLUSTER_NAME, limitedResourceName);
+      idealState.setMaxPartitionsPerInstance(Integer.MAX_VALUE);
+      _gSetupTool.getClusterManagementTool()
+          .setResourceIdealState(CLUSTER_NAME, limitedResourceName, idealState);
+
+      validate(_replica);
+    } finally {
+      // recover the config change
+      clusterConfig = configAccessor.getClusterConfig(CLUSTER_NAME);
+      clusterConfig.setMaxPartitionsPerInstance(-1);
+      configAccessor.setClusterConfig(CLUSTER_NAME, clusterConfig);
+    }
+  }
+
+  @Test(dependsOnMethods = "test")
+  public void testNewInstances() throws InterruptedException {
+    ConfigAccessor configAccessor = new ConfigAccessor(_gZkClient);
+    ClusterConfig clusterConfig = configAccessor.getClusterConfig(CLUSTER_NAME);
+    clusterConfig.setGlobalRebalancePreference(ImmutableMap
+        .of(ClusterConfig.GlobalRebalancePreferenceKey.EVENNESS, 0,
+            ClusterConfig.GlobalRebalancePreferenceKey.LESS_MOVEMENT, 10));
+    configAccessor.setClusterConfig(CLUSTER_NAME, clusterConfig);
+
+    int i = 0;
+    for (String stateModel : _testModels) {
+      String db = "Test-DB-" + TestHelper.getTestMethodName() + i++;
+      createResourceWithWagedRebalance(CLUSTER_NAME, db, stateModel, PARTITIONS, _replica, _replica);
+      _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, db, _replica);
+      _allDBs.add(db);
+    }
+    Thread.sleep(300);
+    validate(_replica);
+
+    String newNodeName = "newNode-" + TestHelper.getTestMethodName() + "_" + START_PORT;
+    MockParticipantManager participant =
+        new MockParticipantManager(ZK_ADDR, CLUSTER_NAME, newNodeName);
+    try {
+      _gSetupTool.addInstanceToCluster(CLUSTER_NAME, newNodeName);
+      participant.syncStart();
+
+      Thread.sleep(300);
+      validate(_replica);
+
+      Assert.assertFalse(_allDBs.stream().anyMatch(db -> {
+        ExternalView ev =
+            _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, db);
+        for (String partition : ev.getPartitionSet()) {
+          if (ev.getStateMap(partition).containsKey(newNodeName)) {
+            return true;
+          }
+        }
+        return false;
+      }));
+
+      clusterConfig.setGlobalRebalancePreference(ClusterConfig.DEFAULT_GLOBAL_REBALANCE_PREFERENCE);
+      configAccessor.setClusterConfig(CLUSTER_NAME, clusterConfig);
+      Thread.sleep(300);
+      validate(_replica);
+
+      Assert.assertTrue(_allDBs.stream().anyMatch(db -> {
+        ExternalView ev =
+            _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, db);
+        for (String partition : ev.getPartitionSet()) {
+          if (ev.getStateMap(partition).containsKey(newNodeName)) {
+            return true;
+          }
+        }
+        return false;
+      }));
+    } finally {
+      if (participant != null && participant.isConnected()) {
+        participant.syncStop();
+      }
+    }
+  }
+
+  /**
+   * The stateful WAGED rebalancer will be reset while the controller regains the leadership.
+   * This test is to verify if the reset has been done and the rebalancer has forgotten any previous
+   * status after leadership switched.
+   */
+  @Test(dependsOnMethods = "test")
+  public void testRebalancerReset() throws Exception {
+    // Configure the rebalance preference so as to trigger more partition movements for evenness.
+    // This is to ensure the controller will try to move something if the rebalancer has been reset.
+    ConfigAccessor configAccessor = new ConfigAccessor(_gZkClient);
+    ClusterConfig clusterConfig = configAccessor.getClusterConfig(CLUSTER_NAME);
+    clusterConfig.setGlobalRebalancePreference(ImmutableMap
+        .of(ClusterConfig.GlobalRebalancePreferenceKey.EVENNESS, 10,
+            ClusterConfig.GlobalRebalancePreferenceKey.LESS_MOVEMENT, 0));
+    configAccessor.setClusterConfig(CLUSTER_NAME, clusterConfig);
+
+    int i = 0;
+    for (String stateModel : _testModels) {
+      String db = "Test-DB-" + TestHelper.getTestMethodName() + i++;
+      createResourceWithWagedRebalance(CLUSTER_NAME, db, stateModel, PARTITIONS, _replica,
+          _replica);
+      _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, db, _replica);
+      _allDBs.add(db);
+    }
+    // TODO remove this sleep after fix https://github.com/apache/helix/issues/526
+    Thread.sleep(300);
+    validate(_replica);
+
+    // Adding one more resource. Since it is added after the other resources, the assignment is
+    // impacted because of the other resources' assignment.
+    String moreDB = "More-Test-DB";
+    createResourceWithWagedRebalance(CLUSTER_NAME, moreDB,
+        BuiltInStateModelDefinitions.MasterSlave.name(), PARTITIONS, _replica, _replica);
+    _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, moreDB, _replica);
+    _allDBs.add(moreDB);
+    // TODO remove this sleep after fix https://github.com/apache/helix/issues/526
+    Thread.sleep(300);
+    validate(_replica);
+    ExternalView oldEV =
+        _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, moreDB);
+
+    // Expire the controller session so it will reset the internal rebalancer's status.
+    simulateSessionExpiry(_controller.getZkClient());
+    // After reset done, the rebalancer will try to rebalance all the partitions since it has
+    // forgotten the previous state.
+    // TODO remove this sleep after fix https://github.com/apache/helix/issues/526
+    Thread.sleep(300);
+    validate(_replica);
+    ExternalView newEV =
+        _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, moreDB);
+
+    // To verify that the controller has moved some partitions.
+    Assert.assertFalse(newEV.equals(oldEV));
+  }
+
+  private void validate(int expectedReplica) {
+    HelixClusterVerifier _clusterVerifier =
+        new StrictMatchExternalViewVerifier.Builder(CLUSTER_NAME).setZkAddr(ZK_ADDR)
+            .setDeactivatedNodeAwareness(true).setResources(_allDBs).build();
+    Assert.assertTrue(_clusterVerifier.verify(5000));
+    for (String db : _allDBs) {
+      IdealState is =
+          _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, db);
+      ExternalView ev =
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, db);
+      validateIsolation(is, ev, expectedReplica);
+    }
+  }
+
+  /**
+   * Validate each partition is different instances and with necessary tagged instances.
+   */
+  private void validateIsolation(IdealState is, ExternalView ev, int expectedReplica) {
+    String tag = is.getInstanceGroupTag();
+    for (String partition : is.getPartitionSet()) {
+      Map<String, String> assignmentMap = ev.getRecord().getMapField(partition);
+      Set<String> instancesInEV = assignmentMap.keySet();
+      Assert.assertEquals(instancesInEV.size(), expectedReplica);
+      for (String instance : instancesInEV) {
+        if (tag != null) {
+          InstanceConfig config =
+              _gSetupTool.getClusterManagementTool().getInstanceConfig(CLUSTER_NAME, instance);
+          Assert.assertTrue(config.containsTag(tag));
+        }
+      }
+    }
+  }
+
+  @AfterMethod
+  public void afterMethod() throws Exception {
+    for (String db : _allDBs) {
+      _gSetupTool.dropResourceFromCluster(CLUSTER_NAME, db);
+    }
+    _allDBs.clear();
+    // waiting for all DB be dropped.
+    Thread.sleep(100);
+    ZkHelixClusterVerifier _clusterVerifier =
+        new StrictMatchExternalViewVerifier.Builder(CLUSTER_NAME).setZkAddr(ZK_ADDR)
+            .setDeactivatedNodeAwareness(true).setResources(_allDBs).build();
+    Assert.assertTrue(_clusterVerifier.verifyByPolling());
+  }
+
+  @AfterClass
+  public void afterClass() throws Exception {
+    if (_controller != null && _controller.isConnected()) {
+      _controller.syncStop();
+    }
+    for (MockParticipantManager p : _participants) {
+      if (p != null && p.isConnected()) {
+        p.syncStop();
+      }
+    }
+    deleteCluster(CLUSTER_NAME);
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestWagedRebalanceFaultZone.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestWagedRebalanceFaultZone.java
new file mode 100644
index 0000000..904e0bc
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestWagedRebalanceFaultZone.java
@@ -0,0 +1,372 @@
+package org.apache.helix.integration.rebalancer.WagedRebalancer;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.ArrayList;
+import java.util.Date;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import org.apache.helix.ConfigAccessor;
+import org.apache.helix.common.ZkTestBase;
+import org.apache.helix.integration.manager.ClusterControllerManager;
+import org.apache.helix.integration.manager.MockParticipantManager;
+import org.apache.helix.model.BuiltInStateModelDefinitions;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.ExternalView;
+import org.apache.helix.model.IdealState;
+import org.apache.helix.model.InstanceConfig;
+import org.apache.helix.tools.ClusterVerifiers.BestPossibleExternalViewVerifier;
+import org.apache.helix.tools.ClusterVerifiers.ZkHelixClusterVerifier;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.AfterMethod;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+public class TestWagedRebalanceFaultZone extends ZkTestBase {
+  protected final int NUM_NODE = 6;
+  protected static final int START_PORT = 12918;
+  protected static final int PARTITIONS = 20;
+  protected static final int ZONES = 3;
+  protected static final int TAGS = 2;
+
+  protected final String CLASS_NAME = getShortClassName();
+  protected final String CLUSTER_NAME = CLUSTER_PREFIX + "_" + CLASS_NAME;
+  protected ClusterControllerManager _controller;
+
+  List<MockParticipantManager> _participants = new ArrayList<>();
+  Map<String, String> _nodeToZoneMap = new HashMap<>();
+  Map<String, String> _nodeToTagMap = new HashMap<>();
+  List<String> _nodes = new ArrayList<>();
+  Set<String> _allDBs = new HashSet<>();
+  int _replica = 3;
+
+  String[] _testModels = {
+      BuiltInStateModelDefinitions.OnlineOffline.name(),
+      BuiltInStateModelDefinitions.MasterSlave.name(),
+      BuiltInStateModelDefinitions.LeaderStandby.name()
+  };
+
+  @BeforeClass
+  public void beforeClass() throws Exception {
+    System.out.println("START " + CLASS_NAME + " at " + new Date(System.currentTimeMillis()));
+
+    _gSetupTool.addCluster(CLUSTER_NAME, true);
+
+    for (int i = 0; i < NUM_NODE; i++) {
+      String storageNodeName = PARTICIPANT_PREFIX + "_" + (START_PORT + i);
+      addInstanceConfig(storageNodeName, i, ZONES, TAGS);
+    }
+
+    // start dummy participants
+    for (String node : _nodes) {
+      MockParticipantManager participant = new MockParticipantManager(ZK_ADDR, CLUSTER_NAME, node);
+      participant.syncStart();
+      _participants.add(participant);
+    }
+
+    // start controller
+    String controllerName = CONTROLLER_PREFIX + "_0";
+    _controller = new ClusterControllerManager(ZK_ADDR, CLUSTER_NAME, controllerName);
+    _controller.syncStart();
+
+    enablePersistBestPossibleAssignment(_gZkClient, CLUSTER_NAME, true);
+    enableTopologyAwareRebalance(_gZkClient, CLUSTER_NAME, true);
+  }
+
+  protected void addInstanceConfig(String storageNodeName, int seqNo, int zoneCount, int tagCount) {
+    _gSetupTool.addInstanceToCluster(CLUSTER_NAME, storageNodeName);
+    String zone = "zone-" + seqNo % zoneCount;
+    String tag = "tag-" + seqNo % tagCount;
+    _gSetupTool.getClusterManagementTool().setInstanceZoneId(CLUSTER_NAME, storageNodeName, zone);
+    _gSetupTool.getClusterManagementTool().addInstanceTag(CLUSTER_NAME, storageNodeName, tag);
+    _nodeToZoneMap.put(storageNodeName, zone);
+    _nodeToTagMap.put(storageNodeName, tag);
+    _nodes.add(storageNodeName);
+  }
+
+  @Test
+  public void testZoneIsolation() throws Exception {
+    int i = 0;
+    for (String stateModel : _testModels) {
+      String db = "Test-DB-testZoneIsolation" + i++;
+      createResourceWithWagedRebalance(CLUSTER_NAME, db, stateModel, PARTITIONS, _replica,
+          _replica);
+      _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, db, _replica);
+      _allDBs.add(db);
+    }
+    Thread.sleep(300);
+
+    validate(_replica);
+  }
+
+  @Test
+  public void testZoneIsolationWithInstanceTag() throws Exception {
+    Set<String> tags = new HashSet<String>(_nodeToTagMap.values());
+    int i = 0;
+    for (String tag : tags) {
+      String db = "Test-DB-testZoneIsolationWithInstanceTag" + i++;
+      createResourceWithWagedRebalance(CLUSTER_NAME, db,
+          BuiltInStateModelDefinitions.MasterSlave.name(), PARTITIONS, _replica, _replica);
+      IdealState is =
+          _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, db);
+      is.setInstanceGroupTag(tag);
+      _gSetupTool.getClusterManagementTool().setResourceIdealState(CLUSTER_NAME, db, is);
+      _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, db, _replica);
+      _allDBs.add(db);
+    }
+    Thread.sleep(300);
+
+    validate(_replica);
+  }
+
+  @Test(dependsOnMethods = { "testZoneIsolation", "testZoneIsolationWithInstanceTag" })
+  public void testLackEnoughLiveRacks() throws Exception {
+    // shutdown participants within one zone
+    String zone = _nodeToZoneMap.values().iterator().next();
+    for (int i = 0; i < _participants.size(); i++) {
+      MockParticipantManager p = _participants.get(i);
+      if (_nodeToZoneMap.get(p.getInstanceName()).equals(zone)) {
+        p.syncStop();
+      }
+    }
+
+    int j = 0;
+    for (String stateModel : _testModels) {
+      String db = "Test-DB-testLackEnoughLiveRacks" + j++;
+      createResourceWithWagedRebalance(CLUSTER_NAME, db, stateModel, PARTITIONS, _replica,
+          _replica);
+      _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, db, _replica);
+      _allDBs.add(db);
+    }
+    Thread.sleep(300);
+    validate(2);
+
+    // restart the participants within the zone
+    for (int i = 0; i < _participants.size(); i++) {
+      MockParticipantManager p = _participants.get(i);
+      if (_nodeToZoneMap.get(p.getInstanceName()).equals(zone)) {
+        MockParticipantManager newNode =
+            new MockParticipantManager(ZK_ADDR, CLUSTER_NAME, p.getInstanceName());
+        _participants.set(i, newNode);
+        newNode.syncStart();
+      }
+    }
+
+    Thread.sleep(300);
+    // Verify if the partitions get assigned
+    validate(_replica);
+  }
+
+  @Test(dependsOnMethods = { "testLackEnoughLiveRacks" })
+  public void testLackEnoughRacks() throws Exception {
+    // shutdown participants within one zone
+    String zone = _nodeToZoneMap.values().iterator().next();
+    for (int i = 0; i < _participants.size(); i++) {
+      MockParticipantManager p = _participants.get(i);
+      if (_nodeToZoneMap.get(p.getInstanceName()).equals(zone)) {
+        p.syncStop();
+        _gSetupTool.getClusterManagementTool()
+            .enableInstance(CLUSTER_NAME, p.getInstanceName(), false);
+        Thread.sleep(50);
+        _gSetupTool.dropInstanceFromCluster(CLUSTER_NAME, p.getInstanceName());
+      }
+    }
+
+    int j = 0;
+    for (String stateModel : _testModels) {
+      String db = "Test-DB-testLackEnoughRacks" + j++;
+      createResourceWithWagedRebalance(CLUSTER_NAME, db, stateModel, PARTITIONS, _replica,
+          _replica);
+      _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, db, _replica);
+      _allDBs.add(db);
+    }
+    Thread.sleep(300);
+    validate(2);
+
+    // Create new participants within the zone
+    int nodeCount = _participants.size();
+    for (int i = 0; i < nodeCount; i++) {
+      MockParticipantManager p = _participants.get(i);
+      if (_nodeToZoneMap.get(p.getInstanceName()).equals(zone)) {
+        String replaceNodeName = p.getInstanceName() + "-replacement_" + START_PORT;
+        addInstanceConfig(replaceNodeName, i, ZONES, TAGS);
+        MockParticipantManager newNode =
+            new MockParticipantManager(ZK_ADDR, CLUSTER_NAME, replaceNodeName);
+        _participants.set(i, newNode);
+        newNode.syncStart();
+      }
+    }
+
+    Thread.sleep(300);
+    // Verify if the partitions get assigned
+    validate(_replica);
+  }
+
+  @Test(dependsOnMethods = { "testZoneIsolation", "testZoneIsolationWithInstanceTag" })
+  public void testAddZone() throws Exception {
+    int i = 0;
+    for (String stateModel : _testModels) {
+      String db = "Test-DB-testAddZone" + i++;
+      createResourceWithWagedRebalance(CLUSTER_NAME, db, stateModel, PARTITIONS, _replica,
+          _replica);
+      _gSetupTool.rebalanceStorageCluster(CLUSTER_NAME, db, _replica);
+      _allDBs.add(db);
+    }
+    Thread.sleep(300);
+
+    validate(_replica);
+
+    // Create new participants within the a new zone
+    Set<MockParticipantManager> newNodes = new HashSet<>();
+    Map<String, Integer> newNodeReplicaCount = new HashMap<>();
+
+    ConfigAccessor configAccessor = new ConfigAccessor(_gZkClient);
+
+    try {
+      // Configure the preference so as to allow movements.
+      ClusterConfig clusterConfig = configAccessor.getClusterConfig(CLUSTER_NAME);
+      Map<ClusterConfig.GlobalRebalancePreferenceKey, Integer> preference = new HashMap<>();
+      preference.put(ClusterConfig.GlobalRebalancePreferenceKey.EVENNESS, 10);
+      preference.put(ClusterConfig.GlobalRebalancePreferenceKey.LESS_MOVEMENT, 0);
+      clusterConfig.setGlobalRebalancePreference(preference);
+      configAccessor.setClusterConfig(CLUSTER_NAME, clusterConfig);
+
+      int nodeCount = 2;
+      for (int j = 0; j < nodeCount; j++) {
+        String newNodeName = "new-zone-node-" + j + "_" + START_PORT;
+        // Add all new node to the new zone
+        addInstanceConfig(newNodeName, j, ZONES + 1, TAGS);
+        MockParticipantManager newNode =
+            new MockParticipantManager(ZK_ADDR, CLUSTER_NAME, newNodeName);
+        newNode.syncStart();
+        newNodes.add(newNode);
+        newNodeReplicaCount.put(newNodeName, 0);
+      }
+      Thread.sleep(300);
+
+      validate(_replica);
+
+      // The new zone nodes shall have some assignments
+      for (String db : _allDBs) {
+        IdealState is =
+            _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, db);
+        ExternalView ev =
+            _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, db);
+        validateZoneAndTagIsolation(is, ev, _replica);
+        for (String partition : ev.getPartitionSet()) {
+          Map<String, String> stateMap = ev.getStateMap(partition);
+          for (String node : stateMap.keySet()) {
+            if (newNodeReplicaCount.containsKey(node)) {
+              newNodeReplicaCount.computeIfPresent(node, (nodeName, replicaCount) -> replicaCount + 1);
+            }
+          }
+        }
+      }
+      Assert.assertTrue(newNodeReplicaCount.values().stream().allMatch(count -> count > 0));
+    } finally {
+      // Revert the preference
+      ClusterConfig clusterConfig = configAccessor.getClusterConfig(CLUSTER_NAME);
+      Map<ClusterConfig.GlobalRebalancePreferenceKey, Integer> preference = new HashMap<>();
+      preference.put(ClusterConfig.GlobalRebalancePreferenceKey.EVENNESS, 1);
+      preference.put(ClusterConfig.GlobalRebalancePreferenceKey.LESS_MOVEMENT, 1);
+      clusterConfig.setGlobalRebalancePreference(preference);
+      configAccessor.setClusterConfig(CLUSTER_NAME, clusterConfig);
+      // Stop the new nodes
+      for (MockParticipantManager p : newNodes) {
+        if (p != null && p.isConnected()) {
+          p.syncStop();
+        }
+      }
+    }
+  }
+
+  private void validate(int expectedReplica) {
+    ZkHelixClusterVerifier _clusterVerifier =
+        new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME).setZkAddr(ZK_ADDR).build();
+    Assert.assertTrue(_clusterVerifier.verifyByPolling());
+
+    for (String db : _allDBs) {
+      IdealState is =
+          _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, db);
+      ExternalView ev =
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, db);
+      validateZoneAndTagIsolation(is, ev, expectedReplica);
+    }
+  }
+
+  /**
+   * Validate instances for each partition is on different zone and with necessary tagged instances.
+   */
+  private void validateZoneAndTagIsolation(IdealState is, ExternalView ev, int expectedReplica) {
+    String tag = is.getInstanceGroupTag();
+    for (String partition : is.getPartitionSet()) {
+      Set<String> assignedZones = new HashSet<String>();
+
+      Map<String, String> assignmentMap = ev.getRecord().getMapField(partition);
+      Set<String> instancesInEV = assignmentMap.keySet();
+      // TODO: preference List is not persisted in IS.
+      // Assert.assertEquals(instancesInEV, instancesInIs);
+      for (String instance : instancesInEV) {
+        assignedZones.add(_nodeToZoneMap.get(instance));
+        if (tag != null) {
+          InstanceConfig config =
+              _gSetupTool.getClusterManagementTool().getInstanceConfig(CLUSTER_NAME, instance);
+          Assert.assertTrue(config.containsTag(tag));
+        }
+      }
+      Assert.assertEquals(assignedZones.size(), expectedReplica);
+    }
+  }
+
+  @AfterMethod
+  public void afterMethod() throws Exception {
+    for (String db : _allDBs) {
+      _gSetupTool.dropResourceFromCluster(CLUSTER_NAME, db);
+    }
+    _allDBs.clear();
+    // waiting for all DB be dropped.
+    Thread.sleep(100);
+    ZkHelixClusterVerifier _clusterVerifier =
+        new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME).setZkAddr(ZK_ADDR).build();
+    Assert.assertTrue(_clusterVerifier.verifyByPolling());
+  }
+
+  @AfterClass
+  public void afterClass() throws Exception {
+    /*
+     * shutdown order: 1) disconnect the controller 2) disconnect participants
+     */
+    if (_controller != null && _controller.isConnected()) {
+      _controller.syncStop();
+    }
+    for (MockParticipantManager p : _participants) {
+      if (p != null && p.isConnected()) {
+        p.syncStop();
+      }
+    }
+    deleteCluster(CLUSTER_NAME);
+    System.out.println("END " + CLASS_NAME + " at " + new Date(System.currentTimeMillis()));
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestWagedRebalanceTopologyAware.java b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestWagedRebalanceTopologyAware.java
new file mode 100644
index 0000000..412fc8c
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestWagedRebalanceTopologyAware.java
@@ -0,0 +1,114 @@
+package org.apache.helix.integration.rebalancer.WagedRebalancer;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Date;
+
+import org.apache.helix.ConfigAccessor;
+import org.apache.helix.integration.manager.ClusterControllerManager;
+import org.apache.helix.integration.manager.MockParticipantManager;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.InstanceConfig;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+public class TestWagedRebalanceTopologyAware extends TestWagedRebalanceFaultZone {
+  private static final String TOLOPOGY_DEF = "/DOMAIN/ZONE/INSTANCE";
+  private static final String DOMAIN_NAME = "Domain";
+  private static final String FAULT_ZONE = "ZONE";
+
+  protected final String CLASS_NAME = getShortClassName();
+  protected final String CLUSTER_NAME = CLUSTER_PREFIX + "_" + CLASS_NAME;
+
+  @BeforeClass
+  public void beforeClass() throws Exception {
+    System.out.println("START " + CLASS_NAME + " at " + new Date(System.currentTimeMillis()));
+
+    _gSetupTool.addCluster(CLUSTER_NAME, true);
+
+    ConfigAccessor configAccessor = new ConfigAccessor(_gZkClient);
+    ClusterConfig clusterConfig = configAccessor.getClusterConfig(CLUSTER_NAME);
+    clusterConfig.setTopology(TOLOPOGY_DEF);
+    clusterConfig.setFaultZoneType(FAULT_ZONE);
+    configAccessor.setClusterConfig(CLUSTER_NAME, clusterConfig);
+
+    for (int i = 0; i < NUM_NODE; i++) {
+      String storageNodeName = PARTICIPANT_PREFIX + "_" + (START_PORT + i);
+      addInstanceConfig(storageNodeName, i, ZONES, TAGS);
+    }
+
+    // start dummy participants
+    for (String node : _nodes) {
+      MockParticipantManager participant = new MockParticipantManager(ZK_ADDR, CLUSTER_NAME, node);
+      participant.syncStart();
+      _participants.add(participant);
+    }
+
+    // start controller
+    String controllerName = CONTROLLER_PREFIX + "_0";
+    _controller = new ClusterControllerManager(ZK_ADDR, CLUSTER_NAME, controllerName);
+    _controller.syncStart();
+
+    enablePersistBestPossibleAssignment(_gZkClient, CLUSTER_NAME, true);
+    enableTopologyAwareRebalance(_gZkClient, CLUSTER_NAME, true);
+  }
+
+  protected void addInstanceConfig(String storageNodeName, int seqNo, int zoneCount, int tagCount) {
+    _gSetupTool.addInstanceToCluster(CLUSTER_NAME, storageNodeName);
+    String zone = "zone-" + seqNo % zoneCount;
+    String tag = "tag-" + seqNo % tagCount;
+
+    InstanceConfig config =
+        _gSetupTool.getClusterManagementTool().getInstanceConfig(CLUSTER_NAME, storageNodeName);
+    config.setDomain(
+        String.format("DOMAIN=%s,ZONE=%s,INSTANCE=%s", DOMAIN_NAME, zone, storageNodeName));
+    config.addTag(tag);
+    _gSetupTool.getClusterManagementTool().setInstanceConfig(CLUSTER_NAME, storageNodeName, config);
+
+    _nodeToZoneMap.put(storageNodeName, zone);
+    _nodeToTagMap.put(storageNodeName, tag);
+    _nodes.add(storageNodeName);
+  }
+
+  @Test
+  public void testZoneIsolation() throws Exception {
+    super.testZoneIsolation();
+  }
+
+  @Test
+  public void testZoneIsolationWithInstanceTag() throws Exception {
+    super.testZoneIsolationWithInstanceTag();
+  }
+
+  @Test(dependsOnMethods = { "testZoneIsolation", "testZoneIsolationWithInstanceTag" })
+  public void testLackEnoughLiveRacks() throws Exception {
+    super.testLackEnoughLiveRacks();
+  }
+
+  @Test(dependsOnMethods = { "testLackEnoughLiveRacks" })
+  public void testLackEnoughRacks() throws Exception {
+    super.testLackEnoughRacks();
+  }
+
+  @Test(dependsOnMethods = { "testZoneIsolation", "testZoneIsolationWithInstanceTag" })
+  public void testAddZone() throws Exception {
+    super.testAddZone();
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/manager/zk/TestZkBucketDataAccessor.java b/helix-core/src/test/java/org/apache/helix/manager/zk/TestZkBucketDataAccessor.java
new file mode 100644
index 0000000..c7b5cbf
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/manager/zk/TestZkBucketDataAccessor.java
@@ -0,0 +1,189 @@
+package org.apache.helix.manager.zk;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Random;
+import org.I0Itec.zkclient.exception.ZkMarshallingError;
+import org.I0Itec.zkclient.serialize.ZkSerializer;
+import org.apache.helix.AccessOption;
+import org.apache.helix.BaseDataAccessor;
+import org.apache.helix.BucketDataAccessor;
+import org.apache.helix.HelixException;
+import org.apache.helix.HelixProperty;
+import org.apache.helix.TestHelper;
+import org.apache.helix.ZNRecord;
+import org.apache.helix.common.ZkTestBase;
+import org.apache.helix.manager.zk.client.DedicatedZkClientFactory;
+import org.apache.helix.manager.zk.client.HelixZkClient;
+import org.testng.Assert;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+public class TestZkBucketDataAccessor extends ZkTestBase {
+  private static final String PATH = "/" + TestHelper.getTestClassName();
+  private static final String NAME_KEY = TestHelper.getTestClassName();
+  private static final String LAST_SUCCESSFUL_WRITE_KEY = "LAST_SUCCESSFUL_WRITE";
+  private static final String LAST_WRITE_KEY = "LAST_WRITE";
+
+  // Populate list and map fields for content comparison
+  private static final List<String> LIST_FIELD = ImmutableList.of("1", "2");
+  private static final Map<String, String> MAP_FIELD = ImmutableMap.of("1", "2");
+
+  private BucketDataAccessor _bucketDataAccessor;
+  private BaseDataAccessor<byte[]> _zkBaseDataAccessor;
+
+  private ZNRecord record = new ZNRecord(NAME_KEY);
+
+  @BeforeClass
+  public void beforeClass() {
+    // Initialize ZK accessors for testing
+    _bucketDataAccessor = new ZkBucketDataAccessor(ZK_ADDR, 50 * 1024, 0L);
+    HelixZkClient zkClient = DedicatedZkClientFactory.getInstance()
+        .buildZkClient(new HelixZkClient.ZkConnectionConfig(ZK_ADDR));
+    zkClient.setZkSerializer(new ZkSerializer() {
+      @Override
+      public byte[] serialize(Object data) throws ZkMarshallingError {
+        if (data instanceof byte[]) {
+          return (byte[]) data;
+        }
+        throw new HelixException("ZkBucketDataAccesor only supports a byte array as an argument!");
+      }
+
+      @Override
+      public Object deserialize(byte[] data) throws ZkMarshallingError {
+        return data;
+      }
+    });
+    _zkBaseDataAccessor = new ZkBaseDataAccessor<>(zkClient);
+
+    // Fill in some data for the record
+    record.setSimpleField(NAME_KEY, NAME_KEY);
+    record.setListField(NAME_KEY, LIST_FIELD);
+    record.setMapField(NAME_KEY, MAP_FIELD);
+  }
+
+  @AfterClass
+  public void afterClass() {
+    _bucketDataAccessor.disconnect();
+  }
+
+  /**
+   * Attempt writing a simple HelixProperty using compressedBucketWrite.
+   * @throws IOException
+   */
+  @Test
+  public void testCompressedBucketWrite() throws IOException {
+    Assert.assertTrue(_bucketDataAccessor.compressedBucketWrite(PATH, new HelixProperty(record)));
+  }
+
+  @Test(dependsOnMethods = "testCompressedBucketWrite")
+  public void testMultipleWrites() throws Exception {
+    int count = 50;
+
+    // Write "count" times
+    for (int i = 0; i < count; i++) {
+      _bucketDataAccessor.compressedBucketWrite(PATH, new HelixProperty(record));
+    }
+
+    // Last known good version number should be "count"
+    byte[] binarySuccessfulWriteVer = _zkBaseDataAccessor
+        .get(PATH + "/" + LAST_SUCCESSFUL_WRITE_KEY, null, AccessOption.PERSISTENT);
+    long lastSuccessfulWriteVer = Long.parseLong(new String(binarySuccessfulWriteVer));
+    Assert.assertEquals(lastSuccessfulWriteVer, count);
+
+    // Last write version should be "count"
+    byte[] binaryWriteVer =
+        _zkBaseDataAccessor.get(PATH + "/" + LAST_WRITE_KEY, null, AccessOption.PERSISTENT);
+    long writeVer = Long.parseLong(new String(binaryWriteVer));
+    Assert.assertEquals(writeVer, count);
+
+    // Test that all previous versions have been deleted
+    // Use Verifier because GC can take ZK delay
+    Assert.assertTrue(TestHelper.verify(() -> {
+      List<String> children = _zkBaseDataAccessor.getChildNames(PATH, AccessOption.PERSISTENT);
+      return children.size() == 3;
+    }, 60 * 1000L));
+  }
+
+  /**
+   * The record written in {@link #testCompressedBucketWrite()} is the same record that was written.
+   */
+  @Test(dependsOnMethods = "testMultipleWrites")
+  public void testCompressedBucketRead() {
+    HelixProperty readRecord = _bucketDataAccessor.compressedBucketRead(PATH, HelixProperty.class);
+    Assert.assertEquals(readRecord.getRecord().getSimpleField(NAME_KEY), NAME_KEY);
+    Assert.assertEquals(readRecord.getRecord().getListField(NAME_KEY), LIST_FIELD);
+    Assert.assertEquals(readRecord.getRecord().getMapField(NAME_KEY), MAP_FIELD);
+    _bucketDataAccessor.compressedBucketDelete(PATH);
+  }
+
+  /**
+   * Write a HelixProperty with large number of entries using BucketDataAccessor and read it back.
+   */
+  @Test(dependsOnMethods = "testCompressedBucketRead")
+  public void testLargeWriteAndRead() throws IOException {
+    String name = "largeResourceAssignment";
+    HelixProperty property = createLargeHelixProperty(name, 100000);
+
+    // Perform large write
+    long before = System.currentTimeMillis();
+    _bucketDataAccessor.compressedBucketWrite("/" + name, property);
+    long after = System.currentTimeMillis();
+    System.out.println("Write took " + (after - before) + " ms");
+
+    // Read it back
+    before = System.currentTimeMillis();
+    HelixProperty readRecord =
+        _bucketDataAccessor.compressedBucketRead("/" + name, HelixProperty.class);
+    after = System.currentTimeMillis();
+    System.out.println("Read took " + (after - before) + " ms");
+
+    // Check against the original HelixProperty
+    Assert.assertEquals(readRecord, property);
+  }
+
+  private HelixProperty createLargeHelixProperty(String name, int numEntries) {
+    HelixProperty property = new HelixProperty(name);
+    for (int i = 0; i < numEntries; i++) {
+      // Create a random string every time
+      byte[] arrayKey = new byte[20];
+      byte[] arrayVal = new byte[20];
+      new Random().nextBytes(arrayKey);
+      new Random().nextBytes(arrayVal);
+      String randomStrKey = new String(arrayKey, StandardCharsets.UTF_8);
+      String randomStrVal = new String(arrayVal, StandardCharsets.UTF_8);
+
+      // Dummy mapField
+      Map<String, String> mapField = new HashMap<>();
+      mapField.put(randomStrKey, randomStrVal);
+
+      property.getRecord().setMapField(randomStrKey, mapField);
+    }
+    return property;
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/manager/zk/TestZkHelixAdmin.java b/helix-core/src/test/java/org/apache/helix/manager/zk/TestZkHelixAdmin.java
index 20acef4..c391085 100644
--- a/helix-core/src/test/java/org/apache/helix/manager/zk/TestZkHelixAdmin.java
+++ b/helix-core/src/test/java/org/apache/helix/manager/zk/TestZkHelixAdmin.java
@@ -19,13 +19,16 @@
  * under the License.
  */
 
+import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Arrays;
+import java.util.Collections;
 import java.util.Date;
 import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
 
+import com.google.common.collect.ImmutableMap;
 import org.apache.helix.BaseDataAccessor;
 import org.apache.helix.HelixAdmin;
 import org.apache.helix.HelixDataAccessor;
@@ -39,7 +42,10 @@
 import org.apache.helix.TestHelper;
 import org.apache.helix.ZNRecord;
 import org.apache.helix.ZkUnitTestBase;
+import org.apache.helix.controller.rebalancer.waged.WagedRebalancer;
 import org.apache.helix.examples.MasterSlaveStateModelFactory;
+import org.apache.helix.integration.manager.MockParticipantManager;
+import org.apache.helix.model.ClusterConfig;
 import org.apache.helix.model.ClusterConstraints;
 import org.apache.helix.model.ClusterConstraints.ConstraintAttribute;
 import org.apache.helix.model.ClusterConstraints.ConstraintType;
@@ -49,18 +55,22 @@
 import org.apache.helix.model.HelixConfigScope.ConfigScopeProperty;
 import org.apache.helix.model.IdealState;
 import org.apache.helix.model.InstanceConfig;
+import org.apache.helix.model.MasterSlaveSMD;
+import org.apache.helix.model.ResourceConfig;
 import org.apache.helix.model.StateModelDefinition;
 import org.apache.helix.model.builder.ConstraintItemBuilder;
 import org.apache.helix.model.builder.HelixConfigScopeBuilder;
 import org.apache.helix.participant.StateMachineEngine;
 import org.apache.helix.tools.StateModelConfigGenerator;
 import org.apache.zookeeper.data.Stat;
+import org.codehaus.jackson.map.ObjectMapper;
 import org.testng.Assert;
 import org.testng.AssertJUnit;
 import org.testng.annotations.BeforeClass;
 import org.testng.annotations.Test;
 
 public class TestZkHelixAdmin extends ZkUnitTestBase {
+  private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
 
   @BeforeClass
   public void beforeClass() {
@@ -506,4 +516,102 @@
         .getListField(InstanceConfig.InstanceConfigProperty.HELIX_DISABLED_PARTITION.name()).size(),
         2);
   }
+
+  /**
+   * Test addResourceWithWeight() and validateResourcesForWagedRebalance() by trying to add a resource with incomplete ResourceConfig.
+   */
+  @Test
+  public void testAddResourceWithWeightAndValidation()
+      throws IOException {
+    String className = TestHelper.getTestClassName();
+    String methodName = TestHelper.getTestMethodName();
+    String clusterName = className + "_" + methodName;
+    String mockInstance = "MockInstance";
+    String testResourcePrefix = "TestResource";
+    HelixAdmin admin = new ZKHelixAdmin(_gZkClient);
+    admin.addCluster(clusterName, true);
+    admin.addStateModelDef(clusterName, "MasterSlave", new MasterSlaveSMD());
+
+    // Create a dummy instance
+    InstanceConfig instanceConfig = new InstanceConfig(mockInstance);
+    Map<String, Integer> mockInstanceCapacity =
+        ImmutableMap.of("WCU", 100, "RCU", 100, "STORAGE", 100);
+    instanceConfig.setInstanceCapacityMap(mockInstanceCapacity);
+    admin.addInstance(clusterName, instanceConfig);
+    MockParticipantManager mockParticipantManager =
+        new MockParticipantManager(ZK_ADDR, clusterName, mockInstance);
+    mockParticipantManager.syncStart();
+
+    IdealState idealState = new IdealState(testResourcePrefix);
+    idealState.setNumPartitions(3);
+    idealState.setStateModelDefRef("MasterSlave");
+    idealState.setRebalanceMode(IdealState.RebalanceMode.FULL_AUTO);
+
+    ResourceConfig resourceConfig = new ResourceConfig(testResourcePrefix);
+    // validate
+    Map<String, Boolean> validationResult = admin.validateResourcesForWagedRebalance(clusterName,
+        Collections.singletonList(testResourcePrefix));
+    Assert.assertEquals(validationResult.size(), 1);
+    Assert.assertFalse(validationResult.get(testResourcePrefix));
+    try {
+      admin.addResourceWithWeight(clusterName, idealState, resourceConfig);
+      Assert.fail();
+    } catch (HelixException e) {
+      // OK since resourceConfig is empty
+    }
+
+    // Set PARTITION_CAPACITY_MAP
+    Map<String, String> capacityDataMap =
+        ImmutableMap.of("WCU", "1", "RCU", "2", "STORAGE", "3");
+    resourceConfig.getRecord()
+        .setMapField(ResourceConfig.ResourceConfigProperty.PARTITION_CAPACITY_MAP.name(),
+            Collections.singletonMap(ResourceConfig.DEFAULT_PARTITION_KEY,
+                OBJECT_MAPPER.writeValueAsString(capacityDataMap)));
+
+    // validate
+    validationResult = admin.validateResourcesForWagedRebalance(clusterName,
+        Collections.singletonList(testResourcePrefix));
+    Assert.assertEquals(validationResult.size(), 1);
+    Assert.assertFalse(validationResult.get(testResourcePrefix));
+
+    // Add the capacity key to ClusterConfig
+    HelixDataAccessor dataAccessor = new ZKHelixDataAccessor(clusterName, _baseAccessor);
+    PropertyKey.Builder keyBuilder = dataAccessor.keyBuilder();
+    ClusterConfig clusterConfig = dataAccessor.getProperty(keyBuilder.clusterConfig());
+    clusterConfig.setInstanceCapacityKeys(Arrays.asList("WCU", "RCU", "STORAGE"));
+    dataAccessor.setProperty(keyBuilder.clusterConfig(), clusterConfig);
+
+    // Should succeed now
+    Assert.assertTrue(admin.addResourceWithWeight(clusterName, idealState, resourceConfig));
+    // validate
+    validationResult = admin.validateResourcesForWagedRebalance(clusterName,
+        Collections.singletonList(testResourcePrefix));
+    Assert.assertEquals(validationResult.size(), 1);
+    Assert.assertTrue(validationResult.get(testResourcePrefix));
+  }
+
+  /**
+   * Test enabledWagedRebalance by checking the rebalancer class name changed.
+   */
+  @Test
+  public void testEnableWagedRebalance() {
+    String className = TestHelper.getTestClassName();
+    String methodName = TestHelper.getTestMethodName();
+    String clusterName = className + "_" + methodName;
+    String testResourcePrefix = "TestResource";
+    HelixAdmin admin = new ZKHelixAdmin(_gZkClient);
+    admin.addCluster(clusterName, true);
+    admin.addStateModelDef(clusterName, "MasterSlave", new MasterSlaveSMD());
+
+    // Add an IdealState
+    IdealState idealState = new IdealState(testResourcePrefix);
+    idealState.setNumPartitions(3);
+    idealState.setStateModelDefRef("MasterSlave");
+    idealState.setRebalanceMode(IdealState.RebalanceMode.FULL_AUTO);
+    admin.addResource(clusterName, testResourcePrefix, idealState);
+
+    admin.enableWagedRebalance(clusterName, Collections.singletonList(testResourcePrefix));
+    IdealState is = admin.getResourceIdealState(clusterName, testResourcePrefix);
+    Assert.assertEquals(is.getRebalancerClassName(), WagedRebalancer.class.getName());
+  }
 }
diff --git a/helix-core/src/test/java/org/apache/helix/mock/MockHelixAdmin.java b/helix-core/src/test/java/org/apache/helix/mock/MockHelixAdmin.java
index b299ab1..8e17d73 100644
--- a/helix-core/src/test/java/org/apache/helix/mock/MockHelixAdmin.java
+++ b/helix-core/src/test/java/org/apache/helix/mock/MockHelixAdmin.java
@@ -39,6 +39,7 @@
 import org.apache.helix.model.IdealState;
 import org.apache.helix.model.InstanceConfig;
 import org.apache.helix.model.MaintenanceSignal;
+import org.apache.helix.model.ResourceConfig;
 import org.apache.helix.model.StateModelDefinition;
 
 public class MockHelixAdmin implements HelixAdmin {
@@ -428,4 +429,25 @@
   @Override public void close() {
 
   }
+
+  @Override
+  public boolean addResourceWithWeight(String clusterName, IdealState idealState, ResourceConfig resourceConfig) {
+    return false;
+  }
+
+  @Override
+  public boolean enableWagedRebalance(String clusterName, List<String> resourceNames) {
+    return false;
+  }
+
+  @Override
+  public Map<String, Boolean> validateResourcesForWagedRebalance(String clusterName, List<String> resourceNames) {
+    return null;
+  }
+
+  @Override
+  public Map<String, Boolean> validateInstancesForWagedRebalance(String clusterName,
+      List<String> instancesNames) {
+    return null;
+  }
 }
diff --git a/helix-core/src/test/java/org/apache/helix/model/TestClusterConfig.java b/helix-core/src/test/java/org/apache/helix/model/TestClusterConfig.java
new file mode 100644
index 0000000..d688827
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/model/TestClusterConfig.java
@@ -0,0 +1,259 @@
+package org.apache.helix.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import org.apache.helix.ZNRecord;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+import static org.apache.helix.model.ClusterConfig.GlobalRebalancePreferenceKey.EVENNESS;
+import static org.apache.helix.model.ClusterConfig.GlobalRebalancePreferenceKey.LESS_MOVEMENT;
+
+public class TestClusterConfig {
+
+  @Test
+  public void testGetCapacityKeys() {
+    List<String> keys = ImmutableList.of("CPU", "MEMORY", "Random");
+
+    ClusterConfig testConfig = new ClusterConfig("testId");
+    testConfig.getRecord()
+        .setListField(ClusterConfig.ClusterConfigProperty.INSTANCE_CAPACITY_KEYS.name(), keys);
+
+    Assert.assertEquals(testConfig.getInstanceCapacityKeys(), keys);
+  }
+
+  @Test
+  public void testGetCapacityKeysEmpty() {
+    ClusterConfig testConfig = new ClusterConfig("testId");
+    Assert.assertEquals(testConfig.getInstanceCapacityKeys(), Collections.emptyList());
+  }
+
+  @Test
+  public void testSetCapacityKeys() {
+    List<String> keys = ImmutableList.of("CPU", "MEMORY", "Random");
+
+    ClusterConfig testConfig = new ClusterConfig("testId");
+    testConfig.setInstanceCapacityKeys(keys);
+
+    Assert.assertEquals(keys, testConfig.getRecord()
+        .getListField(ClusterConfig.ClusterConfigProperty.INSTANCE_CAPACITY_KEYS.name()));
+  }
+
+  @Test(expectedExceptions = IllegalArgumentException.class)
+  public void testSetCapacityKeysEmptyList() {
+    ClusterConfig testConfig = new ClusterConfig("testId");
+    testConfig.setInstanceCapacityKeys(Collections.emptyList());
+  }
+
+  @Test
+  public void testGetRebalancePreference() {
+    Map<ClusterConfig.GlobalRebalancePreferenceKey, Integer> preference = new HashMap<>();
+    preference.put(EVENNESS, 5);
+    preference.put(LESS_MOVEMENT, 3);
+
+    Map<String, String> mapFieldData = new HashMap<>();
+    for (ClusterConfig.GlobalRebalancePreferenceKey key : preference.keySet()) {
+      mapFieldData.put(key.name(), String.valueOf(preference.get(key)));
+    }
+
+    ClusterConfig testConfig = new ClusterConfig("testId");
+    testConfig.getRecord()
+        .setMapField(ClusterConfig.ClusterConfigProperty.REBALANCE_PREFERENCE.name(), mapFieldData);
+
+    Assert.assertEquals(testConfig.getGlobalRebalancePreference(), preference);
+  }
+
+  @Test
+  public void testGetRebalancePreferenceDefault() {
+    ClusterConfig testConfig = new ClusterConfig("testId");
+    Assert.assertEquals(testConfig.getGlobalRebalancePreference(),
+        ClusterConfig.DEFAULT_GLOBAL_REBALANCE_PREFERENCE);
+
+    Map<ClusterConfig.GlobalRebalancePreferenceKey, Integer> preference = new HashMap<>();
+    preference.put(EVENNESS, 5);
+    testConfig.setGlobalRebalancePreference(preference);
+
+    Assert.assertEquals(testConfig.getGlobalRebalancePreference(),
+        ClusterConfig.DEFAULT_GLOBAL_REBALANCE_PREFERENCE);
+  }
+
+  @Test
+  public void testSetRebalancePreference() {
+    Map<ClusterConfig.GlobalRebalancePreferenceKey, Integer> preference = new HashMap<>();
+    preference.put(EVENNESS, 5);
+    preference.put(LESS_MOVEMENT, 3);
+
+    Map<String, String> mapFieldData = new HashMap<>();
+    for (ClusterConfig.GlobalRebalancePreferenceKey key : preference.keySet()) {
+      mapFieldData.put(key.name(), String.valueOf(preference.get(key)));
+    }
+
+    ClusterConfig testConfig = new ClusterConfig("testId");
+    testConfig.setGlobalRebalancePreference(preference);
+
+    Assert.assertEquals(testConfig.getRecord()
+            .getMapField(ClusterConfig.ClusterConfigProperty.REBALANCE_PREFERENCE.name()),
+        mapFieldData);
+  }
+
+  @Test(expectedExceptions = IllegalArgumentException.class)
+  public void testSetRebalancePreferenceInvalidNumber() {
+    Map<ClusterConfig.GlobalRebalancePreferenceKey, Integer> preference = new HashMap<>();
+    preference.put(EVENNESS, -1);
+    preference.put(LESS_MOVEMENT, 3);
+
+    ClusterConfig testConfig = new ClusterConfig("testId");
+    testConfig.setGlobalRebalancePreference(preference);
+  }
+
+  @Test
+  public void testGetInstanceCapacityMap() {
+    Map<String, Integer> capacityDataMap = ImmutableMap.of("item1", 1, "item2", 2, "item3", 3);
+
+    Map<String, String> capacityDataMapString =
+        ImmutableMap.of("item1", "1", "item2", "2", "item3", "3");
+
+    ZNRecord rec = new ZNRecord("testId");
+    rec.setMapField(ClusterConfig.ClusterConfigProperty.DEFAULT_INSTANCE_CAPACITY_MAP.name(),
+        capacityDataMapString);
+    ClusterConfig testConfig = new ClusterConfig(rec);
+
+    Assert.assertTrue(testConfig.getDefaultInstanceCapacityMap().equals(capacityDataMap));
+  }
+
+  @Test
+  public void testGetInstanceCapacityMapEmpty() {
+    ClusterConfig testConfig = new ClusterConfig("testId");
+
+    Assert.assertTrue(testConfig.getDefaultInstanceCapacityMap().equals(Collections.emptyMap()));
+  }
+
+  @Test
+  public void testSetInstanceCapacityMap() {
+    Map<String, Integer> capacityDataMap = ImmutableMap.of("item1", 1, "item2", 2, "item3", 3);
+
+    Map<String, String> capacityDataMapString =
+        ImmutableMap.of("item1", "1", "item2", "2", "item3", "3");
+
+    ClusterConfig testConfig = new ClusterConfig("testConfig");
+    testConfig.setDefaultInstanceCapacityMap(capacityDataMap);
+
+    Assert.assertEquals(testConfig.getRecord().getMapField(ClusterConfig.ClusterConfigProperty.
+        DEFAULT_INSTANCE_CAPACITY_MAP.name()), capacityDataMapString);
+  }
+
+  @Test(expectedExceptions = IllegalArgumentException.class, expectedExceptionsMessageRegExp = "Default capacity data is null")
+  public void testSetInstanceCapacityMapEmpty() {
+    Map<String, Integer> capacityDataMap = new HashMap<>();
+
+    ClusterConfig testConfig = new ClusterConfig("testConfig");
+    // The following operation can be done, this will clear the default values
+    testConfig.setDefaultInstanceCapacityMap(capacityDataMap);
+    // The following operation will fail
+    testConfig.setDefaultInstanceCapacityMap(null);
+  }
+
+  @Test(expectedExceptions = IllegalArgumentException.class, expectedExceptionsMessageRegExp = "Default capacity data contains a negative value: item3 = -3")
+  public void testSetInstanceCapacityMapInvalid() {
+    Map<String, Integer> capacityDataMap = ImmutableMap.of("item1", 1, "item2", 2, "item3", -3);
+
+    ClusterConfig testConfig = new ClusterConfig("testConfig");
+    testConfig.setDefaultInstanceCapacityMap(capacityDataMap);
+  }
+
+  @Test
+  public void testGetPartitionWeightMap() {
+    Map<String, Integer> weightDataMap = ImmutableMap.of("item1", 1, "item2", 2, "item3", 3);
+
+    Map<String, String> weightDataMapString =
+        ImmutableMap.of("item1", "1", "item2", "2", "item3", "3");
+
+    ZNRecord rec = new ZNRecord("testId");
+    rec.setMapField(ClusterConfig.ClusterConfigProperty.DEFAULT_PARTITION_WEIGHT_MAP.name(),
+        weightDataMapString);
+    ClusterConfig testConfig = new ClusterConfig(rec);
+
+    Assert.assertTrue(testConfig.getDefaultPartitionWeightMap().equals(weightDataMap));
+  }
+
+  @Test
+  public void testGetPartitionWeightMapEmpty() {
+    ClusterConfig testConfig = new ClusterConfig("testId");
+
+    Assert.assertTrue(testConfig.getDefaultPartitionWeightMap().equals(Collections.emptyMap()));
+  }
+
+  @Test
+  public void testSetPartitionWeightMap() {
+    Map<String, Integer> weightDataMap = ImmutableMap.of("item1", 1, "item2", 2, "item3", 3);
+
+    Map<String, String> weightDataMapString =
+        ImmutableMap.of("item1", "1", "item2", "2", "item3", "3");
+
+    ClusterConfig testConfig = new ClusterConfig("testConfig");
+    testConfig.setDefaultPartitionWeightMap(weightDataMap);
+
+    Assert.assertEquals(testConfig.getRecord().getMapField(ClusterConfig.ClusterConfigProperty.
+        DEFAULT_PARTITION_WEIGHT_MAP.name()), weightDataMapString);
+  }
+
+  @Test(expectedExceptions = IllegalArgumentException.class, expectedExceptionsMessageRegExp = "Default capacity data is null")
+  public void testSetPartitionWeightMapEmpty() {
+    Map<String, Integer> weightDataMap = new HashMap<>();
+
+    ClusterConfig testConfig = new ClusterConfig("testConfig");
+    // The following operation can be done, this will clear the default values
+    testConfig.setDefaultPartitionWeightMap(weightDataMap);
+    // The following operation will fail
+    testConfig.setDefaultPartitionWeightMap(null);
+  }
+
+  @Test(expectedExceptions = IllegalArgumentException.class, expectedExceptionsMessageRegExp = "Default capacity data contains a negative value: item3 = -3")
+  public void testSetPartitionWeightMapInvalid() {
+    Map<String, Integer> weightDataMap = ImmutableMap.of("item1", 1, "item2", 2, "item3", -3);
+
+    ClusterConfig testConfig = new ClusterConfig("testConfig");
+    testConfig.setDefaultPartitionWeightMap(weightDataMap);
+  }
+
+  @Test
+  public void testAsyncGlobalRebalanceOption() {
+    ClusterConfig testConfig = new ClusterConfig("testConfig");
+    // Default value is true.
+    Assert.assertEquals(testConfig.isGlobalRebalanceAsyncModeEnabled(), true);
+    // Test get the option
+    testConfig.getRecord()
+        .setBooleanField(ClusterConfig.ClusterConfigProperty.GLOBAL_REBALANCE_ASYNC_MODE.name(),
+            false);
+    Assert.assertEquals(testConfig.isGlobalRebalanceAsyncModeEnabled(), false);
+    // Test set the option
+    testConfig.setGlobalRebalanceAsyncMode(true);
+    Assert.assertEquals(testConfig.getRecord()
+        .getBooleanField(ClusterConfig.ClusterConfigProperty.GLOBAL_REBALANCE_ASYNC_MODE.name(),
+            false), true);
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/model/TestInstanceConfig.java b/helix-core/src/test/java/org/apache/helix/model/TestInstanceConfig.java
index 38b1c92..f4c7715 100644
--- a/helix-core/src/test/java/org/apache/helix/model/TestInstanceConfig.java
+++ b/helix-core/src/test/java/org/apache/helix/model/TestInstanceConfig.java
@@ -19,12 +19,14 @@
  * under the License.
  */
 
-import java.util.Map;
-
+import com.google.common.collect.ImmutableMap;
 import org.apache.helix.ZNRecord;
 import org.testng.Assert;
 import org.testng.annotations.Test;
 
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Map;
 
 /**
  * Created with IntelliJ IDEA.
@@ -52,10 +54,73 @@
   }
 
   @Test
-  public void testGetParsedDomain_emptyDomain() {
+  public void testGetParsedDomainEmptyDomain() {
     InstanceConfig instanceConfig = new InstanceConfig(new ZNRecord("id"));
 
     Map<String, String> parsedDomain = instanceConfig.getDomainAsMap();
     Assert.assertTrue(parsedDomain.isEmpty());
   }
+
+  @Test
+  public void testGetInstanceCapacityMap() {
+    Map<String, Integer> capacityDataMap = ImmutableMap.of("item1", 1,
+        "item2", 2,
+        "item3", 3);
+
+    Map<String, String> capacityDataMapString = ImmutableMap.of("item1", "1",
+        "item2", "2",
+        "item3", "3");
+
+    ZNRecord rec = new ZNRecord("testId");
+    rec.setMapField(InstanceConfig.InstanceConfigProperty.INSTANCE_CAPACITY_MAP.name(), capacityDataMapString);
+    InstanceConfig testConfig = new InstanceConfig(rec);
+
+    Assert.assertTrue(testConfig.getInstanceCapacityMap().equals(capacityDataMap));
+  }
+
+  @Test
+  public void testGetInstanceCapacityMapEmpty() {
+    InstanceConfig testConfig = new InstanceConfig("testId");
+
+    Assert.assertTrue(testConfig.getInstanceCapacityMap().equals(Collections.emptyMap()));
+  }
+
+  @Test
+  public void testSetInstanceCapacityMap() {
+    Map<String, Integer> capacityDataMap = ImmutableMap.of("item1", 1,
+        "item2", 2,
+        "item3", 3);
+
+    Map<String, String> capacityDataMapString = ImmutableMap.of("item1", "1",
+        "item2", "2",
+        "item3", "3");
+
+    InstanceConfig testConfig = new InstanceConfig("testConfig");
+    testConfig.setInstanceCapacityMap(capacityDataMap);
+
+    Assert.assertEquals(testConfig.getRecord().getMapField(InstanceConfig.InstanceConfigProperty.
+        INSTANCE_CAPACITY_MAP.name()), capacityDataMapString);
+  }
+
+  @Test(expectedExceptions = IllegalArgumentException.class, expectedExceptionsMessageRegExp = "Capacity Data is null")
+  public void testSetInstanceCapacityMapEmpty() {
+    Map<String, Integer> capacityDataMap = new HashMap<>();
+
+    InstanceConfig testConfig = new InstanceConfig("testConfig");
+    // This operation shall be done. This will clear the instance capacity map in the InstanceConfig
+    testConfig.setInstanceCapacityMap(capacityDataMap);
+    // This operation will fall.
+    testConfig.setInstanceCapacityMap(null);
+  }
+
+  @Test(expectedExceptions = IllegalArgumentException.class,
+      expectedExceptionsMessageRegExp = "Capacity Data contains a negative value: item3 = -3")
+  public void testSetInstanceCapacityMapInvalid() {
+    Map<String, Integer> capacityDataMap = ImmutableMap.of("item1", 1,
+        "item2", 2,
+        "item3", -3);
+
+    InstanceConfig testConfig = new InstanceConfig("testConfig");
+    testConfig.setInstanceCapacityMap(capacityDataMap);
+  }
 }
diff --git a/helix-core/src/test/java/org/apache/helix/model/TestResourceConfig.java b/helix-core/src/test/java/org/apache/helix/model/TestResourceConfig.java
new file mode 100644
index 0000000..8099486
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/model/TestResourceConfig.java
@@ -0,0 +1,186 @@
+package org.apache.helix.model;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import com.google.common.collect.ImmutableMap;
+import org.apache.helix.ZNRecord;
+import org.codehaus.jackson.map.ObjectMapper;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+import java.io.IOException;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Map;
+
+public class TestResourceConfig {
+  private static final ObjectMapper _objectMapper = new ObjectMapper();
+
+  @Test
+  public void testGetPartitionCapacityMap() throws IOException {
+    Map<String, Integer> capacityDataMap = ImmutableMap.of("item1", 1,
+        "item2", 2,
+        "item3", 3);
+
+    ZNRecord rec = new ZNRecord("testId");
+    rec.setMapField(ResourceConfig.ResourceConfigProperty.PARTITION_CAPACITY_MAP.name(), Collections
+        .singletonMap(ResourceConfig.DEFAULT_PARTITION_KEY,
+            _objectMapper.writeValueAsString(capacityDataMap)));
+    ResourceConfig testConfig = new ResourceConfig(rec);
+
+    Assert.assertTrue(testConfig.getPartitionCapacityMap().get(ResourceConfig.DEFAULT_PARTITION_KEY)
+        .equals(capacityDataMap));
+  }
+
+  @Test
+  public void testGetPartitionCapacityMapEmpty() throws IOException {
+    ResourceConfig testConfig = new ResourceConfig("testId");
+
+    Assert.assertTrue(testConfig.getPartitionCapacityMap().equals(Collections.emptyMap()));
+  }
+
+  @Test(expectedExceptions = IOException.class)
+  public void testGetPartitionCapacityMapInvalidJson() throws IOException {
+    ZNRecord rec = new ZNRecord("testId");
+    rec.setMapField(ResourceConfig.ResourceConfigProperty.PARTITION_CAPACITY_MAP.name(),
+        Collections.singletonMap("test", "gibberish"));
+    ResourceConfig testConfig = new ResourceConfig(rec);
+
+    testConfig.getPartitionCapacityMap();
+  }
+
+  @Test(dependsOnMethods = "testGetPartitionCapacityMap", expectedExceptions = IOException.class)
+  public void testGetPartitionCapacityMapInvalidJsonType() throws IOException {
+    Map<String, String> capacityDataMap = ImmutableMap.of("item1", "1",
+        "item2", "2",
+        "item3", "three");
+
+    ZNRecord rec = new ZNRecord("testId");
+    rec.setMapField(ResourceConfig.ResourceConfigProperty.PARTITION_CAPACITY_MAP.name(), Collections
+        .singletonMap(ResourceConfig.DEFAULT_PARTITION_KEY,
+            _objectMapper.writeValueAsString(capacityDataMap)));
+    ResourceConfig testConfig = new ResourceConfig(rec);
+
+    testConfig.getPartitionCapacityMap();
+  }
+
+  @Test
+  public void testSetPartitionCapacityMap() throws IOException {
+    Map<String, Integer> capacityDataMap = ImmutableMap.of("item1", 1,
+        "item2", 2,
+        "item3", 3);
+
+    ResourceConfig testConfig = new ResourceConfig("testConfig");
+    testConfig.setPartitionCapacityMap(
+        Collections.singletonMap(ResourceConfig.DEFAULT_PARTITION_KEY, capacityDataMap));
+
+    Assert.assertEquals(testConfig.getRecord().getMapField(ResourceConfig.ResourceConfigProperty.
+            PARTITION_CAPACITY_MAP.name()).get(ResourceConfig.DEFAULT_PARTITION_KEY),
+        _objectMapper.writeValueAsString(capacityDataMap));
+  }
+
+  @Test
+  public void testSetMultiplePartitionCapacityMap() throws IOException {
+    Map<String, Integer> capacityDataMap = ImmutableMap.of("item1", 1,
+        "item2", 2,
+        "item3", 3);
+
+    Map<String, Map<String, Integer>> totalCapacityMap =
+        ImmutableMap.of(ResourceConfig.DEFAULT_PARTITION_KEY, capacityDataMap,
+        "partition2", capacityDataMap,
+        "partition3", capacityDataMap);
+
+    ResourceConfig testConfig = new ResourceConfig("testConfig");
+    testConfig.setPartitionCapacityMap(totalCapacityMap);
+
+    Assert.assertNull(testConfig.getRecord().getMapField(ResourceConfig.ResourceConfigProperty.
+        PARTITION_CAPACITY_MAP.name()).get("partition1"));
+    Assert.assertEquals(testConfig.getRecord().getMapField(ResourceConfig.ResourceConfigProperty.
+        PARTITION_CAPACITY_MAP.name()).get(ResourceConfig.DEFAULT_PARTITION_KEY),
+        _objectMapper.writeValueAsString(capacityDataMap));
+    Assert.assertEquals(testConfig.getRecord().getMapField(ResourceConfig.ResourceConfigProperty.
+            PARTITION_CAPACITY_MAP.name()).get("partition2"),
+        _objectMapper.writeValueAsString(capacityDataMap));
+    Assert.assertEquals(testConfig.getRecord().getMapField(ResourceConfig.ResourceConfigProperty.
+            PARTITION_CAPACITY_MAP.name()).get("partition3"),
+        _objectMapper.writeValueAsString(capacityDataMap));
+  }
+
+  @Test(expectedExceptions = IllegalArgumentException.class, expectedExceptionsMessageRegExp = "Capacity Data is empty")
+  public void testSetPartitionCapacityMapEmpty() throws IOException {
+    Map<String, Integer> capacityDataMap = new HashMap<>();
+
+    ResourceConfig testConfig = new ResourceConfig("testConfig");
+    testConfig.setPartitionCapacityMap(
+        Collections.singletonMap(ResourceConfig.DEFAULT_PARTITION_KEY, capacityDataMap));
+  }
+
+  @Test(expectedExceptions = IllegalArgumentException.class, expectedExceptionsMessageRegExp = "The default partition capacity with the default key DEFAULT is required.")
+  public void testSetPartitionCapacityMapWithoutDefault() throws IOException {
+    Map<String, Integer> capacityDataMap = new HashMap<>();
+
+    ResourceConfig testConfig = new ResourceConfig("testConfig");
+    testConfig.setPartitionCapacityMap(
+        Collections.singletonMap("Random", capacityDataMap));
+  }
+
+  @Test(expectedExceptions = IllegalArgumentException.class, expectedExceptionsMessageRegExp = "Capacity Data contains a negative value:.+")
+  public void testSetPartitionCapacityMapInvalid() throws IOException {
+    Map<String, Integer> capacityDataMap = ImmutableMap.of("item1", 1,
+        "item2", 2,
+        "item3", -3);
+
+    ResourceConfig testConfig = new ResourceConfig("testConfig");
+    testConfig.setPartitionCapacityMap(
+        Collections.singletonMap(ResourceConfig.DEFAULT_PARTITION_KEY, capacityDataMap));
+  }
+
+  @Test
+  public void testWithResourceBuilder() throws IOException {
+    Map<String, Integer> capacityDataMap = ImmutableMap.of("item1", 1,
+        "item2", 2,
+        "item3", 3);
+
+    ResourceConfig.Builder builder = new ResourceConfig.Builder("testConfig");
+    builder.setPartitionCapacity(capacityDataMap);
+    builder.setPartitionCapacity("partition1", capacityDataMap);
+
+    Assert.assertEquals(
+        builder.build().getPartitionCapacityMap().get(ResourceConfig.DEFAULT_PARTITION_KEY),
+        capacityDataMap);
+    Assert.assertEquals(
+        builder.build().getPartitionCapacityMap().get("partition1"),
+        capacityDataMap);
+    Assert.assertNull(
+        builder.build().getPartitionCapacityMap().get("Random"));
+  }
+
+  @Test(expectedExceptions = IllegalArgumentException.class, expectedExceptionsMessageRegExp = "The default partition capacity with the default key DEFAULT is required.")
+  public void testWithResourceBuilderInvalidInput() {
+    Map<String, Integer> capacityDataMap = ImmutableMap.of("item1", 1,
+        "item2", 2,
+        "item3", 3);
+
+    ResourceConfig.Builder builder = new ResourceConfig.Builder("testConfig");
+    builder.setPartitionCapacity("Random", capacityDataMap);
+
+    builder.build();
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestClusterStatusMonitor.java b/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestClusterStatusMonitor.java
index 79f284e..f4ba01f 100644
--- a/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestClusterStatusMonitor.java
+++ b/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestClusterStatusMonitor.java
@@ -19,18 +19,27 @@
  * under the License.
  */
 
+import java.io.IOException;
 import java.lang.management.ManagementFactory;
 import java.util.ArrayList;
+import java.util.Collections;
 import java.util.Date;
+import java.util.HashMap;
 import java.util.Iterator;
 import java.util.List;
 import java.util.Map;
 import java.util.Random;
+import javax.management.AttributeNotFoundException;
 import javax.management.InstanceNotFoundException;
 import javax.management.JMException;
+import javax.management.MBeanException;
 import javax.management.MBeanServerConnection;
+import javax.management.MalformedObjectNameException;
 import javax.management.ObjectName;
+import javax.management.ReflectionException;
 
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
 import com.google.common.collect.Maps;
 import org.apache.helix.TestHelper;
 import org.apache.helix.ZNRecord;
@@ -50,8 +59,8 @@
 
 public class TestClusterStatusMonitor {
   private static final MBeanServerConnection _server = ManagementFactory.getPlatformMBeanServer();
-  String testDB = "TestDB";
-  String testDB_0 = testDB + "_0";
+  private String testDB = "TestDB";
+  private String testDB_0 = testDB + "_0";
 
   @Test()
   public void testReportData() throws Exception {
@@ -65,11 +74,8 @@
     ClusterStatusMonitor monitor = new ClusterStatusMonitor(clusterName);
     monitor.active();
     ObjectName clusterMonitorObjName = monitor.getObjectName(monitor.clusterBeanName());
-    try {
-      _server.getMBeanInfo(clusterMonitorObjName);
-    } catch (Exception e) {
-      Assert.fail("Fail to register ClusterStatusMonitor");
-    }
+
+    Assert.assertTrue(_server.isRegistered(clusterMonitorObjName));
 
     // Test #setPerInstanceResourceStatus()
     BestPossibleStateOutput bestPossibleStates = new BestPossibleStateOutput();
@@ -138,42 +144,29 @@
         "localhost_12918");
     monitor.setPerInstanceResourceStatus(bestPossibleStates, instanceConfigMap, resourceMap,
         stateModelDefMap);
-    try {
-      objName =
-          monitor.getObjectName(monitor.getPerInstanceResourceBeanName("localhost_12918", testDB));
-      _server.getMBeanInfo(objName);
-      Assert.fail("Fail to unregister PerInstanceResource mbean for localhost_12918");
 
-    } catch (InstanceNotFoundException e) {
-      // OK
-    }
+    objName =
+        monitor.getObjectName(monitor.getPerInstanceResourceBeanName("localhost_12918", testDB));
+    Assert.assertFalse(_server.isRegistered(objName),
+        "Fail to unregister PerInstanceResource mbean for localhost_12918");
 
     // Clean up
     monitor.reset();
 
-    try {
-      objName =
-          monitor.getObjectName(monitor.getPerInstanceResourceBeanName("localhost_12920", testDB));
-      _server.getMBeanInfo(objName);
-      Assert.fail("Fail to unregister PerInstanceResource mbean for localhost_12920");
+    objName =
+        monitor.getObjectName(monitor.getPerInstanceResourceBeanName("localhost_12920", testDB));
+    Assert.assertFalse(_server.isRegistered(objName),
+        "Fail to unregister PerInstanceResource mbean for localhost_12920");
 
-    } catch (InstanceNotFoundException e) {
-      // OK
-    }
-
-    try {
-      _server.getMBeanInfo(clusterMonitorObjName);
-      Assert.fail("Fail to unregister ClusterStatusMonitor");
-    } catch (InstanceNotFoundException e) {
-      // OK
-    }
+    Assert.assertFalse(_server.isRegistered(clusterMonitorObjName),
+        "Failed to unregister ClusterStatusMonitor.");
 
     System.out.println("END " + clusterName + " at " + new Date(System.currentTimeMillis()));
   }
 
 
   @Test
-  public void testResourceAggregation() throws JMException {
+  public void testResourceAggregation() throws JMException, IOException {
     String className = TestHelper.getTestClassName();
     String methodName = TestHelper.getTestMethodName();
     String clusterName = className + "_" + methodName;
@@ -183,11 +176,8 @@
     ClusterStatusMonitor monitor = new ClusterStatusMonitor(clusterName);
     monitor.active();
     ObjectName clusterMonitorObjName = monitor.getObjectName(monitor.clusterBeanName());
-    try {
-      _server.getMBeanInfo(clusterMonitorObjName);
-    } catch (Exception e) {
-      Assert.fail("Fail to register ClusterStatusMonitor");
-    }
+
+    Assert.assertTrue(_server.isRegistered(clusterMonitorObjName));
 
     int numInstance = 5;
     int numPartition = 10;
@@ -316,5 +306,133 @@
     messageCount = new Random().nextInt(numPartition) + 1;
     monitor.setResourceStatus(externalView, idealState, stateModelDef, messageCount);
     Assert.assertEquals(monitor.getPendingStateTransitionGuage(), messageCount);
+
+    // Reset monitor.
+    monitor.reset();
+    Assert.assertFalse(_server.isRegistered(clusterMonitorObjName),
+        "Failed to unregister ClusterStatusMonitor.");
+  }
+
+  @Test
+  public void testUpdateInstanceCapacityStatus()
+      throws MalformedObjectNameException, IOException, AttributeNotFoundException, MBeanException,
+             ReflectionException, InstanceNotFoundException {
+    String clusterName = "testCluster";
+    List<Double> maxUsageList = ImmutableList.of(0.0d, 0.32d, 0.85d, 1.0d, 0.50d, 0.75d);
+    Map<String, Double> maxUsageMap = new HashMap<>();
+    Map<String, Map<String, Integer>> instanceCapacityMap = new HashMap<>();
+    Random rand = new Random();
+
+    for (int i = 0; i < maxUsageList.size(); i++) {
+      String instanceName = "instance" + i;
+      maxUsageMap.put(instanceName, maxUsageList.get(i));
+      instanceCapacityMap.put(instanceName,
+          ImmutableMap.of("capacity1", rand.nextInt(100), "capacity2", rand.nextInt(100)));
+    }
+
+    // Setup cluster status monitor.
+    ClusterStatusMonitor monitor = new ClusterStatusMonitor(clusterName);
+    monitor.active();
+    ObjectName clusterMonitorObjName = monitor.getObjectName(monitor.clusterBeanName());
+
+    // Cluster status monitor is registered.
+    Assert.assertTrue(_server.isRegistered(clusterMonitorObjName));
+
+    // Before calling setClusterInstanceStatus, instance monitors are not yet registered.
+    for (Map.Entry<String, Double> entry : maxUsageMap.entrySet()) {
+      String instance = entry.getKey();
+      String instanceBeanName = String
+          .format("%s,%s=%s", monitor.clusterBeanName(), ClusterStatusMonitor.INSTANCE_DN_KEY,
+              instance);
+      ObjectName instanceObjectName = monitor.getObjectName(instanceBeanName);
+
+      Assert.assertFalse(_server.isRegistered(instanceObjectName));
+    }
+
+    // Call setClusterInstanceStatus to register instance monitors.
+    monitor.setClusterInstanceStatus(maxUsageMap.keySet(), maxUsageMap.keySet(),
+        Collections.emptySet(), Collections.emptyMap(), Collections.emptyMap(),
+        Collections.emptyMap());
+
+    // Update instance capacity status.
+    for (Map.Entry<String, Double> usageEntry : maxUsageMap.entrySet()) {
+      String instanceName = usageEntry.getKey();
+      monitor.updateInstanceCapacityStatus(instanceName, usageEntry.getValue(),
+          instanceCapacityMap.get(instanceName));
+    }
+
+    verifyCapacityMetrics(monitor, maxUsageMap, instanceCapacityMap);
+
+    // Change capacity keys: "capacity2" -> "capacity3"
+    for (String instanceName : instanceCapacityMap.keySet()) {
+      instanceCapacityMap.put(instanceName,
+          ImmutableMap.of("capacity1", rand.nextInt(100), "capacity3", rand.nextInt(100)));
+    }
+
+    // Update instance capacity status.
+    for (Map.Entry<String, Double> usageEntry : maxUsageMap.entrySet()) {
+      String instanceName = usageEntry.getKey();
+      monitor.updateInstanceCapacityStatus(instanceName, usageEntry.getValue(),
+          instanceCapacityMap.get(instanceName));
+    }
+
+    // "capacity2" metric should not exist in MBean server.
+    String removedAttribute = "capacity2Gauge";
+    for (Map.Entry<String, Map<String, Integer>> instanceEntry : instanceCapacityMap.entrySet()) {
+      String instance = instanceEntry.getKey();
+      String instanceBeanName = String
+          .format("%s,%s=%s", monitor.clusterBeanName(), ClusterStatusMonitor.INSTANCE_DN_KEY,
+              instance);
+      ObjectName instanceObjectName = monitor.getObjectName(instanceBeanName);
+
+      try {
+        _server.getAttribute(instanceObjectName, removedAttribute);
+        Assert.fail();
+      } catch (AttributeNotFoundException ex) {
+        // Expected AttributeNotFoundException because "capacity2Gauge" metric does not exist in
+        // MBean server.
+      }
+    }
+
+    verifyCapacityMetrics(monitor, maxUsageMap, instanceCapacityMap);
+
+    // Reset monitor.
+    monitor.reset();
+    Assert.assertFalse(_server.isRegistered(clusterMonitorObjName),
+        "Failed to unregister ClusterStatusMonitor.");
+    for (String instance : maxUsageMap.keySet()) {
+      String instanceBeanName =
+          String.format("%s,%s=%s", monitor.clusterBeanName(), ClusterStatusMonitor.INSTANCE_DN_KEY, instance);
+      ObjectName instanceObjectName = monitor.getObjectName(instanceBeanName);
+      Assert.assertFalse(_server.isRegistered(instanceObjectName),
+          "Failed to unregister instance monitor for instance: " + instance);
+    }
+  }
+
+  private void verifyCapacityMetrics(ClusterStatusMonitor monitor, Map<String, Double> maxUsageMap,
+      Map<String, Map<String, Integer>> instanceCapacityMap)
+      throws MalformedObjectNameException, IOException, AttributeNotFoundException, MBeanException,
+             ReflectionException, InstanceNotFoundException {
+    // Verify results.
+    for (Map.Entry<String, Map<String, Integer>> instanceEntry : instanceCapacityMap.entrySet()) {
+      String instance = instanceEntry.getKey();
+      Map<String, Integer> capacityMap = instanceEntry.getValue();
+      String instanceBeanName = String
+          .format("%s,%s=%s", monitor.clusterBeanName(), ClusterStatusMonitor.INSTANCE_DN_KEY,
+              instance);
+      ObjectName instanceObjectName = monitor.getObjectName(instanceBeanName);
+
+      Assert.assertTrue(_server.isRegistered(instanceObjectName));
+      Assert.assertEquals(_server.getAttribute(instanceObjectName,
+          InstanceMonitor.InstanceMonitorMetric.MAX_CAPACITY_USAGE_GAUGE.metricName()),
+          maxUsageMap.get(instance));
+
+      for (Map.Entry<String, Integer> capacityEntry : capacityMap.entrySet()) {
+        String capacityKey = capacityEntry.getKey();
+        String attributeName = capacityKey + "Gauge";
+        Assert.assertEquals((long) _server.getAttribute(instanceObjectName, attributeName),
+            (long) instanceCapacityMap.get(instance).get(capacityKey));
+      }
+    }
   }
 }
diff --git a/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestInstanceMonitor.java b/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestInstanceMonitor.java
new file mode 100644
index 0000000..609581b
--- /dev/null
+++ b/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestInstanceMonitor.java
@@ -0,0 +1,75 @@
+package org.apache.helix.monitoring.mbeans;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import javax.management.JMException;
+import javax.management.ObjectName;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.ImmutableSet;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+public class TestInstanceMonitor {
+  @Test
+  public void testInstanceMonitor()
+      throws JMException {
+    String testCluster = "testCluster";
+    String testInstance = "testInstance";
+    String testDomain = "testDomain:key=value";
+    Set<String> tags = ImmutableSet.of("test", "DEFAULT");
+    Map<String, List<String>> disabledPartitions =
+        ImmutableMap.of("instance1", ImmutableList.of("partition1", "partition2"));
+    InstanceMonitor monitor =
+        new InstanceMonitor(testCluster, testInstance, new ObjectName(testDomain));
+
+    // Verify init status.
+    Assert.assertEquals(monitor.getSensorName(),
+        "ParticipantStatus.testCluster.DEFAULT.testInstance");
+    Assert.assertEquals(monitor.getInstanceName(), testInstance);
+    Assert.assertEquals(monitor.getOnline(), 0L);
+    Assert.assertEquals(monitor.getEnabled(), 0L);
+    Assert.assertEquals(monitor.getTotalMessageReceived(), 0L);
+    Assert.assertEquals(monitor.getDisabledPartitions(), 0L);
+    Assert.assertEquals(monitor.getMaxCapacityUsageGauge(), 0.0d);
+
+    // Update metrics.
+    monitor.updateMaxCapacityUsage(0.5d);
+    monitor.increaseMessageCount(10L);
+    monitor.updateInstance(tags, disabledPartitions, Collections.emptyList(), true, true);
+
+    // Verify metrics.
+    Assert.assertEquals(monitor.getTotalMessageReceived(), 10L);
+    Assert.assertEquals(monitor.getSensorName(),
+        "ParticipantStatus.testCluster.DEFAULT|test.testInstance");
+    Assert.assertEquals(monitor.getInstanceName(), testInstance);
+    Assert.assertEquals(monitor.getOnline(), 1L);
+    Assert.assertEquals(monitor.getEnabled(), 1L);
+    Assert.assertEquals(monitor.getDisabledPartitions(), 2L);
+    Assert.assertEquals(monitor.getMaxCapacityUsageGauge(), 0.5d);
+
+    monitor.unregister();
+  }
+}
diff --git a/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestResourceMonitor.java b/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestResourceMonitor.java
index 18576dd..f630124 100644
--- a/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestResourceMonitor.java
+++ b/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestResourceMonitor.java
@@ -19,15 +19,24 @@
  * under the License.
  */
 
+import java.io.IOException;
+import java.lang.management.ManagementFactory;
 import java.util.ArrayList;
 import java.util.Iterator;
 import java.util.List;
 import java.util.Map;
 import java.util.Random;
 import java.util.TreeMap;
+import javax.management.AttributeNotFoundException;
+import javax.management.InstanceNotFoundException;
 import javax.management.JMException;
+import javax.management.MBeanException;
+import javax.management.MBeanServerConnection;
 import javax.management.ObjectName;
+import javax.management.ReflectionException;
 
+import com.google.common.collect.ImmutableMap;
+import org.apache.helix.TestHelper;
 import org.apache.helix.ZNRecord;
 import org.apache.helix.model.BuiltInStateModelDefinitions;
 import org.apache.helix.model.ExternalView;
@@ -46,168 +55,220 @@
   @Test()
   public void testReportData() throws JMException {
     final int n = 5;
-    ResourceMonitor monitor = new ResourceMonitor(_clusterName, _dbName, new ObjectName("testDomain:key=value"));
+    ResourceMonitor monitor =
+        new ResourceMonitor(_clusterName, _dbName, new ObjectName("testDomain:key=value"));
     monitor.register();
 
-    List<String> instances = new ArrayList<>();
-    for (int i = 0; i < n; i++) {
-      String instance = "localhost_" + (12918 + i);
-      instances.add(instance);
-    }
-
-    ZNRecord idealStateRecord = DefaultIdealStateCalculator
-        .calculateIdealState(instances, _partitions, _replicas - 1, _dbName, "MASTER", "SLAVE");
-    IdealState idealState = new IdealState(deepCopyZNRecord(idealStateRecord));
-    idealState.setMinActiveReplicas(_replicas - 1);
-    ExternalView externalView = new ExternalView(deepCopyZNRecord(idealStateRecord));
-    StateModelDefinition stateModelDef =
-        BuiltInStateModelDefinitions.MasterSlave.getStateModelDefinition();
-
-    monitor.updateResourceState(externalView, idealState, stateModelDef);
-
-    Assert.assertEquals(monitor.getDifferenceWithIdealStateGauge(), 0);
-    Assert.assertEquals(monitor.getErrorPartitionGauge(), 0);
-    Assert.assertEquals(monitor.getExternalViewPartitionGauge(), _partitions);
-    Assert.assertEquals(monitor.getPartitionGauge(), _partitions);
-    Assert.assertEquals(monitor.getMissingMinActiveReplicaPartitionGauge(), 0);
-    Assert.assertEquals(monitor.getMissingReplicaPartitionGauge(), 0);
-    Assert.assertEquals(monitor.getMissingTopStatePartitionGauge(), 0);
-    Assert.assertEquals(monitor.getBeanName(), _clusterName + " " + _dbName);
-
-    int errorCount = 5;
-    Random r = new Random();
-    int start = r.nextInt(_partitions - errorCount - 1);
-    for (int i = start; i < start + errorCount; i++) {
-      String partition = _dbName + "_" + i;
-      Map<String, String> map = externalView.getStateMap(partition);
-      for (String key : map.keySet()) {
-        if (map.get(key).equalsIgnoreCase("SLAVE")) {
-          map.put(key, "ERROR");
-          break;
-        }
+    try {
+      List<String> instances = new ArrayList<>();
+      for (int i = 0; i < n; i++) {
+        String instance = "localhost_" + (12918 + i);
+        instances.add(instance);
       }
-      externalView.setStateMap(partition, map);
-    }
 
-    monitor.updateResourceState(externalView, idealState, stateModelDef);
+      ZNRecord idealStateRecord = DefaultIdealStateCalculator
+          .calculateIdealState(instances, _partitions, _replicas - 1, _dbName, "MASTER", "SLAVE");
+      IdealState idealState = new IdealState(deepCopyZNRecord(idealStateRecord));
+      idealState.setMinActiveReplicas(_replicas - 1);
+      ExternalView externalView = new ExternalView(deepCopyZNRecord(idealStateRecord));
+      StateModelDefinition stateModelDef =
+          BuiltInStateModelDefinitions.MasterSlave.getStateModelDefinition();
 
-    Assert.assertEquals(monitor.getDifferenceWithIdealStateGauge(), errorCount);
-    Assert.assertEquals(monitor.getErrorPartitionGauge(), errorCount);
-    Assert.assertEquals(monitor.getExternalViewPartitionGauge(), _partitions);
-    Assert.assertEquals(monitor.getPartitionGauge(), _partitions);
-    Assert.assertEquals(monitor.getMissingMinActiveReplicaPartitionGauge(), 0);
-    Assert.assertEquals(monitor.getMissingReplicaPartitionGauge(), errorCount);
-    Assert.assertEquals(monitor.getMissingTopStatePartitionGauge(), 0);
+      monitor.updateResourceState(externalView, idealState, stateModelDef);
 
-    int lessMinActiveReplica = 6;
-    externalView = new ExternalView(deepCopyZNRecord(idealStateRecord));
-    start = r.nextInt(_partitions - lessMinActiveReplica - 1);
-    for (int i = start; i < start + lessMinActiveReplica; i++) {
-      String partition = _dbName + "_" + i;
-      Map<String, String> map = externalView.getStateMap(partition);
-      Iterator<String> it = map.keySet().iterator();
-      int flag = 0;
-      while (it.hasNext()) {
-        String key = it.next();
-        if (map.get(key).equalsIgnoreCase("SLAVE")) {
-          if (flag++ % 2 == 0) {
-            map.put(key, "OFFLINE");
-          } else {
-            it.remove();
+      Assert.assertEquals(monitor.getDifferenceWithIdealStateGauge(), 0);
+      Assert.assertEquals(monitor.getErrorPartitionGauge(), 0);
+      Assert.assertEquals(monitor.getExternalViewPartitionGauge(), _partitions);
+      Assert.assertEquals(monitor.getPartitionGauge(), _partitions);
+      Assert.assertEquals(monitor.getMissingMinActiveReplicaPartitionGauge(), 0);
+      Assert.assertEquals(monitor.getMissingReplicaPartitionGauge(), 0);
+      Assert.assertEquals(monitor.getMissingTopStatePartitionGauge(), 0);
+      Assert.assertEquals(monitor.getBeanName(), _clusterName + " " + _dbName);
+
+      int errorCount = 5;
+      Random r = new Random();
+      int start = r.nextInt(_partitions - errorCount - 1);
+      for (int i = start; i < start + errorCount; i++) {
+        String partition = _dbName + "_" + i;
+        Map<String, String> map = externalView.getStateMap(partition);
+        for (String key : map.keySet()) {
+          if (map.get(key).equalsIgnoreCase("SLAVE")) {
+            map.put(key, "ERROR");
+            break;
           }
         }
+        externalView.setStateMap(partition, map);
       }
-      externalView.setStateMap(partition, map);
-    }
 
-    monitor.updateResourceState(externalView, idealState, stateModelDef);
+      monitor.updateResourceState(externalView, idealState, stateModelDef);
 
-    Assert.assertEquals(monitor.getDifferenceWithIdealStateGauge(), lessMinActiveReplica);
-    Assert.assertEquals(monitor.getErrorPartitionGauge(), 0);
-    Assert.assertEquals(monitor.getExternalViewPartitionGauge(), _partitions);
-    Assert.assertEquals(monitor.getPartitionGauge(), _partitions);
-    Assert.assertEquals(monitor.getMissingMinActiveReplicaPartitionGauge(), lessMinActiveReplica);
-    Assert.assertEquals(monitor.getMissingReplicaPartitionGauge(), lessMinActiveReplica);
-    Assert.assertEquals(monitor.getMissingTopStatePartitionGauge(), 0);
+      Assert.assertEquals(monitor.getDifferenceWithIdealStateGauge(), errorCount);
+      Assert.assertEquals(monitor.getErrorPartitionGauge(), errorCount);
+      Assert.assertEquals(monitor.getExternalViewPartitionGauge(), _partitions);
+      Assert.assertEquals(monitor.getPartitionGauge(), _partitions);
+      Assert.assertEquals(monitor.getMissingMinActiveReplicaPartitionGauge(), 0);
+      Assert.assertEquals(monitor.getMissingReplicaPartitionGauge(), errorCount);
+      Assert.assertEquals(monitor.getMissingTopStatePartitionGauge(), 0);
 
-    int lessReplica = 4;
-    externalView = new ExternalView(deepCopyZNRecord(idealStateRecord));
-    start = r.nextInt(_partitions - lessReplica - 1);
-    for (int i = start; i < start + lessReplica; i++) {
-      String partition = _dbName + "_" + i;
-      Map<String, String> map = externalView.getStateMap(partition);
-      int flag = 0;
-      Iterator<String> it = map.keySet().iterator();
-      while (it.hasNext()) {
-        String key = it.next();
-        if (map.get(key).equalsIgnoreCase("SLAVE")) {
-          if (flag++ % 2 == 0) {
-            map.put(key, "OFFLINE");
-          } else {
-            it.remove();
+      int lessMinActiveReplica = 6;
+      externalView = new ExternalView(deepCopyZNRecord(idealStateRecord));
+      start = r.nextInt(_partitions - lessMinActiveReplica - 1);
+      for (int i = start; i < start + lessMinActiveReplica; i++) {
+        String partition = _dbName + "_" + i;
+        Map<String, String> map = externalView.getStateMap(partition);
+        Iterator<String> it = map.keySet().iterator();
+        int flag = 0;
+        while (it.hasNext()) {
+          String key = it.next();
+          if (map.get(key).equalsIgnoreCase("SLAVE")) {
+            if (flag++ % 2 == 0) {
+              map.put(key, "OFFLINE");
+            } else {
+              it.remove();
+            }
           }
-          break;
         }
+        externalView.setStateMap(partition, map);
       }
-      externalView.setStateMap(partition, map);
-    }
 
-    monitor.updateResourceState(externalView, idealState, stateModelDef);
+      monitor.updateResourceState(externalView, idealState, stateModelDef);
 
-    Assert.assertEquals(monitor.getDifferenceWithIdealStateGauge(), lessReplica);
-    Assert.assertEquals(monitor.getErrorPartitionGauge(), 0);
-    Assert.assertEquals(monitor.getExternalViewPartitionGauge(), _partitions);
-    Assert.assertEquals(monitor.getPartitionGauge(), _partitions);
-    Assert.assertEquals(monitor.getMissingMinActiveReplicaPartitionGauge(), 0);
-    Assert.assertEquals(monitor.getMissingReplicaPartitionGauge(), lessReplica);
-    Assert.assertEquals(monitor.getMissingTopStatePartitionGauge(), 0);
+      Assert.assertEquals(monitor.getDifferenceWithIdealStateGauge(), lessMinActiveReplica);
+      Assert.assertEquals(monitor.getErrorPartitionGauge(), 0);
+      Assert.assertEquals(monitor.getExternalViewPartitionGauge(), _partitions);
+      Assert.assertEquals(monitor.getPartitionGauge(), _partitions);
+      Assert.assertEquals(monitor.getMissingMinActiveReplicaPartitionGauge(), lessMinActiveReplica);
+      Assert.assertEquals(monitor.getMissingReplicaPartitionGauge(), lessMinActiveReplica);
+      Assert.assertEquals(monitor.getMissingTopStatePartitionGauge(), 0);
 
-    int missTopState = 7;
-    externalView = new ExternalView(deepCopyZNRecord(idealStateRecord));
-    start = r.nextInt(_partitions - missTopState - 1);
-    for (int i = start; i < start + missTopState; i++) {
-      String partition = _dbName + "_" + i;
-      Map<String, String> map = externalView.getStateMap(partition);
-      int flag = 0;
-      for (String key : map.keySet()) {
-        if (map.get(key).equalsIgnoreCase("MASTER")) {
-          if (flag++ % 2 == 0) {
-            map.put(key, "OFFLINE");
-          } else {
-            map.remove(key);
+      int lessReplica = 4;
+      externalView = new ExternalView(deepCopyZNRecord(idealStateRecord));
+      start = r.nextInt(_partitions - lessReplica - 1);
+      for (int i = start; i < start + lessReplica; i++) {
+        String partition = _dbName + "_" + i;
+        Map<String, String> map = externalView.getStateMap(partition);
+        int flag = 0;
+        Iterator<String> it = map.keySet().iterator();
+        while (it.hasNext()) {
+          String key = it.next();
+          if (map.get(key).equalsIgnoreCase("SLAVE")) {
+            if (flag++ % 2 == 0) {
+              map.put(key, "OFFLINE");
+            } else {
+              it.remove();
+            }
+            break;
           }
-          break;
         }
+        externalView.setStateMap(partition, map);
       }
-      externalView.setStateMap(partition, map);
+
+      monitor.updateResourceState(externalView, idealState, stateModelDef);
+
+      Assert.assertEquals(monitor.getDifferenceWithIdealStateGauge(), lessReplica);
+      Assert.assertEquals(monitor.getErrorPartitionGauge(), 0);
+      Assert.assertEquals(monitor.getExternalViewPartitionGauge(), _partitions);
+      Assert.assertEquals(monitor.getPartitionGauge(), _partitions);
+      Assert.assertEquals(monitor.getMissingMinActiveReplicaPartitionGauge(), 0);
+      Assert.assertEquals(monitor.getMissingReplicaPartitionGauge(), lessReplica);
+      Assert.assertEquals(monitor.getMissingTopStatePartitionGauge(), 0);
+
+      int missTopState = 7;
+      externalView = new ExternalView(deepCopyZNRecord(idealStateRecord));
+      start = r.nextInt(_partitions - missTopState - 1);
+      for (int i = start; i < start + missTopState; i++) {
+        String partition = _dbName + "_" + i;
+        Map<String, String> map = externalView.getStateMap(partition);
+        int flag = 0;
+        for (String key : map.keySet()) {
+          if (map.get(key).equalsIgnoreCase("MASTER")) {
+            if (flag++ % 2 == 0) {
+              map.put(key, "OFFLINE");
+            } else {
+              map.remove(key);
+            }
+            break;
+          }
+        }
+        externalView.setStateMap(partition, map);
+      }
+
+      monitor.updateResourceState(externalView, idealState, stateModelDef);
+
+      Assert.assertEquals(monitor.getDifferenceWithIdealStateGauge(), missTopState);
+      Assert.assertEquals(monitor.getErrorPartitionGauge(), 0);
+      Assert.assertEquals(monitor.getExternalViewPartitionGauge(), _partitions);
+      Assert.assertEquals(monitor.getPartitionGauge(), _partitions);
+      Assert.assertEquals(monitor.getMissingMinActiveReplicaPartitionGauge(), 0);
+      Assert.assertEquals(monitor.getMissingReplicaPartitionGauge(), missTopState);
+      Assert.assertEquals(monitor.getMissingTopStatePartitionGauge(), missTopState);
+
+      Assert.assertEquals(monitor.getNumPendingStateTransitionGauge(), 0);
+      // test pending state transition message report and read
+      int messageCount = new Random().nextInt(_partitions) + 1;
+      monitor.updatePendingStateTransitionMessages(messageCount);
+      Assert.assertEquals(monitor.getNumPendingStateTransitionGauge(), messageCount);
+
+      Assert.assertEquals(monitor.getRebalanceState(),
+          ResourceMonitor.RebalanceStatus.UNKNOWN.name());
+      monitor.setRebalanceState(ResourceMonitor.RebalanceStatus.NORMAL);
+      Assert
+          .assertEquals(monitor.getRebalanceState(), ResourceMonitor.RebalanceStatus.NORMAL.name());
+      monitor.setRebalanceState(ResourceMonitor.RebalanceStatus.BEST_POSSIBLE_STATE_CAL_FAILED);
+      Assert.assertEquals(monitor.getRebalanceState(),
+          ResourceMonitor.RebalanceStatus.BEST_POSSIBLE_STATE_CAL_FAILED.name());
+      monitor.setRebalanceState(ResourceMonitor.RebalanceStatus.INTERMEDIATE_STATE_CAL_FAILED);
+      Assert.assertEquals(monitor.getRebalanceState(),
+          ResourceMonitor.RebalanceStatus.INTERMEDIATE_STATE_CAL_FAILED.name());
+    } finally {
+      // Has to unregister this monitor to clean up. Otherwise, later tests may be affected and fail.
+      monitor.unregister();
     }
+  }
 
-    monitor.updateResourceState(externalView, idealState, stateModelDef);
+  @Test
+  public void testUpdatePartitionWeightStats() throws JMException, IOException {
+    final MBeanServerConnection mBeanServer = ManagementFactory.getPlatformMBeanServer();
+    final String clusterName = TestHelper.getTestMethodName();
+    final String resource = "testDB";
+    final ObjectName resourceObjectName = new ObjectName("testDomain:key=value");
+    final ResourceMonitor monitor =
+        new ResourceMonitor(clusterName, resource, resourceObjectName);
+    monitor.register();
 
-    Assert.assertEquals(monitor.getDifferenceWithIdealStateGauge(), missTopState);
-    Assert.assertEquals(monitor.getErrorPartitionGauge(), 0);
-    Assert.assertEquals(monitor.getExternalViewPartitionGauge(), _partitions);
-    Assert.assertEquals(monitor.getPartitionGauge(), _partitions);
-    Assert.assertEquals(monitor.getMissingMinActiveReplicaPartitionGauge(), 0);
-    Assert.assertEquals(monitor.getMissingReplicaPartitionGauge(), missTopState);
-    Assert.assertEquals(monitor.getMissingTopStatePartitionGauge(), missTopState);
+    try {
+      Map<String, Map<String, Integer>> partitionWeightMap =
+          ImmutableMap.of(resource, ImmutableMap.of("capacity1", 20, "capacity2", 40));
 
-    Assert.assertEquals(monitor.getNumPendingStateTransitionGauge(), 0);
-    // test pending state transition message report and read
-    int messageCount = new Random().nextInt(_partitions) + 1;
-    monitor.updatePendingStateTransitionMessages(messageCount);
-    Assert.assertEquals(monitor.getNumPendingStateTransitionGauge(), messageCount);
+      // Update Metrics
+      partitionWeightMap.values().forEach(monitor::updatePartitionWeightStats);
 
-    Assert
-        .assertEquals(monitor.getRebalanceState(), ResourceMonitor.RebalanceStatus.UNKNOWN.name());
-    monitor.setRebalanceState(ResourceMonitor.RebalanceStatus.NORMAL);
-    Assert.assertEquals(monitor.getRebalanceState(), ResourceMonitor.RebalanceStatus.NORMAL.name());
-    monitor.setRebalanceState(ResourceMonitor.RebalanceStatus.BEST_POSSIBLE_STATE_CAL_FAILED);
-    Assert.assertEquals(monitor.getRebalanceState(),
-        ResourceMonitor.RebalanceStatus.BEST_POSSIBLE_STATE_CAL_FAILED.name());
-    monitor.setRebalanceState(ResourceMonitor.RebalanceStatus.INTERMEDIATE_STATE_CAL_FAILED);
-    Assert.assertEquals(monitor.getRebalanceState(),
-        ResourceMonitor.RebalanceStatus.INTERMEDIATE_STATE_CAL_FAILED.name());
+      verifyPartitionWeightMetrics(mBeanServer, resourceObjectName, partitionWeightMap);
+
+      // Change capacity keys: "capacity2" -> "capacity3"
+      partitionWeightMap =
+          ImmutableMap.of(resource, ImmutableMap.of("capacity1", 20, "capacity3", 60));
+
+      // Update metrics.
+      partitionWeightMap.values().forEach(monitor::updatePartitionWeightStats);
+
+      // Verify results.
+      verifyPartitionWeightMetrics(mBeanServer, resourceObjectName, partitionWeightMap);
+
+      // "capacity2" metric should not exist in MBean server.
+      String removedAttribute = "capacity2Gauge";
+      try {
+        mBeanServer.getAttribute(resourceObjectName, removedAttribute);
+        Assert.fail("AttributeNotFoundException should be thrown because attribute [capacity2Gauge]"
+            + " is removed.");
+      } catch (AttributeNotFoundException expected) {
+      }
+    } finally {
+      // Reset monitor.
+      monitor.unregister();
+      Assert.assertFalse(mBeanServer.isRegistered(resourceObjectName),
+          "Failed to unregister resource monitor.");
+    }
   }
 
   /**
@@ -240,4 +301,28 @@
 
     return copy;
   }
+
+  private void verifyPartitionWeightMetrics(MBeanServerConnection mBeanServer,
+      ObjectName objectName, Map<String, Map<String, Integer>> expectedPartitionWeightMap)
+      throws IOException, AttributeNotFoundException, MBeanException, ReflectionException,
+             InstanceNotFoundException {
+    final String gaugeMetricSuffix = "Gauge";
+    for (Map.Entry<String, Map<String, Integer>> entry : expectedPartitionWeightMap.entrySet()) {
+      // Resource monitor for this resource is already registered.
+      Assert.assertTrue(mBeanServer.isRegistered(objectName));
+
+      for (Map.Entry<String, Integer> capacityEntry : entry.getValue().entrySet()) {
+        String attributeName = capacityEntry.getKey() + gaugeMetricSuffix;
+        try {
+          // Wait until the attribute is already registered to mbean server.
+          Assert.assertTrue(TestHelper.verify(
+              () -> !mBeanServer.getAttributes(objectName, new String[]{attributeName}).isEmpty(),
+              2000));
+        } catch (Exception ignored) {
+        }
+        Assert.assertEquals((long) mBeanServer.getAttribute(objectName, attributeName),
+            (long) capacityEntry.getValue());
+      }
+    }
+  }
 }
diff --git a/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestRoutingTableProviderMonitor.java b/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestRoutingTableProviderMonitor.java
index f2b7631..5119d81 100644
--- a/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestRoutingTableProviderMonitor.java
+++ b/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestRoutingTableProviderMonitor.java
@@ -3,6 +3,7 @@
 import java.lang.management.ManagementFactory;
 import java.util.HashSet;
 import java.util.Set;
+import javax.management.AttributeNotFoundException;
 import javax.management.JMException;
 import javax.management.MBeanServer;
 import javax.management.MalformedObjectNameException;
@@ -82,8 +83,15 @@
     Assert.assertEquals((long) _beanServer.getAttribute(name, "EventQueueSizeGauge"), 15);
     Assert.assertEquals((long) _beanServer.getAttribute(name, "DataRefreshLatencyGauge.Max"), 0);
     Assert.assertEquals((long) _beanServer.getAttribute(name, "DataRefreshCounter"), 0);
+
     // StatePropagationLatencyGauge only apply for current state
-    Assert.assertEquals(_beanServer.getAttribute(name, "StatePropagationLatencyGauge.Max"), null);
+    try {
+      _beanServer.getAttribute(name, "StatePropagationLatencyGauge.Max");
+      Assert.fail();
+    } catch (AttributeNotFoundException ex) {
+      // Expected AttributeNotFoundException because the metric does not exist in
+      // MBean server.
+    }
 
     long startTime = System.currentTimeMillis();
     Thread.sleep(5);
diff --git a/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestZkClientMonitor.java b/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestZkClientMonitor.java
index 22be0a5..2695ee4 100644
--- a/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestZkClientMonitor.java
+++ b/helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestZkClientMonitor.java
@@ -20,6 +20,7 @@
  */
 
 import java.lang.management.ManagementFactory;
+import javax.management.AttributeNotFoundException;
 import javax.management.JMException;
 import javax.management.MBeanServer;
 import javax.management.MalformedObjectNameException;
@@ -117,7 +118,13 @@
     requestGauge = (long) _beanServer.getAttribute(name, "OutstandingRequestGauge");
     Assert.assertEquals(requestGauge, 0);
 
-    Assert.assertNull(_beanServer.getAttribute(name, "PendingCallbackGauge"));
+    try {
+      _beanServer.getAttribute(name, "PendingCallbackGauge");
+      Assert.fail();
+    } catch (AttributeNotFoundException ex) {
+      // Expected AttributeNotFoundException because the metric does not exist in
+      // MBean server.
+    }
 
     monitor.record("TEST/IDEALSTATES/myResource", 0, System.currentTimeMillis() - 10,
         ZkClientMonitor.AccessType.READ);
diff --git a/helix-core/src/test/java/org/apache/helix/tools/TestClusterVerifier.java b/helix-core/src/test/java/org/apache/helix/tools/TestClusterVerifier.java
index 47113b8..3d50c4c 100644
--- a/helix-core/src/test/java/org/apache/helix/tools/TestClusterVerifier.java
+++ b/helix-core/src/test/java/org/apache/helix/tools/TestClusterVerifier.java
@@ -126,25 +126,43 @@
         new BestPossibleExternalViewVerifier.Builder(_clusterName).setZkClient(_gZkClient).build();
     Assert.assertTrue(bestPossibleVerifier.verify(10000));
 
-    HelixClusterVerifier strictMatchVerifier =
-        new StrictMatchExternalViewVerifier.Builder(_clusterName).setZkClient(_gZkClient).build();
-    Assert.assertTrue(strictMatchVerifier.verify(10000));
-
-    // Disable partition for 1 instance, then Full-Auto ExternalView should not match IdealState.
-    _admin.enablePartition(false, _clusterName, _participants[0].getInstanceName(), FULL_AUTO_RESOURCES[0],
-        Lists.newArrayList(FULL_AUTO_RESOURCES[0] + "_0"));
-
-    boolean isVerifiedFalse = TestHelper.verify(() -> {
-      HelixClusterVerifier strictMatchVerifierTemp =
-          new StrictMatchExternalViewVerifier.Builder(_clusterName).setZkClient(_gZkClient).build();
-      boolean verified = strictMatchVerifierTemp.verify(3000);
-      return (!verified);
-    }, TestHelper.WAIT_DURATION);
-    Assert.assertTrue(isVerifiedFalse);
+    // Disable partition for 1 instance, then Full-Auto ExternalView should match IdealState.
+    _admin.enablePartition(false, _clusterName, _participants[0].getInstanceName(),
+        FULL_AUTO_RESOURCES[0], Lists.newArrayList(FULL_AUTO_RESOURCES[0] + "_0"));
+    Thread.sleep(1000);
+    Assert.assertTrue(bestPossibleVerifier.verify(3000));
 
     // Enable the partition back
-    _admin.enablePartition(true, _clusterName, _participants[0].getInstanceName(), FULL_AUTO_RESOURCES[0],
-        Lists.newArrayList(FULL_AUTO_RESOURCES[0] + "_0"));
+    _admin.enablePartition(true, _clusterName, _participants[0].getInstanceName(),
+        FULL_AUTO_RESOURCES[0], Lists.newArrayList(FULL_AUTO_RESOURCES[0] + "_0"));
+    Thread.sleep(1000);
+    Assert.assertTrue(bestPossibleVerifier.verify(10000));
+
+    // Make 1 instance non-live
+    _participants[0].syncStop();
+    Thread.sleep(1000);
+    Assert.assertTrue(bestPossibleVerifier.verify(10000));
+
+    // Recover the participant before next test
+    String id = _participants[0].getInstanceName();
+    _participants[0] = new MockParticipantManager(ZK_ADDR, _clusterName, id);
+    _participants[0].syncStart();
+
+    HelixClusterVerifier strictMatchVerifier =
+        new StrictMatchExternalViewVerifier.Builder(_clusterName)
+            .setResources(Sets.newHashSet(RESOURCES)).setZkClient(_gZkClient)
+            .setDeactivatedNodeAwareness(true).build();
+    Assert.assertTrue(strictMatchVerifier.verify(10000));
+
+    // Disable partition for 1 instance, then Full-Auto ExternalView should match IdealState.
+    _admin.enablePartition(false, _clusterName, _participants[0].getInstanceName(),
+        FULL_AUTO_RESOURCES[0], Lists.newArrayList(FULL_AUTO_RESOURCES[0] + "_0"));
+    Thread.sleep(1000);
+    Assert.assertTrue(strictMatchVerifier.verify(3000));
+
+    // Enable the partition back
+    _admin.enablePartition(true, _clusterName, _participants[0].getInstanceName(),
+        FULL_AUTO_RESOURCES[0], Lists.newArrayList(FULL_AUTO_RESOURCES[0] + "_0"));
     Thread.sleep(1000);
     Assert.assertTrue(strictMatchVerifier.verify(10000));
 
@@ -152,17 +170,20 @@
     _participants[0].syncStop();
     Thread.sleep(1000);
 
-    // Semi-Auto ExternalView should not match IdealState
+    // Semi-Auto ExternalView matching
     for (String resource : SEMI_AUTO_RESOURCES) {
-      System.out.println("Un-verify resource: " + resource);
-      strictMatchVerifier = new StrictMatchExternalViewVerifier.Builder(_clusterName)
-          .setZkClient(_gZkClient).setResources(Sets.newHashSet(resource)).build();
-      Assert.assertFalse(strictMatchVerifier.verify(3000));
+      System.out.println("Verify resource: " + resource);
+      strictMatchVerifier =
+          new StrictMatchExternalViewVerifier.Builder(_clusterName).setZkClient(_gZkClient)
+              .setResources(Sets.newHashSet(resource)).setDeactivatedNodeAwareness(true).build();
+      Assert.assertTrue(strictMatchVerifier.verify(3000));
     }
 
-    // Full-Auto still match, because preference list wouldn't contain non-live instances
-    strictMatchVerifier = new StrictMatchExternalViewVerifier.Builder(_clusterName)
-        .setZkClient(_gZkClient).setResources(Sets.newHashSet(FULL_AUTO_RESOURCES)).build();
+    // Full-Auto ExternalView matching
+    strictMatchVerifier =
+        new StrictMatchExternalViewVerifier.Builder(_clusterName).setZkClient(_gZkClient)
+            .setResources(Sets.newHashSet(FULL_AUTO_RESOURCES)).setDeactivatedNodeAwareness(true)
+            .build();
     Assert.assertTrue(strictMatchVerifier.verify(10000));
   }
 
@@ -180,7 +201,7 @@
 
     // Ensure that this passes even when one resource is down
     _admin.enableInstance(_clusterName, "localhost_12918", false);
-
+    Thread.sleep(1000);
     ZkHelixClusterVerifier verifier =
         new BestPossibleExternalViewVerifier.Builder(_clusterName).setZkClient(_gZkClient)
             .setResources(Sets.newHashSet(testDB)).build();
@@ -195,7 +216,7 @@
     Assert.assertTrue(verifier.verifyByPolling());
 
     verifier = new StrictMatchExternalViewVerifier.Builder(_clusterName).setZkClient(_gZkClient)
-        .setResources(Sets.newHashSet(testDB)).build();
+        .setResources(Sets.newHashSet(testDB)).setDeactivatedNodeAwareness(true).build();
     Assert.assertTrue(verifier.verifyByPolling());
 
     // But the full cluster verification should fail
@@ -203,8 +224,8 @@
         new BestPossibleExternalViewVerifier.Builder(_clusterName).setZkClient(_gZkClient).build();
     Assert.assertFalse(verifier.verify(3000));
 
-    verifier =
-        new StrictMatchExternalViewVerifier.Builder(_clusterName).setZkClient(_gZkClient).build();
+    verifier = new StrictMatchExternalViewVerifier.Builder(_clusterName).setZkClient(_gZkClient)
+        .setDeactivatedNodeAwareness(true).build();
     Assert.assertFalse(verifier.verify(3000));
 
     _admin.enableCluster(_clusterName, true);
@@ -218,7 +239,8 @@
     Assert.assertTrue(bestPossibleVerifier.verify(10000));
 
     HelixClusterVerifier strictMatchVerifier =
-        new StrictMatchExternalViewVerifier.Builder(_clusterName).setZkClient(_gZkClient).build();
+        new StrictMatchExternalViewVerifier.Builder(_clusterName).setZkClient(_gZkClient)
+            .setDeactivatedNodeAwareness(true).build();
     Assert.assertTrue(strictMatchVerifier.verify(10000));
 
     // Re-start a new participant with sleeping transition(all state model transition cannot finish)
diff --git a/helix-core/src/test/resources/TestResourceUsageCalculator.MeasureBaselineDivergence.json b/helix-core/src/test/resources/TestResourceUsageCalculator.MeasureBaselineDivergence.json
new file mode 100644
index 0000000..dab432e
--- /dev/null
+++ b/helix-core/src/test/resources/TestResourceUsageCalculator.MeasureBaselineDivergence.json
@@ -0,0 +1,37 @@
+[
+  {
+    "baseline": {
+      "resource1": {
+        "partition1": {
+          "instance1": "MASTER",
+          "instance2": "SLAVE"
+        },
+        "partition2": {
+          "instance2": "SLAVE"
+        }
+      }
+    },
+    "someMatchBestPossible": {
+      "resource1": {
+        "partition1": {
+          "instance1": "MASTER",
+          "instance3": "SLAVE"
+        },
+        "partition2": {
+          "instance3": "MASTER"
+        }
+      }
+    },
+    "noMatchBestPossible": {
+      "resource1": {
+        "partition1": {
+          "instance2": "MASTER",
+          "instance3": "SLAVE"
+        },
+        "partition2": {
+          "instance3": "MASTER"
+        }
+      }
+    }
+  }
+]
diff --git a/helix-core/src/test/resources/TestStrictMatchExternalViewVerifier.ComputeIdealMapping.json b/helix-core/src/test/resources/TestStrictMatchExternalViewVerifier.ComputeIdealMapping.json
index b237b55..4fde15b 100644
--- a/helix-core/src/test/resources/TestStrictMatchExternalViewVerifier.ComputeIdealMapping.json
+++ b/helix-core/src/test/resources/TestStrictMatchExternalViewVerifier.ComputeIdealMapping.json
@@ -7,7 +7,11 @@
       "node_2",
       "node_3"
     ],
-    "liveAndEnabledInstances": [],
+    "liveAndEnabledInstances": [
+      "node_1",
+      "node_2",
+      "node_3"
+    ],
     "expectedBestPossibleStateMap": {
       "node_1": "MASTER",
       "node_2": "SLAVE",
@@ -22,11 +26,15 @@
       "node_2",
       "node_3"
     ],
-    "liveAndEnabledInstances": [],
+    "liveAndEnabledInstances": [
+      "node_1",
+      "node_2",
+      "node_3"
+    ],
     "expectedBestPossibleStateMap": {
       "node_1": "ONLINE",
       "node_2": "ONLINE",
       "node_3": "ONLINE"
     }
   }
-]
\ No newline at end of file
+]
diff --git a/helix-rest/src/main/java/org/apache/helix/rest/server/resources/AbstractResource.java b/helix-rest/src/main/java/org/apache/helix/rest/server/resources/AbstractResource.java
index 78bfd77..8e47b77 100644
--- a/helix-rest/src/main/java/org/apache/helix/rest/server/resources/AbstractResource.java
+++ b/helix-rest/src/main/java/org/apache/helix/rest/server/resources/AbstractResource.java
@@ -67,7 +67,15 @@
     rebalance,
     reset,
     resetPartitions,
-    removeInstanceTag
+    removeInstanceTag,
+    addResource,
+    addWagedResource,
+    getResource,
+    validateWeight,
+    enableWagedRebalance,
+    enableWagedRebalanceForAllResources,
+    getInstance,
+    getAllInstances
   }
 
   @Context
diff --git a/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/ClusterAccessor.java b/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/ClusterAccessor.java
index e4f0358..f7f9af5 100644
--- a/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/ClusterAccessor.java
+++ b/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/ClusterAccessor.java
@@ -238,7 +238,15 @@
       helixAdmin.manuallyEnableMaintenanceMode(clusterId, command == Command.enableMaintenanceMode,
           content, customFieldsMap);
       break;
-
+    case enableWagedRebalanceForAllResources:
+      // Enable WAGED rebalance for all resources in the cluster
+      List<String> resources = helixAdmin.getResourcesInCluster(clusterId);
+      try {
+        helixAdmin.enableWagedRebalance(clusterId, resources);
+      } catch (HelixException e) {
+        return badRequest(e.getMessage());
+      }
+      break;
     default:
       return badRequest("Unsupported command " + command);
     }
diff --git a/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/InstancesAccessor.java b/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/InstancesAccessor.java
index 9aedddd..4578172 100644
--- a/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/InstancesAccessor.java
+++ b/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/InstancesAccessor.java
@@ -1,5 +1,24 @@
 package org.apache.helix.rest.server.resources.helix;
 
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Collections;
@@ -8,6 +27,7 @@
 import java.util.Map;
 import java.util.Set;
 import java.util.TreeSet;
+import javax.ws.rs.DefaultValue;
 import javax.ws.rs.GET;
 import javax.ws.rs.POST;
 import javax.ws.rs.Path;
@@ -61,41 +81,65 @@
   }
 
   @GET
-  public Response getAllInstances(@PathParam("clusterId") String clusterId) {
+  public Response getAllInstances(@PathParam("clusterId") String clusterId,
+      @DefaultValue("getAllInstances") @QueryParam("command") String command) {
+    // Get the command. If not provided, the default would be "getAllInstances"
+    Command cmd;
+    try {
+      cmd = Command.valueOf(command);
+    } catch (Exception e) {
+      return badRequest("Invalid command : " + command);
+    }
+
     HelixDataAccessor accessor = getDataAccssor(clusterId);
     List<String> instances = accessor.getChildNames(accessor.keyBuilder().instanceConfigs());
-
     if (instances == null) {
       return notFound();
     }
 
-    ObjectNode root = JsonNodeFactory.instance.objectNode();
-    root.put(Properties.id.name(), JsonNodeFactory.instance.textNode(clusterId));
+    switch (cmd) {
+    case getAllInstances:
+      ObjectNode root = JsonNodeFactory.instance.objectNode();
+      root.put(Properties.id.name(), JsonNodeFactory.instance.textNode(clusterId));
 
-    ArrayNode instancesNode = root.putArray(InstancesAccessor.InstancesProperties.instances.name());
-    instancesNode.addAll((ArrayNode) OBJECT_MAPPER.valueToTree(instances));
-    ArrayNode onlineNode = root.putArray(InstancesAccessor.InstancesProperties.online.name());
-    ArrayNode disabledNode = root.putArray(InstancesAccessor.InstancesProperties.disabled.name());
+      ArrayNode instancesNode =
+          root.putArray(InstancesAccessor.InstancesProperties.instances.name());
+      instancesNode.addAll((ArrayNode) OBJECT_MAPPER.valueToTree(instances));
+      ArrayNode onlineNode = root.putArray(InstancesAccessor.InstancesProperties.online.name());
+      ArrayNode disabledNode = root.putArray(InstancesAccessor.InstancesProperties.disabled.name());
 
-    List<String> liveInstances = accessor.getChildNames(accessor.keyBuilder().liveInstances());
-    ClusterConfig clusterConfig = accessor.getProperty(accessor.keyBuilder().clusterConfig());
+      List<String> liveInstances = accessor.getChildNames(accessor.keyBuilder().liveInstances());
+      ClusterConfig clusterConfig = accessor.getProperty(accessor.keyBuilder().clusterConfig());
 
-    for (String instanceName : instances) {
-      InstanceConfig instanceConfig =
-          accessor.getProperty(accessor.keyBuilder().instanceConfig(instanceName));
-      if (instanceConfig != null) {
-        if (!instanceConfig.getInstanceEnabled() || (clusterConfig.getDisabledInstances() != null
-            && clusterConfig.getDisabledInstances().containsKey(instanceName))) {
-          disabledNode.add(JsonNodeFactory.instance.textNode(instanceName));
-        }
+      for (String instanceName : instances) {
+        InstanceConfig instanceConfig =
+            accessor.getProperty(accessor.keyBuilder().instanceConfig(instanceName));
+        if (instanceConfig != null) {
+          if (!instanceConfig.getInstanceEnabled() || (clusterConfig.getDisabledInstances() != null
+              && clusterConfig.getDisabledInstances().containsKey(instanceName))) {
+            disabledNode.add(JsonNodeFactory.instance.textNode(instanceName));
+          }
 
-        if (liveInstances.contains(instanceName)){
-          onlineNode.add(JsonNodeFactory.instance.textNode(instanceName));
+          if (liveInstances.contains(instanceName)) {
+            onlineNode.add(JsonNodeFactory.instance.textNode(instanceName));
+          }
         }
       }
+      return JSONRepresentation(root);
+    case validateWeight:
+      // Validate all instances for WAGED rebalance
+      HelixAdmin admin = getHelixAdmin();
+      Map<String, Boolean> validationResultMap;
+      try {
+        validationResultMap = admin.validateInstancesForWagedRebalance(clusterId, instances);
+      } catch (HelixException e) {
+        return badRequest(e.getMessage());
+      }
+      return JSONRepresentation(validationResultMap);
+    default:
+      _logger.error("Unsupported command :" + command);
+      return badRequest("Unsupported command :" + command);
     }
-
-    return JSONRepresentation(root);
   }
 
   @POST
diff --git a/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/PerInstanceAccessor.java b/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/PerInstanceAccessor.java
index a6dfb9a..368c730 100644
--- a/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/PerInstanceAccessor.java
+++ b/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/PerInstanceAccessor.java
@@ -20,9 +20,12 @@
  */
 
 import java.io.IOException;
+import java.util.Collections;
 import java.util.List;
+import java.util.Map;
 import javax.ws.rs.Consumes;
 import javax.ws.rs.DELETE;
+import javax.ws.rs.DefaultValue;
 import javax.ws.rs.GET;
 import javax.ws.rs.POST;
 import javax.ws.rs.PUT;
@@ -81,16 +84,41 @@
 
   @GET
   public Response getInstanceById(@PathParam("clusterId") String clusterId,
-      @PathParam("instanceName") String instanceName) throws IOException {
-    ObjectMapper objectMapper = new ObjectMapper();
-    HelixDataAccessor dataAccessor = getDataAccssor(clusterId);
-    // TODO reduce GC by dependency injection
-    InstanceService instanceService =
-        new InstanceServiceImpl(new HelixDataAccessorWrapper((ZKHelixDataAccessor) dataAccessor), getConfigAccessor());
-    InstanceInfo instanceInfo = instanceService.getInstanceInfo(clusterId, instanceName,
-        InstanceService.HealthCheck.STARTED_AND_HEALTH_CHECK_LIST);
+      @PathParam("instanceName") String instanceName,
+      @DefaultValue("getInstance") @QueryParam("command") String command) throws IOException {
+    // Get the command. If not provided, the default would be "getInstance"
+    Command cmd;
+    try {
+      cmd = Command.valueOf(command);
+    } catch (Exception e) {
+      return badRequest("Invalid command : " + command);
+    }
 
-    return OK(objectMapper.writeValueAsString(instanceInfo));
+    switch (cmd) {
+    case getInstance:
+      ObjectMapper objectMapper = new ObjectMapper();
+      HelixDataAccessor dataAccessor = getDataAccssor(clusterId);
+      // TODO reduce GC by dependency injection
+      InstanceService instanceService = new InstanceServiceImpl(
+          new HelixDataAccessorWrapper((ZKHelixDataAccessor) dataAccessor), getConfigAccessor());
+      InstanceInfo instanceInfo = instanceService.getInstanceInfo(clusterId, instanceName,
+          InstanceService.HealthCheck.STARTED_AND_HEALTH_CHECK_LIST);
+      return OK(objectMapper.writeValueAsString(instanceInfo));
+    case validateWeight:
+      // Validates instanceConfig for WAGED rebalance
+      HelixAdmin admin = getHelixAdmin();
+      Map<String, Boolean> validationResultMap;
+      try {
+        validationResultMap = admin.validateInstancesForWagedRebalance(clusterId,
+            Collections.singletonList(instanceName));
+      } catch (HelixException e) {
+        return badRequest(e.getMessage());
+      }
+      return JSONRepresentation(validationResultMap);
+    default:
+      LOG.error("Unsupported command :" + command);
+      return badRequest("Unsupported command :" + command);
+    }
   }
 
   @POST
@@ -345,7 +373,8 @@
     return notFound();
   }
 
-  @GET @Path("errors")
+  @GET
+  @Path("errors")
   public Response getErrorsOnInstance(@PathParam("clusterId") String clusterId,
       @PathParam("instanceName") String instanceName) throws IOException {
     HelixDataAccessor accessor = getDataAccssor(clusterId);
diff --git a/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/ResourceAccessor.java b/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/ResourceAccessor.java
index 47f7ec9..c41d024 100644
--- a/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/ResourceAccessor.java
+++ b/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/ResourceAccessor.java
@@ -49,6 +49,7 @@
 import org.apache.helix.model.ResourceConfig;
 import org.apache.helix.model.StateModelDefinition;
 import org.apache.helix.model.builder.HelixConfigScopeBuilder;
+import org.codehaus.jackson.type.TypeReference;
 import org.codehaus.jackson.node.ArrayNode;
 import org.codehaus.jackson.node.JsonNodeFactory;
 import org.codehaus.jackson.node.ObjectNode;
@@ -161,33 +162,56 @@
   @GET
   @Path("{resourceName}")
   public Response getResource(@PathParam("clusterId") String clusterId,
-      @PathParam("resourceName") String resourceName) {
+      @PathParam("resourceName") String resourceName,
+      @DefaultValue("getResource") @QueryParam("command") String command) {
+    // Get the command. If not provided, the default would be "getResource"
+    Command cmd;
+    try {
+      cmd = Command.valueOf(command);
+    } catch (Exception e) {
+      return badRequest("Invalid command : " + command);
+    }
     ConfigAccessor accessor = getConfigAccessor();
     HelixAdmin admin = getHelixAdmin();
 
-    ResourceConfig resourceConfig = accessor.getResourceConfig(clusterId, resourceName);
-    IdealState idealState = admin.getResourceIdealState(clusterId, resourceName);
-    ExternalView externalView = admin.getResourceExternalView(clusterId, resourceName);
+    switch (cmd) {
+    case getResource:
+      ResourceConfig resourceConfig = accessor.getResourceConfig(clusterId, resourceName);
+      IdealState idealState = admin.getResourceIdealState(clusterId, resourceName);
+      ExternalView externalView = admin.getResourceExternalView(clusterId, resourceName);
 
-    Map<String, ZNRecord> resourceMap = new HashMap<>();
-    if (idealState != null) {
-      resourceMap.put(ResourceProperties.idealState.name(), idealState.getRecord());
-    } else {
-      return notFound();
+      Map<String, ZNRecord> resourceMap = new HashMap<>();
+      if (idealState != null) {
+        resourceMap.put(ResourceProperties.idealState.name(), idealState.getRecord());
+      } else {
+        return notFound();
+      }
+
+      resourceMap.put(ResourceProperties.resourceConfig.name(), null);
+      resourceMap.put(ResourceProperties.externalView.name(), null);
+
+      if (resourceConfig != null) {
+        resourceMap.put(ResourceProperties.resourceConfig.name(), resourceConfig.getRecord());
+      }
+
+      if (externalView != null) {
+        resourceMap.put(ResourceProperties.externalView.name(), externalView.getRecord());
+      }
+      return JSONRepresentation(resourceMap);
+    case validateWeight:
+      // Validate ResourceConfig for WAGED rebalance
+      Map<String, Boolean> validationResultMap;
+      try {
+        validationResultMap = admin.validateResourcesForWagedRebalance(clusterId,
+            Collections.singletonList(resourceName));
+      } catch (HelixException e) {
+        return badRequest(e.getMessage());
+      }
+      return JSONRepresentation(validationResultMap);
+    default:
+      _logger.error("Unsupported command :" + command);
+      return badRequest("Unsupported command :" + command);
     }
-
-    resourceMap.put(ResourceProperties.resourceConfig.name(), null);
-    resourceMap.put(ResourceProperties.externalView.name(), null);
-
-    if (resourceConfig != null) {
-      resourceMap.put(ResourceProperties.resourceConfig.name(), resourceConfig.getRecord());
-    }
-
-    if (externalView != null) {
-      resourceMap.put(ResourceProperties.externalView.name(), externalView.getRecord());
-    }
-
-    return JSONRepresentation(resourceMap);
   }
 
   @PUT
@@ -200,32 +224,81 @@
       @DefaultValue("DEFAULT") @QueryParam("rebalanceStrategy") String rebalanceStrategy,
       @DefaultValue("0") @QueryParam("bucketSize") int bucketSize,
       @DefaultValue("-1") @QueryParam("maxPartitionsPerInstance") int maxPartitionsPerInstance,
-      String content) {
-
-    HelixAdmin admin = getHelixAdmin();
-
+      @DefaultValue("addResource") @QueryParam("command") String command, String content) {
+    // Get the command. If not provided, the default would be "addResource"
+    Command cmd;
     try {
-      if (content.length() != 0) {
-        ZNRecord record;
-        try {
-          record = toZNRecord(content);
-        } catch (IOException e) {
-          _logger.error("Failed to deserialize user's input " + content + ", Exception: " + e);
-          return badRequest("Input is not a vaild ZNRecord!");
-        }
+      cmd = Command.valueOf(command);
+    } catch (Exception e) {
+      return badRequest("Invalid command : " + command);
+    }
+    HelixAdmin admin = getHelixAdmin();
+    try {
+      switch (cmd) {
+      case addResource:
+        if (content.length() != 0) {
+          ZNRecord record;
+          try {
+            record = toZNRecord(content);
+          } catch (IOException e) {
+            _logger.error("Failed to deserialize user's input " + content + ", Exception: " + e);
+            return badRequest("Input is not a valid ZNRecord!");
+          }
 
-        if (record.getSimpleFields() != null) {
-          admin.addResource(clusterId, resourceName, new IdealState(record));
+          if (record.getSimpleFields() != null) {
+            admin.addResource(clusterId, resourceName, new IdealState(record));
+          }
+        } else {
+          admin.addResource(clusterId, resourceName, numPartitions, stateModelRef, rebalancerMode,
+              rebalanceStrategy, bucketSize, maxPartitionsPerInstance);
         }
-      } else {
-        admin.addResource(clusterId, resourceName, numPartitions, stateModelRef, rebalancerMode,
-            rebalanceStrategy, bucketSize, maxPartitionsPerInstance);
+        break;
+      case addWagedResource:
+        // Check if content is valid
+        if (content == null || content.length() == 0) {
+          _logger.error("Input is null or empty!");
+          return badRequest("Input is null or empty!");
+        }
+        Map<String, ZNRecord> input;
+        // Content must supply both IdealState and ResourceConfig
+        try {
+          TypeReference<Map<String, ZNRecord>> typeRef =
+              new TypeReference<Map<String, ZNRecord>>() {
+              };
+          input = OBJECT_MAPPER.readValue(content, typeRef);
+        } catch (IOException e) {
+          _logger.error("Failed to deserialize user's input {}, Exception: {}", content, e);
+          return badRequest("Input is not a valid map of String-ZNRecord pairs!");
+        }
+        // Check if the map contains both IdealState and ResourceConfig
+        ZNRecord idealStateRecord =
+            input.get(ResourceAccessor.ResourceProperties.idealState.name());
+        ZNRecord resourceConfigRecord =
+            input.get(ResourceAccessor.ResourceProperties.resourceConfig.name());
+
+        if (idealStateRecord == null || resourceConfigRecord == null) {
+          _logger.error("Input does not contain both IdealState and ResourceConfig!");
+          return badRequest("Input does not contain both IdealState and ResourceConfig!");
+        }
+        // Add using HelixAdmin API
+        try {
+          admin.addResourceWithWeight(clusterId, new IdealState(idealStateRecord),
+              new ResourceConfig(resourceConfigRecord));
+        } catch (HelixException e) {
+          String errMsg = String.format("Failed to add resource %s with weight in cluster %s!",
+              idealStateRecord.getId(), clusterId);
+          _logger.error(errMsg, e);
+          return badRequest(errMsg);
+        }
+        break;
+      default:
+        _logger.error("Unsupported command :" + command);
+        return badRequest("Unsupported command :" + command);
       }
     } catch (Exception e) {
       _logger.error("Error in adding a resource: " + resourceName, e);
       return serverError(e);
     }
-
     return OK();
   }
 
@@ -259,6 +332,13 @@
         keyPrefix = keyPrefix.length() == 0 ? resourceName : keyPrefix;
         admin.rebalance(clusterId, resourceName, replicas, keyPrefix, group);
         break;
+      case enableWagedRebalance:
+        try {
+          admin.enableWagedRebalance(clusterId, Collections.singletonList(resourceName));
+        } catch (HelixException e) {
+          return badRequest(e.getMessage());
+        }
+        break;
       default:
         _logger.error("Unsupported command :" + command);
         return badRequest("Unsupported command :" + command);
@@ -318,7 +398,7 @@
       record = toZNRecord(content);
     } catch (IOException e) {
       _logger.error("Failed to deserialize user's input " + content + ", Exception: " + e);
-      return badRequest("Input is not a vaild ZNRecord!");
+      return badRequest("Input is not a valid ZNRecord!");
     }
     ResourceConfig resourceConfig = new ResourceConfig(record);
     ConfigAccessor configAccessor = getConfigAccessor();
@@ -380,7 +460,7 @@
       record = toZNRecord(content);
     } catch (IOException e) {
       _logger.error("Failed to deserialize user's input " + content + ", Exception: " + e);
-      return badRequest("Input is not a vaild ZNRecord!");
+      return badRequest("Input is not a valid ZNRecord!");
     }
     IdealState idealState = new IdealState(record);
     HelixAdmin helixAdmin = getHelixAdmin();
diff --git a/helix-rest/src/test/java/org/apache/helix/rest/server/AbstractTestClass.java b/helix-rest/src/test/java/org/apache/helix/rest/server/AbstractTestClass.java
index 11a450d..e3b33fb 100644
--- a/helix-rest/src/test/java/org/apache/helix/rest/server/AbstractTestClass.java
+++ b/helix-rest/src/test/java/org/apache/helix/rest/server/AbstractTestClass.java
@@ -471,8 +471,9 @@
     final Response response = webTarget.request().get();
     Assert.assertEquals(response.getStatus(), expectedReturnStatus);
 
-    // NOT_FOUND will throw text based html
-    if (expectedReturnStatus != Response.Status.NOT_FOUND.getStatusCode()) {
+    // NOT_FOUND and BAD_REQUEST will throw text based html
+    if (expectedReturnStatus != Response.Status.NOT_FOUND.getStatusCode()
+        && expectedReturnStatus != Response.Status.BAD_REQUEST.getStatusCode()) {
       Assert.assertEquals(response.getMediaType().getType(), "application");
     } else {
       Assert.assertEquals(response.getMediaType().getType(), "text");
diff --git a/helix-rest/src/test/java/org/apache/helix/rest/server/TestClusterAccessor.java b/helix-rest/src/test/java/org/apache/helix/rest/server/TestClusterAccessor.java
index b8d8018..05525c0 100644
--- a/helix-rest/src/test/java/org/apache/helix/rest/server/TestClusterAccessor.java
+++ b/helix-rest/src/test/java/org/apache/helix/rest/server/TestClusterAccessor.java
@@ -39,6 +39,7 @@
 import org.apache.helix.ZNRecord;
 import org.apache.helix.controller.rebalancer.DelayedAutoRebalancer;
 import org.apache.helix.controller.rebalancer.strategy.CrushEdRebalanceStrategy;
+import org.apache.helix.controller.rebalancer.waged.WagedRebalancer;
 import org.apache.helix.integration.manager.ClusterDistributedController;
 import org.apache.helix.manager.zk.ZKHelixDataAccessor;
 import org.apache.helix.manager.zk.ZKUtil;
@@ -563,6 +564,18 @@
     System.out.println("End test :" + TestHelper.getTestMethodName());
   }
 
+  @Test(dependsOnMethods = "testActivateSuperCluster")
+  public void testEnableWagedRebalanceForAllResources() {
+    String cluster = "TestCluster_2";
+    post("clusters/" + cluster, ImmutableMap.of("command", "enableWagedRebalanceForAllResources"),
+        Entity.entity("", MediaType.APPLICATION_JSON_TYPE), Response.Status.OK.getStatusCode());
+    for (String resource : _gSetupTool.getClusterManagementTool().getResourcesInCluster(cluster)) {
+      IdealState idealState =
+          _gSetupTool.getClusterManagementTool().getResourceIdealState(cluster, resource);
+      Assert.assertEquals(idealState.getRebalancerClassName(), WagedRebalancer.class.getName());
+    }
+  }
+
   private ClusterConfig getClusterConfigFromRest(String cluster) throws IOException {
     String body = get("clusters/" + cluster + "/configs", null, Response.Status.OK.getStatusCode(), true);
 
diff --git a/helix-rest/src/test/java/org/apache/helix/rest/server/TestInstancesAccessor.java b/helix-rest/src/test/java/org/apache/helix/rest/server/TestInstancesAccessor.java
index da8f911..6ffd1e5 100644
--- a/helix-rest/src/test/java/org/apache/helix/rest/server/TestInstancesAccessor.java
+++ b/helix-rest/src/test/java/org/apache/helix/rest/server/TestInstancesAccessor.java
@@ -1,7 +1,28 @@
 package org.apache.helix.rest.server;
 
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 import java.io.IOException;
+import java.util.ArrayList;
 import java.util.Arrays;
+import java.util.Collections;
 import java.util.HashSet;
 import java.util.List;
 import java.util.Set;
@@ -152,6 +173,62 @@
     System.out.println("End test :" + TestHelper.getTestMethodName());
   }
 
+  @Test(dependsOnMethods = "testGetAllInstances")
+  public void testValidateWeightForAllInstances() throws IOException {
+    System.out.println("Start test :" + TestHelper.getTestMethodName());
+
+    // Empty out ClusterConfig's weight key setting and InstanceConfig's capacity maps for testing
+    ClusterConfig clusterConfig = _configAccessor.getClusterConfig(CLUSTER_NAME);
+    clusterConfig.getRecord().setListField(
+        ClusterConfig.ClusterConfigProperty.INSTANCE_CAPACITY_KEYS.name(), new ArrayList<>());
+    _configAccessor.setClusterConfig(CLUSTER_NAME, clusterConfig);
+    List<String> instances =
+        _gSetupTool.getClusterManagementTool().getInstancesInCluster(CLUSTER_NAME);
+    for (String instance : instances) {
+      InstanceConfig instanceConfig = _configAccessor.getInstanceConfig(CLUSTER_NAME, instance);
+      instanceConfig.setInstanceCapacityMap(Collections.emptyMap());
+      _configAccessor.setInstanceConfig(CLUSTER_NAME, instance, instanceConfig);
+    }
+
+    // Issue a validate call
+    String body = new JerseyUriRequestBuilder("clusters/{}/instances?command=validateWeight")
+        .isBodyReturnExpected(true).format(CLUSTER_NAME).get(this);
+
+    JsonNode node = OBJECT_MAPPER.readTree(body);
+    // Must have the results saying they are all valid (true) because there's no capacity keys set
+    // in ClusterConfig
+    node.iterator().forEachRemaining(child -> Assert.assertTrue(child.booleanValue()));
+
+    clusterConfig = _configAccessor.getClusterConfig(CLUSTER_NAME);
+    clusterConfig.setInstanceCapacityKeys(Arrays.asList("FOO", "BAR"));
+    _configAccessor.setClusterConfig(CLUSTER_NAME, clusterConfig);
+
+    body = new JerseyUriRequestBuilder("clusters/{}/instances?command=validateWeight")
+        .isBodyReturnExpected(true).format(CLUSTER_NAME)
+        .expectedReturnStatusCode(Response.Status.BAD_REQUEST.getStatusCode()).get(this);
+    node = OBJECT_MAPPER.readTree(body);
+    // Since instances do not have weight-related configs, the result should return error
+    Assert.assertTrue(node.has("error"));
+
+    // Now set weight-related configs in InstanceConfigs
+    instances = _gSetupTool.getClusterManagementTool().getInstancesInCluster(CLUSTER_NAME);
+    for (String instance : instances) {
+      InstanceConfig instanceConfig = _configAccessor.getInstanceConfig(CLUSTER_NAME, instance);
+      instanceConfig.setInstanceCapacityMap(ImmutableMap.of("FOO", 1000, "BAR", 1000));
+      _configAccessor.setInstanceConfig(CLUSTER_NAME, instance, instanceConfig);
+    }
+
+    body = new JerseyUriRequestBuilder("clusters/{}/instances?command=validateWeight")
+        .isBodyReturnExpected(true).format(CLUSTER_NAME)
+        .expectedReturnStatusCode(Response.Status.OK.getStatusCode()).get(this);
+    node = OBJECT_MAPPER.readTree(body);
+    // Must have the results saying they are all valid (true) because capacity keys are set
+    // in ClusterConfig
+    node.iterator().forEachRemaining(child -> Assert.assertTrue(child.booleanValue()));
+
+    System.out.println("End test :" + TestHelper.getTestMethodName());
+  }
+
   private Set<String> getStringSet(JsonNode jsonNode, String key) {
     Set<String> result = new HashSet<>();
     jsonNode.withArray(key).forEach(s -> result.add(s.textValue()));
diff --git a/helix-rest/src/test/java/org/apache/helix/rest/server/TestPerInstanceAccessor.java b/helix-rest/src/test/java/org/apache/helix/rest/server/TestPerInstanceAccessor.java
index cc56ef2..d57bbde 100644
--- a/helix-rest/src/test/java/org/apache/helix/rest/server/TestPerInstanceAccessor.java
+++ b/helix-rest/src/test/java/org/apache/helix/rest/server/TestPerInstanceAccessor.java
@@ -22,6 +22,7 @@
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Arrays;
+import java.util.Collections;
 import java.util.HashSet;
 import java.util.List;
 import java.util.Map;
@@ -37,6 +38,7 @@
 import org.apache.helix.TestHelper;
 import org.apache.helix.ZNRecord;
 import org.apache.helix.manager.zk.ZKHelixDataAccessor;
+import org.apache.helix.model.ClusterConfig;
 import org.apache.helix.model.InstanceConfig;
 import org.apache.helix.model.Message;
 import org.apache.helix.rest.server.resources.AbstractResource;
@@ -398,4 +400,68 @@
         .format(CLUSTER_NAME, instanceName).post(this, entity);
     System.out.println("End test :" + TestHelper.getTestMethodName());
   }
+
+  /**
+   * Check that validateWeightForInstance() works by
+   * 1. First call validate -> We should get "true" because nothing is set in ClusterConfig.
+   * 2. Define keys in ClusterConfig and call validate -> We should get BadRequest.
+   * 3. Define weight configs in InstanceConfig and call validate -> We should get OK with "true".
+   */
+  @Test(dependsOnMethods = "checkUpdateFails")
+  public void testValidateWeightForInstance()
+      throws IOException {
+    // Empty out ClusterConfig's weight key setting and InstanceConfig's capacity maps for testing
+    ClusterConfig clusterConfig = _configAccessor.getClusterConfig(CLUSTER_NAME);
+    clusterConfig.getRecord()
+        .setListField(ClusterConfig.ClusterConfigProperty.INSTANCE_CAPACITY_KEYS.name(),
+            new ArrayList<>());
+    _configAccessor.setClusterConfig(CLUSTER_NAME, clusterConfig);
+    List<String> instances =
+        _gSetupTool.getClusterManagementTool().getInstancesInCluster(CLUSTER_NAME);
+    for (String instance : instances) {
+      InstanceConfig instanceConfig = _configAccessor.getInstanceConfig(CLUSTER_NAME, instance);
+      instanceConfig.setInstanceCapacityMap(Collections.emptyMap());
+      _configAccessor.setInstanceConfig(CLUSTER_NAME, instance, instanceConfig);
+    }
+
+    // Get one instance in the cluster
+    String selectedInstance =
+        _gSetupTool.getClusterManagementTool().getInstancesInCluster(CLUSTER_NAME).iterator()
+            .next();
+
+    // Issue a validate call
+    String body = new JerseyUriRequestBuilder("clusters/{}/instances/{}?command=validateWeight")
+        .isBodyReturnExpected(true).format(CLUSTER_NAME, selectedInstance).get(this);
+
+    JsonNode node = OBJECT_MAPPER.readTree(body);
+    // Must have the result saying (true) because there's no capacity keys set
+    // in ClusterConfig
+    node.iterator().forEachRemaining(child -> Assert.assertTrue(child.getBooleanValue()));
+
+    // Define keys in ClusterConfig
+    clusterConfig = _configAccessor.getClusterConfig(CLUSTER_NAME);
+    clusterConfig.setInstanceCapacityKeys(Arrays.asList("FOO", "BAR"));
+    _configAccessor.setClusterConfig(CLUSTER_NAME, clusterConfig);
+
+    body = new JerseyUriRequestBuilder("clusters/{}/instances/{}?command=validateWeight")
+        .isBodyReturnExpected(true).format(CLUSTER_NAME, selectedInstance)
+        .expectedReturnStatusCode(Response.Status.BAD_REQUEST.getStatusCode()).get(this);
+    node = OBJECT_MAPPER.readTree(body);
+    // Since instance does not have weight-related configs, the result should return error
+    Assert.assertTrue(node.has("error"));
+
+    // Now set weight-related config in InstanceConfig
+    InstanceConfig instanceConfig =
+        _configAccessor.getInstanceConfig(CLUSTER_NAME, selectedInstance);
+    instanceConfig.setInstanceCapacityMap(ImmutableMap.of("FOO", 1000, "BAR", 1000));
+    _configAccessor.setInstanceConfig(CLUSTER_NAME, selectedInstance, instanceConfig);
+
+    body = new JerseyUriRequestBuilder("clusters/{}/instances/{}?command=validateWeight")
+        .isBodyReturnExpected(true).format(CLUSTER_NAME, selectedInstance)
+        .expectedReturnStatusCode(Response.Status.OK.getStatusCode()).get(this);
+    node = OBJECT_MAPPER.readTree(body);
+    // Must have the results saying they are all valid (true) because capacity keys are set
+    // in ClusterConfig
+    node.iterator().forEachRemaining(child -> Assert.assertTrue(child.getBooleanValue()));
+  }
 }
diff --git a/helix-rest/src/test/java/org/apache/helix/rest/server/TestResourceAccessor.java b/helix-rest/src/test/java/org/apache/helix/rest/server/TestResourceAccessor.java
index 62e68c5..f865674 100644
--- a/helix-rest/src/test/java/org/apache/helix/rest/server/TestResourceAccessor.java
+++ b/helix-rest/src/test/java/org/apache/helix/rest/server/TestResourceAccessor.java
@@ -41,8 +41,11 @@
 import org.apache.helix.PropertyPathBuilder;
 import org.apache.helix.TestHelper;
 import org.apache.helix.ZNRecord;
+import org.apache.helix.controller.rebalancer.waged.WagedRebalancer;
+import org.apache.helix.model.ClusterConfig;
 import org.apache.helix.model.ExternalView;
 import org.apache.helix.model.IdealState;
+import org.apache.helix.model.InstanceConfig;
 import org.apache.helix.model.ResourceConfig;
 import org.apache.helix.model.builder.FullAutoModeISBuilder;
 import org.apache.helix.rest.server.resources.helix.ResourceAccessor;
@@ -429,10 +432,30 @@
   }
 
   /**
+   * Test "enableWagedRebalance" command of updateResource.
+   */
+  @Test(dependsOnMethods = "updateResourceIdealState")
+  public void testEnableWagedRebalance() {
+    IdealState idealState =
+        _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, RESOURCE_NAME);
+    Assert.assertNotSame(idealState.getRebalancerClassName(), WagedRebalancer.class.getName());
+
+    // Enable waged rebalance, which should change the rebalancer class name
+    Entity entity = Entity.entity(null, MediaType.APPLICATION_JSON_TYPE);
+    post("clusters/" + CLUSTER_NAME + "/resources/" + RESOURCE_NAME,
+        Collections.singletonMap("command", "enableWagedRebalance"), entity,
+        Response.Status.OK.getStatusCode());
+
+    idealState =
+        _gSetupTool.getClusterManagementTool().getResourceIdealState(CLUSTER_NAME, RESOURCE_NAME);
+    Assert.assertEquals(idealState.getRebalancerClassName(), WagedRebalancer.class.getName());
+  }
+
+  /**
    * Test "delete" command of updateResourceIdealState.
    * @throws Exception
    */
-  @Test(dependsOnMethods = "updateResourceIdealState")
+  @Test(dependsOnMethods = "testEnableWagedRebalance")
   public void deleteFromResourceIdealState() throws Exception {
     String zkPath = PropertyPathBuilder.idealState(CLUSTER_NAME, RESOURCE_NAME);
     ZNRecord record = new ZNRecord(RESOURCE_NAME);
@@ -470,6 +493,98 @@
     System.out.println("End test :" + TestHelper.getTestMethodName());
   }
 
+  @Test(dependsOnMethods = "deleteFromResourceIdealState")
+  public void testAddResourceWithWeight() throws IOException {
+    // Test case 1: Add a valid resource with valid weights
+    // Create a resource with IdealState and ResourceConfig
+    String wagedResourceName = "newWagedResource";
+
+    // Create an IdealState on full-auto with 1 partition
+    IdealState idealState = new IdealState(wagedResourceName);
+    idealState.getRecord().getSimpleFields().putAll(_gSetupTool.getClusterManagementTool()
+        .getResourceIdealState(CLUSTER_NAME, RESOURCE_NAME).getRecord().getSimpleFields());
+    idealState.setRebalanceMode(IdealState.RebalanceMode.FULL_AUTO);
+    idealState.setRebalancerClassName(WagedRebalancer.class.getName());
+    idealState.setNumPartitions(1); // 1 partition for convenience of testing
+
+    // Create a ResourceConfig with FOO and BAR at 100 respectively
+    ResourceConfig resourceConfig = new ResourceConfig(wagedResourceName);
+    Map<String, Map<String, Integer>> partitionCapacityMap = new HashMap<>();
+    Map<String, Integer> partitionCapacity = ImmutableMap.of("FOO", 100, "BAR", 100);
+    partitionCapacityMap.put(wagedResourceName + "_0", partitionCapacity);
+    // Also add a default key
+    partitionCapacityMap.put(ResourceConfig.DEFAULT_PARTITION_KEY, partitionCapacity);
+    resourceConfig.setPartitionCapacityMap(partitionCapacityMap);
+
+    // Put both IdealState and ResourceConfig into a map as required
+    Map<String, ZNRecord> inputMap = ImmutableMap.of(
+        ResourceAccessor.ResourceProperties.idealState.name(), idealState.getRecord(),
+        ResourceAccessor.ResourceProperties.resourceConfig.name(), resourceConfig.getRecord());
+
+    // Create an entity using the inputMap
+    Entity entity =
+        Entity.entity(OBJECT_MAPPER.writeValueAsString(inputMap), MediaType.APPLICATION_JSON_TYPE);
+
+    // Make a HTTP call to the REST endpoint
+    put("clusters/" + CLUSTER_NAME + "/resources/" + wagedResourceName,
+        ImmutableMap.of("command", "addWagedResource"), entity, Response.Status.OK.getStatusCode());
+
+    // Test case 2: Add a resource with invalid weights
+    String invalidResourceName = "invalidWagedResource";
+    ResourceConfig invalidWeightResourceConfig = new ResourceConfig(invalidResourceName);
+    IdealState invalidWeightIdealState = new IdealState(invalidResourceName);
+
+    Map<String, ZNRecord> invalidInputMap = ImmutableMap.of(
+        ResourceAccessor.ResourceProperties.idealState.name(), invalidWeightIdealState.getRecord(),
+        ResourceAccessor.ResourceProperties.resourceConfig.name(),
+        invalidWeightResourceConfig.getRecord());
+
+    // Create an entity using invalidInputMap
+    entity = Entity.entity(OBJECT_MAPPER.writeValueAsString(invalidInputMap),
+        MediaType.APPLICATION_JSON_TYPE);
+
+    // Make a HTTP call to the REST endpoint
+    put("clusters/" + CLUSTER_NAME + "/resources/" + invalidResourceName,
+        ImmutableMap.of("command", "addWagedResource"), entity,
+        Response.Status.BAD_REQUEST.getStatusCode());
+  }
+
+  @Test(dependsOnMethods = "testAddResourceWithWeight")
+  public void testValidateResource() throws IOException {
+    // Define weight keys in ClusterConfig
+    ClusterConfig clusterConfig = _configAccessor.getClusterConfig(CLUSTER_NAME);
+    clusterConfig.setInstanceCapacityKeys(Arrays.asList("FOO", "BAR"));
+    _configAccessor.setClusterConfig(CLUSTER_NAME, clusterConfig);
+
+    // Remove all weight configs in InstanceConfig for testing
+    for (String instance : _instancesMap.get(CLUSTER_NAME)) {
+      InstanceConfig instanceConfig = _configAccessor.getInstanceConfig(CLUSTER_NAME, instance);
+      instanceConfig.setInstanceCapacityMap(Collections.emptyMap());
+      _configAccessor.setInstanceConfig(CLUSTER_NAME, instance, instanceConfig);
+    }
+
+    // Validate the resource added in testAddResourceWithWeight()
+    String resourceToValidate = "newWagedResource";
+    // This should fail because none of the instances have weight configured
+    get("clusters/" + CLUSTER_NAME + "/resources/" + resourceToValidate,
+        ImmutableMap.of("command", "validateWeight"), Response.Status.BAD_REQUEST.getStatusCode(),
+        true);
+
+    // Add back weight configurations to all instance configs
+    Map<String, Integer> instanceCapacityMap = ImmutableMap.of("FOO", 1000, "BAR", 1000);
+    for (String instance : _instancesMap.get(CLUSTER_NAME)) {
+      InstanceConfig instanceConfig = _configAccessor.getInstanceConfig(CLUSTER_NAME, instance);
+      instanceConfig.setInstanceCapacityMap(instanceCapacityMap);
+      _configAccessor.setInstanceConfig(CLUSTER_NAME, instance, instanceConfig);
+    }
+
+    // Now try validating again - it should go through and return a 200
+    String body = get("clusters/" + CLUSTER_NAME + "/resources/" + resourceToValidate,
+        ImmutableMap.of("command", "validateWeight"), Response.Status.OK.getStatusCode(), true);
+    JsonNode node = OBJECT_MAPPER.readTree(body);
+    Assert.assertEquals(node.get(resourceToValidate).toString(), "true");
+  }
+
   /**
    * Creates a setup where the health API can be tested.
    * @param clusterName
diff --git a/helix-rest/src/test/java/org/apache/helix/rest/server/util/JerseyUriRequestBuilder.java b/helix-rest/src/test/java/org/apache/helix/rest/server/util/JerseyUriRequestBuilder.java
index 359999e..4552240 100644
--- a/helix-rest/src/test/java/org/apache/helix/rest/server/util/JerseyUriRequestBuilder.java
+++ b/helix-rest/src/test/java/org/apache/helix/rest/server/util/JerseyUriRequestBuilder.java
@@ -76,7 +76,8 @@
     Assert.assertEquals(response.getStatus(), _expectedStatusCode);
 
     // NOT_FOUND will throw text based html
-    if (_expectedStatusCode != Response.Status.NOT_FOUND.getStatusCode()) {
+    if (_expectedStatusCode != Response.Status.NOT_FOUND.getStatusCode()
+        && _expectedStatusCode != Response.Status.BAD_REQUEST.getStatusCode()) {
       Assert.assertEquals(response.getMediaType().getType(), "application");
     } else {
       Assert.assertEquals(response.getMediaType().getType(), "text");