title: Container Balancer menu: main: parent: Features summary: How to use the Container Balancer in Ozone.

Overview

The Container Balancer is a tool in Apache Ozone that balances data containers across the cluster. Its primary goal is to ensure an even distribution of data based on disk space usage on datanodes. This helps to prevent some datanodes from becoming full while others remain underutilized.

The balancer operates by moving CLOSED container replicas, which means it doesn‘t interfere with active I/O operations. It is designed to work with both regular and Erasure Coded (EC) containers. To maintain cluster stability, the Container Balancer’s startup is delayed after a Storage Container Manager (SCM) failover.

Command Line Usage

The Container Balancer is managed through the ozone admin containerbalancer command.

Start

To start the Container Balancer with default settings:

ozone admin containerbalancer start

You can also start the balancer with specific options:

ozone admin containerbalancer start [options]

Options:

OptionDescription
-t, --thresholdThe percentage deviation from the average utilization of the cluster after which a datanode will be rebalanced. Default is 10%.
-i, --iterationsThe maximum number of consecutive iterations the balancer will run for. Default is 10. Use -1 for infinite iterations.
-d, --maxDatanodesPercentageToInvolvePerIterationThe maximum percentage of healthy, in-service datanodes that can be involved in balancing in one iteration. Default is 20%.
-s, --maxSizeToMovePerIterationInGBThe maximum size of data in GB to be moved in one iteration. Default is 500GB.
-e, --maxSizeEnteringTargetInGBThe maximum size in GB that can enter a target datanode in one iteration. Default is 26GB.
-l, --maxSizeLeavingSourceInGBThe maximum size in GB that can leave a source datanode in one iteration. Default is 26GB.
--balancing-iteration-interval-minutesThe interval in minutes between each iteration of the Container Balancer. Default is 70 minutes.
--move-timeout-minutesThe time in minutes to allow a single container to move from source to target. Default is 65 minutes.
--move-replication-timeout-minutesThe time in minutes to allow a single container's replication from source to target as part of a container move. Default is 50 minutes.
--move-network-topology-enableWhether to consider network topology when selecting a target for a source. Default is false.
--include-datanodesA comma-separated list of datanode hostnames or IP addresses to be included in balancing.
--exclude-datanodesA comma-separated list of datanode hostnames or IP addresses to be excluded from balancing.

Status

To check the status of the Container Balancer:

ozone admin containerbalancer status

To get a more detailed status, including the history of iterations:

ozone admin containerbalancer status -v --history

Stop

To stop the Container Balancer:

ozone admin containerbalancer stop

Configuration

The Container Balancer can also be configured through the ozone-site.xml file.

PropertyDescriptionDefault Value
hdds.container.balancer.utilization.thresholdA cluster is considered balanced if for each datanode, the utilization of the datanode differs from the utilization of the cluster no more than this threshold.10%
hdds.container.balancer.datanodes.involved.max.percentage.per.iterationMaximum percentage of healthy, in-service datanodes that can be involved in balancing in one iteration.20%
hdds.container.balancer.size.moved.max.per.iterationThe maximum size of data that will be moved by Container Balancer in one iteration.500GB
hdds.container.balancer.size.entering.target.maxThe maximum size that can enter a target datanode in each iteration.26GB
hdds.container.balancer.size.leaving.source.maxThe maximum size that can leave a source datanode in each iteration.26GB
hdds.container.balancer.iterationsThe number of iterations that Container Balancer will run for.10
hdds.container.balancer.exclude.containersA comma-separated list of container IDs to exclude from balancing.""
hdds.container.balancer.move.timeoutThe amount of time to allow a single container to move from source to target.65m
hdds.container.balancer.move.replication.timeoutThe amount of time to allow a single container's replication from source to target as part of a container move.50m
hdds.container.balancer.balancing.iteration.intervalThe interval period between each iteration of Container Balancer.70m
hdds.container.balancer.include.datanodesA comma-separated list of Datanode hostnames or IP addresses. Only the Datanodes specified in this list are balanced.""
hdds.container.balancer.exclude.datanodesA comma-separated list of Datanode hostnames or IP addresses. The Datanodes specified in this list are excluded from balancing.""
hdds.container.balancer.move.networkTopology.enableWhether to take network topology into account when selecting a target for a source.false
hdds.container.balancer.trigger.du.before.move.enableWhether to send a command to all healthy and in-service data nodes to run du immediately before starting a balance iteration.false