Multi-Raft support allows a Datanode to participate multiple Ratis replication groups (pipelines) at the same time for improving write throughput and ensuring better utilization of disk and network resources. This is particularly useful when Datanodes have multiple disks or the network has a very high bandwidth.
The early Ozone versions supported only one Raft pipeline per Datanode. This limited its concurrent write handling capacity for replicated data and led to under-utilization of resources. The use of Multi-Raft tremendously improved the resource utilization.
hdds.container.ratis.datanode.storage.dir
)SCM can now create overlapping pipelines: each Datanode can join multiple Raft groups up to a configurable limit. This boosts concurrency and avoids idle nodes and idle disks. Raft logs are stored separately on different metadata directories in order to reduce disk contention. Ratis handles concurrent logs per node.
hdds.container.ratis.datanode.storage.dir
(no default)ozone.scm.datanode.pipeline.limit
(default: 2)ozone.scm.pipeline.per.metadata.disk
(default: 2)<property> <name>hdds.container.ratis.datanode.storage.dir</name> <value>/disk1/ratis,/disk2/ratis,/disk3/ratis,/disk4/ratis</value> </property>
ozone-site.xml
:<property> <name>ozone.scm.datanode.pipeline.limit</name> <value>0</value> </property> <property> <name>ozone.scm.pipeline.per.metadata.disk</name> <value>2</value> </property>
ozone admin pipeline list
ozone admin datanode list
ozone admin
CLI and the Recon UI.hdds.container.ratis.datanode.storage.dir
with paths on multiple distinct physical disks. The Ratis pipelines will be distributed accordingly.