apache /
kudu /
6b6910870ce2c35bf8b9be9408f44a8cec6b580a KUDU-3001 Multi-thread to load containers in a data directory
When a data directory has many block containers, a single thread to
load these container files is low efficiency, we can improve it by
multi-threads.
We did some simple benchmarks to verify it. Adjust
'log_container_max_size' to 1GB to generate more containers when do
benchmarks, adjust 'startup_benchmark_data_dir_count_for_testing' to 8
to make sure existing concurrent data directories load are effective,
and adjust 'fs_max_thread_count_per_data_dir' and
'startup_benchmark_block_count_for_testing' to different
values, timing 10 times ReopenBlockManager(), in milliseconds,
result details as follow:
disk type: SSD
| new version
Block count old version | 1 thread | 2 threads | 4 threads | 8 threads | 16 threads | 32 threads
100,000 2,375 2,382 2,342 2,372 2,343 2,353 2,393
1,000,000 24,018 23,813 22,628 22,407 22,367 22,636 23,173
2,000,000 50,163 51,120 39,726 37,589 37,671 37,501 37,710
4,000,000 104,051 105,560 90,427 79,778 73,129 73,205 74,947
8,000,000 214,347 216,210 199,456 159,143 157,190 158,798 157,056
disk type: spinning disk
| new version
Block count old version | 1 thread | 2 threads | 4 threads | 8 threads | 16 threads | 32 threads
100,000 3,207 3,347 3,345 3,279 3,237 3,263 3,221
1,000,000 33,659 34,106 32,081 30,261 30,142 30,115 30,876
2,000,000 68,097 74,939 56,976 51,407 50,957 56,299 58,456
4,000,000 146,503 162,389 116,956 104,435 94,905 102,606 100,526
8,000,000 331,201 349,609 267,259 247,069 243,064 247,810 247,472
Change-Id: I0721ee4a5a6824db146ba0658e60eec25dd0c65c
Reviewed-on: http://gerrit.cloudera.org:8080/14743
Reviewed-by: Adar Dembo <adar@cloudera.com>
Tested-by: Adar Dembo <adar@cloudera.com>
4 files changed