You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2019/12/09 04:17:00 UTC
[jira] [Commented] (KUDU-3001) Multi-thread to load containers in a
data directory
[ https://issues.apache.org/jira/browse/KUDU-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16991124#comment-16991124 ]
ASF subversion and git services commented on KUDU-3001:
-------------------------------------------------------
Commit 6b6910870ce2c35bf8b9be9408f44a8cec6b580a in kudu's branch refs/heads/master from Yingchun Lai
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=6b69108 ]
KUDU-3001 Multi-thread to load containers in a data directory
When a data directory has many block containers, a single thread to
load these container files is low efficiency, we can improve it by
multi-threads.
We did some simple benchmarks to verify it. Adjust
'log_container_max_size' to 1GB to generate more containers when do
benchmarks, adjust 'startup_benchmark_data_dir_count_for_testing' to 8
to make sure existing concurrent data directories load are effective,
and adjust 'fs_max_thread_count_per_data_dir' and
'startup_benchmark_block_count_for_testing' to different
values, timing 10 times ReopenBlockManager(), in milliseconds,
result details as follow:
disk type: SSD
| new version
Block count old version | 1 thread | 2 threads | 4 threads | 8 threads | 16 threads | 32 threads
100,000 2,375 2,382 2,342 2,372 2,343 2,353 2,393
1,000,000 24,018 23,813 22,628 22,407 22,367 22,636 23,173
2,000,000 50,163 51,120 39,726 37,589 37,671 37,501 37,710
4,000,000 104,051 105,560 90,427 79,778 73,129 73,205 74,947
8,000,000 214,347 216,210 199,456 159,143 157,190 158,798 157,056
disk type: spinning disk
| new version
Block count old version | 1 thread | 2 threads | 4 threads | 8 threads | 16 threads | 32 threads
100,000 3,207 3,347 3,345 3,279 3,237 3,263 3,221
1,000,000 33,659 34,106 32,081 30,261 30,142 30,115 30,876
2,000,000 68,097 74,939 56,976 51,407 50,957 56,299 58,456
4,000,000 146,503 162,389 116,956 104,435 94,905 102,606 100,526
8,000,000 331,201 349,609 267,259 247,069 243,064 247,810 247,472
Change-Id: I0721ee4a5a6824db146ba0658e60eec25dd0c65c
Reviewed-on: http://gerrit.cloudera.org:8080/14743
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Adar Dembo <ad...@cloudera.com>
> Multi-thread to load containers in a data directory
> ---------------------------------------------------
>
> Key: KUDU-3001
> URL: https://issues.apache.org/jira/browse/KUDU-3001
> Project: Kudu
> Issue Type: Improvement
> Reporter: Yingchun Lai
> Assignee: Yingchun Lai
> Priority: Major
> Fix For: 1.12.0
>
>
> As what [~tlipcon] mentioned in https://issues.apache.org/jira/browse/KUDU-2014, we can improve tserver startup time by load containers in a data directoty by multiple threads.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)