You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Adar Dembo (Code Review)" <ge...@cloudera.org> on 2019/11/01 00:07:19 UTC

[kudu-CR] KUDU-2977 sharding block map to speed up tserver startup

Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14555 )

Change subject: KUDU-2977 sharding block map to speed up tserver startup
......................................................................


Patch Set 2:

(7 comments)

Thanks for the detailed perf testing.

Could you summarize your findings in the commit description? You don't need to provide all of the numbers, but just indicate, given a particular choice X, which option seemed best from the data you collected.

http://gerrit.cloudera.org:8080/#/c/14555/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14555/1//COMMIT_MSG@11
PS1, Line 11: 
> Yea, I ran the benchmark multiple times, pretty stable.
Could you summarize this finding and add it to the commit message?


http://gerrit.cloudera.org:8080/#/c/14555/2/src/kudu/fs/log_block_manager.h
File src/kudu/fs/log_block_manager.h:

http://gerrit.cloudera.org:8080/#/c/14555/2/src/kudu/fs/log_block_manager.h@409
PS2, Line 409:   // Block IDs container used to prevent collisions when creating new anonymous blocks.
Nit: the word "container" is already used elsewhere in the log block manager, so it's confusing to see it here too.

Perhaps we can call these shards? So the struct could be called ManagedBlockShard, and managed_blocks_ could be managed_block_shards_?


http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.h
File src/kudu/fs/log_block_manager.h:

http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.h@410
PS1, Line 410:   struct ManagedBlocks {
> unique_ptr is OK, Done.
Why do we need the indirection provided by a unique_ptr? Couldn't we just declare the members directly in the struct?


http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.cc@1861
PS1, Line 1861: static const uint64_t kBlockMapMask = kBlockMapChunk - 1;
> I also ran StartupBenchmark on 2 servers with different kBlockMapChunk valu
What was the significance of running on two servers? Different hardware?

Also, could you summarize this and add it to the commit message?


http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.cc@2288
PS1, Line 2288:     Status s = RemoveLogBlock(block_id, &lb);
> I did some benchmark for RemoveLogBlocks(in LogBlockManagerTest.StartupBenc
Summarize and add to commit message?


http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.cc@2612
PS1, Line 2612:                    << " which already is alive from another container when "
> If we lock once before per ith shard's sub-map insertion, concurrency will 
Same.


http://gerrit.cloudera.org:8080/#/c/14555/2/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14555/2/src/kudu/fs/log_block_manager.cc@1916
PS2, Line 1916: LogBlockManager::~LogBlockManager() {
When doing something on each shard, could you iterate on the number of elements in managed_blocks_? Something like:

  for (const auto& mb : managed_blocks_) {
    ...
  }



-- 
To view, visit http://gerrit.cloudera.org:8080/14555
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0d5c13e051a2c1d6cfd1c9ad7db8a3cd195459d
Gerrit-Change-Number: 14555
Gerrit-PatchSet: 2
Gerrit-Owner: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Comment-Date: Fri, 01 Nov 2019 00:07:19 +0000
Gerrit-HasComments: Yes