You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2020/06/13 04:31:05 UTC

[kudu-CR] [consensus] use move semantics for LogEntryBatchPB

Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16075


Change subject: [consensus] use move semantics for LogEntryBatchPB
......................................................................

[consensus] use move semantics for LogEntryBatchPB

As it turned out, in the context of the Log class and related utilities,
it's possible to get rid of the heap allocation and unique_ptr-wrapping
for LogEntryBatchPB.  With this patch, instances of LogEntryBatchPB are
allocated on the stack and passed around using the move semantics.

Change-Id: Ib5d3c384dd6f17e8c4c71eec074af9e98827262b
---
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/log_reader.cc
M src/kudu/consensus/log_reader.h
M src/kudu/consensus/log_util.cc
M src/kudu/consensus/log_util.h
6 files changed, 54 insertions(+), 56 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/16075/1
-- 
To view, visit http://gerrit.cloudera.org:8080/16075
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib5d3c384dd6f17e8c4c71eec074af9e98827262b
Gerrit-Change-Number: 16075
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>

[kudu-CR] [consensus] use move semantics for LogEntryBatchPB

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16075 )

Change subject: [consensus] use move semantics for LogEntryBatchPB
......................................................................

[consensus] use move semantics for LogEntryBatchPB

As it turned out, in the context of the Log class and related utilities,
it's possible to get rid of the heap allocation and unique_ptr-wrapping
for LogEntryBatchPB.  With this patch, instances of LogEntryBatchPB are
allocated on the stack and passed around using the move semantics.

There aren't "low-level unit" tests for the affected code paths and I'm
not sure it makes sense to introduce one.  However, there is scenario
named 'MultiThreadedLogTest.TestAppends' in the "synthetic" mt-log-test.
To assess performance impact of this patch, I ran the following for
RELEASE builds with and without this patch:

  ./bin/mt-log-test --num_writer_threads=1 --num_batches_per_thread=500000 --num_reader_threads=0 --num_ops_per_batch_avg=1 --verify_log=false --log_segment_size_mb=4096 --never_fsync --unlock_unsafe_flags --gtest_filter='MultiThreadedLogTest.TestAppends' --gtest_repeat=100 | \
  grep 'Time spent inserting 500000 batches' | awk '{print $14}' | \
  sed 's/s$//' | awk 'BEGIN { sum = 0 } { sum += $1 } END { print sum }'

With each run producing summary like below:
  Time spent inserting 500000 batches(1 threads, 500000 per-thread): real 7.506s      user 0.000s     sys 0.000s

I got the following results:
  with the patch:
    711.679

  without the patch:
    736.141

So, the performance at least didn't degrade.

Change-Id: Ib5d3c384dd6f17e8c4c71eec074af9e98827262b
Reviewed-on: http://gerrit.cloudera.org:8080/16075
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong <aw...@cloudera.com>
---
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/log_cache-test.cc
M src/kudu/consensus/log_cache.cc
M src/kudu/consensus/log_cache.h
M src/kudu/consensus/log_reader.cc
M src/kudu/consensus/log_reader.h
M src/kudu/consensus/log_util.cc
M src/kudu/consensus/log_util.h
11 files changed, 70 insertions(+), 71 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Andrew Wong: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/16075
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ib5d3c384dd6f17e8c4c71eec074af9e98827262b
Gerrit-Change-Number: 16075
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [consensus] use move semantics for LogEntryBatchPB

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16075

to look at the new patch set (#2).

Change subject: [consensus] use move semantics for LogEntryBatchPB
......................................................................

[consensus] use move semantics for LogEntryBatchPB

As it turned out, in the context of the Log class and related utilities,
it's possible to get rid of the heap allocation and unique_ptr-wrapping
for LogEntryBatchPB.  With this patch, instances of LogEntryBatchPB are
allocated on the stack and passed around using the move semantics.

There aren't "low-level unit" tests for the affected code paths and I'm
not sure it makes sense to introduce one.  However, there is scenario
named 'MultiThreadedLogTest.TestAppends' in the "synthetic" mt-log-test.
To assess performance impact of this patch, I ran the following for
RELEASE builds with and without this patch:

  ./bin/mt-log-test --num_writer_threads=1 --num_batches_per_thread=500000 --num_reader_threads=0 --num_ops_per_batch_avg=1 --verify_log=false --log_segment_size_mb=4096 --never_fsync --unlock_unsafe_flags --gtest_filter='MultiThreadedLogTest.TestAppends' --gtest_repeat=100 | \
  grep 'Time spent inserting 500000 batches' | awk '{print $14}' | \
  sed 's/s$//' | awk 'BEGIN { sum = 0 } { sum += $1 } END { print sum }'

With each run producing summary like below:
  Time spent inserting 500000 batches(1 threads, 500000 per-thread): real 7.506s      user 0.000s     sys 0.000s

I got the following results:
  with the patch:
    711.679

  without the patch:
    736.141

So, the performance at least didn't degrade.

Change-Id: Ib5d3c384dd6f17e8c4c71eec074af9e98827262b
---
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/log_cache-test.cc
M src/kudu/consensus/log_cache.cc
M src/kudu/consensus/log_cache.h
M src/kudu/consensus/log_reader.cc
M src/kudu/consensus/log_reader.h
M src/kudu/consensus/log_util.cc
M src/kudu/consensus/log_util.h
11 files changed, 70 insertions(+), 71 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/16075/2
-- 
To view, visit http://gerrit.cloudera.org:8080/16075
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib5d3c384dd6f17e8c4c71eec074af9e98827262b
Gerrit-Change-Number: 16075
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [consensus] use move semantics for LogEntryBatchPB

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has removed a vote on this change.

Change subject: [consensus] use move semantics for LogEntryBatchPB
......................................................................


Removed Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/16075
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: Ib5d3c384dd6f17e8c4c71eec074af9e98827262b
Gerrit-Change-Number: 16075
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [consensus] use move semantics for LogEntryBatchPB

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16075 )

Change subject: [consensus] use move semantics for LogEntryBatchPB
......................................................................


Patch Set 1:

Would be nice to quantify the perf improvement for this. While heap allocation might be expensive, iiuc moving PBs entails trivial copies on move, which might add up, vs pointer swizzling.


-- 
To view, visit http://gerrit.cloudera.org:8080/16075
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib5d3c384dd6f17e8c4c71eec074af9e98827262b
Gerrit-Change-Number: 16075
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 13 Jun 2020 19:40:09 +0000
Gerrit-HasComments: No

[kudu-CR] [consensus] use move semantics for LogEntryBatchPB

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16075 )

Change subject: [consensus] use move semantics for LogEntryBatchPB
......................................................................


Patch Set 1:

> Would be nice to quantify the perf improvement for this. While heap
 > allocation might be expensive, iiuc moving PBs entails trivial
 > copies on move, which might add up, vs pointer swizzling.

I added a blurb on the measured performance impact for this patch in PS2.  We don't have a targeted low-level unit test to assess these kind of thinks for consensus/log.{cc,h}, but I ran mt-log-test as a synthetic benchmark.


-- 
To view, visit http://gerrit.cloudera.org:8080/16075
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib5d3c384dd6f17e8c4c71eec074af9e98827262b
Gerrit-Change-Number: 16075
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 15 Jun 2020 03:49:35 +0000
Gerrit-HasComments: No

[kudu-CR] [consensus] use move semantics for LogEntryBatchPB

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16075 )

Change subject: [consensus] use move semantics for LogEntryBatchPB
......................................................................


Patch Set 2:

> Would be nice to quantify the perf improvement for this. While heap
 > allocation might be expensive, iiuc moving PBs entails trivial
 > copies on move, which might add up, vs pointer swizzling.

Keep in mind that moving around std::unique_ptr also involves trivial copying and calling a couple of std::unique_ptr methods, and now this and also creation of std::unique_ptr instances are both gone.


-- 
To view, visit http://gerrit.cloudera.org:8080/16075
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib5d3c384dd6f17e8c4c71eec074af9e98827262b
Gerrit-Change-Number: 16075
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 15 Jun 2020 04:03:28 +0000
Gerrit-HasComments: No

[kudu-CR] [consensus] use move semantics for LogEntryBatchPB

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16075 )

Change subject: [consensus] use move semantics for LogEntryBatchPB
......................................................................


Patch Set 2: Code-Review+2

> Patch Set 2:
> 
> > Would be nice to quantify the perf improvement for this. While heap
>  > allocation might be expensive, iiuc moving PBs entails trivial
>  > copies on move, which might add up, vs pointer swizzling.
> 
> Keep in mind that moving around std::unique_ptr also involves trivial copying and calling a couple of std::unique_ptr methods, and now this and also creation of std::unique_ptr instances are both gone.

Thanks for the findings! It seems that the move constructor for PBs works via swapping metadata pointers, which I imagine is quite cheap -- at least comparable to swapping a pointer as we would for std::unique_ptr, as you found.


-- 
To view, visit http://gerrit.cloudera.org:8080/16075
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib5d3c384dd6f17e8c4c71eec074af9e98827262b
Gerrit-Change-Number: 16075
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 15 Jun 2020 15:51:22 +0000
Gerrit-HasComments: No

[kudu-CR] [consensus] use move semantics for LogEntryBatchPB

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16075 )

Change subject: [consensus] use move semantics for LogEntryBatchPB
......................................................................


Patch Set 1: Verified+1

(2 comments)

Unrelated failures in a couple of tests:
  * ClientFailoverTServerTimeoutITest.FailoverOnLeaderTimeout: timeout creating table
  * CreateTableITest.TestCreateWhenMajorityOfReplicasFailCreation: failed to start chronyd

http://gerrit.cloudera.org:8080/#/c/16075/1/src/kudu/consensus/log_util.cc
File src/kudu/consensus/log_util.cc:

http://gerrit.cloudera.org:8080/#/c/16075/1/src/kudu/consensus/log_util.cc@708
PS1, Line 708:                                    *offset, header.msg_length));
> warning: 1st function call argument is an uninitialized value [clang-analyz
I'm going to ignore this garbage warning from TidyBot.


http://gerrit.cloudera.org:8080/#/c/16075/1/src/kudu/consensus/log_util.cc@709
PS1, Line 709:   if (header.msg_length == 0) {
> warning: The left operand of '==' is a garbage value [clang-analyzer-core.U
ditto



-- 
To view, visit http://gerrit.cloudera.org:8080/16075
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib5d3c384dd6f17e8c4c71eec074af9e98827262b
Gerrit-Change-Number: 16075
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 13 Jun 2020 05:56:14 +0000
Gerrit-HasComments: Yes