You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2019/12/09 18:36:34 UTC

[kudu-CR] client: optimize destruction of WriteRpc

Hello Alexey Serbin, Andrew Wong,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/14868

to review the following change.


Change subject: client: optimize destruction of WriteRpc
......................................................................

client: optimize destruction of WriteRpc

When writing batches with lots of operations, the WriteRpc destructor
ends up cache-miss bound, since the various InFlightOp and WriteOps are
strewn all about memory. This adds some prefetching which sped things up
noticeably (~37%) in a benchmark which ends up bound by the reactor thread on
the client side.

  $ perf stat ./build/thinlto/bin/kudu perf loadgen localhost -num_rows_per_thread=10000000 -num_threads=8

Before:
  Generator report
    time total  : 51403.6 ms
    time per row: 0.000642545 ms
  Dropping auto-created table 'default.loadgen_auto_d289807fc12a4b1c861f79b19af9ec8e'

   Performance counter stats for './build/thinlto/bin/kudu perf loadgen localhost -num_rows_per_thread=10000000 -num_threads=8':

          180,585.24 msec task-clock                #    3.508 CPUs utilized
              25,373      context-switches          #    0.141 K/sec
               1,648      cpu-migrations            #    0.009 K/sec
              50,927      page-faults               #    0.282 K/sec
     726,022,544,856      cycles                    #    4.020 GHz                      (83.33%)
      71,782,315,500      stalled-cycles-frontend   #    9.89% frontend cycles idle     (83.36%)
     412,273,652,207      stalled-cycles-backend    #   56.79% backend cycles idle      (83.29%)
     408,271,477,858      instructions              #    0.56  insn per cycle
                                                    #    1.01  stalled cycles per insn  (83.35%)
      75,750,045,948      branches                  #  419.470 M/sec                    (83.33%)
         296,247,270      branch-misses             #    0.39% of all branches          (83.34%)

        51.475433628 seconds time elapsed

       178.590913000 seconds user
         1.935099000 seconds sys

After:
  Generator report
    time total  : 37293.8 ms
    time per row: 0.000466172 ms
  Dropping auto-created table 'default.loadgen_auto_ece2f41beef94a9fa032c77899f7e61c'

   Performance counter stats for './build/thinlto/bin/kudu perf loadgen localhost -num_rows_per_thread=10000000 -num_threads=8':

          189,125.49 msec task-clock                #    5.060 CPUs utilized
              29,363      context-switches          #    0.155 K/sec
               2,043      cpu-migrations            #    0.011 K/sec
              48,405      page-faults               #    0.256 K/sec
     772,496,448,279      cycles                    #    4.085 GHz                      (83.33%)
     129,999,474,226      stalled-cycles-frontend   #   16.83% frontend cycles idle     (83.36%)
     300,049,388,250      stalled-cycles-backend    #   38.84% backend cycles idle      (83.30%)
     414,415,517,571      instructions              #    0.54  insn per cycle
                                                    #    0.72  stalled cycles per insn  (83.32%)
      76,829,647,882      branches                  #  406.236 M/sec                    (83.34%)
         352,749,453      branch-misses             #    0.46% of all branches          (83.35%)

        37.376785122 seconds time elapsed

       186.834651000 seconds user
         2.143945000 seconds sys

Change-Id: I538f995f7ec161e746885c6b31cd1dccd72139b0
---
M src/kudu/client/batcher.cc
M src/kudu/common/partial_row.h
2 files changed, 75 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/14868/1
-- 
To view, visit http://gerrit.cloudera.org:8080/14868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I538f995f7ec161e746885c6b31cd1dccd72139b0
Gerrit-Change-Number: 14868
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>

[kudu-CR] client: optimize destruction of WriteRpc

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/14868 )

Change subject: client: optimize destruction of WriteRpc
......................................................................


Patch Set 2:

> Patch Set 2: Code-Review+2
> 
> > not a race, I just didn't handle the failure case where the write_op() had been set to null prior to WriteRpc destruction, I think. fixing
> 
> Can you remind me how that can happen?

In the MarkInFlightOpFailed() code path, we release the op from an InFlightOp


-- 
To view, visit http://gerrit.cloudera.org:8080/14868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I538f995f7ec161e746885c6b31cd1dccd72139b0
Gerrit-Change-Number: 14868
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 09 Dec 2019 21:14:58 +0000
Gerrit-HasComments: No

[kudu-CR] client: optimize destruction of WriteRpc

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has removed a vote on this change.

Change subject: client: optimize destruction of WriteRpc
......................................................................


Removed Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/14868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I538f995f7ec161e746885c6b31cd1dccd72139b0
Gerrit-Change-Number: 14868
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] client: optimize destruction of WriteRpc

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Andrew Wong, Kudu Jenkins, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14868

to look at the new patch set (#2).

Change subject: client: optimize destruction of WriteRpc
......................................................................

client: optimize destruction of WriteRpc

When writing batches with lots of operations, the WriteRpc destructor
ends up cache-miss bound, since the various InFlightOp and WriteOps are
strewn all about memory. This adds some prefetching which sped things up
noticeably (~37%) in a benchmark which ends up bound by the reactor thread on
the client side.

  $ perf stat ./build/thinlto/bin/kudu perf loadgen localhost -num_rows_per_thread=10000000 -num_threads=8

Before:
  Generator report
    time total  : 51403.6 ms
    time per row: 0.000642545 ms
  Dropping auto-created table 'default.loadgen_auto_d289807fc12a4b1c861f79b19af9ec8e'

   Performance counter stats for './build/thinlto/bin/kudu perf loadgen localhost -num_rows_per_thread=10000000 -num_threads=8':

          180,585.24 msec task-clock                #    3.508 CPUs utilized
              25,373      context-switches          #    0.141 K/sec
               1,648      cpu-migrations            #    0.009 K/sec
              50,927      page-faults               #    0.282 K/sec
     726,022,544,856      cycles                    #    4.020 GHz                      (83.33%)
      71,782,315,500      stalled-cycles-frontend   #    9.89% frontend cycles idle     (83.36%)
     412,273,652,207      stalled-cycles-backend    #   56.79% backend cycles idle      (83.29%)
     408,271,477,858      instructions              #    0.56  insn per cycle
                                                    #    1.01  stalled cycles per insn  (83.35%)
      75,750,045,948      branches                  #  419.470 M/sec                    (83.33%)
         296,247,270      branch-misses             #    0.39% of all branches          (83.34%)

        51.475433628 seconds time elapsed

       178.590913000 seconds user
         1.935099000 seconds sys

After:
  Generator report
    time total  : 37293.8 ms
    time per row: 0.000466172 ms
  Dropping auto-created table 'default.loadgen_auto_ece2f41beef94a9fa032c77899f7e61c'

   Performance counter stats for './build/thinlto/bin/kudu perf loadgen localhost -num_rows_per_thread=10000000 -num_threads=8':

          189,125.49 msec task-clock                #    5.060 CPUs utilized
              29,363      context-switches          #    0.155 K/sec
               2,043      cpu-migrations            #    0.011 K/sec
              48,405      page-faults               #    0.256 K/sec
     772,496,448,279      cycles                    #    4.085 GHz                      (83.33%)
     129,999,474,226      stalled-cycles-frontend   #   16.83% frontend cycles idle     (83.36%)
     300,049,388,250      stalled-cycles-backend    #   38.84% backend cycles idle      (83.30%)
     414,415,517,571      instructions              #    0.54  insn per cycle
                                                    #    0.72  stalled cycles per insn  (83.32%)
      76,829,647,882      branches                  #  406.236 M/sec                    (83.34%)
         352,749,453      branch-misses             #    0.46% of all branches          (83.35%)

        37.376785122 seconds time elapsed

       186.834651000 seconds user
         2.143945000 seconds sys

Change-Id: I538f995f7ec161e746885c6b31cd1dccd72139b0
---
M src/kudu/client/batcher.cc
M src/kudu/common/partial_row.h
2 files changed, 80 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/14868/2
-- 
To view, visit http://gerrit.cloudera.org:8080/14868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I538f995f7ec161e746885c6b31cd1dccd72139b0
Gerrit-Change-Number: 14868
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] client: optimize destruction of WriteRpc

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14868 )

Change subject: client: optimize destruction of WriteRpc
......................................................................

client: optimize destruction of WriteRpc

When writing batches with lots of operations, the WriteRpc destructor
ends up cache-miss bound, since the various InFlightOp and WriteOps are
strewn all about memory. This adds some prefetching which sped things up
noticeably (~37%) in a benchmark which ends up bound by the reactor thread on
the client side.

  $ perf stat ./build/thinlto/bin/kudu perf loadgen localhost -num_rows_per_thread=10000000 -num_threads=8

Before:
  Generator report
    time total  : 51403.6 ms
    time per row: 0.000642545 ms
  Dropping auto-created table 'default.loadgen_auto_d289807fc12a4b1c861f79b19af9ec8e'

   Performance counter stats for './build/thinlto/bin/kudu perf loadgen localhost -num_rows_per_thread=10000000 -num_threads=8':

          180,585.24 msec task-clock                #    3.508 CPUs utilized
              25,373      context-switches          #    0.141 K/sec
               1,648      cpu-migrations            #    0.009 K/sec
              50,927      page-faults               #    0.282 K/sec
     726,022,544,856      cycles                    #    4.020 GHz                      (83.33%)
      71,782,315,500      stalled-cycles-frontend   #    9.89% frontend cycles idle     (83.36%)
     412,273,652,207      stalled-cycles-backend    #   56.79% backend cycles idle      (83.29%)
     408,271,477,858      instructions              #    0.56  insn per cycle
                                                    #    1.01  stalled cycles per insn  (83.35%)
      75,750,045,948      branches                  #  419.470 M/sec                    (83.33%)
         296,247,270      branch-misses             #    0.39% of all branches          (83.34%)

        51.475433628 seconds time elapsed

       178.590913000 seconds user
         1.935099000 seconds sys

After:
  Generator report
    time total  : 37293.8 ms
    time per row: 0.000466172 ms
  Dropping auto-created table 'default.loadgen_auto_ece2f41beef94a9fa032c77899f7e61c'

   Performance counter stats for './build/thinlto/bin/kudu perf loadgen localhost -num_rows_per_thread=10000000 -num_threads=8':

          189,125.49 msec task-clock                #    5.060 CPUs utilized
              29,363      context-switches          #    0.155 K/sec
               2,043      cpu-migrations            #    0.011 K/sec
              48,405      page-faults               #    0.256 K/sec
     772,496,448,279      cycles                    #    4.085 GHz                      (83.33%)
     129,999,474,226      stalled-cycles-frontend   #   16.83% frontend cycles idle     (83.36%)
     300,049,388,250      stalled-cycles-backend    #   38.84% backend cycles idle      (83.30%)
     414,415,517,571      instructions              #    0.54  insn per cycle
                                                    #    0.72  stalled cycles per insn  (83.32%)
      76,829,647,882      branches                  #  406.236 M/sec                    (83.34%)
         352,749,453      branch-misses             #    0.46% of all branches          (83.35%)

        37.376785122 seconds time elapsed

       186.834651000 seconds user
         2.143945000 seconds sys

Change-Id: I538f995f7ec161e746885c6b31cd1dccd72139b0
Reviewed-on: http://gerrit.cloudera.org:8080/14868
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Todd Lipcon <to...@apache.org>
---
M src/kudu/client/batcher.cc
M src/kudu/common/partial_row.h
2 files changed, 80 insertions(+), 3 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Todd Lipcon: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/14868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I538f995f7ec161e746885c6b31cd1dccd72139b0
Gerrit-Change-Number: 14868
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] client: optimize destruction of WriteRpc

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/14868 )

Change subject: client: optimize destruction of WriteRpc
......................................................................


Patch Set 2: Verified+1

Strange python packaging issue (unable to find 'wheel')


-- 
To view, visit http://gerrit.cloudera.org:8080/14868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I538f995f7ec161e746885c6b31cd1dccd72139b0
Gerrit-Change-Number: 14868
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 09 Dec 2019 21:40:50 +0000
Gerrit-HasComments: No

[kudu-CR] client: optimize destruction of WriteRpc

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14868 )

Change subject: client: optimize destruction of WriteRpc
......................................................................


Patch Set 1:

(1 comment)

Test failures suggest you've unearthed an interesting race.

http://gerrit.cloudera.org:8080/#/c/14868/1/src/kudu/client/batcher.cc
File src/kudu/client/batcher.cc:

http://gerrit.cloudera.org:8080/#/c/14868/1/src/kudu/client/batcher.cc@371
PS1, Line 371:  
Missing a period here.



-- 
To view, visit http://gerrit.cloudera.org:8080/14868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I538f995f7ec161e746885c6b31cd1dccd72139b0
Gerrit-Change-Number: 14868
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 09 Dec 2019 19:50:29 +0000
Gerrit-HasComments: Yes

[kudu-CR] client: optimize destruction of WriteRpc

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14868 )

Change subject: client: optimize destruction of WriteRpc
......................................................................


Patch Set 2: Code-Review+2

> not a race, I just didn't handle the failure case where the write_op() had been set to null prior to WriteRpc destruction, I think. fixing

Can you remind me how that can happen?


-- 
To view, visit http://gerrit.cloudera.org:8080/14868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I538f995f7ec161e746885c6b31cd1dccd72139b0
Gerrit-Change-Number: 14868
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 09 Dec 2019 20:06:00 +0000
Gerrit-HasComments: No

[kudu-CR] client: optimize destruction of WriteRpc

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/14868 )

Change subject: client: optimize destruction of WriteRpc
......................................................................


Patch Set 1:

(1 comment)

not a race, I just didn't handle the failure case where the write_op() had been set to null prior to WriteRpc destruction, I think. fixing

http://gerrit.cloudera.org:8080/#/c/14868/1/src/kudu/client/batcher.cc
File src/kudu/client/batcher.cc:

http://gerrit.cloudera.org:8080/#/c/14868/1/src/kudu/client/batcher.cc@371
PS1, Line 371:  
> Missing a period here.
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/14868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I538f995f7ec161e746885c6b31cd1dccd72139b0
Gerrit-Change-Number: 14868
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 09 Dec 2019 19:55:25 +0000
Gerrit-HasComments: Yes