You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2017/08/24 21:15:45 UTC

[kudu-CR] [rpc] faster generation of KRPC call ID

Alexey Serbin has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7813

Change subject: [rpc] faster generation of KRPC call ID
......................................................................

[rpc] faster generation of KRPC call ID

Updated the way how the KRPC call identifiers are generated.

To compare, I used the following code (headers are omitted):

old.cc:
-------------------------

int32_t cur_id = 0;
int32_t next_id() {
  if (PREDICT_FALSE(cur_id == std::numeric_limits<int32_t>::max())) {
    cur_id = 0;
  } else {
    ++cur_id;
  }
  return cur_id;
}

int main() {
  srandom(time(nullptr));
  const int u_bound = 800000000 + (random() % 100);
  int n = 0;
  for (int i = 0; i < u_bound; ++i) {
    n = next_id();
  }
  cout << n << endl;
  return 0;
}
-------------------------

new.cc:
-------------------------
int32_t cur_id = 0;
int32_t next_id() {
  ++cur_id;
  return (cur_id & 0x7fffffff);
}

int main() {
  srandom(time(nullptr));
  const int u_bound = 800000000 + (random() % 100);
  int n = 0;
  for (int i = 0; i < u_bound; ++i) {
    n = next_id();
  }
  std::cout << n << std::endl;
  return 0;
}
-------------------------

My ad-hoc measurements show that the new version is faster even if
compiling both with -O3 optimization flag:
  old:
    real  0m0.831s
    user  0m0.822s
    sys   0m0.005s

  new:
    real  0m0.005s
    user  0m0.001s
    sys   0m0.002s

From the 'perf stat' perspective it looks good as well:

-------------------------
 Performance counter stats for './old' (10 runs):

        790.584545 task-clock                #    1.001 CPUs utilized
                 1 context-switches          #    0.002 K/sec
                 1 cpu-migrations            #    0.001 K/sec
               273 page-faults               #    0.345 K/sec
     2,583,562,591 cycles                    #    3.268 GHz
     2,805,308,074 instructions              #    1.09  insns per cycle
       200,917,139 branches                  #  254.137 M/sec
            13,513 branch-misses             #    0.01% of all branches

       0.790023494 seconds time elapsed
-------------------------

-------------------------
 Performance counter stats for './new' (10 runs):

          0.888215 task-clock                #    0.831 CPUs utilized
                 0 context-switches          #    0.000 K/sec
                 0 cpu-migrations            #    0.000 K/sec
               273 page-faults               #    0.307 M/sec
         2,561,042 cycles                    #    2.883 GHz
         2,804,468 instructions              #    1.10  insns per cycle
           480,394 branches                  #  540.853 M/sec
            11,575 branch-misses             #    2.41% of all branches

       0.001068744 seconds time elapsed
-------------------------

Change-Id: I03726343d222bcd241c2c2a5a1670a672f8e5cb6
---
M src/kudu/rpc/connection.h
1 file changed, 7 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/13/7813/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I03726343d222bcd241c2c2a5a1670a672f8e5cb6
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>

[kudu-CR] [rpc] faster generation of KRPC call ID

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: [rpc] faster generation of KRPC call ID
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7813/2//COMMIT_MSG
Commit Message:

PS2, Line 48:   const int u_bound = 800000000 + (random() % 100);
            :   int n = 0;
            :   for (int i = 0; i < u_bound; ++i) {
            :     n = next_id();
            :   }
> I think your microbenchmark is invalid. With -O3 it compiles to this assemb
Yep, I suspected that it's not a good benchmark, but the newer code looked better IMO.


-- 
To view, visit http://gerrit.cloudera.org:8080/7813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I03726343d222bcd241c2c2a5a1670a672f8e5cb6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] [rpc] faster generation of KRPC call ID

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: [rpc] faster generation of KRPC call ID
......................................................................


Patch Set 2: Code-Review-1

(1 comment)

don't think this is a worthwhile optimization since this isn't a hot code path

http://gerrit.cloudera.org:8080/#/c/7813/2//COMMIT_MSG
Commit Message:

PS2, Line 48:   const int u_bound = 800000000 + (random() % 100);
            :   int n = 0;
            :   for (int i = 0; i < u_bound; ++i) {
            :     n = next_id();
            :   }
I think your microbenchmark is invalid. With -O3 it compiles to this assembly:

int main() {
  4007a0:       48 83 ec 08             sub    $0x8,%rsp
  srandom(time(nullptr));
  4007a4:       31 ff                   xor    %edi,%edi
  4007a6:       e8 d5 ff ff ff          callq  400780 <ti...@plt>
  4007ab:       89 c7                   mov    %eax,%edi
  4007ad:       e8 8e ff ff ff          callq  400740 <sr...@plt>
  const int u_bound = 800000000 + (random() % 100);
  4007b2:       e8 a9 ff ff ff          callq  400760 <ra...@plt>
  4007b7:       48 ba 0b d7 a3 70 3d    movabs $0xa3d70a3d70a3d70b,%rdx
  4007be:       0a d7 a3 
  4007c1:       48 89 c1                mov    %rax,%rcx
  4007c4:       48 f7 ea                imul   %rdx
  4007c7:       48 89 c8                mov    %rcx,%rax
  4007ca:       48 c1 f8 3f             sar    $0x3f,%rax
  4007ce:       48 01 ca                add    %rcx,%rdx
  4007d1:       48 c1 fa 06             sar    $0x6,%rdx
  4007d5:       48 29 c2                sub    %rax,%rdx
  4007d8:       48 8d 04 92             lea    (%rdx,%rdx,4),%rax


ie it's turning the loop into a single constant time multiplication.


-- 
To view, visit http://gerrit.cloudera.org:8080/7813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I03726343d222bcd241c2c2a5a1670a672f8e5cb6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] [rpc] faster generation of KRPC call ID

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: [rpc] faster generation of KRPC call ID
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7813/2//COMMIT_MSG
Commit Message:

PS2, Line 48:   const int u_bound = 800000000 + (random() % 100);
            :   int n = 0;
            :   for (int i = 0; i < u_bound; ++i) {
            :     n = next_id();
            :   }
> Yep, I suspected that it's not a good benchmark, but the newer code looked 
yea, if you compile in clang it turns it into a constant time operation. g++ on my machine isn't quite smart enough but it seems it is on yours, perhaps. Adding the noinline attribute on the function breaks that optimization of course and it goes back to being the same speed.

Either way, I think it's a "if it aint broke dont fix it" scenario IMO


-- 
To view, visit http://gerrit.cloudera.org:8080/7813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I03726343d222bcd241c2c2a5a1670a672f8e5cb6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] [rpc] faster generation of KRPC call ID

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: [rpc] faster generation of KRPC call ID
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7813/2//COMMIT_MSG
Commit Message:

PS2, Line 48:   const int u_bound = 800000000 + (random() % 100);
            :   int n = 0;
            :   for (int i = 0; i < u_bound; ++i) {
            :     n = next_id();
            :   }
> yea, if you compile in clang it turns it into a constant time operation. g+
Thank you for looking at this.

Just FYI: even with noinline, clang 3.5 with -O2 gives slightly better results on ve0518:

new:
  real    0m1.485s
  user    0m1.486s
  sys     0m0.000s

old:
  real    0m1.997s
  user    0m1.999s
  sys     0m0.001s

The fact that the function can be easily inlined and get better result is a good sign, IMO.  Especially given the fact we usually use clang++ to compile our binaries.
 
However, it's a minor thing and I agree it's not worth updating if the incentive to do so is low.


-- 
To view, visit http://gerrit.cloudera.org:8080/7813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I03726343d222bcd241c2c2a5a1670a672f8e5cb6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] [rpc] faster generation of KRPC call ID

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: [rpc] faster generation of KRPC call ID
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7813/2//COMMIT_MSG
Commit Message:

PS2, Line 48:   const int u_bound = 800000000 + (random() % 100);
            :   int n = 0;
            :   for (int i = 0; i < u_bound; ++i) {
            :     n = next_id();
            :   }
> I think your microbenchmark is invalid. With -O3 it compiles to this assemb
oh wait, I may have misread the assembly output... you can ignore this. But still I'm skeptical of the benchmark here.


-- 
To view, visit http://gerrit.cloudera.org:8080/7813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I03726343d222bcd241c2c2a5a1670a672f8e5cb6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] [rpc] faster generation of KRPC call ID

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has abandoned this change.

Change subject: [rpc] faster generation of KRPC call ID
......................................................................


Abandoned

This is not needed: the current implementation seems to be good enough.

-- 
To view, visit http://gerrit.cloudera.org:8080/7813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I03726343d222bcd241c2c2a5a1670a672f8e5cb6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] [rpc] faster generation of KRPC call ID

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has uploaded a new patch set (#2).

Change subject: [rpc] faster generation of KRPC call ID
......................................................................

[rpc] faster generation of KRPC call ID

Updated the way how the KRPC call identifiers are generated.

To compare, I used the following code (headers are omitted):

old.cc:
-------------------------

int32_t cur_id = 0;
int32_t next_id() {
  if (PREDICT_FALSE(cur_id == std::numeric_limits<int32_t>::max())) {
    cur_id = 0;
  } else {
    ++cur_id;
  }
  return cur_id;
}

int main() {
  srandom(time(nullptr));
  const int u_bound = 800000000 + (random() % 100);
  int n = 0;
  for (int i = 0; i < u_bound; ++i) {
    n = next_id();
  }
  cout << n << endl;
  return 0;
}
-------------------------

new.cc:
-------------------------
int32_t cur_id = 0;
int32_t next_id() {
  ++cur_id;
  return (cur_id & 0x7fffffff);
}

int main() {
  srandom(time(nullptr));
  const int u_bound = 800000000 + (random() % 100);
  int n = 0;
  for (int i = 0; i < u_bound; ++i) {
    n = next_id();
  }
  std::cout << n << std::endl;
  return 0;
}
-------------------------

My ad-hoc measurements show that the new version is faster even if
compiling both with -O3 optimization flag:
  old:
    real  0m0.831s
    user  0m0.822s
    sys   0m0.005s

  new:
    real  0m0.005s
    user  0m0.001s
    sys   0m0.002s

From the 'perf stat' perspective it looks good as well:

-------------------------
 Performance counter stats for './old' (10 runs):

        790.584545 task-clock                #    1.001 CPUs utilized
                 1 context-switches          #    0.002 K/sec
                 1 cpu-migrations            #    0.001 K/sec
               273 page-faults               #    0.345 K/sec
     2,583,562,591 cycles                    #    3.268 GHz
     2,805,308,074 instructions              #    1.09  insns per cycle
       200,917,139 branches                  #  254.137 M/sec
            13,513 branch-misses             #    0.01% of all branches

       0.790023494 seconds time elapsed
-------------------------

-------------------------
 Performance counter stats for './new' (10 runs):

          0.888215 task-clock                #    0.831 CPUs utilized
                 0 context-switches          #    0.000 K/sec
                 0 cpu-migrations            #    0.000 K/sec
               273 page-faults               #    0.307 M/sec
         2,561,042 cycles                    #    2.883 GHz
         2,804,468 instructions              #    1.10  insns per cycle
           480,394 branches                  #  540.853 M/sec
            11,575 branch-misses             #    2.41% of all branches

       0.001068744 seconds time elapsed
-------------------------

Change-Id: I03726343d222bcd241c2c2a5a1670a672f8e5cb6
---
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
2 files changed, 9 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/13/7813/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I03726343d222bcd241c2c2a5a1670a672f8e5cb6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>