You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2020/06/10 16:46:30 UTC

[kudu-CR] [tests] add same tablet concurrent writes test

Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16060


Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................

[tests] add same_tablet_concurrent_writes test

Added SameTabletConcurrentWritesTest.InsertsOnly test scenario.
The scenario exercises concurrent inserts from multiple clients
into the same tablet.

The purpose of the newly introduced test is to check for lock contention
if running multiple write operations on the same tablet concurrently.
There is an interaction between threads pushing Raft consensus updates
and RPC worker threads serving write requests, and the test pinpoints
the contention over the lock primitives in RaftConsensus.

To validate the results reported by the test, I verified that RPC queue
overflows happen a bit less often if using the lock-free implementation
of RaftConsensus::CheckLeadershipAndBindTerm() with patch posted here:
  https://gerrit.cloudera.org/#/c/16034/

The rates of successful write operations was the same for both cases.
However, the number of messages from spinlock_profiling.cc like
  Waited 190 ms on lock 0x237acd4 ...
dropped significantly after applying patch 16034 on top.  That's a good
news to have less contention because the freed CPU resources might be
spend on something useful.

Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc
2 files changed, 244 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/16060/1
-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>

[kudu-CR] [tests] add same tablet concurrent writes test

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/16060 )

Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................


Patch Set 3: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16060/3/src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc
File src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc:

http://gerrit.cloudera.org:8080/#/c/16060/3/src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc@214
PS3, Line 214:   FLAGS_max_num_columns = kNumColumns;
Would it make sense to use a larger value instead of more columns given 1000 columns isn't representative of any real workload?



-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 29 Jun 2020 15:24:53 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tests] add same tablet concurrent writes test

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16060 )

Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................


Patch Set 3: Verified+1

unrelated test failure due to SSL issue


-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Sat, 20 Jun 2020 05:06:00 +0000
Gerrit-HasComments: No

[kudu-CR] [tests] add same tablet concurrent writes test

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16060

to look at the new patch set (#5).

Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................

[tests] add same_tablet_concurrent_writes test

Added SameTabletConcurrentWritesTest.InsertsOnly test scenario.
The scenario exercises concurrent inserts from multiple clients
into the same tablet.

The purpose of the newly introduced test is to check for lock contention
if running multiple write operations on the same tablet concurrently.
There is an interaction between threads pushing Raft consensus updates
and RPC worker threads serving write requests, and the test pinpoints
the contention over the lock primitives used in RaftConsensus.

To validate the results reported by the test, I verified that RPC queue
overflows happen a bit less often if using the lock-free implementation
of RaftConsensus::CheckLeadershipAndBindTerm() with patch posted here:
  https://gerrit.cloudera.org/#/c/16034/

The rates of successful write operations was the same for both cases,
and that's expected since the bottleneck is the WAL (where additional
static delays are introduced per each fsync).  However, the number of
messages from spinlock_profiling.cc like
  Waited 190 ms on lock 0x237acd4 ...
dropped significantly after applying patch 16034 on top.  That's a good
news to have less contention because the freed CPU resources might be
spend on something useful, like handing another RPC request from the
queue (which isn't overflown and able to accommodate extra requests).

Below are snippets of various measurements done for this new test
before and after applying patch from 16034 review item on top.

========================================================================

Without 16034 patch:
 Performance counter stats for './bin/same_tablet_concurrent_writes-itest':

       7449.425640 task-clock                #    0.504 CPUs utilized
            47,882 context-switches          #    0.006 M/sec
             3,454 cpu-migrations            #    0.464 K/sec
            28,592 page-faults               #    0.004 M/sec
    10,211,586,270 cycles                    #    1.371 GHz
    10,647,306,766 instructions              #    1.04  insns per cycle
     1,861,229,149 branches                  #  249.849 M/sec
        25,370,590 branch-misses             #    1.36% of all branches

      14.767762000 seconds time elapsed

With 16034 patch:
 Performance counter stats for './bin/same_tablet_concurrent_writes-itest':

       5646.543970 task-clock                #    0.394 CPUs utilized
            39,194 context-switches          #    0.007 M/sec
             3,715 cpu-migrations            #    0.658 K/sec
            30,090 page-faults               #    0.005 M/sec
     8,543,832,082 cycles                    #    1.513 GHz
     9,301,870,856 instructions              #    1.09  insns per cycle
     1,590,579,357 branches                  #  281.691 M/sec
        18,563,203 branch-misses             #    1.17% of all branches

      14.339274728 seconds time elapsed

========================================================================

------------------------------------------------------------------------
                           | Without 16034 patch   |  With 16034 patch
------------------------------------------------------------------------
  write RPC request rate   | 15.8 req/sec          | 16 req/sec
  RPC queue overflows      | 1898                  | 50
  spinlock_contention_time | 22966310              | 9161557
------------------------------------------------------------------------
  rpc_incoming_queue_time  |                       |
                           | Count: 82             | Count: 82
                           | Mean: 199704          | Mean: 1037.87
                           | Percentiles:          | Percentiles:
                           |   0%  (min) = 35      |   0%  (min) = 21
                           |  25%        = 5844    |  25%        = 51
                           |  50%  (med) = 196096  |  50%  (med) = 67
                           |  75%        = 388352  |  75%        = 1608
                           |  95%        = 400640  |  95%        = 3334
                           |  99%        = 598016  |  99%        = 9960
                           |  99.9%      = 599552  |  99.9%      = 10064
                           |  99.99%     = 599552  |  99.99%     = 10064
                           |  100% (max) = 600048  |  100% (max) = 10066
------------------------------------------------------------------------
  op_apply_run_time        |                       |
                           | Count: 79             | Count: 80
                           | Mean: 99377.1         | Mean: 80796.3
                           | Percentiles:          | Percentiles:
                           |   0%  (min) = 429     |   0%  (min) = 575
                           |  25%        = 940     |  25%        = 828
                           |  50%  (med) = 1336    |  50%  (med) = 1064
                           |  75%        = 200704  |  75%        = 200704
                           |  95%        = 391168  |  95%        = 200704
                           |  99%        = 401408  |  99%        = 200704
                           |  99.9%      = 401408  |  99.9%      = 399360
                           |  99.99%     = 401408  |  99.99%     = 399360
                           |  100% (max) = 401432  |  100% (max) = 399703
------------------------------------------------------------------------
  handler_latency_kudu_tserver_TabletServerService_Write:
                           | Count: 49             | Count: 45
                           | Mean: 3.11688e+06     | Mean: 3.08435e+06
                           | Percentiles:          | Percentiles:
                           |   0%  (min) = 611494  |   0%  (min) = 636928
                           |  25%        = 1802240 |  25%        = 2392064
                           |  50%  (med) = 3391488 |  50%  (med) = 3178496
                           |  75%        = 4390912 |  75%        = 3997696
                           |  95%        = 4816896 |  95%        = 4587520
                           |  99%        = 5013504 |  99%        = 4587520
                           |  99.9%      = 5013504 |  99.9%      = 4587520
                           |  99.99%     = 5013504 |  99.99%     = 4587520
                           |  100% (max) = 5023673 |  100% (max) = 4616174
------------------------------------------------------------------------

Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc
2 files changed, 335 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/16060/5
-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] [tests] add same tablet concurrent writes test

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16060

to look at the new patch set (#2).

Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................

[tests] add same_tablet_concurrent_writes test

Added SameTabletConcurrentWritesTest.InsertsOnly test scenario.
The scenario exercises concurrent inserts from multiple clients
into the same tablet.

The purpose of the newly introduced test is to check for lock contention
if running multiple write operations on the same tablet concurrently.
There is an interaction between threads pushing Raft consensus updates
and RPC worker threads serving write requests, and the test pinpoints
the contention over the lock primitives in RaftConsensus.

To validate the results reported by the test, I verified that RPC queue
overflows happen a bit less often if using the lock-free implementation
of RaftConsensus::CheckLeadershipAndBindTerm() with patch posted here:
  https://gerrit.cloudera.org/#/c/16034/

The rates of successful write operations was the same for both cases.
However, the number of messages from spinlock_profiling.cc like
  Waited 190 ms on lock 0x237acd4 ...
dropped significantly after applying patch 16034 on top.  That's a good
news to have less contention because the freed CPU resources might be
spend on something useful.

Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc
2 files changed, 245 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/16060/2
-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] [tests] add same tablet concurrent writes test

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16060

to look at the new patch set (#3).

Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................

[tests] add same_tablet_concurrent_writes test

Added SameTabletConcurrentWritesTest.InsertsOnly test scenario.
The scenario exercises concurrent inserts from multiple clients
into the same tablet.

The purpose of the newly introduced test is to check for lock contention
if running multiple write operations on the same tablet concurrently.
There is an interaction between threads pushing Raft consensus updates
and RPC worker threads serving write requests, and the test pinpoints
the contention over the lock primitives used in RaftConsensus.

To validate the results reported by the test, I verified that RPC queue
overflows happen a bit less often if using the lock-free implementation
of RaftConsensus::CheckLeadershipAndBindTerm() with patch posted here:
  https://gerrit.cloudera.org/#/c/16034/

The rates of successful write operations was the same for both cases,
and that's expected since the bottleneck is the WAL (where additional
static delays are introduced per each fsync).  However, the number of
messages from spinlock_profiling.cc like
  Waited 190 ms on lock 0x237acd4 ...
dropped significantly after applying patch 16034 on top.  That's a good
news to have less contention because the freed CPU resources might be
spend on something useful, like handing another RPC request from the
queue (which isn't overflown and able to accommodate extra requests).

Below are snippets of various measurements done for this new test
before and after applying patch from 16034 review item on top.

========================================================================

Without 16034 patch:
 Performance counter stats for './bin/same_tablet_concurrent_writes-itest':

       7449.425640 task-clock                #    0.504 CPUs utilized
            47,882 context-switches          #    0.006 M/sec
             3,454 cpu-migrations            #    0.464 K/sec
            28,592 page-faults               #    0.004 M/sec
    10,211,586,270 cycles                    #    1.371 GHz
    10,647,306,766 instructions              #    1.04  insns per cycle
     1,861,229,149 branches                  #  249.849 M/sec
        25,370,590 branch-misses             #    1.36% of all branches

      14.767762000 seconds time elapsed

With 16034 patch:
 Performance counter stats for './bin/same_tablet_concurrent_writes-itest':

       5646.543970 task-clock                #    0.394 CPUs utilized
            39,194 context-switches          #    0.007 M/sec
             3,715 cpu-migrations            #    0.658 K/sec
            30,090 page-faults               #    0.005 M/sec
     8,543,832,082 cycles                    #    1.513 GHz
     9,301,870,856 instructions              #    1.09  insns per cycle
     1,590,579,357 branches                  #  281.691 M/sec
        18,563,203 branch-misses             #    1.17% of all branches

      14.339274728 seconds time elapsed

========================================================================

------------------------------------------------------------------------
                           | Without 16034 patch   |  With 16034 patch
------------------------------------------------------------------------
  write RPC request rate   | 15.8 req/sec          | 16 req/sec
  RPC queue overflows      | 1898                  | 50
  spinlock_contention_time | 22966310              | 9161557
------------------------------------------------------------------------
  rpc_incoming_queue_time  |                       |
                           | Count: 82             | Count: 82
                           | Mean: 199704          | Mean: 1037.87
                           | Percentiles:          | Percentiles:
                           |   0%  (min) = 35      |   0%  (min) = 21
                           |  25%        = 5844    |  25%        = 51
                           |  50%  (med) = 196096  |  50%  (med) = 67
                           |  75%        = 388352  |  75%        = 1608
                           |  95%        = 400640  |  95%        = 3334
                           |  99%        = 598016  |  99%        = 9960
                           |  99.9%      = 599552  |  99.9%      = 10064
                           |  99.99%     = 599552  |  99.99%     = 10064
                           |  100% (max) = 600048  |  100% (max) = 10066
------------------------------------------------------------------------
  op_apply_run_time        |                       |
                           | Count: 79             | Count: 80
                           | Mean: 99377.1         | Mean: 80796.3
                           | Percentiles:          | Percentiles:
                           |   0%  (min) = 429     |   0%  (min) = 575
                           |  25%        = 940     |  25%        = 828
                           |  50%  (med) = 1336    |  50%  (med) = 1064
                           |  75%        = 200704  |  75%        = 200704
                           |  95%        = 391168  |  95%        = 200704
                           |  99%        = 401408  |  99%        = 200704
                           |  99.9%      = 401408  |  99.9%      = 399360
                           |  99.99%     = 401408  |  99.99%     = 399360
                           |  100% (max) = 401432  |  100% (max) = 399703
------------------------------------------------------------------------
  handler_latency_kudu_tserver_TabletServerService_Write:
                           | Count: 49             | Count: 45
                           | Mean: 3.11688e+06     | Mean: 3.08435e+06
                           | Percentiles:          | Percentiles:
                           |   0%  (min) = 611494  |   0%  (min) = 636928
                           |  25%        = 1802240 |  25%        = 2392064
                           |  50%  (med) = 3391488 |  50%  (med) = 3178496
                           |  75%        = 4390912 |  75%        = 3997696
                           |  95%        = 4816896 |  95%        = 4587520
                           |  99%        = 5013504 |  99%        = 4587520
                           |  99.9%      = 5013504 |  99.9%      = 4587520
                           |  99.99%     = 5013504 |  99.99%     = 4587520
                           |  100% (max) = 5023673 |  100% (max) = 4616174
------------------------------------------------------------------------

Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc
2 files changed, 316 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/16060/3
-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] [tests] add same tablet concurrent writes test

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/16060 )

Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16060/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16060/2//COMMIT_MSG@24
PS2, Line 24: The rates of successful write operations was the same for both cases.
            : However, the number of messages from spinlock_profiling.cc like
            :   Waited 190 ms on lock 0x237acd4 ...
            : dropped significantly after applying patch 16034 on top.  That's a good
            : news to have less contention because the freed CPU resources might be
            : spend on something useful.
can you compare a latency histogram, either from the point of view of the client, or using the existing latency histogram metrics? (eg on the Write RPC or the prepare/apply time histos?)

or if you are claiming a CPU reduction, measure CPU cycles per completed write using perf-stat on a before/after?



-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 10 Jun 2020 22:12:08 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tests] add same tablet concurrent writes test

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16060 )

Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................


Patch Set 4:

> Uploaded patch set 4.

Whoops, it seems I pushed extra changes I used for experiments.  I will update it shorty.


-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Thu, 09 Jul 2020 19:20:00 +0000
Gerrit-HasComments: No

[kudu-CR] [tests] add same tablet concurrent writes test

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16060 )

Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................

[tests] add same_tablet_concurrent_writes test

Added SameTabletConcurrentWritesTest.InsertsOnly test scenario.
The scenario exercises concurrent inserts from multiple clients
into the same tablet.

The purpose of the newly introduced test is to check for lock contention
if running multiple write operations on the same tablet concurrently.
There is an interaction between threads pushing Raft consensus updates
and RPC worker threads serving write requests, and the test pinpoints
the contention over the lock primitives used in RaftConsensus.

To validate the results reported by the test, I verified that RPC queue
overflows happen much less often if using the lock-free implementation
of RaftConsensus::CheckLeadershipAndBindTerm() addressed in KUDU-2727
fix (da1c66b61).

The rates of successful write operations was the same for both cases,
and that's expected since the bottleneck is the WAL (where additional
static delays are introduced per each fsync).  However, the number of
messages from spinlock_profiling.cc like
  Waited 190 ms on lock 0x237acd4 ...
dropped significantly after KUDU-2727 fix (da1c66b61).  That's a good
news to have less contention because the freed CPU resources might be
spend on something useful, like handing another RPC request from the
queue (which isn't overflown and able to accommodate extra requests).

Below are snippets of various measurements done for this new test
before and after applying patch from 16034 review item on top.

========================================================================

Without 16034 patch:
 Performance counter stats for './bin/same_tablet_concurrent_writes-itest':

       7449.425640 task-clock                #    0.504 CPUs utilized
            47,882 context-switches          #    0.006 M/sec
             3,454 cpu-migrations            #    0.464 K/sec
            28,592 page-faults               #    0.004 M/sec
    10,211,586,270 cycles                    #    1.371 GHz
    10,647,306,766 instructions              #    1.04  insns per cycle
     1,861,229,149 branches                  #  249.849 M/sec
        25,370,590 branch-misses             #    1.36% of all branches

      14.767762000 seconds time elapsed

With 16034 patch:
 Performance counter stats for './bin/same_tablet_concurrent_writes-itest':

       5646.543970 task-clock                #    0.394 CPUs utilized
            39,194 context-switches          #    0.007 M/sec
             3,715 cpu-migrations            #    0.658 K/sec
            30,090 page-faults               #    0.005 M/sec
     8,543,832,082 cycles                    #    1.513 GHz
     9,301,870,856 instructions              #    1.09  insns per cycle
     1,590,579,357 branches                  #  281.691 M/sec
        18,563,203 branch-misses             #    1.17% of all branches

      14.339274728 seconds time elapsed

========================================================================

------------------------------------------------------------------------
                           | Without 16034 patch   |  With 16034 patch
------------------------------------------------------------------------
  write RPC request rate   | 15.8 req/sec          | 16 req/sec
  RPC queue overflows      | 1898                  | 50
  spinlock_contention_time | 22966310              | 9161557
------------------------------------------------------------------------
  rpc_incoming_queue_time  |                       |
                           | Count: 82             | Count: 82
                           | Mean: 199704          | Mean: 1037.87
                           | Percentiles:          | Percentiles:
                           |   0%  (min) = 35      |   0%  (min) = 21
                           |  25%        = 5844    |  25%        = 51
                           |  50%  (med) = 196096  |  50%  (med) = 67
                           |  75%        = 388352  |  75%        = 1608
                           |  95%        = 400640  |  95%        = 3334
                           |  99%        = 598016  |  99%        = 9960
                           |  99.9%      = 599552  |  99.9%      = 10064
                           |  99.99%     = 599552  |  99.99%     = 10064
                           |  100% (max) = 600048  |  100% (max) = 10066
------------------------------------------------------------------------
  op_apply_run_time        |                       |
                           | Count: 79             | Count: 80
                           | Mean: 99377.1         | Mean: 80796.3
                           | Percentiles:          | Percentiles:
                           |   0%  (min) = 429     |   0%  (min) = 575
                           |  25%        = 940     |  25%        = 828
                           |  50%  (med) = 1336    |  50%  (med) = 1064
                           |  75%        = 200704  |  75%        = 200704
                           |  95%        = 391168  |  95%        = 200704
                           |  99%        = 401408  |  99%        = 200704
                           |  99.9%      = 401408  |  99.9%      = 399360
                           |  99.99%     = 401408  |  99.99%     = 399360
                           |  100% (max) = 401432  |  100% (max) = 399703
------------------------------------------------------------------------
  handler_latency_kudu_tserver_TabletServerService_Write:
                           | Count: 49             | Count: 45
                           | Mean: 3.11688e+06     | Mean: 3.08435e+06
                           | Percentiles:          | Percentiles:
                           |   0%  (min) = 611494  |   0%  (min) = 636928
                           |  25%        = 1802240 |  25%        = 2392064
                           |  50%  (med) = 3391488 |  50%  (med) = 3178496
                           |  75%        = 4390912 |  75%        = 3997696
                           |  95%        = 4816896 |  95%        = 4587520
                           |  99%        = 5013504 |  99%        = 4587520
                           |  99.9%      = 5013504 |  99.9%      = 4587520
                           |  99.99%     = 5013504 |  99.99%     = 4587520
                           |  100% (max) = 5023673 |  100% (max) = 4616174
------------------------------------------------------------------------

Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Reviewed-on: http://gerrit.cloudera.org:8080/16060
Reviewed-by: Attila Bukor <ab...@apache.org>
Tested-by: Kudu Jenkins
Reviewed-by: Grant Henke <gr...@apache.org>
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc
2 files changed, 336 insertions(+), 0 deletions(-)

Approvals:
  Attila Bukor: Looks good to me, but someone else must approve
  Kudu Jenkins: Verified
  Grant Henke: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 7
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] [tests] add same tablet concurrent writes test

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has removed a vote on this change.

Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................


Removed Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] [tests] add same tablet concurrent writes test

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16060 )

Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16060/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16060/2//COMMIT_MSG@24
PS2, Line 24: The rates of successful write operations was the same for both cases.
            : However, the number of messages from spinlock_profiling.cc like
            :   Waited 190 ms on lock 0x237acd4 ...
            : dropped significantly after applying patch 16034 on top.  That's a good
            : news to have less contention because the freed CPU resources might be
            : spend on something useful.
> can you compare a latency histogram, either from the point of view of the c
Done.

As expected, the latencies of Write RPC and apply haven't changed much: the main factor limiting the throughput in this scenario is extra latency added for every fsync() WAL operation.

However, lock contention and rpc_incoming_queue_time significantly lower with 16034.



-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Sat, 20 Jun 2020 03:14:12 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tests] add same tablet concurrent writes test

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16060

to look at the new patch set (#4).

Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................

[tests] add same_tablet_concurrent_writes test

Added SameTabletConcurrentWritesTest.InsertsOnly test scenario.
The scenario exercises concurrent inserts from multiple clients
into the same tablet.

The purpose of the newly introduced test is to check for lock contention
if running multiple write operations on the same tablet concurrently.
There is an interaction between threads pushing Raft consensus updates
and RPC worker threads serving write requests, and the test pinpoints
the contention over the lock primitives used in RaftConsensus.

To validate the results reported by the test, I verified that RPC queue
overflows happen a bit less often if using the lock-free implementation
of RaftConsensus::CheckLeadershipAndBindTerm() with patch posted here:
  https://gerrit.cloudera.org/#/c/16034/

The rates of successful write operations was the same for both cases,
and that's expected since the bottleneck is the WAL (where additional
static delays are introduced per each fsync).  However, the number of
messages from spinlock_profiling.cc like
  Waited 190 ms on lock 0x237acd4 ...
dropped significantly after applying patch 16034 on top.  That's a good
news to have less contention because the freed CPU resources might be
spend on something useful, like handing another RPC request from the
queue (which isn't overflown and able to accommodate extra requests).

Below are snippets of various measurements done for this new test
before and after applying patch from 16034 review item on top.

========================================================================

Without 16034 patch:
 Performance counter stats for './bin/same_tablet_concurrent_writes-itest':

       7449.425640 task-clock                #    0.504 CPUs utilized
            47,882 context-switches          #    0.006 M/sec
             3,454 cpu-migrations            #    0.464 K/sec
            28,592 page-faults               #    0.004 M/sec
    10,211,586,270 cycles                    #    1.371 GHz
    10,647,306,766 instructions              #    1.04  insns per cycle
     1,861,229,149 branches                  #  249.849 M/sec
        25,370,590 branch-misses             #    1.36% of all branches

      14.767762000 seconds time elapsed

With 16034 patch:
 Performance counter stats for './bin/same_tablet_concurrent_writes-itest':

       5646.543970 task-clock                #    0.394 CPUs utilized
            39,194 context-switches          #    0.007 M/sec
             3,715 cpu-migrations            #    0.658 K/sec
            30,090 page-faults               #    0.005 M/sec
     8,543,832,082 cycles                    #    1.513 GHz
     9,301,870,856 instructions              #    1.09  insns per cycle
     1,590,579,357 branches                  #  281.691 M/sec
        18,563,203 branch-misses             #    1.17% of all branches

      14.339274728 seconds time elapsed

========================================================================

------------------------------------------------------------------------
                           | Without 16034 patch   |  With 16034 patch
------------------------------------------------------------------------
  write RPC request rate   | 15.8 req/sec          | 16 req/sec
  RPC queue overflows      | 1898                  | 50
  spinlock_contention_time | 22966310              | 9161557
------------------------------------------------------------------------
  rpc_incoming_queue_time  |                       |
                           | Count: 82             | Count: 82
                           | Mean: 199704          | Mean: 1037.87
                           | Percentiles:          | Percentiles:
                           |   0%  (min) = 35      |   0%  (min) = 21
                           |  25%        = 5844    |  25%        = 51
                           |  50%  (med) = 196096  |  50%  (med) = 67
                           |  75%        = 388352  |  75%        = 1608
                           |  95%        = 400640  |  95%        = 3334
                           |  99%        = 598016  |  99%        = 9960
                           |  99.9%      = 599552  |  99.9%      = 10064
                           |  99.99%     = 599552  |  99.99%     = 10064
                           |  100% (max) = 600048  |  100% (max) = 10066
------------------------------------------------------------------------
  op_apply_run_time        |                       |
                           | Count: 79             | Count: 80
                           | Mean: 99377.1         | Mean: 80796.3
                           | Percentiles:          | Percentiles:
                           |   0%  (min) = 429     |   0%  (min) = 575
                           |  25%        = 940     |  25%        = 828
                           |  50%  (med) = 1336    |  50%  (med) = 1064
                           |  75%        = 200704  |  75%        = 200704
                           |  95%        = 391168  |  95%        = 200704
                           |  99%        = 401408  |  99%        = 200704
                           |  99.9%      = 401408  |  99.9%      = 399360
                           |  99.99%     = 401408  |  99.99%     = 399360
                           |  100% (max) = 401432  |  100% (max) = 399703
------------------------------------------------------------------------
  handler_latency_kudu_tserver_TabletServerService_Write:
                           | Count: 49             | Count: 45
                           | Mean: 3.11688e+06     | Mean: 3.08435e+06
                           | Percentiles:          | Percentiles:
                           |   0%  (min) = 611494  |   0%  (min) = 636928
                           |  25%        = 1802240 |  25%        = 2392064
                           |  50%  (med) = 3391488 |  50%  (med) = 3178496
                           |  75%        = 4390912 |  75%        = 3997696
                           |  95%        = 4816896 |  95%        = 4587520
                           |  99%        = 5013504 |  99%        = 4587520
                           |  99.9%      = 5013504 |  99.9%      = 4587520
                           |  99.99%     = 5013504 |  99.99%     = 4587520
                           |  100% (max) = 5023673 |  100% (max) = 4616174
------------------------------------------------------------------------

Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc
2 files changed, 341 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/16060/4
-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] [tests] add same tablet concurrent writes test

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/16060 )

Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................


Patch Set 6: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 6
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 10 Jul 2020 15:30:13 +0000
Gerrit-HasComments: No

[kudu-CR] [tests] add same tablet concurrent writes test

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/16060 )

Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................


Patch Set 6: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 6
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 15 Jul 2020 14:42:36 +0000
Gerrit-HasComments: No

[kudu-CR] [tests] add same tablet concurrent writes test

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16060 )

Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16060/3/src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc
File src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc:

http://gerrit.cloudera.org:8080/#/c/16060/3/src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc@214
PS3, Line 214:   FLAGS_max_num_columns = kNumColumns;
> Would it make sense to use a larger value instead of more columns given 100
You mean have larger cells (like long strings)?  Yes: in both cases the desired effect can be achieved, and I updated the test to use smaller number of columns (250), so this is sort of having both high enough number of columns and heavier cells.



-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Thu, 09 Jul 2020 18:58:35 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tests] add same tablet concurrent writes test

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/16060 )

Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/16060/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16060/5//COMMIT_MSG@22
PS5, Line 22:   https://gerrit.cloudera.org/#/c/16034/
this patch is already merged as da1c66b so I think it would be better to refer to it by commit hash which doesn't require gerrit to be online.


http://gerrit.cloudera.org:8080/#/c/16060/5//COMMIT_MSG@29
PS5, Line 29: 16034
same here


http://gerrit.cloudera.org:8080/#/c/16060/5/src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc
File src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc:

http://gerrit.cloudera.org:8080/#/c/16060/5/src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc@222
PS5, Line 222: 
I think you missed a few words here.


http://gerrit.cloudera.org:8080/#/c/16060/5/src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc@289
PS5, Line 289: spinlock_contentino_time
nit: typo in contention



-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 10 Jul 2020 14:24:02 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tests] add same tablet concurrent writes test

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Attila Bukor, Kudu Jenkins, Grant Henke, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16060

to look at the new patch set (#6).

Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................

[tests] add same_tablet_concurrent_writes test

Added SameTabletConcurrentWritesTest.InsertsOnly test scenario.
The scenario exercises concurrent inserts from multiple clients
into the same tablet.

The purpose of the newly introduced test is to check for lock contention
if running multiple write operations on the same tablet concurrently.
There is an interaction between threads pushing Raft consensus updates
and RPC worker threads serving write requests, and the test pinpoints
the contention over the lock primitives used in RaftConsensus.

To validate the results reported by the test, I verified that RPC queue
overflows happen much less often if using the lock-free implementation
of RaftConsensus::CheckLeadershipAndBindTerm() addressed in KUDU-2727
fix (da1c66b61).

The rates of successful write operations was the same for both cases,
and that's expected since the bottleneck is the WAL (where additional
static delays are introduced per each fsync).  However, the number of
messages from spinlock_profiling.cc like
  Waited 190 ms on lock 0x237acd4 ...
dropped significantly after KUDU-2727 fix (da1c66b61).  That's a good
news to have less contention because the freed CPU resources might be
spend on something useful, like handing another RPC request from the
queue (which isn't overflown and able to accommodate extra requests).

Below are snippets of various measurements done for this new test
before and after applying patch from 16034 review item on top.

========================================================================

Without 16034 patch:
 Performance counter stats for './bin/same_tablet_concurrent_writes-itest':

       7449.425640 task-clock                #    0.504 CPUs utilized
            47,882 context-switches          #    0.006 M/sec
             3,454 cpu-migrations            #    0.464 K/sec
            28,592 page-faults               #    0.004 M/sec
    10,211,586,270 cycles                    #    1.371 GHz
    10,647,306,766 instructions              #    1.04  insns per cycle
     1,861,229,149 branches                  #  249.849 M/sec
        25,370,590 branch-misses             #    1.36% of all branches

      14.767762000 seconds time elapsed

With 16034 patch:
 Performance counter stats for './bin/same_tablet_concurrent_writes-itest':

       5646.543970 task-clock                #    0.394 CPUs utilized
            39,194 context-switches          #    0.007 M/sec
             3,715 cpu-migrations            #    0.658 K/sec
            30,090 page-faults               #    0.005 M/sec
     8,543,832,082 cycles                    #    1.513 GHz
     9,301,870,856 instructions              #    1.09  insns per cycle
     1,590,579,357 branches                  #  281.691 M/sec
        18,563,203 branch-misses             #    1.17% of all branches

      14.339274728 seconds time elapsed

========================================================================

------------------------------------------------------------------------
                           | Without 16034 patch   |  With 16034 patch
------------------------------------------------------------------------
  write RPC request rate   | 15.8 req/sec          | 16 req/sec
  RPC queue overflows      | 1898                  | 50
  spinlock_contention_time | 22966310              | 9161557
------------------------------------------------------------------------
  rpc_incoming_queue_time  |                       |
                           | Count: 82             | Count: 82
                           | Mean: 199704          | Mean: 1037.87
                           | Percentiles:          | Percentiles:
                           |   0%  (min) = 35      |   0%  (min) = 21
                           |  25%        = 5844    |  25%        = 51
                           |  50%  (med) = 196096  |  50%  (med) = 67
                           |  75%        = 388352  |  75%        = 1608
                           |  95%        = 400640  |  95%        = 3334
                           |  99%        = 598016  |  99%        = 9960
                           |  99.9%      = 599552  |  99.9%      = 10064
                           |  99.99%     = 599552  |  99.99%     = 10064
                           |  100% (max) = 600048  |  100% (max) = 10066
------------------------------------------------------------------------
  op_apply_run_time        |                       |
                           | Count: 79             | Count: 80
                           | Mean: 99377.1         | Mean: 80796.3
                           | Percentiles:          | Percentiles:
                           |   0%  (min) = 429     |   0%  (min) = 575
                           |  25%        = 940     |  25%        = 828
                           |  50%  (med) = 1336    |  50%  (med) = 1064
                           |  75%        = 200704  |  75%        = 200704
                           |  95%        = 391168  |  95%        = 200704
                           |  99%        = 401408  |  99%        = 200704
                           |  99.9%      = 401408  |  99.9%      = 399360
                           |  99.99%     = 401408  |  99.99%     = 399360
                           |  100% (max) = 401432  |  100% (max) = 399703
------------------------------------------------------------------------
  handler_latency_kudu_tserver_TabletServerService_Write:
                           | Count: 49             | Count: 45
                           | Mean: 3.11688e+06     | Mean: 3.08435e+06
                           | Percentiles:          | Percentiles:
                           |   0%  (min) = 611494  |   0%  (min) = 636928
                           |  25%        = 1802240 |  25%        = 2392064
                           |  50%  (med) = 3391488 |  50%  (med) = 3178496
                           |  75%        = 4390912 |  75%        = 3997696
                           |  95%        = 4816896 |  95%        = 4587520
                           |  99%        = 5013504 |  99%        = 4587520
                           |  99.9%      = 5013504 |  99.9%      = 4587520
                           |  99.99%     = 5013504 |  99.99%     = 4587520
                           |  100% (max) = 5023673 |  100% (max) = 4616174
------------------------------------------------------------------------

Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc
2 files changed, 336 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/16060/6
-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 6
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] [tests] add same tablet concurrent writes test

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16060 )

Change subject: [tests] add same_tablet_concurrent_writes test
......................................................................


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/16060/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16060/5//COMMIT_MSG@22
PS5, Line 22:   https://gerrit.cloudera.org/#/c/16034/
> this patch is already merged as da1c66b so I think it would be better to re
Done


http://gerrit.cloudera.org:8080/#/c/16060/5//COMMIT_MSG@29
PS5, Line 29: 16034
> same here
Done


http://gerrit.cloudera.org:8080/#/c/16060/5/src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc
File src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc:

http://gerrit.cloudera.org:8080/#/c/16060/5/src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc@222
PS5, Line 222: 
> I think you missed a few words here.
Whoops, a whole line was missing here.


http://gerrit.cloudera.org:8080/#/c/16060/5/src/kudu/integration-tests/same_tablet_concurrent_writes-itest.cc@289
PS5, Line 289: spinlock_contentino_time
> nit: typo in contention
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/16060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7eef6e46e7685450354473cee9d804c5054723eb
Gerrit-Change-Number: 16060
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 10 Jul 2020 15:17:42 +0000
Gerrit-HasComments: Yes