You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2019/05/20 21:32:14 UTC

[kudu-CR] [tool-test] scenario for ksck and unusual tserver flags

Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13380


Change subject: [tool-test] scenario for ksck and unusual tserver flags
......................................................................

[tool-test] scenario for ksck and unusual tserver flags

Added a test scenario to check how ksck works in case of
collecting information from many tablets servers when tablet
servers have 'unusual' run-time flags.

This is to cover regressions for a race condition that existed
in FetchInfoFromTabletServers() some time ago before it was
fixed with d17f9fce684272df28e6291839f162b33796bc5c along with
bringing other improvements.

With this test, I verified that the version prior to the above mentioned
patch is susceptible to the race as reported by AddressSantitizer:

heap-buffer-overflow and double-free reproduced (~1 of 100):
  http://dist-test.cloudera.org//job?job_id=aserbin.1558386479.123305

Also, I verified that commenting out the guard that protects concurrent
calls of emplace_back() on warning_messages in src/kudu/toolts/ksck.cc,
FetchInfoFromTabletServers() method, ASAN catches SIGBUS (~1 of 100):
  http://dist-test.cloudera.org//job?job_id=aserbin.1558385919.118394

Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
---
M src/kudu/tools/kudu-tool-test.cc
1 file changed, 62 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/80/13380/1
-- 
To view, visit http://gerrit.cloudera.org:8080/13380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
Gerrit-Change-Number: 13380
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>

[kudu-CR] [tool-test] scenario for ksck and unusual tserver flags

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13380

to look at the new patch set (#2).

Change subject: [tool-test] scenario for ksck and unusual tserver flags
......................................................................

[tool-test] scenario for ksck and unusual tserver flags

Added a test scenario to check how ksck works in case of
collecting information from many tablets servers when tablet
servers have 'unusual' run-time flags.

This is to cover regressions for a race condition that existed
in FetchInfoFromTabletServers() some time ago before it was
fixed with d17f9fce6 along with bringing other improvements.

With this test, I verified that the version prior to the above mentioned
patch d17f9fce6 is susceptible to the race as reported by sanitizers.

  ThreadSantitizer (100% failure rate):
    http://dist-test.cloudera.org/job?job_id=aserbin.1558485044.66856

  AddressSantitizer (~1 out of 20 failed), reproducing double-free,
  heap-buffer-overflow, memory leaks and friends:
    http://dist-test.cloudera.org/job?job_id=aserbin.1558484756.62807

Also, I verified that commenting out the guard that protects concurrent
calls of emplace_back() on warning_messages in src/kudu/toolts/ksck.cc,
FetchInfoFromTabletServers() method:

  AddressSantitizer (~1 out of 20 failed):
    http://dist-test.cloudera.org/job?job_id=aserbin.1558485217.70253

  ThreadSantitizer (100% failure rate):
    http://dist-test.cloudera.org/job?job_id=aserbin.1558485274.73242

Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
---
M src/kudu/tools/kudu-tool-test.cc
1 file changed, 40 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/80/13380/2
-- 
To view, visit http://gerrit.cloudera.org:8080/13380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
Gerrit-Change-Number: 13380
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [tool-test] scenario for ksck and unusual tserver flags

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13380 )

Change subject: [tool-test] scenario for ksck and unusual tserver flags
......................................................................


Patch Set 1: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13380/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13380/1//COMMIT_MSG@18
PS1, Line 18: With this test, I verified that the version prior to the above mentioned
It's too bad this is so hard to repro, but having a test that'll become flaky if we regress is certainly better than nothing.


http://gerrit.cloudera.org:8080/#/c/13380/1/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/13380/1/src/kudu/tools/kudu-tool-test.cc@4538
PS1, Line 4538:   // Run tablet servers with a few flags considered unusual by ksck.
Could you doc why these custom values are useful?

Oh, it's just so that ksck will have something to show in the "unusual flags" section. Is that actually necessary to tickle the race though? I thought the "find unusual flags" calls are made regardless of whether the cluster actually has unusual flags set or not.



-- 
To view, visit http://gerrit.cloudera.org:8080/13380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
Gerrit-Change-Number: 13380
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Tue, 21 May 2019 20:08:00 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tool-test] scenario for ksck and unusual tserver flags

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13380 )

Change subject: [tool-test] scenario for ksck and unusual tserver flags
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/13380/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13380/2//COMMIT_MSG@20
PS2, Line 20: ThreadSantitizer
"ThreadSanitizer" (and "AddressSanitizer", below).


http://gerrit.cloudera.org:8080/#/c/13380/2/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/13380/2/src/kudu/tools/kudu-tool-test.cc@4547
PS2, Line 4547:     Status s = RunTool(Substitute("cluster ksck $0", master_addr),
RunActionStderrString to simplify?

Below too.


http://gerrit.cloudera.org:8080/#/c/13380/2/src/kudu/tools/kudu-tool-test.cc@4560
PS2, Line 4560:     LOG(INFO) << "STDERR: " << err;
Was this for debugging? Should it be removed?



-- 
To view, visit http://gerrit.cloudera.org:8080/13380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
Gerrit-Change-Number: 13380
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Wed, 22 May 2019 16:49:58 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tool-test] scenario for ksck and unusual tserver flags

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has removed Kudu Jenkins from this change.  ( http://gerrit.cloudera.org:8080/13380 )

Change subject: [tool-test] scenario for ksck and unusual tserver flags
......................................................................


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/13380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
Gerrit-Change-Number: 13380
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [tool-test] add scenario for KUDU-2819 regressions

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has removed Kudu Jenkins from this change.  ( http://gerrit.cloudera.org:8080/13380 )

Change subject: [tool-test] add scenario for KUDU-2819 regressions
......................................................................


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/13380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
Gerrit-Change-Number: 13380
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [tool-test] add scenario for KUDU-2819 regressions

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13380 )

Change subject: [tool-test] add scenario for KUDU-2819 regressions
......................................................................

[tool-test] add scenario for KUDU-2819 regressions

Added a test scenario to check how ksck works in case of
collecting information from many tablets servers a Kudu cluster.

This is to cover regressions for a race condition that existed in
FetchInfoFromTabletServers() some time ago before it was fixed with
d17f9fce6 (along with bringing other improvements).  One manifestation
of the race condition is described in KUDU-2819.

With this test, I verified that the version prior to the above mentioned
patch d17f9fce6 was susceptible to the race as reported by sanitizers:

  ThreadSanitizer (100% failure rate):
    http://dist-test.cloudera.org/job?job_id=aserbin.1558485044.66856

  AddressSanitizer (~1 out of 20 failed), reproducing double-free,
  heap-buffer-overflow, memory leaks and friends:
    http://dist-test.cloudera.org/job?job_id=aserbin.1558484756.62807

Also, I verified that by commenting out the guard that protects
concurrent calls of std::vector::emplace_back() on 'warning_messages'
container in FetchInfoFromTabletServers() method
(src/kudu/toolts/ksck.cc), both sanitizers were able to detect the race:

  ThreadSanitizer (100% failure rate):
    http://dist-test.cloudera.org/job?job_id=aserbin.1558485274.73242

  AddressSanitizer (~1 out of 20 failed):
    http://dist-test.cloudera.org/job?job_id=aserbin.1558485217.70253

Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
Reviewed-on: http://gerrit.cloudera.org:8080/13380
Tested-by: Alexey Serbin <as...@cloudera.com>
Reviewed-by: Adar Dembo <ad...@cloudera.com>
---
M src/kudu/tools/kudu-tool-test.cc
1 file changed, 41 insertions(+), 0 deletions(-)

Approvals:
  Alexey Serbin: Verified
  Adar Dembo: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/13380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
Gerrit-Change-Number: 13380
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [tool-test] scenario for ksck and unusual tserver flags

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/13380 )

Change subject: [tool-test] scenario for ksck and unusual tserver flags
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13380/1/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/13380/1/src/kudu/tools/kudu-tool-test.cc@4538
PS1, Line 4538:   // Run tablet servers with a few flags considered unusual by ksck.
> Could you doc why these custom values are useful?
Yep, you are right -- these are not actually necessary to tickle the race.  The important thing to have for the race is unavailability of tablet servers.  And yes, due to the test setup that we have for mini-cluster, the 'unusual' flags are actually there already (in addition to those specified here explicitly).

However, these were useful to distinguish between crash/race and normal run.  Probably, I can do better in that regard and distinguish between the latter by some other means.



-- 
To view, visit http://gerrit.cloudera.org:8080/13380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
Gerrit-Change-Number: 13380
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Tue, 21 May 2019 20:51:51 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tool-test] add scenario for KUDU-2819 regressions

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/13380 )

Change subject: [tool-test] add scenario for KUDU-2819 regressions
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/13380/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13380/2//COMMIT_MSG@20
PS2, Line 20: ThreadSantitizer
> "ThreadSanitizer" (and "AddressSanitizer", below).
Done


http://gerrit.cloudera.org:8080/#/c/13380/2/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/13380/2/src/kudu/tools/kudu-tool-test.cc@4547
PS2, Line 4547:     Status s = RunTool(Substitute("cluster ksck $0", master_addr),
> RunActionStderrString to simplify?
Done


http://gerrit.cloudera.org:8080/#/c/13380/2/src/kudu/tools/kudu-tool-test.cc@4560
PS2, Line 4560:     LOG(INFO) << "STDERR: " << err;
> Was this for debugging? Should it be removed?
Whoops, indeed.  Good catch!



-- 
To view, visit http://gerrit.cloudera.org:8080/13380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
Gerrit-Change-Number: 13380
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Wed, 22 May 2019 17:46:20 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tool-test] scenario for ksck and unusual tserver flags

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has removed Kudu Jenkins from this change.  ( http://gerrit.cloudera.org:8080/13380 )

Change subject: [tool-test] scenario for ksck and unusual tserver flags
......................................................................


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/13380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
Gerrit-Change-Number: 13380
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>

[kudu-CR] [tool-test] scenario for ksck and unusual tserver flags

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/13380 )

Change subject: [tool-test] scenario for ksck and unusual tserver flags
......................................................................


Patch Set 1: Verified+1

unrelated failures in tests:
  * org.apache.kudu.client.TestSecurity
  * org.apache.kudu.client.TestHybridTime


-- 
To view, visit http://gerrit.cloudera.org:8080/13380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
Gerrit-Change-Number: 13380
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 20 May 2019 22:37:49 +0000
Gerrit-HasComments: No

[kudu-CR] [tool-test] scenario for ksck and unusual tserver flags

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/13380 )

Change subject: [tool-test] scenario for ksck and unusual tserver flags
......................................................................


Patch Set 2: Verified+1

unrelated failures in:
  * org.apache.kudu.backup.TestKuduBackup
  * org.apache.kudu.flume.sink.SecureKuduSinkTest


-- 
To view, visit http://gerrit.cloudera.org:8080/13380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
Gerrit-Change-Number: 13380
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Wed, 22 May 2019 02:28:41 +0000
Gerrit-HasComments: No

[kudu-CR] [tool-test] add scenario for KUDU-2819 regressions

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13380 )

Change subject: [tool-test] add scenario for KUDU-2819 regressions
......................................................................


Patch Set 3: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/13380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
Gerrit-Change-Number: 13380
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Wed, 22 May 2019 18:29:12 +0000
Gerrit-HasComments: No

[kudu-CR] [tool-test] add scenario for KUDU-2819 regressions

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/13380 )

Change subject: [tool-test] add scenario for KUDU-2819 regressions
......................................................................


Patch Set 3: Verified+1

unrelated failure in:
  * org.apache.kudu.client.TestAsyncKuduSession


-- 
To view, visit http://gerrit.cloudera.org:8080/13380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
Gerrit-Change-Number: 13380
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Wed, 22 May 2019 18:09:30 +0000
Gerrit-HasComments: No

[kudu-CR] [tool-test] add scenario for KUDU-2819 regressions

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13380

to look at the new patch set (#3).

Change subject: [tool-test] add scenario for KUDU-2819 regressions
......................................................................

[tool-test] add scenario for KUDU-2819 regressions

Added a test scenario to check how ksck works in case of
collecting information from many tablets servers a Kudu cluster.

This is to cover regressions for a race condition that existed in
FetchInfoFromTabletServers() some time ago before it was fixed with
d17f9fce6 (along with bringing other improvements).  One manifestation
of the race condition is described in KUDU-2819.

With this test, I verified that the version prior to the above mentioned
patch d17f9fce6 was susceptible to the race as reported by sanitizers:

  ThreadSanitizer (100% failure rate):
    http://dist-test.cloudera.org/job?job_id=aserbin.1558485044.66856

  AddressSanitizer (~1 out of 20 failed), reproducing double-free,
  heap-buffer-overflow, memory leaks and friends:
    http://dist-test.cloudera.org/job?job_id=aserbin.1558484756.62807

Also, I verified that by commenting out the guard that protects
concurrent calls of std::vector::emplace_back() on 'warning_messages'
container in FetchInfoFromTabletServers() method
(src/kudu/toolts/ksck.cc), both sanitizers were able to detect the race:

  ThreadSanitizer (100% failure rate):
    http://dist-test.cloudera.org/job?job_id=aserbin.1558485274.73242

  AddressSanitizer (~1 out of 20 failed):
    http://dist-test.cloudera.org/job?job_id=aserbin.1558485217.70253

Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
---
M src/kudu/tools/kudu-tool-test.cc
1 file changed, 41 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/80/13380/3
-- 
To view, visit http://gerrit.cloudera.org:8080/13380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I16cf69b6f7d2fb59014df26601dfc30e124a52ee
Gerrit-Change-Number: 13380
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>