You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2018/03/08 00:55:04 UTC

[kudu-CR](branch-1.6.x) [tablet] fix nullptr dereference while capturing iterators

Hello Mike Percy, Kudu Jenkins,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/9549

to review the following change.


Change subject: [tablet] fix nullptr dereference while capturing iterators
......................................................................

[tablet] fix nullptr dereference while capturing iterators

Added a check into Tablet::CaptureConsistentIterators() to make sure
the tablet is not stopped/shutdown.

Before this patch in one test scenario I saw stack traces
like below (built in DEBUG configuration):

kudu-tserver: src/kudu/gutil/ref_counted.h:284: T *scoped_refptr<kudu::tablet::TabletComponents>::operator->() const [T = kudu::tablet::TabletComponents]: Assertion `ptr_ != __null' failed.
*** Aborted at 1517534012 (unix time) try "date -d @1517534012" if you are using GNU date ***
PC: @     0x7ff9ad39cc37 gsignal
*** SIGABRT (@0x3e80000745f) received by PID 29791 (TID 0x7ff99a0bc700) from PID 29791; stack trace: ***
    @     0x7ff9b5129330 (unknown) at ??:0
    @     0x7ff9ad39cc37 gsignal at ??:0
    @     0x7ff9ad3a0028 abort at ??:0
    @     0x7ff9ad395bf6 (unknown) at ??:0
    @     0x7ff9ad395ca2 __assert_fail at ??:0
    @     0x7ff9b7f2ce52 scoped_refptr<>::operator->() at ??:0
    @     0x7ff9b7f1bf6d kudu::tablet::Tablet::CaptureConsistentIterators() at ??:0
    @     0x7ff9b7f225f6 kudu::tablet::Tablet::Iterator::Init() at ??:0
    @     0x7ff9b94372e3 kudu::tserver::TabletServiceImpl::HandleNewScanRequest() at ??:0
    @     0x7ff9b943a906 kudu::tserver::TabletServiceImpl::Checksum() at ??:0
    @     0x7ff9b3d3c83d kudu::tserver::TabletServerServiceIf::TabletServerServiceIf()::$_11::operator()() at ??:0
    @     0x7ff9b3d3c682 std::_Function_handler<>::_M_invoke() at ??:0
    @     0x7ff9b2ea026b std::function<>::operator()() at ??:0
    @     0x7ff9b2e9fb2d kudu::rpc::GeneratedServiceIf::Handle() at ??:0
    @     0x7ff9b2ea1ee6 kudu::rpc::ServicePool::RunThread() at ??:0
    @     0x7ff9b2ea4499 boost::_mfi::mf0<>::operator()() at ??:0
    @     0x7ff9b2ea4400 boost::_bi::list1<>::operator()<>() at ??:0
    @     0x7ff9b2ea43aa boost::_bi::bind_t<>::operator()() at ??:0
    @     0x7ff9b2ea418d boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0
    @     0x7ff9b2e45f68 boost::function0<>::operator()() at ??:0
    @     0x7ff9b115162d kudu::Thread::SuperviseThread() at ??:0
    @     0x7ff9b5121184 start_thread at ??:0
    @     0x7ff9ad463ffd clone at ??:0
    @                0x0 (unknown)

I used the following WIP stress test for the reproduction scenario:
  https://gerrit.cloudera.org/#/c/9255/

For DEBUG builds, without fix the issues appeared ~0.5% of cases.  After
the fix, the issue could not be reproduced:

Without fix:
  http://dist-test.cloudera.org//job?job_id=aserbin.1518492521.137030

With fix:
  http://dist-test.cloudera.org//job?job_id=aserbin.1518492937.141401

Change-Id: Ia7600f006c8df7f445cc2551e99390177378bcff
Reviewed-on: http://gerrit.cloudera.org:8080/9189
Tested-by: Kudu Jenkins
Reviewed-by: Mike Percy <mp...@apache.org>
(cherry picked from commit 5d10a56f9d06dc695f2a4469edbabce978912eb4)
---
M src/kudu/tablet/tablet.cc
1 file changed, 11 insertions(+), 9 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/9549/1
-- 
To view, visit http://gerrit.cloudera.org:8080/9549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.6.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia7600f006c8df7f445cc2551e99390177378bcff
Gerrit-Change-Number: 9549
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>

[kudu-CR](branch-1.6.x) [tablet] fix nullptr dereference while capturing iterators

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9549 )

Change subject: [tablet] fix nullptr dereference while capturing iterators
......................................................................


Patch Set 2: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/9549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.6.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7600f006c8df7f445cc2551e99390177378bcff
Gerrit-Change-Number: 9549
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 09 Mar 2018 19:57:08 +0000
Gerrit-HasComments: No

[kudu-CR](branch-1.6.x) [tablet] fix nullptr dereference while capturing iterators

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has removed Kudu Jenkins from this change.  ( http://gerrit.cloudera.org:8080/9549 )

Change subject: [tablet] fix nullptr dereference while capturing iterators
......................................................................


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/9549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.6.x
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: Ia7600f006c8df7f445cc2551e99390177378bcff
Gerrit-Change-Number: 9549
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>

[kudu-CR](branch-1.6.x) [tablet] fix nullptr dereference while capturing iterators

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/9549 )

Change subject: [tablet] fix nullptr dereference while capturing iterators
......................................................................


Patch Set 2: Verified+1

Flakes in org.apache.kudu.client.TestKuduClient.testCloseShortlyAfterOpen


-- 
To view, visit http://gerrit.cloudera.org:8080/9549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.6.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7600f006c8df7f445cc2551e99390177378bcff
Gerrit-Change-Number: 9549
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Thu, 08 Mar 2018 04:00:25 +0000
Gerrit-HasComments: No

[kudu-CR](branch-1.6.x) [tablet] fix nullptr dereference while capturing iterators

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9549 )

Change subject: [tablet] fix nullptr dereference while capturing iterators
......................................................................

[tablet] fix nullptr dereference while capturing iterators

Added a check into Tablet::CaptureConsistentIterators() to make sure
the tablet is not stopped/shutdown.

Before this patch in one test scenario I saw stack traces
like below (built in DEBUG configuration):

kudu-tserver: src/kudu/gutil/ref_counted.h:284: T *scoped_refptr<kudu::tablet::TabletComponents>::operator->() const [T = kudu::tablet::TabletComponents]: Assertion `ptr_ != __null' failed.
*** Aborted at 1517534012 (unix time) try "date -d @1517534012" if you are using GNU date ***
PC: @     0x7ff9ad39cc37 gsignal
*** SIGABRT (@0x3e80000745f) received by PID 29791 (TID 0x7ff99a0bc700) from PID 29791; stack trace: ***
    @     0x7ff9b5129330 (unknown) at ??:0
    @     0x7ff9ad39cc37 gsignal at ??:0
    @     0x7ff9ad3a0028 abort at ??:0
    @     0x7ff9ad395bf6 (unknown) at ??:0
    @     0x7ff9ad395ca2 __assert_fail at ??:0
    @     0x7ff9b7f2ce52 scoped_refptr<>::operator->() at ??:0
    @     0x7ff9b7f1bf6d kudu::tablet::Tablet::CaptureConsistentIterators() at ??:0
    @     0x7ff9b7f225f6 kudu::tablet::Tablet::Iterator::Init() at ??:0
    @     0x7ff9b94372e3 kudu::tserver::TabletServiceImpl::HandleNewScanRequest() at ??:0
    @     0x7ff9b943a906 kudu::tserver::TabletServiceImpl::Checksum() at ??:0
    @     0x7ff9b3d3c83d kudu::tserver::TabletServerServiceIf::TabletServerServiceIf()::$_11::operator()() at ??:0
    @     0x7ff9b3d3c682 std::_Function_handler<>::_M_invoke() at ??:0
    @     0x7ff9b2ea026b std::function<>::operator()() at ??:0
    @     0x7ff9b2e9fb2d kudu::rpc::GeneratedServiceIf::Handle() at ??:0
    @     0x7ff9b2ea1ee6 kudu::rpc::ServicePool::RunThread() at ??:0
    @     0x7ff9b2ea4499 boost::_mfi::mf0<>::operator()() at ??:0
    @     0x7ff9b2ea4400 boost::_bi::list1<>::operator()<>() at ??:0
    @     0x7ff9b2ea43aa boost::_bi::bind_t<>::operator()() at ??:0
    @     0x7ff9b2ea418d boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0
    @     0x7ff9b2e45f68 boost::function0<>::operator()() at ??:0
    @     0x7ff9b115162d kudu::Thread::SuperviseThread() at ??:0
    @     0x7ff9b5121184 start_thread at ??:0
    @     0x7ff9ad463ffd clone at ??:0
    @                0x0 (unknown)

I used the following WIP stress test for the reproduction scenario:
  https://gerrit.cloudera.org/#/c/9255/

For DEBUG builds, without fix the issues appeared ~0.5% of cases.  After
the fix, the issue could not be reproduced:

Without fix:
  http://dist-test.cloudera.org//job?job_id=aserbin.1518492521.137030

With fix:
  http://dist-test.cloudera.org//job?job_id=aserbin.1518492937.141401

Change-Id: Ia7600f006c8df7f445cc2551e99390177378bcff
Reviewed-on: http://gerrit.cloudera.org:8080/9189
Tested-by: Kudu Jenkins
Reviewed-by: Mike Percy <mp...@apache.org>
(cherry picked from commit 5d10a56f9d06dc695f2a4469edbabce978912eb4)
Reviewed-on: http://gerrit.cloudera.org:8080/9549
Tested-by: Alexey Serbin <as...@cloudera.com>
Reviewed-by: Todd Lipcon <to...@apache.org>
---
M src/kudu/tablet/tablet.cc
1 file changed, 11 insertions(+), 9 deletions(-)

Approvals:
  Alexey Serbin: Verified
  Todd Lipcon: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/9549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.6.x
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia7600f006c8df7f445cc2551e99390177378bcff
Gerrit-Change-Number: 9549
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>