You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2016/06/22 06:08:02 UTC

[kudu-CR] Fix stray memory writes due to tcmalloc profiling

Hello David Ribeiro Alves, Mike Percy,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/3445

to review the following change.

Change subject: Fix stray memory writes due to tcmalloc profiling
......................................................................

Fix stray memory writes due to tcmalloc profiling

This fixes an issue that has been causing frequent crashes in
JD.com's production cluster as well as various Cloudera test clusters.
The crashes would be in various different places, but the key signature
was that offset 120 in some data structure or array (eg the 16th element
of a vector) would be corrupted.

After doing a git bisect using an integration testing cluster running
an ITBLL workload, I found that this was a regression caused by the
introduction of tcmalloc contention profiling[1]. The short explanation
is that, if we experienced contention while freeing a Trace object, we
could in some cases increment offset 120 of some other allocation
which occurred soon after the deallocation of the Trace.

The issue is described in more detail in a new comment in trace.h.

With this patch, I was unable to reproduce the issue on the test
cluster. No new test is added since this is quite timing-dependent
and not amenable to unit testing or even stress testing.

[1] commit f6691e744b9cb796e1bbc6e07953f21f387c9a88

Change-Id: I9afca83d9cc24585960f6bf68d8996c4736ce6cb
---
M src/kudu/util/trace.h
1 file changed, 23 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/45/3445/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I9afca83d9cc24585960f6bf68d8996c4736ce6cb
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>

[kudu-CR] Fix stray memory writes due to tcmalloc profiling

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: Fix stray memory writes due to tcmalloc profiling
......................................................................


Patch Set 1: Code-Review+2

good catch

-- 
To view, visit http://gerrit.cloudera.org:8080/3445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9afca83d9cc24585960f6bf68d8996c4736ce6cb
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: song bruce zhang <zs...@gmail.com>
Gerrit-HasComments: No

[kudu-CR] Fix stray memory writes due to tcmalloc profiling

Posted by "Kudu Jenkins (Code Review)" <ge...@cloudera.org>.
Kudu Jenkins has posted comments on this change.

Change subject: Fix stray memory writes due to tcmalloc profiling
......................................................................


Patch Set 1:

Build Started http://104.196.14.100/job/kudu-gerrit/1923/

-- 
To view, visit http://gerrit.cloudera.org:8080/3445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9afca83d9cc24585960f6bf68d8996c4736ce6cb
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-HasComments: No

[kudu-CR] Fix stray memory writes due to tcmalloc profiling

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged.

Change subject: Fix stray memory writes due to tcmalloc profiling
......................................................................


Fix stray memory writes due to tcmalloc profiling

This fixes an issue that has been causing frequent crashes in
JD.com's production cluster as well as various Cloudera test clusters.
The crashes would be in various different places, but the key signature
was that offset 120 in some data structure or array (eg the 16th element
of a vector) would be corrupted.

After doing a git bisect using an integration testing cluster running
an ITBLL workload, I found that this was a regression caused by the
introduction of tcmalloc contention profiling[1]. The short explanation
is that, if we experienced contention while freeing a Trace object, we
could in some cases increment offset 120 of some other allocation
which occurred soon after the deallocation of the Trace.

The issue is described in more detail in a new comment in trace.h.

With this patch, I was unable to reproduce the issue on the test
cluster. No new test is added since this is quite timing-dependent
and not amenable to unit testing or even stress testing.

[1] commit f6691e744b9cb796e1bbc6e07953f21f387c9a88

Change-Id: I9afca83d9cc24585960f6bf68d8996c4736ce6cb
Reviewed-on: http://gerrit.cloudera.org:8080/3445
Reviewed-by: David Ribeiro Alves <dr...@apache.org>
Reviewed-by: Mike Percy <mp...@apache.org>
Tested-by: Kudu Jenkins
---
M src/kudu/util/trace.h
1 file changed, 23 insertions(+), 3 deletions(-)

Approvals:
  David Ribeiro Alves: Looks good to me, approved
  Mike Percy: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/3445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I9afca83d9cc24585960f6bf68d8996c4736ce6cb
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] Fix stray memory writes due to tcmalloc profiling

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: Fix stray memory writes due to tcmalloc profiling
......................................................................


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9afca83d9cc24585960f6bf68d8996c4736ce6cb
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: song bruce zhang <zs...@gmail.com>
Gerrit-HasComments: No