You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Jim Apple (Code Review)" <ge...@cloudera.org> on 2018/07/16 02:38:07 UTC

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Jim Apple has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10948


Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................

IMPALA-5031: Fix undefined behavior: memset NULL

memset has undefined behavior when its first argument is NULL. The
instance fixed here was found by Clang's undefined behavior
sanitizer.

It was found in the end-to-end tests. The interesting part of the
stack trace is:

exec/data-source-scan-node.cc:152:10: runtime error: null pointer passed as argument 1, which is declared to never be null
/usr/include/string.h:62:79: note: nonnull attribute specified here

Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
---
M be/src/exec/data-source-scan-node.cc
A be/src/util/ubsan.h
2 files changed, 36 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/10948/1
-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple <jb...@apache.org>

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................


Patch Set 3:

> (1 comment)

Another wrinkle here is that this doesn't actually provide a memory-safe memset. It prevents trying to write to nullptr when n > 0 but p = nullptr, but that's incidental, and that would segfault anyway, right?

What it's really preventing is the undefined behavior, not the dangers we normally associate with memory unsafety like double frees, writing beyond bounds, slicing, and so on.


-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 3
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Jul 2018 23:57:13 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................


Patch Set 3: Code-Review+1

(2 comments)

I'm fine with moving forward with this as-is, Todd's suggestion makes sense but I'll leave it up to you two

http://gerrit.cloudera.org:8080/#/c/10948/3/be/src/util/ubsan.h
File be/src/util/ubsan.h:

http://gerrit.cloudera.org:8080/#/c/10948/3/be/src/util/ubsan.h@26
PS3, Line 26: class Ubsan {
> I think in those cases I'd probably put each of the methods into wherever t
Todd's suggestion makes sense to me. I don't think this way of organising utilities is going to cause any significant issues though.


http://gerrit.cloudera.org:8080/#/c/10948/2/be/src/util/ubsan.h
File be/src/util/ubsan.h:

http://gerrit.cloudera.org:8080/#/c/10948/2/be/src/util/ubsan.h@29
PS2, Line 29:     if (s == nullptr) {
> I picked the first one, because in a release build, the calling MemSet(NULL
In principle I don't agree with writing code to defensively handle invariant violations. In this case it doesn't complicate the code but I feel like maybe this is part of the reason we had trouble getting the ubsan changes through.

IMO, we should add the appropriate DCHECKs and tests and then streamline the code as much as possible. It's impossible to test handling of invariant violations by definition and once you start doing it there's an endless number of opportunities to complicate code by being unnecessarily defensive.



-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 3
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Wed, 18 Jul 2018 20:55:10 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10948/2/be/src/util/ubsan.h
File be/src/util/ubsan.h:

http://gerrit.cloudera.org:8080/#/c/10948/2/be/src/util/ubsan.h@29
PS2, Line 29:     if (s == nullptr) return s;
> n > 0 && s == nullptr sorry
I picked the first one, because in a release build, the calling MemSet(NULL, 'q', 10) has incorrect (but defined) behavior. In the latter, it has undefined behavior.



-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 2
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Jul 2018 16:57:34 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Zoltan Borok-Nagy (Code Review)" <ge...@cloudera.org>.
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................


Patch Set 2: Code-Review+1

Thank you for your answers, lgtm.


-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 2
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Jul 2018 13:19:30 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................


Patch Set 2:

> Do we want to update the other call sites of std::memset?

Not yet.

First, when I last submitted a patch with many memset fixes some months ago,it had trouble getting through review.

Second, not every call needs to be fixed, since some don't call it with a NULL parameter and so do not induce undefined behavior.

 > IMPALA-5031 is quite general, while this patch set is very
 > specific. Is IMPALA-5031 some kind of umbrella Jira for all the
 > ubsan-related issues?

Yes.


-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 2
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Jul 2018 13:02:56 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10948/3/be/src/util/ubsan.h
File be/src/util/ubsan.h:

http://gerrit.cloudera.org:8080/#/c/10948/3/be/src/util/ubsan.h@26
PS3, Line 26: class Ubsan {
> bikeshedding opportunity here: when I saw the name 'ubsan.h' and the name o
I had intended for this class to include safe versions of other undefined operations, not all of which are memory:

https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html

Thoughts?



-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 3
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Jul 2018 21:48:50 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................


Patch Set 2:

(1 comment)

Yeah I generally disagree with the idea of adding NULL checks to every invocation of memset() - I think it makes the invariants of the code harder to understand and adds runtime overhead. In practice I think all of the callsites pass n == 0 when there's a null pointer and glibc memset() won't dereference the pointer in that case.

There are more theoretical possibilities if the compiler decides to inline a custom memset implementation but I find it unlikely in practice that that would be compiled to anything strange since that code still has to handle the n == 0 case correctly by not dereferencing the pointer. You could have an implementation like below

  if (p == NULL) DoSomethingWild();
  if (n >= ...) {
  }
  if (n >= ...) {
  }

But something like below makes way more sense.

  if (n >= ...) {
  }
  if (n >= ...) {
  }

http://gerrit.cloudera.org:8080/#/c/10948/2/be/src/util/ubsan.h
File be/src/util/ubsan.h:

http://gerrit.cloudera.org:8080/#/c/10948/2/be/src/util/ubsan.h@29
PS2, Line 29:     if (s == nullptr) return s;
We should DCHECK that n == 0 in this case since otherwise it's a bug.

Or actually, this check could be if (n == 0) and we could DCHECK != NULL - I think that's closer to the intent.



-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 2
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Jul 2018 15:56:27 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................


Patch Set 3: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 3
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Mon, 23 Jul 2018 16:33:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................


Patch Set 4: Code-Review+2

carry Tim's +2


-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 4
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Mon, 23 Jul 2018 19:20:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10948/2/be/src/util/ubsan.h
File be/src/util/ubsan.h:

http://gerrit.cloudera.org:8080/#/c/10948/2/be/src/util/ubsan.h@29
PS2, Line 29:     if (s == nullptr) return s;
> Which suggestion are you referring to? I don't think that problem applies t
n > 0 && s == nullptr sorry



-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 2
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Jul 2018 16:55:49 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10948/2/be/src/util/ubsan.h
File be/src/util/ubsan.h:

http://gerrit.cloudera.org:8080/#/c/10948/2/be/src/util/ubsan.h@29
PS2, Line 29:     if (s == nullptr) return s;
> If DCHECKs are no-ops in release mode, the NULL pointer check will be missi
Which suggestion are you referring to? I don't think that problem applies to either alternative I had in mind.

  if (s == nullptr) {
    DCHECK_EQ(0, n);
    return s;
  }
  ...

or

  if (n == 0) {
    return s;
  }
  DCHECK(s != nullptr);
  ...

Either case hits the DCHECK on a DEBUG build when there's a bug (n > 0 || s == nullptr) and behaves in a defined way on both release and debug builds when the arguments are valid.



-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 2
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Jul 2018 16:46:34 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Zoltan Borok-Nagy (Code Review)" <ge...@cloudera.org>.
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................


Patch Set 2:

Do we want to update the other call sites of std::memset?

IMPALA-5031 is quite general, while this patch set is very specific. Is IMPALA-5031 some kind of umbrella Jira for all the ubsan-related issues? Or, this is the last issue related to undefined behavior, and after this ubsan builds will be error-free?


-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 2
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Jul 2018 10:10:38 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.
Hello Tim Armstrong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10948

to look at the new patch set (#2).

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................

IMPALA-5031: Fix undefined behavior: memset NULL

memset has undefined behavior when its first argument is NULL. The
instance fixed here was found by Clang's undefined behavior
sanitizer.

It was found in the end-to-end tests. The interesting part of the
stack trace is:

/exec/data-source-scan-node.cc:152:10: runtime error: null pointer passed as argument 1, which is declared to never be null
/usr/include/string.h:62:79: note: nonnull attribute specified here
    #0 0x482fd8e in DataSourceScanNode::GetNextInputBatch() /exec/data-source-scan-node.cc:152:3
    #1 0x482fb40 in DataSourceScanNode::Open(RuntimeState*) /exec/data-source-scan-node.cc:124:10
    #2 0x47ef854 in AggregationNode::Open(RuntimeState*) /exec/aggregation-node.cc:71:49
    #3 0x23506a4 in FragmentInstanceState::Open() /runtime/fragment-instance-state.cc:266:53
    #4 0x234b6a8 in FragmentInstanceState::Exec() /runtime/fragment-instance-state.cc:81:12
    #5 0x236ee52 in QueryState::ExecFInstance(FragmentInstanceState*) /runtime/query-state.cc:401:24
    #6 0x237093e in QueryState::StartFInstances()::$_0::operator()() const /runtime/query-state.cc:341:44

Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
---
M be/src/exec/data-source-scan-node.cc
A be/src/util/ubsan.h
2 files changed, 36 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/10948/2
-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 2
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.
Hello Zoltan Borok-Nagy, Tim Armstrong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10948

to look at the new patch set (#3).

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................

IMPALA-5031: Fix undefined behavior: memset NULL

memset has undefined behavior when its first argument is NULL. The
instance fixed here was found by Clang's undefined behavior
sanitizer.

It was found in the end-to-end tests. The interesting part of the
stack trace is:

/exec/data-source-scan-node.cc:152:10: runtime error: null pointer passed as argument 1, which is declared to never be null
/usr/include/string.h:62:79: note: nonnull attribute specified here
    #0 0x482fd8e in DataSourceScanNode::GetNextInputBatch() /exec/data-source-scan-node.cc:152:3
    #1 0x482fb40 in DataSourceScanNode::Open(RuntimeState*) /exec/data-source-scan-node.cc:124:10
    #2 0x47ef854 in AggregationNode::Open(RuntimeState*) /exec/aggregation-node.cc:71:49
    #3 0x23506a4 in FragmentInstanceState::Open() /runtime/fragment-instance-state.cc:266:53
    #4 0x234b6a8 in FragmentInstanceState::Exec() /runtime/fragment-instance-state.cc:81:12
    #5 0x236ee52 in QueryState::ExecFInstance(FragmentInstanceState*) /runtime/query-state.cc:401:24
    #6 0x237093e in QueryState::StartFInstances()::$_0::operator()() const /runtime/query-state.cc:341:44

Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
---
M be/src/exec/data-source-scan-node.cc
A be/src/util/ubsan.h
2 files changed, 39 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/10948/3
-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 3
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10948/3/be/src/util/ubsan.h
File be/src/util/ubsan.h:

http://gerrit.cloudera.org:8080/#/c/10948/3/be/src/util/ubsan.h@26
PS3, Line 26: class Ubsan {
> I had intended for this class to include safe versions of other undefined o
I think in those cases I'd probably put each of the methods into wherever they make sense, since these aren't workarounds for UBSAN so much as safe variants of utility code. eg this one would be in memory.h, and overflow-safe math code could be in safe_math.h (we have such a header in kudu/util fwiw). Bit-shift behavior probably belongs in bit-util.h etc.

If others disagree with me though I'm not gonna be too much of a stickler.



-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 3
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Jul 2018 22:25:19 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2848/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 4
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Mon, 23 Jul 2018 19:21:26 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................

IMPALA-5031: Fix undefined behavior: memset NULL

memset has undefined behavior when its first argument is NULL. The
instance fixed here was found by Clang's undefined behavior
sanitizer.

It was found in the end-to-end tests. The interesting part of the
stack trace is:

/exec/data-source-scan-node.cc:152:10: runtime error: null pointer passed as argument 1, which is declared to never be null
/usr/include/string.h:62:79: note: nonnull attribute specified here
    #0 0x482fd8e in DataSourceScanNode::GetNextInputBatch() /exec/data-source-scan-node.cc:152:3
    #1 0x482fb40 in DataSourceScanNode::Open(RuntimeState*) /exec/data-source-scan-node.cc:124:10
    #2 0x47ef854 in AggregationNode::Open(RuntimeState*) /exec/aggregation-node.cc:71:49
    #3 0x23506a4 in FragmentInstanceState::Open() /runtime/fragment-instance-state.cc:266:53
    #4 0x234b6a8 in FragmentInstanceState::Exec() /runtime/fragment-instance-state.cc:81:12
    #5 0x236ee52 in QueryState::ExecFInstance(FragmentInstanceState*) /runtime/query-state.cc:401:24
    #6 0x237093e in QueryState::StartFInstances()::$_0::operator()() const /runtime/query-state.cc:341:44

Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Reviewed-on: http://gerrit.cloudera.org:8080/10948
Reviewed-by: Jim Apple <jb...@apache.org>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/exec/data-source-scan-node.cc
A be/src/util/ubsan.h
2 files changed, 39 insertions(+), 1 deletion(-)

Approvals:
  Jim Apple: Looks good to me, approved
  Impala Public Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 5
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10948/3/be/src/util/ubsan.h
File be/src/util/ubsan.h:

http://gerrit.cloudera.org:8080/#/c/10948/3/be/src/util/ubsan.h@26
PS3, Line 26: class Ubsan {
bikeshedding opportunity here: when I saw the name 'ubsan.h' and the name of this class, I thought this would have various utility code to deal with interfacing with ubsan itself (eg util/asan.h has macros for interfacing with ASAN)

In fact this is more like a "safe memory utilities" class. Consider a different name?



-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 3
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Jul 2018 18:44:04 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................


Patch Set 4: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 4
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Mon, 23 Jul 2018 22:37:15 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: Fix undefined behavior: memset NULL

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/10948 )

Change subject: IMPALA-5031: Fix undefined behavior: memset NULL
......................................................................


Patch Set 2:

(1 comment)

> (1 comment)
 > 
 > Yeah I generally disagree with the idea of adding NULL checks to
 > every invocation of memset() - I think it makes the invariants of
 > the code harder to understand and adds runtime overhead. In
 > practice I think all of the callsites pass n == 0 when there's a
 > null pointer and glibc memset() won't dereference the pointer in
 > that case.
 > 
 > There are more theoretical possibilities if the compiler decides to
 > inline a custom memset implementation but I find it unlikely in
 > practice that that would be compiled to anything strange since that
 > code still has to handle the n == 0 case correctly by not
 > dereferencing the pointer. You could have an implementation like
 > below
 > 
 > if (p == NULL) DoSomethingWild();
 > if (n >= ...) {
 > }
 > if (n >= ...) {
 > }
 > 
 > But something like below makes way more sense.
 > 
 > if (n >= ...) {
 > }
 > if (n >= ...) {
 > }

https://blog.regehr.org/archives/1292

shows

https://goo.gl/h7K89h

which shows

    int set(char *p, int c, size_t n) {
      memset(p, c, n);
      return p == 0;
    }

which compiles to

    set:
        subq    $8, %rsp
        call    memset
        xorl    %eax, %eax
        addq    $8, %rsp
        ret

which uses the fact that the first argument can't be NULL to optimize away the comparison in the return statement.

http://gerrit.cloudera.org:8080/#/c/10948/2/be/src/util/ubsan.h
File be/src/util/ubsan.h:

http://gerrit.cloudera.org:8080/#/c/10948/2/be/src/util/ubsan.h@29
PS2, Line 29:     if (s == nullptr) return s;
> We should DCHECK that n == 0 in this case since otherwise it's a bug.
If DCHECKs are no-ops in release mode, the NULL pointer check will be missing, which allows the compiler to have demons fly out of my nose.

https://groups.google.com/d/msg/comp.std.c/ycpVKxTZkgw/S2hHdTbv4d8J



-- 
To view, visit http://gerrit.cloudera.org:8080/10948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18fa02dc887a42a94c6f81e4923d17568f2184f2
Gerrit-Change-Number: 10948
Gerrit-PatchSet: 2
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Jul 2018 16:11:07 +0000
Gerrit-HasComments: Yes