You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Thomas Marshall (Code Review)" <ge...@cloudera.org> on 2018/10/10 18:57:31 UTC

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Thomas Marshall has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11650


Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................

IMPALA-7272: Fix crash in StringMinMaxFilter

StringMinMaxFilter uses a MemPool to allocate space for StringBuffers.
Previously, the MemPool was created by RuntimeFilterBank and passed to
each StringMinMaxFilter. In queries with multiple StringMinMaxFilters
being generated by the same fragment instance, this leads to
overlapping use of the MemPool by different threads, which is
incorrect as MemPools are not thread-safe.

The solution is to have each StringMinMaxFilter create its own
MemPool. This patch also documents MemPool as not thread-safe.

Testing:
- I have been unable to repro the actual crash despite trying a large
  variety of different things. However, with additional logging added
  its clear that the MemPool is being used concurrently, which is
  incorrect.

Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
---
M be/src/runtime/mem-pool.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-filter-bank.h
M be/src/util/min-max-filter.cc
M be/src/util/min-max-filter.h
5 files changed, 25 insertions(+), 23 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/11650/1
-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 )

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3329/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 5
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Comment-Date: Thu, 18 Oct 2018 17:08:21 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 )

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................


Patch Set 3: Code-Review+2

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11650/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11650/3//COMMIT_MSG@29
PS3, Line 29: 
            :   MemPool::Allocate.
Can you please consider adding this query as a test as a regression test ? Although it may not be reproducible as standalone, it will still exercise the path in question. Since it's also timing dependent, it may be more readily reproducible in our regular testing environment where we run multiple queries concurrently.


http://gerrit.cloudera.org:8080/#/c/11650/3/be/src/util/min-max-filter-test.cc
File be/src/util/min-max-filter-test.cc:

http://gerrit.cloudera.org:8080/#/c/11650/3/be/src/util/min-max-filter-test.cc@355
PS3, Line 355:   EXPECT_EQ(TimestampValue::FromTColumnValue(tFilter2.max), t3);
Should we call Close() on the filters above too ?


http://gerrit.cloudera.org:8080/#/c/11650/3/be/src/util/min-max-filter.h
File be/src/util/min-max-filter.h:

http://gerrit.cloudera.org:8080/#/c/11650/3/be/src/util/min-max-filter.h@188
PS3, Line 188: MemPool mem_pool_;
As mentioned previously, please add a comment for this.



-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Comment-Date: Thu, 18 Oct 2018 02:34:46 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 )

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................


Patch Set 5: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 5
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Comment-Date: Thu, 18 Oct 2018 20:56:51 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Thomas Marshall (Code Review)" <ge...@cloudera.org>.
Hello Michael Ho, Philip Zeyliger, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11650

to look at the new patch set (#4).

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................

IMPALA-7272: Fix crash in StringMinMaxFilter

StringMinMaxFilter uses a MemPool to allocate space for StringBuffers.
Previously, the MemPool was created by RuntimeFilterBank and passed to
each StringMinMaxFilter. In queries with multiple StringMinMaxFilters
being generated by the same fragment instance, this leads to
overlapping use of the MemPool by different threads, which is
incorrect as MemPools are not thread-safe.

The solution is to have each StringMinMaxFilter create its own
MemPool.

This patch also documents MemPool as not thread-safe and introduces a
DFAKE_MUTEX to help enforce correct usage. Doing this requires
modifying our CMakeLists.txt to pass '-DNDEBUG' to clang only in
RELEASE builds, so that the DFAKE_MUTEX will be present in the
compiled IR for DEBUG builds.

Testing:
- I have been unable to repro the actual crash despite trying a large
  variety of different things. However, with additional logging added
  its clear that the MemPool is being used concurrently, which is
  incorrect.
- Added an e2e test that covers the potential issue. It hits the
  DFAKE_MUTEX with a sleep added to MemPool::Allocate.
- Ran a full exhaustive build in both DEBUG and RELEASE.

Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
---
M be/CMakeLists.txt
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-filter-bank.h
M be/src/util/min-max-filter-test.cc
M be/src/util/min-max-filter.cc
M be/src/util/min-max-filter.h
M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test
9 files changed, 113 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/11650/4
-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Philip Zeyliger (Code Review)" <ge...@cloudera.org>.
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 )

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................


Patch Set 1:

Nice catch.

I'd be interested in a way that we could DCHECK that all users of mempool only come from one thread, if that's in fact the API.


-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Comment-Date: Wed, 10 Oct 2018 19:05:10 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 )

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/runtime/mem-pool.h
File be/src/runtime/mem-pool.h:

http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/runtime/mem-pool.h@260
PS1, Line 260:   uint8_t* ALWAYS_INLINE Allocate(int64_t size, int alignment) noexcept {
> After a first pass, it looks like this isn't our only mis-use of MemPools (
Good to hear that DFAKE_SCOPED_LOCK() is catching the problem.

How many other different instances are we talking about here ? My preference would be to fix all of them in this patch as there is a workaround for the bug in the min/max filter. Of course, if fixing some of those new cases requires a lot of refactoring or if  there are too many places which suffer from this problem (which would be surprising), we can consider deferring the adding of DFAKE_SCOPED_LOCK() here.



-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Comment-Date: Wed, 10 Oct 2018 23:12:04 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Thomas Marshall (Code Review)" <ge...@cloudera.org>.
Thomas Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 )

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/runtime/mem-pool.h
File be/src/runtime/mem-pool.h:

http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/runtime/mem-pool.h@260
PS1, Line 260:   uint8_t* ALWAYS_INLINE Allocate(int64_t size, int alignment) noexcept {
> May benefit from DFAKE_SCOPED_LOCK(). See https://gerrit.cloudera.org/#/c/1
After a first pass, it looks like this isn't our only mis-use of MemPools (i.e. we're hitting the DFAKE_SCOPED_LOCK even with this patch)

I'll see about fixing that, but we may also want to consider getting this in as is and doing the fake lock as a followup task.



-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Comment-Date: Wed, 10 Oct 2018 23:02:52 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11650 )

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................

IMPALA-7272: Fix crash in StringMinMaxFilter

StringMinMaxFilter uses a MemPool to allocate space for StringBuffers.
Previously, the MemPool was created by RuntimeFilterBank and passed to
each StringMinMaxFilter. In queries with multiple StringMinMaxFilters
being generated by the same fragment instance, this leads to
overlapping use of the MemPool by different threads, which is
incorrect as MemPools are not thread-safe.

The solution is to have each StringMinMaxFilter create its own
MemPool.

This patch also documents MemPool as not thread-safe and introduces a
DFAKE_MUTEX to help enforce correct usage. Doing this requires
modifying our CMakeLists.txt to pass '-DNDEBUG' to clang only in
RELEASE builds, so that the DFAKE_MUTEX will be present in the
compiled IR for DEBUG builds.

Testing:
- I have been unable to repro the actual crash despite trying a large
  variety of different things. However, with additional logging added
  its clear that the MemPool is being used concurrently, which is
  incorrect.
- Added an e2e test that covers the potential issue. It hits the
  DFAKE_MUTEX with a sleep added to MemPool::Allocate.
- Ran a full exhaustive build in both DEBUG and RELEASE.

Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Reviewed-on: http://gerrit.cloudera.org:8080/11650
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/CMakeLists.txt
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-filter-bank.h
M be/src/util/min-max-filter-test.cc
M be/src/util/min-max-filter.cc
M be/src/util/min-max-filter.h
M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test
9 files changed, 113 insertions(+), 49 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 6
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 )

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/1067/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Comment-Date: Tue, 16 Oct 2018 22:06:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 )

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................


Patch Set 1:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/1012/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Comment-Date: Wed, 10 Oct 2018 19:45:12 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Thomas Marshall (Code Review)" <ge...@cloudera.org>.
Thomas Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 )

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/runtime/mem-pool.h
File be/src/runtime/mem-pool.h:

http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/runtime/mem-pool.h@260
PS1, Line 260:   uint8_t* ALWAYS_INLINE Allocate(int64_t size, int alignment) noexcept {
> Good to hear that DFAKE_SCOPED_LOCK() is catching the problem.
After digging in more, it looks like the problem is actually related to codegen - even correct uses of MemPool with the DFAKE_MUTEX from a codegen'd function cause crashes and disabling codegen fixes the crashes.

I'm not sure what's going on - there isn't any sort of error and the backtraces from the crashes are useless - but I'm continuing to dig in. Any thoughts on what might be happening?



-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Comment-Date: Thu, 11 Oct 2018 23:17:45 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 )

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................


Patch Set 1:

(3 comments)

The fix makes sense to me. Can you please try the DFAKE_SCOPED_LOCK() suggestion and use a query with multiple joins inside the same fragment to see if you can trigger the crash ?

http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/runtime/mem-pool.h
File be/src/runtime/mem-pool.h:

http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/runtime/mem-pool.h@260
PS1, Line 260:   uint8_t* ALWAYS_INLINE Allocate(int64_t size, int alignment) noexcept {
May benefit from DFAKE_SCOPED_LOCK(). See https://gerrit.cloudera.org/#/c/10855/15/be/src/runtime/fragment-instance-state.cc@396

Same for other functions.


http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/util/min-max-filter.h
File be/src/util/min-max-filter.h:

http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/util/min-max-filter.h@80
PS1, Line 80:   /// Returns a new MinMaxFilter with the given type, allocated from 'pool'.
Please add a comment for 'mem_tracker'


http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/util/min-max-filter.h@186
PS1, Line 186: MemPool mem_pool_;
Please add a comment for it.



-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Comment-Date: Wed, 10 Oct 2018 19:23:38 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Thomas Marshall (Code Review)" <ge...@cloudera.org>.
Thomas Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 )

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/11650/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11650/3//COMMIT_MSG@29
PS3, Line 29: 
            : - Added an e2e test 
> Can you please consider adding this query as a test as a regression test ? 
Done


http://gerrit.cloudera.org:8080/#/c/11650/3/be/src/util/min-max-filter-test.cc
File be/src/util/min-max-filter-test.cc:

http://gerrit.cloudera.org:8080/#/c/11650/3/be/src/util/min-max-filter-test.cc@355
PS3, Line 355:   t4.ToTColumnValue(&tFilter1.max);
> Should we call Close() on the filters above too ?
Done


http://gerrit.cloudera.org:8080/#/c/11650/3/be/src/util/min-max-filter.h
File be/src/util/min-max-filter.h:

http://gerrit.cloudera.org:8080/#/c/11650/3/be/src/util/min-max-filter.h@188
PS3, Line 188: /// MemPool that '
> As mentioned previously, please add a comment for this.
Done


http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/util/min-max-filter.h
File be/src/util/min-max-filter.h:

http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/util/min-max-filter.h@80
PS1, Line 80: 
> Please add a comment for 'mem_tracker'
Done


http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/util/min-max-filter.h@186
PS1, Line 186: static const int M
> Please add a comment for it.
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Comment-Date: Thu, 18 Oct 2018 03:47:35 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 )

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................


Patch Set 4: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Comment-Date: Thu, 18 Oct 2018 05:49:09 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 )

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/1083/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Comment-Date: Thu, 18 Oct 2018 04:20:21 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 )

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11650/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11650/2//COMMIT_MSG@26
PS2, Line 26: I have been unable to repro the actual crash despite trying a large
            :   variety of different things. However, with additional logging added
            :   its clear that the MemPool is being used concurrently, which is
            :   incorrect.
Can you reproduce the problem now with DFAKE_MUTEX ?


http://gerrit.cloudera.org:8080/#/c/11650/2/be/CMakeLists.txt
File be/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/11650/2/be/CMakeLists.txt@234
PS2, Line 234:  SET(CLANG_IR_CXX_FLAGS "${CLANG_IR_CXX_FLAGS}" "-DNDEBUG")
I think this makes sense. I wonder what the historical context for not doing it in the first place. I checked the code on some older releases from more than 4 years ago and it was there all along. May be the codegen time was too slow back then to include all the debug code ?

How much slow down (if any) did you notice in debug builds with this change ?


http://gerrit.cloudera.org:8080/#/c/11650/2/be/src/runtime/mem-pool.h
File be/src/runtime/mem-pool.h:

http://gerrit.cloudera.org:8080/#/c/11650/2/be/src/runtime/mem-pool.h@135
PS2, Line 135: DFAKE_SCOPED_LOCK(mutex_);
nit: Feel free to ignore but it seems easier to just add DFAKE_SCOPED_LOCK() directly in Allocate(). That said, the current code also looks fine as-is.



-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Comment-Date: Tue, 16 Oct 2018 23:40:22 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 )

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................


Patch Set 5: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 5
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Comment-Date: Thu, 18 Oct 2018 17:08:20 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Thomas Marshall (Code Review)" <ge...@cloudera.org>.
Hello Michael Ho, Philip Zeyliger, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11650

to look at the new patch set (#3).

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................

IMPALA-7272: Fix crash in StringMinMaxFilter

StringMinMaxFilter uses a MemPool to allocate space for StringBuffers.
Previously, the MemPool was created by RuntimeFilterBank and passed to
each StringMinMaxFilter. In queries with multiple StringMinMaxFilters
being generated by the same fragment instance, this leads to
overlapping use of the MemPool by different threads, which is
incorrect as MemPools are not thread-safe.

The solution is to have each StringMinMaxFilter create its own
MemPool.

This patch also documents MemPool as not thread-safe and introduces a
DFAKE_MUTEX to help enforce correct usage. Doing this requires
modifying our CMakeLists.txt to pass '-DNDEBUG' to clang only in
RELEASE builds, so that the DFAKE_MUTEX will be present in the
compiled IR for DEBUG builds.

Testing:
- I have been unable to repro the actual crash despite trying a large
  variety of different things. However, with additional logging added
  its clear that the MemPool is being used concurrently, which is
  incorrect. I can also hit the DFAKE_MUTEX by adding a sleep to
  MemPool::Allocate.
- Ran a full exhaustive build in both DEBUG and RELEASE.

Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
---
M be/CMakeLists.txt
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-filter-bank.h
M be/src/util/min-max-filter-test.cc
M be/src/util/min-max-filter.cc
M be/src/util/min-max-filter.h
8 files changed, 76 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/11650/3
-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Thomas Marshall (Code Review)" <ge...@cloudera.org>.
Thomas Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 )

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11650/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11650/2//COMMIT_MSG@26
PS2, Line 26: I have been unable to repro the actual crash despite trying a large
            :   variety of different things. However, with additional logging added
            :   its clear that the MemPool is being used concurrently, which is
            :   incorrect.
> Can you reproduce the problem now with DFAKE_MUTEX ?
I looped for awhile and it didn't hit, but I can generate a thread collision by adding an appropriate sleep() to Allocate to increase the period the lock is held for.


http://gerrit.cloudera.org:8080/#/c/11650/2/be/CMakeLists.txt
File be/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/11650/2/be/CMakeLists.txt@234
PS2, Line 234:  SET(CLANG_IR_CXX_FLAGS "${CLANG_IR_CXX_FLAGS}" "-DNDEBUG")
> I think this makes sense. I wonder what the historical context for not doin
Yeah, it seems that its always been this way since we first introduced CLANG_IR_CXX_FLAGS, which unfortunately was back before the Impala team used good commit messages.

The perf difference seems very minor - I think exhaustive DEBUG usually takes about 7-8 hours and the exhaustive DEBUG build I ran with this patch took 7h49m.


http://gerrit.cloudera.org:8080/#/c/11650/2/be/src/runtime/mem-pool.h
File be/src/runtime/mem-pool.h:

http://gerrit.cloudera.org:8080/#/c/11650/2/be/src/runtime/mem-pool.h@135
PS2, Line 135: DFAKE_SCOPED_LOCK(mutex_);
> nit: Feel free to ignore but it seems easier to just add DFAKE_SCOPED_LOCK(
Yeah, I guess it doesn't make much difference, but I just wanted to maximize the amount of time the lock is held for since actually hitting the bug is quite difficult.



-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Comment-Date: Wed, 17 Oct 2018 19:18:07 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter

Posted by "Thomas Marshall (Code Review)" <ge...@cloudera.org>.
Hello Michael Ho, Philip Zeyliger, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11650

to look at the new patch set (#2).

Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter
......................................................................

IMPALA-7272: Fix crash in StringMinMaxFilter

StringMinMaxFilter uses a MemPool to allocate space for StringBuffers.
Previously, the MemPool was created by RuntimeFilterBank and passed to
each StringMinMaxFilter. In queries with multiple StringMinMaxFilters
being generated by the same fragment instance, this leads to
overlapping use of the MemPool by different threads, which is
incorrect as MemPools are not thread-safe.

The solution is to have each StringMinMaxFilter create its own
MemPool.

This patch also documents MemPool as not thread-safe and introduces a
DFAKE_MUTEX to help enforce correct usage. Doing this requires
modifying our CMakeLists.txt to pass '-DNDEBUG' to clang only in
RELEASE builds, so that the DFAKE_MUTEX will be present in the
compiled IR for DEBUG builds.

Testing:
- I have been unable to repro the actual crash despite trying a large
  variety of different things. However, with additional logging added
  its clear that the MemPool is being used concurrently, which is
  incorrect.
- Ran a full exhaustive build in both DEBUG and RELEASE.

Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
---
M be/CMakeLists.txt
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-filter-bank.h
M be/src/util/min-max-filter-test.cc
M be/src/util/min-max-filter.cc
M be/src/util/min-max-filter.h
8 files changed, 76 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/11650/2
-- 
To view, visit http://gerrit.cloudera.org:8080/11650
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Gerrit-Change-Number: 11650
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>