You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Michael Ho (Code Review)" <ge...@cloudera.org> on 2017/06/01 23:57:06 UTC

[Impala-ASF-CR] IMPALA-5419: Check for cancellation when building hash tables

Michael Ho has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7047

Change subject: IMPALA-5419: Check for cancellation when building hash tables
......................................................................

IMPALA-5419: Check for cancellation when building hash tables

When inserting build rows into the hash table for each partition
in PhjBuilder::Partition::BuildHashTable(), we didn't check for
cancellation. This may lead to orphaned fragments running for
excessive amount of time and consuming a lot of memory after the
query has been cancelled.

This change fixes the problem by checking for cancellation per
row batch in the loop inside PhjBuilder::Partition::BuildHashTable().

Manually verified with the following query that cancelling the query
right after all build rows have been returned by the scan node will
not cause memory usage to continue going up.

select /* +straight_join */ count(*)
from tpch20_parquet.lineitem t1
  join /* +broadcast */ tpch20_parquet.lineitem t2
  on t1.l_orderkey = t2.l_orderkey

Change-Id: I8047f532f55dc0118f7a843c91275f752c8a190d
---
M be/src/exec/partitioned-hash-join-builder.cc
1 file changed, 1 insertion(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/7047/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8047f532f55dc0118f7a843c91275f752c8a190d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <kw...@cloudera.com>

[Impala-ASF-CR] IMPALA-5419: Check for cancellation when building hash tables

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5419: Check for cancellation when building hash tables
......................................................................


Patch Set 1:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/670/

-- 
To view, visit http://gerrit.cloudera.org:8080/7047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8047f532f55dc0118f7a843c91275f752c8a190d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5419: Check for cancellation when building hash tables

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5419: Check for cancellation when building hash tables
......................................................................


IMPALA-5419: Check for cancellation when building hash tables

When inserting build rows into the hash table for each partition
in PhjBuilder::Partition::BuildHashTable(), we didn't check for
cancellation. This may lead to orphaned fragments running for
excessive amount of time and consuming a lot of memory after the
query has been cancelled.

This change fixes the problem by checking for cancellation per
row batch in the loop inside PhjBuilder::Partition::BuildHashTable().

Manually verified with the following query that cancelling the query
right after all build rows have been returned by the scan node will
not cause memory usage to continue going up.

select /* +straight_join */ count(*)
from tpch20_parquet.lineitem t1
  join /* +broadcast */ tpch20_parquet.lineitem t2
  on t1.l_orderkey = t2.l_orderkey

Change-Id: I8047f532f55dc0118f7a843c91275f752c8a190d
Reviewed-on: http://gerrit.cloudera.org:8080/7047
Reviewed-by: Sailesh Mukil <sa...@cloudera.com>
Reviewed-by: Dan Hecht <dh...@cloudera.com>
Tested-by: Impala Public Jenkins
---
M be/src/exec/partitioned-hash-join-builder.cc
1 file changed, 1 insertion(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Sailesh Mukil: Looks good to me, but someone else must approve
  Dan Hecht: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I8047f532f55dc0118f7a843c91275f752c8a190d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>

[Impala-ASF-CR] IMPALA-5419: Check for cancellation when building hash tables

Posted by "Sailesh Mukil (Code Review)" <ge...@cloudera.org>.
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5419: Check for cancellation when building hash tables
......................................................................


Patch Set 1: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7047/1/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

PS1, Line 690: RETURN_IF_ERROR(state->GetQueryStatus());
> Sadly, no. When a query is cancelled, it only sets the is_cancelled_ flag i
Ah, that doesn't match with this then:
https://github.com/apache/incubator-impala/blob/master/be/src/common/status.h#L197

Probably not in scope for this patch but we should fix that separately.


-- 
To view, visit http://gerrit.cloudera.org:8080/7047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8047f532f55dc0118f7a843c91275f752c8a190d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5419: Check for cancellation when building hash tables

Posted by "Sailesh Mukil (Code Review)" <ge...@cloudera.org>.
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5419: Check for cancellation when building hash tables
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7047/1/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

PS1, Line 690: RETURN_IF_ERROR(state->GetQueryStatus());
Wouldn't this return if it got cancelled?


-- 
To view, visit http://gerrit.cloudera.org:8080/7047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8047f532f55dc0118f7a843c91275f752c8a190d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5419: Check for cancellation when building hash tables

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5419: Check for cancellation when building hash tables
......................................................................


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8047f532f55dc0118f7a843c91275f752c8a190d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5419: Check for cancellation when building hash tables

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change.

Change subject: IMPALA-5419: Check for cancellation when building hash tables
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7047/1/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

PS1, Line 690: RETURN_IF_ERROR(state->GetQueryStatus());
> Wouldn't this return if it got cancelled?
Sadly, no. When a query is cancelled, it only sets the is_cancelled_ flag in the RuntimeState and leaves the query_status_ untouched.


-- 
To view, visit http://gerrit.cloudera.org:8080/7047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8047f532f55dc0118f7a843c91275f752c8a190d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5419: Check for cancellation when building hash tables

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5419: Check for cancellation when building hash tables
......................................................................


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8047f532f55dc0118f7a843c91275f752c8a190d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5419: Check for cancellation when building hash tables

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change.

Change subject: IMPALA-5419: Check for cancellation when building hash tables
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7047/1/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

PS1, Line 690: RETURN_IF_ERROR(state->GetQueryStatus());
> Ah, that doesn't match with this then:
RETURN_IF_CANCELLED(state) will return Status::CANCELLED.


-- 
To view, visit http://gerrit.cloudera.org:8080/7047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8047f532f55dc0118f7a843c91275f752c8a190d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes