You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Riza Suminto (Code Review)" <ge...@cloudera.org> on 2021/03/15 15:35:33 UTC

[Impala-ASF-CR] IMPALA-10583: Fix bug on unbounded result spooling memory config.

Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17187


Change subject: IMPALA-10583: Fix bug on unbounded result spooling memory config.
......................................................................

IMPALA-10583: Fix bug on unbounded result spooling memory config.

Two bugs happening on result spooling when we set two of its query
options to 0 (unbounded).

The first bug happens if max_spilled_result_spooling_mem =
0 (unbounded). max_unpinned_bytes_ in SpillableRowBatchQueue will be set
to 0, SpillableRowBatchQueue::IsFull() then will always return true, and
the query hang. This patch fix it by setting max_unpinned_bytes_ to
INT64_MAX if max_spilled_result_spooling_mem = 0.

The second bug happens if we set max_result_spooling_mem =
0 (unbounded). PlanRootSink.java will peg maxMemReservationBytes to
always equal to minMemReservationBytes. This patch fixes this by
reverting to the default max_result_spooling_mem (100MB).

Testing:
- Add test_unbounded_result_spooling_mem.
- Pass core tests.

Change-Id: If8f5e3668281bba8813f8082f45b4faa7721530e
---
M be/src/exec/buffered-plan-root-sink.cc
M be/src/runtime/spillable-row-batch-queue.cc
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/query_test/test_result_spooling.py
4 files changed, 43 insertions(+), 6 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/17187/1
-- 
To view, visit http://gerrit.cloudera.org:8080/17187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If8f5e3668281bba8813f8082f45b4faa7721530e
Gerrit-Change-Number: 17187
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-10583: Fix bug on unbounded result spooling memory config.

Posted by "Aman Sinha (Code Review)" <ge...@cloudera.org>.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17187 )

Change subject: IMPALA-10583: Fix bug on unbounded result spooling memory config.
......................................................................


Patch Set 1: Code-Review+1

(1 comment)

Thanks for fixing the edge cases. One related question below.

http://gerrit.cloudera.org:8080/#/c/17187/1/be/src/exec/buffered-plan-root-sink.cc
File be/src/exec/buffered-plan-root-sink.cc:

http://gerrit.cloudera.org:8080/#/c/17187/1/be/src/exec/buffered-plan-root-sink.cc@55
PS1, Line 55:   if (max_spilled_mem <= 0) max_spilled_mem = INT64_MAX;
In practice, this would eventually encounter other limits such as running out of disk space if too much has been spilled but do we check for disk space limits elsewhere ?



-- 
To view, visit http://gerrit.cloudera.org:8080/17187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8f5e3668281bba8813f8082f45b4faa7721530e
Gerrit-Change-Number: 17187
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 09 Apr 2021 15:57:48 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10583: Fix bug on unbounded result spooling memory config.

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17187 )

Change subject: IMPALA-10583: Fix bug on unbounded result spooling memory config.
......................................................................

IMPALA-10583: Fix bug on unbounded result spooling memory config.

Two bugs happening on result spooling when we set two of its query
options to 0 (unbounded).

The first bug happens if max_spilled_result_spooling_mem =
0 (unbounded). max_unpinned_bytes_ in SpillableRowBatchQueue will be set
to 0, SpillableRowBatchQueue::IsFull() then will always return true, and
the query hang. This patch fix it by setting max_unpinned_bytes_ to
INT64_MAX if max_spilled_result_spooling_mem = 0.

The second bug happens if we set max_result_spooling_mem =
0 (unbounded). PlanRootSink.java will peg maxMemReservationBytes to
always equal to minMemReservationBytes. This patch fixes this by
reverting to the default max_result_spooling_mem (100MB).

Testing:
- Add test_unbounded_result_spooling_mem.
- Pass core tests.

Change-Id: If8f5e3668281bba8813f8082f45b4faa7721530e
Reviewed-on: http://gerrit.cloudera.org:8080/17187
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/exec/buffered-plan-root-sink.cc
M be/src/runtime/spillable-row-batch-queue.cc
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/query_test/test_result_spooling.py
4 files changed, 43 insertions(+), 6 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/17187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If8f5e3668281bba8813f8082f45b4faa7721530e
Gerrit-Change-Number: 17187
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-10583: Fix bug on unbounded result spooling memory config.

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/17187 )

Change subject: IMPALA-10583: Fix bug on unbounded result spooling memory config.
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17187/1/be/src/exec/buffered-plan-root-sink.cc
File be/src/exec/buffered-plan-root-sink.cc:

http://gerrit.cloudera.org:8080/#/c/17187/1/be/src/exec/buffered-plan-root-sink.cc@55
PS1, Line 55:   if (max_spilled_mem <= 0) max_spilled_mem = INT64_MAX;
> In practice, this would eventually encounter other limits such as running o
The closest limit I find is scratch_dirs startup flag. In this flag, we define the path of scratch dirs and capacity quota (maximum bytes allowed to spill) for each dirs. But the quota specification itself is optional. So without quota being specified, I suppose we depend on OS to tell impala when disk space has ran out.
https://impala.apache.org/docs/build/html/topics/impala_disk_space.html



-- 
To view, visit http://gerrit.cloudera.org:8080/17187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8f5e3668281bba8813f8082f45b4faa7721530e
Gerrit-Change-Number: 17187
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 09 Apr 2021 17:26:32 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10583: Fix bug on unbounded result spooling memory config.

Posted by "Aman Sinha (Code Review)" <ge...@cloudera.org>.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17187 )

Change subject: IMPALA-10583: Fix bug on unbounded result spooling memory config.
......................................................................


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17187/1/be/src/exec/buffered-plan-root-sink.cc
File be/src/exec/buffered-plan-root-sink.cc:

http://gerrit.cloudera.org:8080/#/c/17187/1/be/src/exec/buffered-plan-root-sink.cc@55
PS1, Line 55:   if (max_spilled_mem <= 0) max_spilled_mem = INT64_MAX;
> The closest limit I find is scratch_dirs startup flag. In this flag, we def
I suppose it makes sense to keep the quota optional because the spill location could be either local disk/SSD or could be remote on S3 and the limits could be very different.



-- 
To view, visit http://gerrit.cloudera.org:8080/17187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8f5e3668281bba8813f8082f45b4faa7721530e
Gerrit-Change-Number: 17187
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 09 Apr 2021 19:21:40 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10583: Fix bug on unbounded result spooling memory config.

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17187 )

Change subject: IMPALA-10583: Fix bug on unbounded result spooling memory config.
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8370/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8f5e3668281bba8813f8082f45b4faa7721530e
Gerrit-Change-Number: 17187
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Mon, 15 Mar 2021 15:55:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10583: Fix bug on unbounded result spooling memory config.

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17187 )

Change subject: IMPALA-10583: Fix bug on unbounded result spooling memory config.
......................................................................


Patch Set 2: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8f5e3668281bba8813f8082f45b4faa7721530e
Gerrit-Change-Number: 17187
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Sat, 10 Apr 2021 01:13:04 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10583: Fix bug on unbounded result spooling memory config.

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17187 )

Change subject: IMPALA-10583: Fix bug on unbounded result spooling memory config.
......................................................................


Patch Set 2: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/17187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8f5e3668281bba8813f8082f45b4faa7721530e
Gerrit-Change-Number: 17187
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 09 Apr 2021 19:22:57 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10583: Fix bug on unbounded result spooling memory config.

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17187 )

Change subject: IMPALA-10583: Fix bug on unbounded result spooling memory config.
......................................................................


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7061/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/17187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8f5e3668281bba8813f8082f45b4faa7721530e
Gerrit-Change-Number: 17187
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 09 Apr 2021 19:22:58 +0000
Gerrit-HasComments: No