You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Riza Suminto (Code Review)" <ge...@cloudera.org> on 2020/11/21 15:55:55 UTC

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16765


Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................

IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation

PlanRootSink can fail silently if result spooling is enabled and
maxMemReservationBytes is less than 2 * MAX_ROW_SIZE. Underneath, a
SpillableRowBatchQueue need 2 buffer (read and write) each to fit at
least MAX_ROW_SIZE bytes. This patch change the ResourceProfile's
maxMemReservationBytes as:

max(minMemReservationBytes, MAX_RESULT_SPOOLING_MEM, 2 * MAX_ROW_SIZE)

Testing:
- Pass exhaustive tests.
- Lower MAX_ROW_SIZE on tests where MAX_RESULT_SPOOLING_MEM is set to
  extremely low value.

Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
---
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/custom_cluster/test_query_retries.py
M tests/query_test/test_result_spooling.py
3 files changed, 4 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/16765/1
-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 5: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py
File tests/query_test/test_result_spooling.py:

http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@106
PS3, Line 106:       assert re.search(plan_root_sink_reservation_limit, profile)
> This assertion, however, is OK to delete, because it is not the focus of th
Thanks for the context above, makes sense to keep this too.



-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Mon, 30 Nov 2020 19:52:36 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@87
PS1, Line 87: getDefault_spillable_buffer_size
> got it, thanks for the explanation
I was wrong. Large DEFAULT_SPILLABLE_BUFFER_SIZE can wins calculation for maxMemReservationBytes by increasing minMemReservationBytes. The test case I mention earlier, however, stays true.

I adjust the commit message and add TestResultSpoolingMaxReservation::test_high_default_spillable_buffer to reflect this.


http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py
File tests/custom_cluster/test_query_retries.py:

http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py@589
PS1, Line 589: 'max_row_size': 8 * 1024,
> can you add a test that confirms that changing this will have an effect. So
I add verification to check that query ReservationLimit remains unchanged after addition of MAX_ROW_SIZE query option.

I also add TestResultSpoolingMaxReservation to verify that all three of MAX_ROW_SIZE, MAX_RESULT_SPOOLING_MEM, and DEFAULT_SPILLABLE_BUFFER_SIZE query options can contribute towards increasing PLAN_ROOT_SINK's ReservationLimit.



-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Wed, 25 Nov 2020 19:03:20 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 7: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 04 Dec 2020 17:46:33 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@87
PS1, Line 87: getDefault_spillable_buffer_size
> My understanding is that DEFAULT_SPILLABLE_BUFFER_SIZE control the minimal 
got it, thanks for the explanation


http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py
File tests/custom_cluster/test_query_retries.py:

http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py@589
PS1, Line 589: 'max_row_size': 4 * 1024,
> It intended to keep the test the same, keeping the maxMemReservationBytes t
can you add a test that confirms that changing this will have an effect. Something like have a test that would fail without using this and then setting this would let it run. Or maybe something that you can confirm, a metric perhaps that changes  when you set this



-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Tue, 24 Nov 2020 21:11:07 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@87
PS1, Line 87: getDefault_spillable_buffer_size
Is there any relation to this and the max_row_Size?
should it be able to hold the full row size as well?


http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py
File tests/custom_cluster/test_query_retries.py:

http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py@589
PS1, Line 589: 'max_row_size': 4 * 1024,
is this having any effect on the query? the max_result_spooling_mem is already set to 8 and this is set to 4 which will result in the maxMemReservationBytes set to 8 in either.
Similar thing for the other test



-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Tue, 24 Nov 2020 00:15:36 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/7734/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Wed, 25 Nov 2020 19:02:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 6:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/7753/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Mon, 30 Nov 2020 22:29:37 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@87
PS1, Line 87: getDefault_spillable_buffer_size
> Is there any relation to this and the max_row_Size?
My understanding is that DEFAULT_SPILLABLE_BUFFER_SIZE control the minimal unit of page size when increasing memory reservation. It is intended to tune performance in spill operation (larger buffer sizes result in Impala issuing larger I/O requests to storage devices, which might result in higher throughput) rather than in-memory allocation.
it contribute to calculation of maxRowBufferSize, but not to maxMemReservationBytes.

Let say user only set MAX_ROW_SIZE to 256MB, and leave DEFAULT_SPILLABLE_BUFFER_SIZE and MAX_RESULT_SPOOLING_MEM as default (2MB and 100 MB accordingly). Without the patch, the result is the following:
bufferSize = 2MB
maxRowBufferSize = 256MB (rounded up to nearest power of two)
minMemReservationBytes = 4MB
maxMemReservationBytes = 100MB

maxMemReservationBytes should be at least 2 * MAX_ROW_SIZE for reservation of read+write page to be satisfied, which is not true in this case.


http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py
File tests/custom_cluster/test_query_retries.py:

http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py@589
PS1, Line 589: 'max_row_size': 4 * 1024,
> is this having any effect on the query? the max_result_spooling_mem is alre
It intended to keep the test the same, keeping the maxMemReservationBytes to 8KB.

Default MAX_ROW_SIZE is 512KB. With the proposed changes in PlannerRootSink.java and without explicitly lowering MAX_ROW_SIZE to 4KB, MAX_RESULT_SPOOLING_MEM will be ignored and maxMemReservationBytes will be set to 1MB instead of intended 8KB.



-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Tue, 24 Nov 2020 01:28:28 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 5:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/16765/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16765/3//COMMIT_MSG@10
PS3, Line 10: This happens
            : because results are spilled using a SpillableRowBatchQueue which needs 2
            : buffer (read and write) with at least MAX_ROW_SIZE bytes per buffer.
            : This patch fixes this by s
> nit: how about :
Done


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py
File tests/query_test/test_result_spooling.py:

http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@68
PS3, Line 68:     exec_options['max_row_size'] = 16 * 1024
> I think you can remove this change and the one in test_query_retries since 
test_spilling and test_query_retries need to force spill by lowering spill related query options.
As consequence of this patch, MAX_ROW_SIZE now also need to be lowered in order to force spill.

The default MAX_ROW_SIZE is too high (512 KB) for the context of the tests.
With changes introduced in PlanRootSink.java, default MAX_ROW_SIZE will now contribute to pushing maxMemReservationBytes to 1 MB, which in turn will fit all rows in memory (no spill happens).

I have double check by commenting the MAX_ROW_SIZE option, and these tests failed without it.


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@106
PS3, Line 106:       assert re.search(plan_root_sink_reservation_limit, profile)
This assertion, however, is OK to delete, because it is not the focus of the test.
I just add this to verify that ReservationLimit stay the same as before when MAX_ROW_SIZE does not contribute to maxMemReservationBytes.
Similarly with the one I added in test_query_retries.
Let me know what you think of.


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@419
PS3, Line 419:   """These tests verify that while calculating max_reservation for spooling these query
> nit: you can add a comment here saying that these tests verify that while c
Done


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@479
PS3, Line 479: ")
> nit: spills
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Thu, 26 Nov 2020 04:22:48 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16765/5/be/src/runtime/spillable-row-batch-queue.cc
File be/src/runtime/spillable-row-batch-queue.cc:

http://gerrit.cloudera.org:8080/#/c/16765/5/be/src/runtime/spillable-row-batch-queue.cc@97
PS5, Line 97:         // have been set.
> Good point! I didin't realize IMPALA-9851 when adding that DebugString() in
Done.
Filed as IMPALA-10374.



-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Wed, 02 Dec 2020 18:34:13 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/7739/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Thu, 26 Nov 2020 04:21:53 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 8: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 04 Dec 2020 18:23:46 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 6: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16765/5/be/src/runtime/spillable-row-batch-queue.cc
File be/src/runtime/spillable-row-batch-queue.cc:

http://gerrit.cloudera.org:8080/#/c/16765/5/be/src/runtime/spillable-row-batch-queue.cc@97
PS5, Line 97:         // have been set.
> Added row size and batch_queue_->DebugString() info in the log.
Good point! I didin't realize IMPALA-9851 when adding that DebugString() in GroupingAggregator::Partition::SerializeStreamForSpilling(). Please file a separate JIRA for it.



-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Wed, 02 Dec 2020 04:07:19 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Hello Quanlong Huang, Bikramjeet Vig, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16765

to look at the new patch set (#2).

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................

IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation

PlanRootSink can fail silently if result spooling is enabled and
maxMemReservationBytes is less than 2 * MAX_ROW_SIZE. Underneath, a
SpillableRowBatchQueue need 2 buffer (read and write) each to fit at
least MAX_ROW_SIZE bytes. This patch change the ResourceProfile's
maxMemReservationBytes as:

  max(minMemReservationBytes, MAX_RESULT_SPOOLING_MEM, 2 * MAX_ROW_SIZE)

minMemReservationBytes itself remain unchanged as:

  2 * DEFAULT_SPILLABLE_BUFFER_SIZE

Testing:
- Pass exhaustive tests.
- Add e2e TestResultSpoolingMaxReservation.
- Lower MAX_ROW_SIZE on tests where MAX_RESULT_SPOOLING_MEM is set to
  extremely low value. Also verify that PLAN_ROOT_SINK's ReservationLimit
  remain unchanged after lowering the MAX_ROW_SIZE.

Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
---
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/custom_cluster/test_query_retries.py
M tests/query_test/test_result_spooling.py
3 files changed, 92 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/16765/2
-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16765/5/be/src/runtime/spillable-row-batch-queue.cc
File be/src/runtime/spillable-row-batch-queue.cc:

http://gerrit.cloudera.org:8080/#/c/16765/5/be/src/runtime/spillable-row-batch-queue.cc@97
PS5, Line 97:         DCHECK(false) << "Rows should be added in unpinned mode unless an error occurred";
It'd be helpful to log the row size and batch_queue_->DebugString() here.



-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Thu, 26 Nov 2020 09:14:09 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Hello Quanlong Huang, Bikramjeet Vig, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16765

to look at the new patch set (#6).

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................

IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation

PlanRootSink can fail silently if result spooling is enabled and
maxMemReservationBytes is less than 2 * MAX_ROW_SIZE. This happens
because results are spilled using a SpillableRowBatchQueue which needs 2
buffer (read and write) with at least MAX_ROW_SIZE bytes per buffer.
This patch fixes this by setting the PlanRootSink's
minMemReservationBytes as 2 * maxRowBufferSize.

Testing:
- Pass exhaustive tests.
- Add e2e TestResultSpoolingMaxReservation.
- Lower MAX_ROW_SIZE on tests where MAX_RESULT_SPOOLING_MEM is set to
  extremely low value. Also verify that PLAN_ROOT_SINK's ReservationLimit
  remain unchanged after lowering the MAX_ROW_SIZE.

Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
---
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/spillable-row-batch-queue.cc
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/custom_cluster/test_query_retries.py
M tests/query_test/test_result_spooling.py
5 files changed, 115 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/16765/6
-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Hello Quanlong Huang, Bikramjeet Vig, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16765

to look at the new patch set (#7).

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................

IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation

PlanRootSink can fail silently if result spooling is enabled and
maxMemReservationBytes is less than 2 * MAX_ROW_SIZE. This happens
because results are spilled using a SpillableRowBatchQueue which needs 2
buffer (read and write) with at least MAX_ROW_SIZE bytes per buffer.
This patch fixes this by setting a lower bound of 2 * MAX_ROW_SIZE while
computing the min reservation for the PlanRootSink.

Testing:
- Pass exhaustive tests.
- Add e2e TestResultSpoolingMaxReservation.
- Lower MAX_ROW_SIZE on tests where MAX_RESULT_SPOOLING_MEM is set to
  extremely low value. Also verify that PLAN_ROOT_SINK's ReservationLimit
  remain unchanged after lowering the MAX_ROW_SIZE.

Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
---
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/spillable-row-batch-queue.cc
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/custom_cluster/test_query_retries.py
M tests/query_test/test_result_spooling.py
5 files changed, 118 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/16765/7
-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/7735/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Wed, 25 Nov 2020 19:13:02 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 8: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 04 Dec 2020 23:55:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16765/2/tests/query_test/test_result_spooling.py
File tests/query_test/test_result_spooling.py:

http://gerrit.cloudera.org:8080/#/c/16765/2/tests/query_test/test_result_spooling.py@447
PS2, Line 447: q
flake8: F821 undefined name 'query'


http://gerrit.cloudera.org:8080/#/c/16765/2/tests/query_test/test_result_spooling.py@493
PS2, Line 493: 
flake8: W391 blank line at end of file



-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Wed, 25 Nov 2020 18:43:00 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/7712/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Sat, 21 Nov 2020 16:17:35 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Hello Quanlong Huang, Bikramjeet Vig, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16765

to look at the new patch set (#3).

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................

IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation

PlanRootSink can fail silently if result spooling is enabled and
maxMemReservationBytes is less than 2 * MAX_ROW_SIZE. Underneath, a
SpillableRowBatchQueue need 2 buffer (read and write) each to fit at
least MAX_ROW_SIZE bytes. This patch change the ResourceProfile's
maxMemReservationBytes as:

  max(minMemReservationBytes, MAX_RESULT_SPOOLING_MEM, 2 * MAX_ROW_SIZE)

minMemReservationBytes itself remain unchanged as:

  2 * DEFAULT_SPILLABLE_BUFFER_SIZE

Testing:
- Pass exhaustive tests.
- Add e2e TestResultSpoolingMaxReservation.
- Lower MAX_ROW_SIZE on tests where MAX_RESULT_SPOOLING_MEM is set to
  extremely low value. Also verify that PLAN_ROOT_SINK's ReservationLimit
  remain unchanged after lowering the MAX_ROW_SIZE.

Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
---
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/custom_cluster/test_query_retries.py
M tests/query_test/test_result_spooling.py
3 files changed, 92 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/16765/3
-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/16765/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16765/3//COMMIT_MSG@26
PS3, Line 26: 
> A better test is using SET_DENY_RESERVATION_PROBABILITY to verify the Buffe
Added debug action SET_DENY_RESERVATION_PROBABILITY@1.0 in TestResultSpoolingMaxReservation.
test_spilling_large_rows already exercise SET_DENY_RESERVATION_PROBABILITY with mt_dop=1.


http://gerrit.cloudera.org:8080/#/c/16765/5/be/src/runtime/spillable-row-batch-queue.cc
File be/src/runtime/spillable-row-batch-queue.cc:

http://gerrit.cloudera.org:8080/#/c/16765/5/be/src/runtime/spillable-row-batch-queue.cc@97
PS5, Line 97:         // have been set.
> It'd be helpful to log the row size and batch_queue_->DebugString() here.
Added row size and batch_queue_->DebugString() info in the log.

However, this reminds me of IMPALA-9851.
BufferedTupleStream::DebugString() iterate std::list<Page> that can
potentially grow very large. While most references to this method are
behind DCHEK logging (thus stripped in release build), one of them is
used to build Status message through call to
GroupingAggregator::Partition::DebugString() at this line:
https://github.com/apache/impala/blob/5530b62/be/src/exec/grouping-aggregator-partition.cc#L155

IMPALA-9851 already cap error message to 128kb at most, but that does
not eliminate the cost to iterate page list in
BufferedTupleStream::DebugString(). Should I file this as separate Jira?


http://gerrit.cloudera.org:8080/#/c/16765/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@90
PS3, Line 90:       long minMemReservationBytes = 2 * maxRowBufferSize;
> We should be able to work using the minimal reservation. So I think the pro
Make sense.
Bump up minMemReservation instead of maxMemReservationBytes in patch set #6.


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py
File tests/query_test/test_result_spooling.py:

http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@68
PS3, Line 68:     exec_options['max_row_size'] = 16 * 1024
> test_spilling and test_query_retries need to force spill by lowering spill 
Done


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@106
PS3, Line 106:       assert re.search(plan_root_sink_reservation_limit, profile)
> Thanks for the context above, makes sense to keep this too.
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Mon, 30 Nov 2020 22:32:30 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................

IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation

PlanRootSink can fail silently if result spooling is enabled and
maxMemReservationBytes is less than 2 * MAX_ROW_SIZE. This happens
because results are spilled using a SpillableRowBatchQueue which needs 2
buffer (read and write) with at least MAX_ROW_SIZE bytes per buffer.
This patch fixes this by setting a lower bound of 2 * MAX_ROW_SIZE while
computing the min reservation for the PlanRootSink.

Testing:
- Pass exhaustive tests.
- Add e2e TestResultSpoolingMaxReservation.
- Lower MAX_ROW_SIZE on tests where MAX_RESULT_SPOOLING_MEM is set to
  extremely low value. Also verify that PLAN_ROOT_SINK's ReservationLimit
  remain unchanged after lowering the MAX_ROW_SIZE.

Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Reviewed-on: http://gerrit.cloudera.org:8080/16765
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/spillable-row-batch-queue.cc
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/custom_cluster/test_query_retries.py
M tests/query_test/test_result_spooling.py
5 files changed, 118 insertions(+), 9 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 9
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 7:

(3 comments)

Thanks Bikram,

http://gerrit.cloudera.org:8080/#/c/16765/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16765/6//COMMIT_MSG@14
PS6, Line 14: computing the min reservation for the PlanRootS
> nit: i think this description is slightly opaque since the reader would hav
Done


http://gerrit.cloudera.org:8080/#/c/16765/6/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/6/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@90
PS6, Line 90:     bufferSize, queryOptions.getMax_row_size());
> nit: can you add a small comment on why we need 2 maxRowBufferSize, either 
I edited the method documentation a bit to explain this. Hope it is clear.


http://gerrit.cloudera.org:8080/#/c/16765/6/tests/query_test/test_result_spooling.py
File tests/query_test/test_result_spooling.py:

http://gerrit.cloudera.org:8080/#/c/16765/6/tests/query_test/test_result_spooling.py@426
PS6, Line 426: DEBUG_ACTION_V
> nit: switch to ALL_CAPS to be consistent with rest of the codebase.
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 04 Dec 2020 04:55:43 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6729/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 04 Dec 2020 18:23:47 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 7:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/7774/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 04 Dec 2020 05:15:01 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/16765/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16765/3//COMMIT_MSG@10
PS3, Line 10: Underneath, a
            : SpillableRowBatchQueue need 2 buffer (read and write) each to fit at
            : least MAX_ROW_SIZE bytes. This patch change the ResourceProfile's
            : maxMemReservationBytes as:
nit: how about :
This happens because results are spilled using a SpillableRowBatchQueue which needs 2 buffer (read and write) with at least MAX_ROW_SIZE bytes per buffer. This patch fixes this by setting the PlanRootSink's maxMemReservationBytes as: max(minMemReservationBytes, MAX_RESULT_SPOOLING_MEM, 2 * MAX_ROW_SIZE)


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py
File tests/query_test/test_result_spooling.py:

http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@68
PS3, Line 68:     exec_options['max_row_size'] = 16 * 1024
I think you can remove this change and the one in test_query_retries since you have added a self contained test to verify max_row_size's contribution and these other tests are more concerned with testing other functionality.


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@419
PS3, Line 419: 
nit: you can add a comment here saying that these tests verify that while calculating max_reservation for spooling these query options are taken into account: MAX_ROW_SIZE, MAX_RESULT_SPOOLING_MEM and DEFAULT_SPILLABLE_BUFFER_SIZE.


http://gerrit.cloudera.org:8080/#/c/16765/3/tests/query_test/test_result_spooling.py@479
PS3, Line 479: spill
nit: spills



-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Thu, 26 Nov 2020 02:11:06 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16765/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16765/3//COMMIT_MSG@26
PS3, Line 26:   remain unchanged after lowering the MAX_ROW_SIZE.
A better test is using SET_DENY_RESERVATION_PROBABILITY to verify the BufferedPlanRootSink can work in minimal reservation. There are some examples in query_test/test_spilling.py::TestSpillingDebugActionDimensions. Maybe we can add a test in test_spilling_large_rows.


http://gerrit.cloudera.org:8080/#/c/16765/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@90
PS3, Line 90:       long minMemReservationBytes = 2 * bufferSize;
We should be able to work using the minimal reservation. So I think the problem is minMemReservationBytes should be 2 * maxRowBufferSize here, i.e. reserve mem for one large read page and one large write page.

The minMemReservation calculation of the final aggregation node is an example:
https://github.com/apache/impala/blob/8ea49e9b026d48b46e9fbd98dc5286f3e6dfa93d/fe/src/main/java/org/apache/impala/planner/AggregationNode.java#L582. The final AggregationNode needs n pages, then the min mem reservation is bufferSize * (n-2) + maxRowBufferSize * 2.

Here the BufferedPlanRootSink has a SpillableRowBatchQueue which is backed by a BufferedTupleStream. The stream is a read-write stream so may pin one read page and one write page at the same time, and they could both be a large page. So we should at least reserve 2 * maxRowBufferSize of mem for it. With enough reservation, batch_queue_->AddRow won't fail and hit the DCHECK:
https://github.com/apache/impala/blob/eea617b/be/src/runtime/spillable-row-batch-queue.cc#L97



-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Thu, 26 Nov 2020 07:33:28 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Hello Quanlong Huang, Bikramjeet Vig, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16765

to look at the new patch set (#5).

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................

IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation

PlanRootSink can fail silently if result spooling is enabled and
maxMemReservationBytes is less than 2 * MAX_ROW_SIZE. This happens
because results are spilled using a SpillableRowBatchQueue which needs 2
buffer (read and write) with at least MAX_ROW_SIZE bytes per buffer.
This patch fixes this by setting the PlanRootSink's
maxMemReservationBytes as:

  max(minMemReservationBytes, MAX_RESULT_SPOOLING_MEM, 2 * MAX_ROW_SIZE)

minMemReservationBytes itself remain unchanged as:

  2 * DEFAULT_SPILLABLE_BUFFER_SIZE

Testing:
- Pass exhaustive tests.
- Add e2e TestResultSpoolingMaxReservation.
- Lower MAX_ROW_SIZE on tests where MAX_RESULT_SPOOLING_MEM is set to
  extremely low value. Also verify that PLAN_ROOT_SINK's ReservationLimit
  remain unchanged after lowering the MAX_ROW_SIZE.

Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
---
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/custom_cluster/test_query_retries.py
M tests/query_test/test_result_spooling.py
3 files changed, 95 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/16765/5
-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................


Patch Set 6:

(3 comments)

looks good, will +2 once the nits are resolved

http://gerrit.cloudera.org:8080/#/c/16765/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16765/6//COMMIT_MSG@14
PS6, Line 14: minMemReservationBytes as 2 * maxRowBufferSize.
nit: i think this description is slightly opaque since the reader would have to know how maxRowBufferSize is computed. I think something like this would suffice:

This patch fixes this by setting a lower bound of 2 * MAX_ROW_SIZE while computing the min reservation for the PlanRootSink


http://gerrit.cloudera.org:8080/#/c/16765/6/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/6/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@90
PS6, Line 90: long minMemReservationBytes = 2 * maxRowBufferSize;
nit: can you add a small comment on why we need 2 maxRowBufferSize, either here or in the method comment above


http://gerrit.cloudera.org:8080/#/c/16765/6/tests/query_test/test_result_spooling.py
File tests/query_test/test_result_spooling.py:

http://gerrit.cloudera.org:8080/#/c/16765/6/tests/query_test/test_result_spooling.py@426
PS6, Line 426: _debug_actions
nit: switch to ALL_CAPS to be consistent with rest of the codebase.
DEBUG_ACTION_VALUES



-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 04 Dec 2020 01:08:37 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Hello Quanlong Huang, Bikramjeet Vig, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16765

to look at the new patch set (#4).

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation
......................................................................

IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation

PlanRootSink can fail silently if result spooling is enabled and
maxMemReservationBytes is less than 2 * MAX_ROW_SIZE. Underneath, a
SpillableRowBatchQueue need 2 buffer (read and write) each to fit at
least MAX_ROW_SIZE bytes. This patch change the ResourceProfile's
maxMemReservationBytes as:

  max(minMemReservationBytes, MAX_RESULT_SPOOLING_MEM, 2 * MAX_ROW_SIZE)

minMemReservationBytes itself remain unchanged as:

  2 * DEFAULT_SPILLABLE_BUFFER_SIZE

Testing:
- Pass exhaustive tests.
- Add e2e TestResultSpoolingMaxReservation.
- Lower MAX_ROW_SIZE on tests where MAX_RESULT_SPOOLING_MEM is set to
  extremely low value. Also verify that PLAN_ROOT_SINK's ReservationLimit
  remain unchanged after lowering the MAX_ROW_SIZE.

Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
---
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M tests/custom_cluster/test_query_retries.py
M tests/query_test/test_result_spooling.py
3 files changed, 95 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/16765/4
-- 
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>