You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Tim Armstrong (Code Review)" <ge...@cloudera.org> on 2018/06/01 16:52:43 UTC

[Impala-ASF-CR] IMPALA-7078: Part 2: reduce queue size based on num scanner threads

Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10573


Change subject: IMPALA-7078: Part 2: reduce queue size based on num_scanner_threads
......................................................................

IMPALA-7078: Part 2: reduce queue size based on num_scanner_threads

This gives a tighter bound on memory consumption when running with
a lower num_scanner_threads value.

Cap the maximum row batch queue size at 5 * max_num_scanner_threads_
so that we guarantee lower amounts of memory in the row batch queue
when num_scanner_threads is set, rather than just achieving it
statistically because of the producer running slower relative to
consumer. It does not reduce the default significantly on typical
server configurations that would have 24+ cores except under high
concurrency or low memory environments where the number of scanner
threads is limited. We should evaluate reducing the default further
or otherwise better controlling memory consumption in a follow-up,
based on experiments.

Testing:
Tested along with Part 1.

Change-Id: I8c0bd14e6bcd9fc1655e344a5307ea0eb4600e8b
---
M be/src/exec/hdfs-scan-node.cc
1 file changed, 19 insertions(+), 5 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/10573/1
-- 
To view, visit http://gerrit.cloudera.org:8080/10573
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8c0bd14e6bcd9fc1655e344a5307ea0eb4600e8b
Gerrit-Change-Number: 10573
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-7078: Part 2: reduce queue size based on num scanner threads

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10573 )

Change subject: IMPALA-7078: Part 2: reduce queue size based on num_scanner_threads
......................................................................

IMPALA-7078: Part 2: reduce queue size based on num_scanner_threads

This gives a tighter bound on memory consumption when running with
a lower num_scanner_threads value. With IMPALA-7096 we'll revisit
the approach to reliably avoid OOM.

Cap the maximum row batch queue size at 5 * max_num_scanner_threads_
so that we guarantee lower amounts of memory in the row batch queue
when num_scanner_threads is set, rather than just achieving it
statistically because of the producer running slower relative to
consumer. It does not reduce the default significantly on typical
server configurations that would have 24+ cores except under high
concurrency or low memory environments where the number of scanner
threads is limited. We should evaluate reducing the default further
or otherwise better controlling memory consumption in a follow-up,
based on experiments.

Testing:
Tested along with Part 1.

Change-Id: I8c0bd14e6bcd9fc1655e344a5307ea0eb4600e8b
Reviewed-on: http://gerrit.cloudera.org:8080/10573
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/exec/hdfs-scan-node.cc
1 file changed, 19 insertions(+), 6 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/10573
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I8c0bd14e6bcd9fc1655e344a5307ea0eb4600e8b
Gerrit-Change-Number: 10573
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-7078: Part 2: reduce queue size based on num scanner threads

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Hello Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10573

to look at the new patch set (#3).

Change subject: IMPALA-7078: Part 2: reduce queue size based on num_scanner_threads
......................................................................

IMPALA-7078: Part 2: reduce queue size based on num_scanner_threads

This gives a tighter bound on memory consumption when running with
a lower num_scanner_threads value. With IMPALA-7096 we'll revisit
the approach to reliably avoid OOM.

Cap the maximum row batch queue size at 5 * max_num_scanner_threads_
so that we guarantee lower amounts of memory in the row batch queue
when num_scanner_threads is set, rather than just achieving it
statistically because of the producer running slower relative to
consumer. It does not reduce the default significantly on typical
server configurations that would have 24+ cores except under high
concurrency or low memory environments where the number of scanner
threads is limited. We should evaluate reducing the default further
or otherwise better controlling memory consumption in a follow-up,
based on experiments.

Testing:
Tested along with Part 1.

Change-Id: I8c0bd14e6bcd9fc1655e344a5307ea0eb4600e8b
---
M be/src/exec/hdfs-scan-node.cc
1 file changed, 19 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/10573/3
-- 
To view, visit http://gerrit.cloudera.org:8080/10573
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c0bd14e6bcd9fc1655e344a5307ea0eb4600e8b
Gerrit-Change-Number: 10573
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>

[Impala-ASF-CR] IMPALA-7078: Part 2: reduce queue size based on num scanner threads

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10573 )

Change subject: IMPALA-7078: Part 2: reduce queue size based on num_scanner_threads
......................................................................


Patch Set 5: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/10573
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c0bd14e6bcd9fc1655e344a5307ea0eb4600e8b
Gerrit-Change-Number: 10573
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 06 Jun 2018 20:40:45 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7078: Part 2: reduce queue size based on num scanner threads

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Hello Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10573

to look at the new patch set (#4).

Change subject: IMPALA-7078: Part 2: reduce queue size based on num_scanner_threads
......................................................................

IMPALA-7078: Part 2: reduce queue size based on num_scanner_threads

This gives a tighter bound on memory consumption when running with
a lower num_scanner_threads value. With IMPALA-7096 we'll revisit
the approach to reliably avoid OOM.

Cap the maximum row batch queue size at 5 * max_num_scanner_threads_
so that we guarantee lower amounts of memory in the row batch queue
when num_scanner_threads is set, rather than just achieving it
statistically because of the producer running slower relative to
consumer. It does not reduce the default significantly on typical
server configurations that would have 24+ cores except under high
concurrency or low memory environments where the number of scanner
threads is limited. We should evaluate reducing the default further
or otherwise better controlling memory consumption in a follow-up,
based on experiments.

Testing:
Tested along with Part 1.

Change-Id: I8c0bd14e6bcd9fc1655e344a5307ea0eb4600e8b
---
M be/src/exec/hdfs-scan-node.cc
1 file changed, 19 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/10573/4
-- 
To view, visit http://gerrit.cloudera.org:8080/10573
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c0bd14e6bcd9fc1655e344a5307ea0eb4600e8b
Gerrit-Change-Number: 10573
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-7078: Part 2: reduce queue size based on num scanner threads

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10573 )

Change subject: IMPALA-7078: Part 2: reduce queue size based on num_scanner_threads
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10573/2/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/10573/2/be/src/exec/hdfs-scan-node.cc@204
PS2, Line 204: f (file_descs_.empty() || progress_.done()) return Status::OK();
is this somehow related to this change? why do this?



-- 
To view, visit http://gerrit.cloudera.org:8080/10573
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c0bd14e6bcd9fc1655e344a5307ea0eb4600e8b
Gerrit-Change-Number: 10573
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Jun 2018 23:48:37 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7078: Part 2: reduce queue size based on num scanner threads

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10573 )

Change subject: IMPALA-7078: Part 2: reduce queue size based on num_scanner_threads
......................................................................


Patch Set 4: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/10573
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c0bd14e6bcd9fc1655e344a5307ea0eb4600e8b
Gerrit-Change-Number: 10573
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 06 Jun 2018 17:00:04 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7078: Part 2: reduce queue size based on num scanner threads

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10573 )

Change subject: IMPALA-7078: Part 2: reduce queue size based on num_scanner_threads
......................................................................


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2612/


-- 
To view, visit http://gerrit.cloudera.org:8080/10573
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c0bd14e6bcd9fc1655e344a5307ea0eb4600e8b
Gerrit-Change-Number: 10573
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 06 Jun 2018 21:01:50 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7078: Part 2: reduce queue size based on num scanner threads

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10573 )

Change subject: IMPALA-7078: Part 2: reduce queue size based on num_scanner_threads
......................................................................


Patch Set 5: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/10573
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c0bd14e6bcd9fc1655e344a5307ea0eb4600e8b
Gerrit-Change-Number: 10573
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 07 Jun 2018 00:45:34 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7078: Part 2: reduce queue size based on num scanner threads

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/10573 )

Change subject: IMPALA-7078: Part 2: reduce queue size based on num_scanner_threads
......................................................................

IMPALA-7078: Part 2: reduce queue size based on num_scanner_threads

This gives a tighter bound on memory consumption when running with
a lower num_scanner_threads value. With IMPALA-7096 we'll revisit
the approach to reliably avoid OOM.

Cap the maximum row batch queue size at 5 * max_num_scanner_threads_
so that we guarantee lower amounts of memory in the row batch queue
when num_scanner_threads is set, rather than just achieving it
statistically because of the producer running slower relative to
consumer. It does not reduce the default significantly on typical
server configurations that would have 24+ cores except under high
concurrency or low memory environments where the number of scanner
threads is limited. We should evaluate reducing the default further
or otherwise better controlling memory consumption in a follow-up,
based on experiments.

Testing:
Tested along with Part 1.

Change-Id: I8c0bd14e6bcd9fc1655e344a5307ea0eb4600e8b
---
M be/src/exec/hdfs-scan-node.cc
1 file changed, 20 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/10573/2
-- 
To view, visit http://gerrit.cloudera.org:8080/10573
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c0bd14e6bcd9fc1655e344a5307ea0eb4600e8b
Gerrit-Change-Number: 10573
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-7078: Part 2: reduce queue size based on num scanner threads

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10573 )

Change subject: IMPALA-7078: Part 2: reduce queue size based on num_scanner_threads
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10573/2/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/10573/2/be/src/exec/hdfs-scan-node.cc@204
PS2, Line 204: f (file_descs_.empty() || progress_.done()) return Status::OK();
> is this somehow related to this change? why do this?
This was an error in separating out this from the previous patchset. The previous patchset erroneously deleted this line.



-- 
To view, visit http://gerrit.cloudera.org:8080/10573
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c0bd14e6bcd9fc1655e344a5307ea0eb4600e8b
Gerrit-Change-Number: 10573
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 06 Jun 2018 00:31:25 +0000
Gerrit-HasComments: Yes