You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Marcel Kornacker (Code Review)" <ge...@cloudera.org> on 2016/10/26 20:40:06 UTC

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Marcel Kornacker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4853

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................

IMPALA-4314: Standardize on MT-related data structures

This removes the data structures that were "superceded" in
IMPALA-3903 and changes all control flow to utilize the
new data structures. The new data structures are renamed
to remove the "Mt" prefix.

Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/simple-scheduler-test-util.cc
M be/src/scheduling/simple-scheduler-test-util.h
M be/src/scheduling/simple-scheduler.cc
M be/src/scheduling/simple-scheduler.h
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/query-exec-state.cc
M common/thrift/Frontend.thrift
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
23 files changed, 639 insertions(+), 1,150 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/4853/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................


Patch Set 2:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/4853/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 457:   bool is_mt_execution = request.query_ctx.request.query_options.mt_dop > 0;
> set in FE?
didn't seem worth it. it would be a single bool instead of a single int with a '> 0' around it.


http://gerrit.cloudera.org:8080/#/c/4853/2/be/src/scheduling/query-schedule.cc
File be/src/scheduling/query-schedule.cc:

Line 60:   // extract TPlanFragments and order by fragment id
> order by fragment idx?
Done


Line 70:   DCHECK_EQ(fragment_exec_params_.size(), 0);
> comment that we asserting this is the first call to Init()
Done


Line 244:   if (fragment->partition.type == TPartitionType::UNPARTITIONED) {
> Shouldn't this be a DCHECK instead? Every TStmtType::QUERY must have such a
but this should be callable for a non-query stmt (so the coordinator doesn't need to understand at all times what it's dealing with).


Line 264:   const FragmentExecParams* fragment_params = &fragment_exec_params_[coord_fragment.idx];
> make this a FragmentExecParams&
Done


Line 265:   DCHECK(fragment_params != nullptr);
> weird DCHECK, fix by using const& as suggested above
Done


http://gerrit.cloudera.org:8080/#/c/4853/2/be/src/scheduling/query-schedule.h
File be/src/scheduling/query-schedule.h:

Line 211:   // populated by Scheduler::Schedule (SimpleScheduler::ComputeFInstanceExecParams())
> now populated in Init() right?
clarified


http://gerrit.cloudera.org:8080/#/c/4853/2/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

Line 725: #if 0
> ?
Done


Line 750:     // TODO: we'll need to revisit this strategy once we can partition joins
> Maybe we should deal with this in the planner? I think the only way we can 
could think about that, but let's do that in a follow-on change.


http://gerrit.cloudera.org:8080/#/c/4853/1/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

Line 351:   // The node ids refer to scan nodes in fragments[].plan_tree
> fragments[].plan (not plan_tree)
Done


http://gerrit.cloudera.org:8080/#/c/4853/1/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

Line 1000:     boolean disableSpilling = 
> whitespace
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................


Patch Set 4: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/4853

to look at the new patch set (#3).

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................

IMPALA-4314: Standardize on MT-related data structures

This removes the data structures that were "superceded" in
IMPALA-3903 and changes all control flow to utilize the
new data structures. The new data structures are renamed
to remove the "Mt" prefix.

Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/simple-scheduler-test-util.cc
M be/src/scheduling/simple-scheduler-test-util.h
M be/src/scheduling/simple-scheduler.cc
M be/src/scheduling/simple-scheduler.h
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/query-exec-state.cc
M common/thrift/Frontend.thrift
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
23 files changed, 631 insertions(+), 1,168 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/4853/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................


Patch Set 2:

(12 comments)

Mostly trivial comments, next round should be +2 from me. Great to see all the deleted code!

http://gerrit.cloudera.org:8080/#/c/4853/2/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

PS2, Line 485: /// Initializes the structures in fragment_profiles_. Must be
             :   /// called before RPCs to start remote fragments.
Nit: this can be wrapped more concisely. Run clang-format over your changes?


http://gerrit.cloudera.org:8080/#/c/4853/2/be/src/scheduling/query-schedule.cc
File be/src/scheduling/query-schedule.cc:

PS2, Line 76: coord
Nit: call this root_fragment. At this point we don't know if it's a coord fragment or not.


PS2, Line 77: if (coord_fragment.partition.type == TPartitionType::UNPARTITIONED) {
Let's check stmt_type == QUERY. This check is indirect and prone to future bugs (what if we introduce some other kind of sink that always has UNPARTITIONED input?).


Line 142:       map<TPlanNodeId, int>& node_map = host_it->second;
const?


PS2, Line 162: ) {
long line


Line 244:   if (fragment->partition.type == TPartitionType::UNPARTITIONED) {
> but this should be callable for a non-query stmt (so the coordinator doesn'
But in that case, there is no coordinator fragment (if it's not a QUERY).

Just to be clear, my understanding is that there is a coord fragment iff the query produces result sets, i.e. it has a PlanRootSink and is of type QUERY.


http://gerrit.cloudera.org:8080/#/c/4853/2/be/src/scheduling/query-schedule.h
File be/src/scheduling/query-schedule.h:

PS2, Line 95: // extract instance indices from instance_exec_params.instance_id
            :   void 
why not return vector<int>? Compiler will do RVO, and no chance for accidental nullptr mistakes.


PS2, Line 113: Verifies that the schedule is well-formed:
             :   /// - all fragments have a FragmentExecParams
             :   /// - all scan ranges are assigned
What's the behaviour if it's not valid? Presumably DCHECK, but should mention that here.


PS2, Line 116: Verify
Validate()? AssertWellFormed()?


PS2, Line 203: RuntimeProfile::EventSequence* query_events_;
This is here so that the schedule can mark events on the query-wide event timeline.


http://gerrit.cloudera.org:8080/#/c/4853/2/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

PS2, Line 489: // fake load-balancing for Kudu and Hbase: every split has length 1
             :     // TODO: implement more accurate logic for Kudu and Hbase
move this to line 496 (I thought it applied to the whole loop at first).


http://gerrit.cloudera.org:8080/#/c/4853/2/be/src/scheduling/simple-scheduler.h
File be/src/scheduling/simple-scheduler.h:

PS2, Line 437: /// For each instance of fragment_params's input fragment, create a collocated
             :   /// instance for fragment_params's fragment.
I can't quite parse this.


-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4853/2/be/src/scheduling/query-schedule.h
File be/src/scheduling/query-schedule.h:

PS2, Line 113: Verifies that the schedule is well-formed:
             :   /// - all fragments have a FragmentExecParams
             :   /// - all scan ranges are assigned
> What's the behaviour if it's not valid? Presumably DCHECK, but should menti
missed that, but i'll add the comment before checking it in.


-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Hello Henry Robinson, Alex Behm,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/4853

to look at the new patch set (#4).

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................

IMPALA-4314: Standardize on MT-related data structures

This removes the data structures that were "superceded" in
IMPALA-3903 and changes all control flow to utilize the
new data structures. The new data structures are renamed
to remove the "Mt" prefix.

Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/simple-scheduler-test-util.cc
M be/src/scheduling/simple-scheduler-test-util.h
M be/src/scheduling/simple-scheduler.cc
M be/src/scheduling/simple-scheduler.h
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/query-exec-state.cc
M common/thrift/Frontend.thrift
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
23 files changed, 632 insertions(+), 1,169 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/4853/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................


Patch Set 1:

(3 comments)

+2 for .thrift and .java files. So much better!

I'll go through the BE changes next.

http://gerrit.cloudera.org:8080/#/c/4853/1/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

Line 351:   // The node ids refer to scan nodes in fragments[].plan_tree
fragments[].plan (not plan_tree)


http://gerrit.cloudera.org:8080/#/c/4853/1/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

Line 1000:     boolean disableSpilling = 
whitespace


http://gerrit.cloudera.org:8080/#/c/4853/1/testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
File testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test:

Line 144: |  hosts=3 per-host-mem=0B
weird mem estimate; ok to leave for now


-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................


Patch Set 1:

Regarding tests: I've done some rounds of bug fixing in response to test failures, but there are probably more lurking.

I'd like to get the review cycle started now, though, since the overall structure won't be affected by additional bug fixes.

-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................


Patch Set 2:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/4853/2/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

PS2, Line 485: /// Initializes the structures in fragment_profiles_. Must be
             :   /// called before RPCs to start remote fragments.
> Nit: this can be wrapped more concisely. Run clang-format over your changes
Done


http://gerrit.cloudera.org:8080/#/c/4853/2/be/src/scheduling/query-schedule.cc
File be/src/scheduling/query-schedule.cc:

PS2, Line 76: coord
> Nit: call this root_fragment. At this point we don't know if it's a coord f
Done


PS2, Line 77: if (coord_fragment.partition.type == TPartitionType::UNPARTITIONED) {
> Let's check stmt_type == QUERY. This check is indirect and prone to future 
Done


Line 142:       map<TPlanNodeId, int>& node_map = host_it->second;
> const?
l148 inserts


PS2, Line 162: ) {
> long line
Done


Line 244:   if (fragment->partition.type == TPartitionType::UNPARTITIONED) {
> But in that case, there is no coordinator fragment (if it's not a QUERY).
changed to dcheck


http://gerrit.cloudera.org:8080/#/c/4853/2/be/src/scheduling/query-schedule.h
File be/src/scheduling/query-schedule.h:

PS2, Line 95: // extract instance indices from instance_exec_params.instance_id
            :   void 
> why not return vector<int>? Compiler will do RVO, and no chance for acciden
Done


PS2, Line 116: Verify
> Validate()? AssertWellFormed()?
change to validate


PS2, Line 203: RuntimeProfile::EventSequence* query_events_;
> This is here so that the schedule can mark events on the query-wide event t
yeah, just a bit odd that it's part of the schedule.


http://gerrit.cloudera.org:8080/#/c/4853/2/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

PS2, Line 489: // fake load-balancing for Kudu and Hbase: every split has length 1
             :     // TODO: implement more accurate logic for Kudu and Hbase
> move this to line 496 (I thought it applied to the whole loop at first).
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4853/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS3, Line 246: boost
> remove boost::
Done


PS3, Line 456: runtime
> Will you file a JIRA to fix this?
filed https://issues.cloudera.org/browse/IMPALA-4400


http://gerrit.cloudera.org:8080/#/c/4853/3/be/src/scheduling/query-schedule.h
File be/src/scheduling/query-schedule.h:

Line 116:   void Validate() const;
> Still missing mention of how this signals failure.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................


Patch Set 2:

(10 comments)

Code looks much cleaner than before! I'm not an expert on this part of the code, but it looks pretty good me. No major concerns.

http://gerrit.cloudera.org:8080/#/c/4853/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 457:   bool is_mt_execution = request.query_ctx.request.query_options.mt_dop > 0;
set in FE?


http://gerrit.cloudera.org:8080/#/c/4853/2/be/src/scheduling/query-schedule.cc
File be/src/scheduling/query-schedule.cc:

Line 60:   // extract TPlanFragments and order by fragment id
order by fragment idx?


Line 70:   DCHECK_EQ(fragment_exec_params_.size(), 0);
comment that we asserting this is the first call to Init()


Line 244:   if (fragment->partition.type == TPartitionType::UNPARTITIONED) {
Shouldn't this be a DCHECK instead? Every TStmtType::QUERY must have such a fragment and it must be unpartitioned.


Line 264:   const FragmentExecParams* fragment_params = &fragment_exec_params_[coord_fragment.idx];
make this a FragmentExecParams&


Line 265:   DCHECK(fragment_params != nullptr);
weird DCHECK, fix by using const& as suggested above


http://gerrit.cloudera.org:8080/#/c/4853/2/be/src/scheduling/query-schedule.h
File be/src/scheduling/query-schedule.h:

Line 211:   // populated by Scheduler::Schedule (SimpleScheduler::ComputeFInstanceExecParams())
now populated in Init() right?


http://gerrit.cloudera.org:8080/#/c/4853/2/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

Line 455:   // TODO-MT: figure out how to parallelize Union
agree, seems like a practical solution for now


Line 725: #if 0
?


Line 750:     // TODO: we'll need to revisit this strategy once we can partition joins
Maybe we should deal with this in the planner? I think the only way we can get into this scenario is with a scan that had all partitions pruned. We could easily swap the scan for an empty set node and then invert the join.


-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has uploaded a new patch set (#2).

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................

IMPALA-4314: Standardize on MT-related data structures

This removes the data structures that were "superceded" in
IMPALA-3903 and changes all control flow to utilize the
new data structures. The new data structures are renamed
to remove the "Mt" prefix.

Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/simple-scheduler-test-util.cc
M be/src/scheduling/simple-scheduler-test-util.h
M be/src/scheduling/simple-scheduler.cc
M be/src/scheduling/simple-scheduler.h
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/query-exec-state.cc
M common/thrift/Frontend.thrift
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
23 files changed, 629 insertions(+), 1,151 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/4853/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................


Patch Set 2: Code-Review+1

(2 comments)

+2 on FE.

http://gerrit.cloudera.org:8080/#/c/4853/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 457:   bool is_mt_execution = request.query_ctx.request.query_options.mt_dop > 0;
> didn't seem worth it. it would be a single bool instead of a single int wit
except that plans will contain filters that are silently never run, but ok works for me


http://gerrit.cloudera.org:8080/#/c/4853/2/be/src/scheduling/query-schedule.cc
File be/src/scheduling/query-schedule.cc:

Line 244:   if (fragment->partition.type == TPartitionType::UNPARTITIONED) {
> but this should be callable for a non-query stmt (so the coordinator doesn'
non-query stmts will exit this function in L242


-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................


Patch Set 3: Code-Review+1

(4 comments)

Looked at .cc / .thrift only.

http://gerrit.cloudera.org:8080/#/c/4853/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS3, Line 246: boost
remove boost::


PS3, Line 456: runtime
Will you file a JIRA to fix this?


http://gerrit.cloudera.org:8080/#/c/4853/3/be/src/scheduling/query-schedule.h
File be/src/scheduling/query-schedule.h:

Line 116:   void Validate() const;
Still missing mention of how this signals failure.


PS3, Line 201: /// TODO: why are these part of the schedule?
The alternative is that there is a pointer to the enclosing QueryExecState.


-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................


Patch Set 3:

all tests pass.

-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4853/1/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

Line 527:   void ValidateCollectionSlots(RowBatch* batch);
i'll remove this


Line 531:   Status PrepareCoordFragment(std::vector<ExprContext*>* output_expr_ctxs);
and this.


-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has submitted this change and it was merged.

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................


IMPALA-4314: Standardize on MT-related data structures

This removes the data structures that were "superceded" in
IMPALA-3903 and changes all control flow to utilize the
new data structures. The new data structures are renamed
to remove the "Mt" prefix.

Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Reviewed-on: http://gerrit.cloudera.org:8080/4853
Reviewed-by: Marcel Kornacker <ma...@cloudera.com>
Tested-by: Tim Armstrong <ta...@cloudera.com>
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/simple-scheduler-test-util.cc
M be/src/scheduling/simple-scheduler-test-util.h
M be/src/scheduling/simple-scheduler.cc
M be/src/scheduling/simple-scheduler.h
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/query-exec-state.cc
M common/thrift/Frontend.thrift
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
23 files changed, 632 insertions(+), 1,169 deletions(-)

Approvals:
  Marcel Kornacker: Looks good to me, approved
  Tim Armstrong: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................


Patch Set 4: Code-Review+2

final comment changes and rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4314: Standardize on MT-related data structures
......................................................................


Patch Set 2:

with the latest changes the standard test suite should pass (there were 5 failures caused by what i just fixed in simple-scheduler.cc).

-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: No