You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Marcel Kornacker (Code Review)" <ge...@cloudera.org> on 2016/09/15 00:48:07 UTC

[Impala-ASF-CR] IMPALA-4014: HEADERS ONLY: Introduce query-wide execution state.

Marcel Kornacker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4418

Change subject: IMPALA-4014: HEADERS ONLY: Introduce query-wide execution state.
......................................................................

IMPALA-4014: HEADERS ONLY: Introduce query-wide execution state.

This patch is the beginning of the reorganization of the hierarchy of
execution-related classes in runtime/ and service. It introduces
additional classes which will eventually subsume the existing ones
(FragmentMgr, FragmentExecState, PlanFragmentExecutor), but for now
leaves the old classes in place.

This patch only contains public interface; the implementations of the
new classes will be provided in a follow-on patch.

New classes:
- FragmentInstanceState: subsumes FragmentExecState and
  PlanFragmentExecutor
- QueryState
- QueryExecMgr: subsumes FragmentMgr

Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
---
A be/src/runtime/fragment-instance-state.h
A be/src/runtime/query-exec-mgr.h
A be/src/runtime/query-state.h
3 files changed, 275 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4418/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 12: Code-Review+2

fixed merge accidents

-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has uploaded a new patch set (#10).

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................

IMPALA-4014: Introduce query-wide execution state.

This introduces a global structure to coordinate execution
of fragment instances on a backend for a single query.

New classes:
- QueryExecMgr: subsumes FragmentMgr
- QueryState
- FragmentInstanceState: replaces FragmentExecState

Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/exec/catalog-op-executor.cc
M be/src/exec/data-source-scan-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/hash-table-test.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/union-node.cc
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
R be/src/runtime/fragment-instance-state.cc
A be/src/runtime/fragment-instance-state.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
A be/src/runtime/query-exec-mgr.cc
A be/src/runtime/query-exec-mgr.h
A be/src/runtime/query-state.cc
A be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/request-pool-service.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/CMakeLists.txt
M be/src/service/fe-support.cc
D be/src/service/fragment-exec-state.h
D be/src/service/fragment-mgr.cc
D be/src/service/fragment-mgr.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
A be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/testutil/desc-tbl-builder.h
M be/src/udf/udf.cc
M be/src/util/container-util.h
M be/src/util/thread.h
M be/src/util/uid-util.h
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
66 files changed, 1,171 insertions(+), 775 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4418/10
-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 7:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS7, Line 938: !
nit: missing space


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

PS7, Line 507: fragment_instance_state
instance_state to match below and the definition.


PS7, Line 511: fragment_instance_state
same


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

Line 75: // including the final status when execution finishes.
I had deleted this comment because it wasn't accurate (and had added the accurate info the the header comment).  Let's not add it back.


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

Line 53: /// - move ReportStatusCb() logic into PFE::SendReport() and get rid of the callback
thanks, these four are going to make things a lot better.


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

Line 115:   // always decrement refcount
not sure what this comment is trying to tell me.  would be more informative as: decrement the refcount taken in StartFInstance()


Line 148:     qs = it->second;
it's probably worth commenting here explaining why we need to get a new qs pointer, i.e. someone else may have gc'd the old one and then someone else created a new one.  it's as subtle as the other cases that are commented.


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

Line 38: /// entry point for gaining refcounted access to a QueryState.
this doesn't really explain that it's also the thing that executes instances.


PS7, Line 47: decrement
this really happens after execution finishes, so would be good to reword.


PS7, Line 53: CancelFInstance
what's that? (not a method of this class)


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

PS7, Line 79: desc_tbl
do we plan to pull this out then? (maybe it's mentioned somewhere i haven't read yet).


Line 83:   bool released_resources() const { return released_resources_; }
who outside this class will care about this? besides, in order to call these methods, you have to have a refcnt, which means the resources couldn't have been released, right?


Line 93:   /// Illegal to call any other function of this class after this call returns.
Isn't it more that a reference must be held when calling the other methods, and that this method can't be called until all references are returned? why not specify that?


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

Line 90:   DCHECK(status.ok()) << status.GetDetail();
not for this change but we should clean this up.


Line 345: }
are any of these on critical paths (i.e. does the call instruction you're adding matter)?


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

Line 354:   /// TODO: get rid of this and use ExecEnv::GetInstance() instead
thanks


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/service/impala-internal-service.cc
File be/src/service/impala-internal-service.cc:

Line 68:   fis->Cancel();
is it intentional that we ignore the status returned by Cancel()?  We used to propagate back as the return_val status (inside CancelPlanFragment).


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/service/query-exec-state.h
File be/src/service/query-exec-state.h:

Line 55: /// TODO: Consider renaming to RequestExecState for consistency.
don't do it as part of this change, but this renaming is more important now given the other similarly named class.


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: HEADERS ONLY: Introduce query-wide execution state.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4014: HEADERS ONLY: Introduce query-wide execution state.
......................................................................


Patch Set 1:

(12 comments)

Mostly clarifying questions. Sorry if Henry and you had already reached consensus on some of these.

http://gerrit.cloudera.org:8080/#/c/4418/1/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

Line 27: /// Tear-down happens automatically in the d'tor and frees all memory allocated for
Didn't we want to move away from implicit destruction? Why is it ok to rely on that here?


Line 31: /// ReportProfile() is invoked periodically to report the execution status. The
No ReportProfile() in here. Wouldn't the reporting be controlled by the QueryState by periodically collecting and aggregating the profile() from all fragment instance states?


Line 52:   /// It is an error to delete a FragmentInstanceState before Open()/GetNext() (depending
What if Prepare() fails?


Line 94:   Status Open();
Given my comment below, should this be Exec() or similar?


Line 100:   Status GetNext(RowBatch** batch);
What's this call for? Don't all fragment instances end in a sink? I know that today the coord fragment is an exception, but that's going to change.


Line 120:   void ReleaseThreadToken();
When is this token acquired exactly?


http://gerrit.cloudera.org:8080/#/c/4418/1/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

Line 24: /// Central class for all backend execution state created for a particular query,
This first paragraph could use some cleanup, e.g., grammar of the first sentence is weird, "such as" seems misplaced. Also there seems to be some redundancy with the FragmentInstanceState explanations.


Line 30: /// The lifetime of an instance of this class is dictated by a reference count.
This sounds much like a shared_ptr, i.e., implicit management of the lifetime. What is the motivation for this design? It does not seem to prevent certain interesting races with e.g., RPCs wanting to access the query state after other threads are done with it (has transitioned to a 0 ref count).

Incoming RPCs for data exchanges seem especially interesting, e.g. when there is a limit somewhere downstream that could cause cancellation.


Line 41:   class Guard {
ScopedQueryState? This makes it clear that the access is scoped, and we already have similarly names classes such as ScopedSessionState.


Line 66:   /// Illegal to call any other function of this class after this call returns.
When is this legal to call with respect to the reference count? Will this call block until reference count has reached 0?


Line 76:   Status PublishFilter(const TUniqueId& dst_finstance_id, int32_t filter_id,
Isn't a filter really published to a plan node contained in all relevant fragment instances? In other words, why not have:
PublishFilter(TPlanNodeId dst_plan_node_id, int32_t filter_id, const TBloomFilter& thrift_bloom_filter);

That way we'll only have to deserialize the thrift_bloom_filter once and we can shield callers from knowing how many target fragment instances there are.

Another thought: Aren't runtime filters really shared among all relevant fragment instances? It might make more sense to have the RuntimeFilterBanks in here instead of the FragmentInstanceStates. Are we worried about exploding the memory consumption of runtime filters by DOP?


Line 78: };
We'll deal with local filter aggregation separately correct?


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has uploaded a new patch set (#7).

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................

IMPALA-4014: Introduce query-wide execution state.

This introduces a global structure to coordinate execution
of fragment instances on a backend for a single query.

New classes:
- QueryExecMgr: subsumes FragmentMgr
- QueryState
- FragmentInstanceState: replaces FragmentExecState

It also adds a static function Thread::Run() to run a function in a
newly created thread. This is now used to execute a fragment
instance, instead of managing and destroying Thread objects via
shared_ptrs, etc.

Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/exec/catalog-op-executor.cc
M be/src/exec/data-source-scan-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/hash-table-test.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/union-node.cc
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
R be/src/runtime/fragment-instance-state.cc
A be/src/runtime/fragment-instance-state.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
A be/src/runtime/query-exec-mgr.cc
A be/src/runtime/query-exec-mgr.h
A be/src/runtime/query-state.cc
A be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/request-pool-service.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/CMakeLists.txt
M be/src/service/fe-support.cc
D be/src/service/fragment-exec-state.h
D be/src/service/fragment-mgr.cc
D be/src/service/fragment-mgr.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
A be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/testutil/desc-tbl-builder.h
M be/src/udf/udf.cc
M be/src/util/container-util.h
M be/src/util/thread.cc
M be/src/util/thread.h
M be/src/util/uid-util.h
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
67 files changed, 1,216 insertions(+), 772 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4418/7
-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 5:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS5, Line 938: ||
> joining condition should go before the line break?
no


http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS5, Line 72: qs->refcnt_.Add(1);
> If this is the logic we're going with, I would suggest that this patch and 
i'm not sure what you're talking about. the qs is reused for subsequent fragment instances, unless you have some corner case (e.g., first instance fails and gc's qs).


PS5, Line 125: int32_t cnt = qs->refcnt_.Add(1);
> After we have a per-query RPC, I think we should disallow getting the QS if
why? it's not just the execution threads that need qs references.


PS5, Line 135: qs->refcnt_.Load()
> I think it will be better to put this LOG after getting the local 'cnt' val
Done


Line 141: 
> One problem I see with this is, every time we hit refcount=0, there is some
i'm not sure what you mean by 'unnecessarily'. the qs is gc'd when the refcnt goes to 0. i don't see a problem with another async thread increment the refcnt again, given that that same thread will eventually also decrement it. why would that be a problem?


http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

PS5, Line 66: /// lock order:
            :   /// 1. qs_map_lock_
            :   /// 2. QueryState::refcnt_lock_
> Remove now that refcnt is an AtomicInt
Done


PS5, Line 69: boost::mutex qs_map_lock_
> This will soon need to be a RWlock, or it could introduce scaling issues. B
i don't think it'll cause scaling issues, at least not at typical qps levels, but i'll let mostafa look into it.


http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

PS5, Line 338: local_query_ctx_
> If we're going to have a local_query_ctx_ anyway, why bother with accessing
the .h file points out that it's only used when there's no querystate, ie, for tests.

we definitely do not want to make a gratuitous copy of the queryctx, which can be a *very* large structure.


http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

PS5, Line 49: class ExecEnv;
            : class DataStreamMgr;
            : class HBaseTableFactory;
            : class TPlanFragmentCtx;
            : class TPlanFragmentInstanceCtx;
            : class QueryState;
> Alphabetical order?
that's immaterial to correctness or comprehensibility.


PS5, Line 329:   // needed?
             :   // static const int DEFAULT_BATCH_SIZE = 1024;
> Can remove? since its moved to query-state.h
Done


PS5, Line 351: TUniqueId no_instance_id_;
> Who sets this? It doesn't seem to be used now. My guess is it should be set
it's initialized to an invalid id and used in l127 of this file.


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has uploaded a new patch set (#3).

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................

IMPALA-4014: Introduce query-wide execution state.

This introduces a global structure to coordinate execution
of fragment instances on a backend for a single query.

New classes:
- QueryExecMgr: subsumes FragmentMgr
- QueryState
- FragmentInstanceState: replaces FragmentExecState

It also adds a static function Thread::Run() to run a function in a
newly created thread. This is now used to execute a fragment
instance, instead managing and destroying Thread objects via
shared_ptrs, etc.

Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
---
M be/src/exec/catalog-op-executor.cc
M be/src/exec/data-source-scan-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/union-node.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
R be/src/runtime/fragment-instance-state.cc
A be/src/runtime/fragment-instance-state.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
A be/src/runtime/query-exec-mgr.cc
A be/src/runtime/query-exec-mgr.h
A be/src/runtime/query-state.cc
A be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/request-pool-service.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/CMakeLists.txt
M be/src/service/fe-support.cc
D be/src/service/fragment-exec-state.h
D be/src/service/fragment-mgr.cc
D be/src/service/fragment-mgr.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
A be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/testutil/desc-tbl-builder.h
M be/src/udf/udf.cc
M be/src/util/container-util.h
M be/src/util/thread.cc
M be/src/util/thread.h
M be/src/util/uid-util.h
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
62 files changed, 1,118 insertions(+), 719 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4418/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 2:

(27 comments)

Regarding tests: I ran a bunch of queries locally, and basic cases work, but I seem to get some hbase-related crashes.

However, I don't think that the fixes will change the new functionality here in a meaningful way, and given the size of this patch I'd like to get the review rolling now.

http://gerrit.cloudera.org:8080/#/c/4418/1/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

Line 27: 
> Didn't we want to move away from implicit destruction? Why is it ok to rely
we do, and it's not okay. i just didn't want to change too much in one go. updated commit msg.


Line 31: class TPlanFragmentInstanceCtx;
> No ReportProfile() in here. Wouldn't the reporting be controlled by the Que
i'll remove this from the headers for now. i haven't spent much time looking at this code.


PS1, Line 32: 
            : class TUniqueId;
> Possibly a good time to retire that functionality which I don't think is ev
Done


PS1, Line 43: this fragment instance and closes all data streams.
            : ///
> Why can't we do that in this patch (given it's header only...)?
see above. i think it's too much to bite off in this set of changes.

updated commit msg.


Line 52:  public:
> What if Prepare() fails?
left todo, better addressed in the actual implementation.


PS1, Line 56: /// Frees up all resources allocated in Exec().
> When would we want to get a mutable query_state() from its FIS? That circum
we might pass the fis* into some function, and that function might need access the qs*. we could always pass around qs* and fis*, but this is more convenient.


PS1, Line 69: 
            :   QueryState* query_state() { return query_state_; }
> I do find this API weird, always have. Why doesn't this object start its ow
see updated commit msg.


Line 94:   /// if set to anything other than OK, execution has terminated w/ an error
> Given my comment below, should this be Exec() or similar?
Done


Line 100:   /// Update 'exec_status_' w/ 'status', if the former is not already an error.
> What's this call for? Don't all fragment instances end in a sink? I know th
see updated commit msg.


Line 120
> When is this token acquired exactly?
see updated commit msg.


http://gerrit.cloudera.org:8080/#/c/4418/1/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

PS1, Line 24: 
            : #include "common/status.h"
> I don't think this is relevant - we shouldn't comment on what might call th
Done


PS1, Line 33: TExecPlanFragmentParams;
            : class TUniqueId;
> Not clear from this who owns that reference, and who is therefore responsib
let me know if this is better.


PS1, Line 43: ass QueryExecMgr {
            :  public:
> Why is this here, and not accessible through GetQueryState(query_id)->Cance
Done


PS1, Line 55: StartFInstanc
> Again, don't see that this needs to be in this class.
Done


http://gerrit.cloudera.org:8080/#/c/4418/1/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

Line 24: 
> This first paragraph could use some cleanup, e.g., grammar of the first sen
moved a fragment of the sentence.

there is a comparison/contrasting with fis because the classes are related.


Line 30: namespace impala {
> This sounds much like a shared_ptr, i.e., implicit management of the lifeti
this is indeed like a shared_ptr in the sense that it's refcounted. the difference is that tear-down doesn't happen implicitly in the d'tor.

an rpc that arrives after the refcount goes to 0 is a no-op. if that would be an error (ie, if that were truly a race), we need to prevent that from happening in some other way, such as overhauling our rpc protocol.

in your example, data arriving after the final query result has been computed to me sounds like a no-op.


PS1, Line 31: 
> too specific: clients of this class must obtain a reference.
i don't know what you mean by 'clients of this class'. from a correctness perspective it is important that each thread that accesses query state of any kind has a reference to the qs for at least the duration of that access.


PS1, Line 41: / Any thread 
> comment on usage pattern, which is going to be common. Particularly that ca
changed to ScopedRef, expanded comment


Line 41: /// Any thread that executes on behalf of a query, and accesses any of its state,
> ScopedQueryState? This makes it clear that the access is scoped, and we alr
then we have declarations like 'QueryState::ScopedQueryState qs(...);', which is pretty awkward.


PS1, Line 44: cMgr::Get-/ReleaseQueryStat
> shouldn't there be corresponding changes to ExecEnv then? Or are you leavin
yes, leaving out changes to existing classes for now.


PS1, Line 49: 
> I don't think we need this, but presumably it's redundant here if the const
Done


PS1, Line 60: 
> nit picking here, but it might be helpful to distinguish the query lifetime
i'm not sure it makes sense to spend too much time polishing these headers for code that hasn't been written yet.

the purpose of this object pool is really for query-duration objects. for instance-duration objects we need an object pool in the fis, i'd think.


PS1, Line 65: 
> Would just say it releases all resources owned by this class ("accessible t
i actually meant exactly that (example: something sitting in RuntimeState). but "release resources" is more general.


Line 66:     /// may return nullptr
> When is this legal to call with respect to the reference count? Will this c
the point of reference counting is to have all accesses be legal unless the refcount is 0 (in which case you can't access the qs anymore). in other words, it's always legal to call this.


PS1, Line 73:   DISA
> Under what conditions would this return an error?
clarified.


Line 76:   /// a shared pool for all objects that have query lifetime
> Isn't a filter really published to a plan node contained in all relevant fr
good point, they are indeed shared across the entire query, as is the filter bank.


Line 78: 
> We'll deal with local filter aggregation separately correct?
yes, that requires new logic/new classes (an instance of which will live in the qs).


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 3:

(18 comments)

Looked over the ref count logic so far.

http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS4, Line 502:  QueryState* qs = ExecEnv::GetInstance()->query_exec_mgr()->GetQueryState(query_id_);
ScopedQueryState ?


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS4, Line 59: boost::
remove


Line 66:     } else {
Why is there no refcnt increment here?


PS4, Line 69: 
remove here and below.


PS4, Line 86: ImpaladMetrics::IMPALA_SERVER_NUM_FRAGMENTS_IN_FLIGHT->Increment(1L);
            :   ImpaladMetrics::IMPALA_SERVER_NUM_FRAGMENTS->Increment(1L);
            :   return Status::OK();
Why is this not in the QueryState? It seems like it should be part of RegisterFInstance().


PS4, Line 95: stance completed. in
Call this ExecFInstance() to make it clear that it blocks until instance completion.


PS4, Line 115: VLOG_FILE << "GetQueryState(): query_i
You have a bug here - if a fragment instance finishes quickly (perhaps it didn't prepare() successfully) the query state will get GC'ed before another fragment instance can be started. 

I realise this will be addressed by batching fragment start-up, but until then I suggest a workaround like adding the number of expected fragment instances to the fragment instance params, so that when the QEM is created it can take the right number of references.


PS4, Line 124: id QueryExecMgr::ReleaseQueryState(QueryState* qs) {
             :   VLOG_FILE << "
Prefer atomics rather than heavyweight mutexes for the ref count. Fewer locks -> fewer locking bugs.


PS4, Line 126:           << " refcnt=" <<
What are you checking for here - overflow or that the refcnt_ didn't start below 0? If it's the latter, move this to line 125 and check before increment that its GE(..., 0).


PS4, Line 147: 
How? Only one caller to ReleaseQueryState() can set refcnt == 0.


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

PS4, Line 47: In both cases it brackets the instance execution with an increment/decrement
            :   /// of the refcount.
This needs clarification - why is this relevant to callers? I think you mean that the QEM retains a reference to the QueryState until this FInstance has completed.


PS4, Line 58: Otherwise returns nullptr
AFAICT, GetQueryState() will return a QueryState even if its refcount == 0. It should not, because then you can avoid the complex logic in ReleaseQueryState() that deals with the possibility that some other caller took a refcount between setting the count to 0 and trying to GC the query state.


PS4, Line 66: /// TODO: isn't this available in std:: now?
remove, once we switch to std::mutex we'll get this one as well.


PS4, Line 72: clean up.
Comment on how it's allocated.


PS4, Line 75: 
Clean up what? Does this block until instance has finished?


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS4, Line 42: released_r
set this to 0. Callers should always manage the refcount with a balanced number of increment / decrement operations, otherwise bugs will be harder to understand.


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

PS4, Line 79: /// don't access query_ctx().desc_tbl, it's most likely for a different fragment instance
I don't understand this comment. Is this going to change after batching? Otherwise let's think about ways to avoid having to tell a caller not to touch a certain part of this data structure.


PS4, Line 107: 
Does access to this need to be synchronised?


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.
Jim Apple has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 11:

> Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/100/

It's a little tricky to find the error - you have to go through gerrit-verify-dryrun to parallel-all-tests to ubuntu-14.04-from-scratch. The error is:

    [ERROR] COMPILATION ERROR : 
    [ERROR] /home/ubuntu/Impala/fe/src/main/java/org/apache/impala/service/Frontend.java:[1043,42] cannot find symbol
    [INFO] BUILD FAILURE
    [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.3:compile (default-compile) on project impala-frontend: Compilation failure
    [ERROR] /home/ubuntu/Impala/fe/src/main/java/org/apache/impala/service/Frontend.java:[1043,42] cannot find symbol
    [ERROR] symbol:   variable request
    [ERROR] location: variable queryCtx of type org.apache.impala.thrift.TQueryCtx

-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 8:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4418/6/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS6, Line 509: qs = nullptr;
> superfluous, since it is never read again.
it still feels safer to reset it, the logic below can change.


http://gerrit.cloudera.org:8080/#/c/4418/6/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS6, Line 76: uery_id=" << param
> beware that this can change between line 72 and now. Just store the result 
Done


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

Line 83:   /// rpc.
> Okay, yes, that's what was discussed. And yes, the sentence below (that you
Done


http://gerrit.cloudera.org:8080/#/c/4418/6/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

PS6, Line 89: // Helper function to translate between Beeswax and HiveServer2 type
            : static TOperationState::type QueryStateToTOperationState(
            :     const beeswax::QueryState::type& query_state) {
            :   switch (query_state) {
            :     case beeswax::QueryState::CREATED: return TOperationState::INITIALIZED_STATE;
            :     case beeswax::QueryState::RUNNING: return TOperationState::RUNNING_STATE;
            :     case beeswax::QueryState::FINISHED: return TOperationState::FINISHED_STATE;
            :     case beeswax::QueryState::EXCEPTION: return TOperationState::ERROR_STATE;
            :     default: return TOperationState::UKNOWN_STATE;
            :   }
            : }
> While you're here, put in anonymous namespace.
static will work just fine here.


http://gerrit.cloudera.org:8080/#/c/4418/6/be/src/util/thread.h
File be/src/util/thread.h:

PS6, Line 146: /// Starts a detached thread running 'functor' and returns immediately.
             :   static void StartDetachedThread(con
> What's a detached thread? Why doesn't this method register threads so that 
i reverted the changes to thread.{cc,h}.


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has uploaded a new patch set (#6).

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................

IMPALA-4014: Introduce query-wide execution state.

This introduces a global structure to coordinate execution
of fragment instances on a backend for a single query.

New classes:
- QueryExecMgr: subsumes FragmentMgr
- QueryState
- FragmentInstanceState: replaces FragmentExecState

It also adds a static function Thread::Run() to run a function in a
newly created thread. This is now used to execute a fragment
instance, instead of managing and destroying Thread objects via
shared_ptrs, etc.

Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/exec/catalog-op-executor.cc
M be/src/exec/data-source-scan-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/hash-table-test.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/union-node.cc
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
R be/src/runtime/fragment-instance-state.cc
A be/src/runtime/fragment-instance-state.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
A be/src/runtime/query-exec-mgr.cc
A be/src/runtime/query-exec-mgr.h
A be/src/runtime/query-state.cc
A be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/request-pool-service.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/CMakeLists.txt
M be/src/service/fe-support.cc
D be/src/service/fragment-exec-state.h
D be/src/service/fragment-mgr.cc
D be/src/service/fragment-mgr.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
A be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/testutil/desc-tbl-builder.h
M be/src/udf/udf.cc
M be/src/util/container-util.h
M be/src/util/thread.cc
M be/src/util/thread.h
M be/src/util/uid-util.h
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
67 files changed, 1,215 insertions(+), 778 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4418/6
-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/test-env.h
File be/src/runtime/test-env.h:

Line 66:   /// Create a new RuntimeState sharing global environment with given query options
sorry, i'll clean that up (same for .cc).


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 10:

> Looks good. I can take a final look after you do the thread detach
 > fix.

Done.

-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 14:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4418/14/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

Line 136:   {
Why is there a scoped block here? It doesn't seem to do anything.


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has uploaded a new patch set (#4).

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................

IMPALA-4014: Introduce query-wide execution state.

This introduces a global structure to coordinate execution
of fragment instances on a backend for a single query.

New classes:
- QueryExecMgr: subsumes FragmentMgr
- QueryState
- FragmentInstanceState: replaces FragmentExecState

It also adds a static function Thread::Run() to run a function in a
newly created thread. This is now used to execute a fragment
instance, instead of managing and destroying Thread objects via
shared_ptrs, etc.

Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/exec/catalog-op-executor.cc
M be/src/exec/data-source-scan-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/union-node.cc
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
R be/src/runtime/fragment-instance-state.cc
A be/src/runtime/fragment-instance-state.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
A be/src/runtime/query-exec-mgr.cc
A be/src/runtime/query-exec-mgr.h
A be/src/runtime/query-state.cc
A be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/request-pool-service.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/CMakeLists.txt
M be/src/service/fe-support.cc
D be/src/service/fragment-exec-state.h
D be/src/service/fragment-mgr.cc
D be/src/service/fragment-mgr.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
A be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/testutil/desc-tbl-builder.h
M be/src/udf/udf.cc
M be/src/util/container-util.h
M be/src/util/thread.cc
M be/src/util/thread.h
M be/src/util/uid-util.h
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
65 files changed, 1,199 insertions(+), 751 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4418/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/4418/2/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 1903: // TODO: call this as soon as it's clear that we won't reference the state 
removed.


http://gerrit.cloudera.org:8080/#/c/4418/2/be/src/runtime/plan-fragment-executor.h
File be/src/runtime/plan-fragment-executor.h:

Line 107:   /// TODO: remove desc_tbl parameter once we do a per-query exec rpc (and we 
removed


http://gerrit.cloudera.org:8080/#/c/4418/2/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

Line 82:   // TODO: destroy thread
removed


http://gerrit.cloudera.org:8080/#/c/4418/2/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

Line 67:   boost::mutex qs_map_lock_; 
removed


http://gerrit.cloudera.org:8080/#/c/4418/2/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

Line 21:  
removed


http://gerrit.cloudera.org:8080/#/c/4418/2/be/src/service/impala-internal-service.cc
File be/src/service/impala-internal-service.cc:

Line 58:     return; 
removed


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS5, Line 72: qs->refcnt_.Add(1);
> Another example: an ExecPlanFragment RPC can arrive late after some instanc
what you're describing is a complete corner case. i don't see a need to optimize for that.


Line 141: 
> With the current state of things, after all fragment instances finish execu
i do not want to introduce another counter to keep track of fragment instances, the whole point of this is to keep the qs around until all references to it are gone (and not just those needed for instance execution - i'd prefer to have the final status report rpc be sent when the qs gets gc'ed).

those late-arriving rpcs you're describing sound like another corner case, and i don't see how they would end up consuming a lot of resources. i would be worried about them if they increased the map lock contention by a sizable amount, say 30 or 50 or 100%, but i just don't see how that's possible.

aside from that, it's very easy to reduce the lock contention by an arbitrary amount (by partitioning the key space and having a separate map plus lock per partition). i just don't think we need that quite yet.


http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

PS5, Line 351: TUniqueId no_instance_id_;
> Sorry I meant that it doesn't seem to be initialized. Where is it initializ
the default c'tor initializes it.


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Internal Jenkins (Code Review)" <ge...@cloudera.org>.
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 12: Verified-1

Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/632/

-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Internal Jenkins (Code Review)" <ge...@cloudera.org>.
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 13: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 11:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/100/

-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has uploaded a new patch set (#2).

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................

IMPALA-4014: Introduce query-wide execution state.

This introduces a global structure to coordinate execution
of fragment instances on a backend for a single query.

New classes:
- QueryExecMgr: subsumes FragmentMgr
- QueryState
- FragmentInstanceState: replaces FragmentExecState

It also adds a static function Thread::Run() to run a function in a
newly created thread. This is now used to execute a fragment
instance, instead managing and destroying Thread objects via
shared_ptrs, etc.

Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
---
M be/src/exec/catalog-op-executor.cc
M be/src/exec/data-source-scan-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
A be/src/runtime/fragment-instance-state.cc
A be/src/runtime/fragment-instance-state.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
A be/src/runtime/query-exec-mgr.cc
A be/src/runtime/query-exec-mgr.h
A be/src/runtime/query-state.cc
A be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/request-pool-service.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/CMakeLists.txt
M be/src/service/fe-support.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
A be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/testutil/desc-tbl-builder.h
M be/src/udf/udf.cc
M be/src/util/container-util.h
M be/src/util/thread.cc
M be/src/util/thread.h
M be/src/util/uid-util.h
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
58 files changed, 1,241 insertions(+), 338 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4418/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has uploaded a new patch set (#8).

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................

IMPALA-4014: Introduce query-wide execution state.

This introduces a global structure to coordinate execution
of fragment instances on a backend for a single query.

New classes:
- QueryExecMgr: subsumes FragmentMgr
- QueryState
- FragmentInstanceState: replaces FragmentExecState

It also adds a static function Thread::Run() to run a function in a
newly created thread. This is now used to execute a fragment
instance, instead of managing and destroying Thread objects via
shared_ptrs, etc.

Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/exec/catalog-op-executor.cc
M be/src/exec/data-source-scan-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/hash-table-test.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/union-node.cc
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
R be/src/runtime/fragment-instance-state.cc
A be/src/runtime/fragment-instance-state.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
A be/src/runtime/query-exec-mgr.cc
A be/src/runtime/query-exec-mgr.h
A be/src/runtime/query-state.cc
A be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/request-pool-service.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/CMakeLists.txt
M be/src/service/fe-support.cc
D be/src/service/fragment-exec-state.h
D be/src/service/fragment-mgr.cc
D be/src/service/fragment-mgr.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
A be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/testutil/desc-tbl-builder.h
M be/src/udf/udf.cc
M be/src/util/container-util.h
M be/src/util/thread.cc
M be/src/util/thread.h
M be/src/util/uid-util.h
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
67 files changed, 1,219 insertions(+), 775 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4418/8
-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Sailesh Mukil (Code Review)" <ge...@cloudera.org>.
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS5, Line 72: qs->refcnt_.Add(1);
> i'm not sure what you're talking about. the qs is reused for subsequent fra
Another example: an ExecPlanFragment RPC can arrive late after some instances for the same query have already arrived and finished executing and GC'd the QS.

The example you mention also sounds scary, since it doesn't seem like the code anticipates that. That could mean FInstances that arrive late may start execution after the query has been cancelled, wasting resources.


PS5, Line 125: int32_t cnt = qs->refcnt_.Add(1);
> why? it's not just the execution threads that need qs references.
For the same reason I mention in the comments in L141.


Line 141: 
> i'm not sure what you mean by 'unnecessarily'. the qs is gc'd when the refc
With the current state of things, after all fragment instances finish executing (either successfully or not), there is no use of servicing any of the async calls. Those are UpdateFilter, TransmitData, CancelPlanFragment, etc.

Having the code this way could lead to a chain effect where these late RPCs that arrive after all FInstances have completed (successful or not) keep getting references to the QS only to find that the query is already complete, causing unnecessary contention on the qs_map_lock_ and having the QS longer than necessary.

Late RPCs will be more evident when we move to a model where we queue RPCs rather than send them immediately (i.e. after we move to KuduRPC), so I feel we need to keep that in mind with future patches moving forward.

So what I'm proposing is to know when all FInstances have completed execution (via another counter) and not give away new references to async RPCs after that.

Of course this is all moot if you think we will be introducing RPCs in the future that will still need to be serviced after ALL FInstances have completed executing.


http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

PS5, Line 338: local_query_ctx_
> the .h file points out that it's only used when there's no querystate, ie, 
Ok, makes sense.


http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

PS5, Line 351: /// Use pointer to avoid i
> it's initialized to an invalid id and used in l127 of this file.
Sorry I meant that it doesn't seem to be initialized. Where is it initialized to an invalid ID?


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 7:

(3 comments)

Other than the remaining questions around the released_resources() bool, this change looks fine to me.

http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

Line 83:   bool released_resources() const { return released_resources_; }
> the idea with explicit resource release (as opposed to implicit, via some d
Okay, yes, that's what was discussed. And yes, the sentence below (that you've removed) was part of what led to to these comments. But also the fact that ReleaseResources() call is currently triggered by refcnt going to 0.  i.e. it looks like the resources are also tied to the refcnt as well as the control structure.  I suppose you plan to move the ReleaseResources() to a different point (like when all instance executions are complete, which may happen before all references go away)?

Even so, given that multiple threads will be accessing the QueryState, I don't see how a single bool is sufficient to safely determine whether the resources are accessible by code outside of this class.  So, I'm still not sure what the usefulness of this bool is.  Either we'll have to make accessing/updating this bool and accessing/tearing-down the resources atomic, or we'll have rules about when during the query lifecycle the resources are valid (but then this seems to be less explicit).

maybe we can defer adding the bool until the change that actually makes use of it, so that getting this change committed doesn't get blocked on this discussion?


Line 93:   /// Illegal to call any other function of this class after this call returns.
> see above.
Thanks, this sentence was part of what led to the comment above.


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

Line 90:   DCHECK(status.ok()) << status.GetDetail();
> actually, Init() always returns ok. i'm removing this dcheck.
thanks


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 6:

(11 comments)

Some more comments about the refcounting, many in response to yours.

http://gerrit.cloudera.org:8080/#/c/4418/6/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS6, Line 509: qs = nullptr;
superfluous, since it is never read again.


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS4, Line 86: // start new thread to execute instance
            :   Thread::Run("query-exec-mgr",
            :       Substitute("exec-fragment-instance-$0", PrintId(
> i felt both approaches are equally valid.
Having some fragment instance management logic here confuses the responsibilities of this class and of QueryState (which I would expect to control the lifecycle of the fragment instance as well as contain its metadata). There naturally seems to be a hierarchy of responsibilities: QueryExecMgr manages QueryStates, QueryStates manage FragmentInstances. That encapsulation makes the code easier to reason about.


PS4, Line 115: // always decrement refcount
> i thought about this, but didn't see it as a bug.
Maybe it's not a bug as implemented, but it's counter-intuitive that there could be more than one QS over time (it's tripped two reviewers up now). It is a simpler invariant that there is one QES over the lifetime of the query. I think there are bugs waiting to happen. For example: we add aggregated state / metrics to the QES and then wonder why they get reset over time.


PS4, Line 147: if (it == qs_map_.end()) return;
> after decreasing the refcnt to 0 in l138 someone else could have increased 
I think this is again unnecessarily confusing. Why not have the semantics such that:

  refcount == 0 -> no more references may be taken && caller to RQS() that sets count == 0 performs (or schedules) the GC.

If we stipulate that a query is live for as long as its fragments are running, then having the QES start out with refcount == 1 (like you already do) is compatible with the above invariant, and much easier to reason about.


http://gerrit.cloudera.org:8080/#/c/4418/6/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS6, Line 76: qs->refcnt_.Load()
beware that this can change between line 72 and now. Just store the result of Add() and log that.


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

PS4, Line 58: 
> i'll change this comment to fit the existing behavior.
Elsewhere you say that you're happy with a QS getting recreated if the original gets GCed. I think that allowing refcount == 0 as a valid live state complicates the state machine for a refcounted object. For now, if the refcount hits 0, it's not deterministic whether or not a QS will get GC'ed. It's more understandable to make that relationship unambiguous, and I've expanded on that in other comments.

Who are the clients that you expect to do a second registration? If you're concerned about the case where a fragment completes before another fragment arrives, that seems like an edge case that's not worth adapting the semantics for (given that that case won't be possible once batching happens). Eventually I would expect registration to happen once per query, called from one path only (the batched RPC handler).


http://gerrit.cloudera.org:8080/#/c/4418/6/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

PS6, Line 89: // Helper function to translate between Beeswax and HiveServer2 type
            : static TOperationState::type QueryStateToTOperationState(
            :     const beeswax::QueryState::type& query_state) {
            :   switch (query_state) {
            :     case beeswax::QueryState::CREATED: return TOperationState::INITIALIZED_STATE;
            :     case beeswax::QueryState::RUNNING: return TOperationState::RUNNING_STATE;
            :     case beeswax::QueryState::FINISHED: return TOperationState::FINISHED_STATE;
            :     case beeswax::QueryState::EXCEPTION: return TOperationState::ERROR_STATE;
            :     default: return TOperationState::UKNOWN_STATE;
            :   }
            : }
While you're here, put in anonymous namespace.


http://gerrit.cloudera.org:8080/#/c/4418/6/be/src/util/thread.cc
File be/src/util/thread.cc:

PS6, Line 305: //ThreadMgr* thread_mgr = ExecEnv::GetInstance()->thread_mgr();
?


PS6, Line 313: //thread_mgr_ref->AddThread(this_thread::get_id(), name, category, system_tid);
?


http://gerrit.cloudera.org:8080/#/c/4418/6/be/src/util/thread.h
File be/src/util/thread.h:

PS6, Line 146: /// Starts a detached thread running 'functor' and returns immediately.
             :   static void StartDetachedThread(con
What's a detached thread? Why doesn't this method register threads so that we can monitor them?


PS6, Line 172: SuperviseDetachedThread
Needs a comment. At this moment I don't see why this exists.


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Internal Jenkins (Code Review)" <ge...@cloudera.org>.
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 11: Verified-1

Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/629/

-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 10: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, Internal Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/4418

to look at the new patch set (#13).

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................

IMPALA-4014: Introduce query-wide execution state.

This introduces a global structure to coordinate execution
of fragment instances on a backend for a single query.

New classes:
- QueryExecMgr: subsumes FragmentMgr
- QueryState
- FragmentInstanceState: replaces FragmentExecState

Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/exec/catalog-op-executor.cc
M be/src/exec/data-source-scan-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/hash-table-test.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/union-node.cc
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
R be/src/runtime/fragment-instance-state.cc
A be/src/runtime/fragment-instance-state.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
A be/src/runtime/query-exec-mgr.cc
A be/src/runtime/query-exec-mgr.h
A be/src/runtime/query-state.cc
A be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/request-pool-service.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/CMakeLists.txt
M be/src/service/fe-support.cc
D be/src/service/fragment-exec-state.h
D be/src/service/fragment-mgr.cc
D be/src/service/fragment-mgr.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
A be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/testutil/desc-tbl-builder.h
M be/src/udf/udf.cc
M be/src/util/container-util.h
M be/src/util/thread.h
M be/src/util/uid-util.h
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
67 files changed, 1,177 insertions(+), 782 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4418/13
-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 9:

Looks good. I can take a final look after you do the thread detach fix.

-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 7:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS7, Line 938: !
> nit: missing space
Done


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

PS7, Line 507: fragment_instance_state
> instance_state to match below and the definition.
Done


PS7, Line 511: fragment_instance_state
> same
Done


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

Line 75: // including the final status when execution finishes.
> I had deleted this comment because it wasn't accurate (and had added the ac
rebase accident. removed.


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

Line 115:   // always decrement refcount
> not sure what this comment is trying to tell me.  would be more informative
Done


Line 148:     qs = it->second;
> it's probably worth commenting here explaining why we need to get a new qs 
Done


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

Line 38: /// entry point for gaining refcounted access to a QueryState.
> this doesn't really explain that it's also the thing that executes instance
added comment.


PS7, Line 47: decrement
> this really happens after execution finishes, so would be good to reword.
Done


PS7, Line 53: CancelFInstance
> what's that? (not a method of this class)
Done


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

PS7, Line 79: desc_tbl
> do we plan to pull this out then? (maybe it's mentioned somewhere i haven't
i expanded the comment.


Line 83:   bool released_resources() const { return released_resources_; }
> who outside this class will care about this? besides, in order to call thes
the idea with explicit resource release (as opposed to implicit, via some d'tor) is that we keep the control structures around (such as QueryState, FIS) but still get to release resources. in other words, we can release resources and then send the final exec report or do whatever else. the control structures will stay around until the last refcnt is gone.


Line 93:   /// Illegal to call any other function of this class after this call returns.
> Isn't it more that a reference must be held when calling the other methods,
see above.

the second sentence is wrong. removed.


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

Line 90:   DCHECK(status.ok()) << status.GetDetail();
> not for this change but we should clean this up.
actually, Init() always returns ok. i'm removing this dcheck.


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/service/impala-internal-service.cc
File be/src/service/impala-internal-service.cc:

Line 68:   fis->Cancel();
> is it intentional that we ignore the status returned by Cancel()?  We used 
oversight. fixed.


http://gerrit.cloudera.org:8080/#/c/4418/7/be/src/service/query-exec-state.h
File be/src/service/query-exec-state.h:

Line 55: /// TODO: Consider renaming to RequestExecState for consistency.
> don't do it as part of this change, but this renaming is more important now
i agree, it's getting confusing.


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Sailesh Mukil (Code Review)" <ge...@cloudera.org>.
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 5:

(11 comments)

Did a quick first pass.

http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS5, Line 938: ||
joining condition should go before the line break?


http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS5, Line 72: qs->refcnt_.Add(1);
If this is the logic we're going with, I would suggest that this patch and the per-query RPC patch make it in with a relatively short gap between them.

Else we will have the penalty of tearing down and setting up a QS multiple times for the same query.


PS5, Line 125: int32_t cnt = qs->refcnt_.Add(1);
After we have a per-query RPC, I think we should disallow getting the QS if the refcount has already reached 0 (since that would mean all the fragments have completed).

If you agree, you can add that as a TODO here.


PS5, Line 135: qs->refcnt_.Load()
I think it will be better to put this LOG after getting the local 'cnt' value and printing it as "refcnt=" << cnt + 1.

So we can at least via the logs understand what decision was made by this function (attempt to gc vs return) based on the log line.
Eg: If we see "refcnt=1", we will know that this would have attempted a GC.

Also, its good to not lock the bus twice.


Line 141: 
One problem I see with this is, every time we hit refcount=0, there is some leeway for some async call to get a reference.

However, those async calls (late reports, late filters, etc.) will mostly be of no use since if the refcount hit 0, that means all the fragments are done executing. So why keep the QS around unnecessarily?


http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

PS5, Line 66: /// lock order:
            :   /// 1. qs_map_lock_
            :   /// 2. QueryState::refcnt_lock_
Remove now that refcnt is an AtomicInt


PS5, Line 69: boost::mutex qs_map_lock_
This will soon need to be a RWlock, or it could introduce scaling issues. But that will have its own issues of reader or writer starvation, so won't be a trivial change. What do you think?


http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

PS5, Line 338: local_query_ctx_
If we're going to have a local_query_ctx_ anyway, why bother with accessing the query_state_->query_ctx() ?


http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

PS5, Line 49: class ExecEnv;
            : class DataStreamMgr;
            : class HBaseTableFactory;
            : class TPlanFragmentCtx;
            : class TPlanFragmentInstanceCtx;
            : class QueryState;
Alphabetical order?


PS5, Line 329:   // needed?
             :   // static const int DEFAULT_BATCH_SIZE = 1024;
Can remove? since its moved to query-state.h


PS5, Line 351: TUniqueId no_instance_id_;
Who sets this? It doesn't seem to be used now. My guess is it should be set in the second constructor?


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 7:

rebase past the recent pfe fixes

-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 4:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS4, Line 502:  QueryState* qs = ExecEnv::GetInstance()->query_exec_mgr()->GetQueryState(query_id_);
> ScopedQueryState ?
that's a bug: you need to keep the reference until the query completes. see the call to ReleaseQueryState() in TearDown().


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS4, Line 59: boost::
> remove
Done


Line 66:       VLOG_QUERY << "new QueryState: query_id=" << query_id;
> Why is there no refcnt increment here?
it's a new one, and the refcnt is initialized to 1 (documented for the c'tor).


PS4, Line 69: boost::
> remove here and below.
Done


PS4, Line 86: Thread::Run("query-exec-mgr",
            :       Substitute("exec-fragment-instance-$0", PrintId(instance_id)),
            :       &QueryExecMgr::StartFInstanceHelper, this, fis);
> Why is this not in the QueryState? It seems like it should be part of Regis
i felt both approaches are equally valid.


PS4, Line 95: StartFInstanceHelper
> Call this ExecFInstance() to make it clear that it blocks until instance co
Done


PS4, Line 115: ReleaseQueryState(fis->query_state());
> You have a bug here - if a fragment instance finishes quickly (perhaps it d
i thought about this, but didn't see it as a bug.

why do you think it's a bug? the second fragment instance will simply re-create the qs.


PS4, Line 124: boost::lock_guard<mutex> l2(qs->refcnt_lock_);
             :   ++qs->refcnt_;
> Prefer atomics rather than heavyweight mutexes for the ref count. Fewer loc
Done


PS4, Line 126: DCHECK_GT(qs->refcnt_, 0);
> What are you checking for here - overflow or that the refcnt_ didn't start 
semantically it's the same to check it here, and this requires less synchronization.


PS4, Line 147: // someone else might have gc'd the entry
> How? Only one caller to ReleaseQueryState() can set refcnt == 0.
after decreasing the refcnt to 0 in l138 someone else could have increased it again before we get the lock in l145.


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

PS4, Line 47: In both cases it brackets the instance execution with an increment/decrement
            :   /// of the refcount.
> This needs clarification - why is this relevant to callers? I think you mea
it expresses that the qs won't go away while the instance is executing.


PS4, Line 58: Otherwise returns nullptr
> AFAICT, GetQueryState() will return a QueryState even if its refcount == 0.
i'll change this comment to fit the existing behavior.

i like the existing behavior better, because it eliminates the possibility of someone trying to register a qs because this function returned nullptr, despite the qs still being accessible.


PS4, Line 66: /// TODO: isn't this available in std:: now?
> remove, once we switch to std::mutex we'll get this one as well.
Done


PS4, Line 72: QueryState (owned by us)
> Comment on how it's allocated.
you mean that it's calling new?


PS4, Line 75: /// Execute instance and clean up.
> Clean up what? Does this block until instance has finished?
Done


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS4, Line 42: refcnt_(1)
> set this to 0. Callers should always manage the refcount with a balanced nu
Done


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

PS4, Line 79: /// don't access query_ctx().desc_tbl, it's most likely for a different fragment instance
> I don't understand this comment. Is this going to change after batching? Ot
yes, this will change with the per-query exec rpc. it's broken right now because every fragment instance shows up with its own descriptor table, and they're all slightly different.


PS4, Line 107: bool
> Does access to this need to be synchronised?
not so far.


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: HEADERS ONLY: Introduce query-wide execution state.

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4014: HEADERS ONLY: Introduce query-wide execution state.
......................................................................


Patch Set 1:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/4418/1/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

PS1, Line 32: setting
            : /// that flag to 0 disables periodic reporting altogether.
Possibly a good time to retire that functionality which I don't think is ever used.


PS1, Line 43: TODO:
            : /// - move tear-down logic out of d'tor and into TearDown() function
Why can't we do that in this patch (given it's header only...)?


PS1, Line 56: QueryState* query_state() { return query_state_; }
When would we want to get a mutable query_state() from its FIS? That circumstance would suggest the caller violated the top-down pattern for getting at the FIS in the first place.


PS1, Line 69: /// Set the execution thread, taking ownership of the object.
            :   void set_exec_thread(Thread* exec_thread) { exec_thread_.reset(exec_thread); }
I do find this API weird, always have. Why doesn't this object start its own thread?


PS1, Line 116:   /// Returns true if this query has a limit and it has been reached.
             :   bool ReachedLimit();
FYI, this and GetNext() are likely to be removed with my patch for IMPALA-2905 (subject to review). Might be worth anticipating that here.


http://gerrit.cloudera.org:8080/#/c/4418/1/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

PS1, Line 24: an entry point for
            : /// execution-related rpcs (ExecPlanFragment/CancelPlanFragment)
I don't think this is relevant - we shouldn't comment on what might call this class.


PS1, Line 33: Also creates a QueryState for this query, if none exists, and sets its refcount
            :   /// to 1. If a QueryState already exists for this query, increments the refcount.
Not clear from this who owns that reference, and who is therefore responsible for releasing it.


PS1, Line 43: /// Cancels a particular fragment instance.
            :   Status CancelFInstance(const TUniqueId& finstance_id);
Why is this here, and not accessible through GetQueryState(query_id)->Cancel(...) ?


PS1, Line 55: PublishFilter
Again, don't see that this needs to be in this class.


http://gerrit.cloudera.org:8080/#/c/4418/1/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

PS1, Line 31: threads
too specific: clients of this class must obtain a reference.


PS1, Line 41: class Guard {
comment on usage pattern, which is going to be common. Particularly that callers must check if query_state() is not null, and if it is not null then it will live as long as the guard.

FWIW, I prefer ScopedRef as a name. The 'guard' in lock_guard<> means - I think - that the object guards a critical section by preventing concurrent access.


PS1, Line 44: (ExecEnv::GetQueryExecMgr()
shouldn't there be corresponding changes to ExecEnv then? Or are you leaving those out to make this self-contained?

Either way - prefer to do ExecEnv::GetInstance()->query_exec_mgr(). Under what circumstances would this ever be null?


PS1, Line 49: DCHECK(ExecEnv::GetQueryExecMgr() != nullptr);
I don't think we need this, but presumably it's redundant here if the constructor checked it exists.


PS1, Line 60: query
nit picking here, but it might be helpful to distinguish the query lifetime from the lifetime of the set of fragment instances; since the former can be significantly longer than the latter. Not sure what a good alternative is without introducing a new phrase for a group of fragment instances.


PS1, Line 65: Delete all query state in this class or accessible through this class.
Would just say it releases all resources owned by this class ("accessible through this class" sounds like it could reach out and delete other structures' data).


PS1, Line 69: /// Registers a new FInstanceState.
Either the comment or the method name should change. I would change the comment - the fact that an FInstanceState is registered seems like an implementation detail.


PS1, Line 73: Status
Under what conditions would this return an error?


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has uploaded a new patch set (#5).

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................

IMPALA-4014: Introduce query-wide execution state.

This introduces a global structure to coordinate execution
of fragment instances on a backend for a single query.

New classes:
- QueryExecMgr: subsumes FragmentMgr
- QueryState
- FragmentInstanceState: replaces FragmentExecState

It also adds a static function Thread::Run() to run a function in a
newly created thread. This is now used to execute a fragment
instance, instead of managing and destroying Thread objects via
shared_ptrs, etc.

Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/exec/catalog-op-executor.cc
M be/src/exec/data-source-scan-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/hash-table-test.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/union-node.cc
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
R be/src/runtime/fragment-instance-state.cc
A be/src/runtime/fragment-instance-state.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
A be/src/runtime/query-exec-mgr.cc
A be/src/runtime/query-exec-mgr.h
A be/src/runtime/query-state.cc
A be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/request-pool-service.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/CMakeLists.txt
M be/src/service/fe-support.cc
D be/src/service/fragment-exec-state.h
D be/src/service/fragment-mgr.cc
D be/src/service/fragment-mgr.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
A be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/testutil/desc-tbl-builder.h
M be/src/udf/udf.cc
M be/src/util/container-util.h
M be/src/util/thread.cc
M be/src/util/thread.h
M be/src/util/uid-util.h
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
67 files changed, 1,235 insertions(+), 753 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4418/5
-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 4:

(2 comments)

All tests now pass, please take a look.

The changes in AnalysisContext, impala-server.* and ImpalinternalService.thrift in the latest version are from a rebase.

http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

Line 119:   VLOG_QUERY << "GetQueryState(): query_id=" << PrintId(query_id);
i'll lower this and the other ones back to file.


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/service/impala-internal-service.cc
File be/src/service/impala-internal-service.cc:

Line 41:   VLOG_QUERY << "ExecPlanFragment():"
i'll drop these to vlog_file


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 11: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 11: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/100/

-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 13: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Internal Jenkins (Code Review)" <ge...@cloudera.org>.
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 12: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Hello Dan Hecht,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/4418

to look at the new patch set (#11).

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................

IMPALA-4014: Introduce query-wide execution state.

This introduces a global structure to coordinate execution
of fragment instances on a backend for a single query.

New classes:
- QueryExecMgr: subsumes FragmentMgr
- QueryState
- FragmentInstanceState: replaces FragmentExecState

Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/exec/catalog-op-executor.cc
M be/src/exec/data-source-scan-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/hash-table-test.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/union-node.cc
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
R be/src/runtime/fragment-instance-state.cc
A be/src/runtime/fragment-instance-state.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
A be/src/runtime/query-exec-mgr.cc
A be/src/runtime/query-exec-mgr.h
A be/src/runtime/query-state.cc
A be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/request-pool-service.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/CMakeLists.txt
M be/src/service/fe-support.cc
D be/src/service/fragment-exec-state.h
D be/src/service/fragment-mgr.cc
D be/src/service/fragment-mgr.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
A be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/testutil/desc-tbl-builder.h
M be/src/udf/udf.cc
M be/src/util/container-util.h
M be/src/util/thread.h
M be/src/util/uid-util.h
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
66 files changed, 1,174 insertions(+), 778 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4418/11
-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has uploaded a new patch set (#9).

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................

IMPALA-4014: Introduce query-wide execution state.

This introduces a global structure to coordinate execution
of fragment instances on a backend for a single query.

New classes:
- QueryExecMgr: subsumes FragmentMgr
- QueryState
- FragmentInstanceState: replaces FragmentExecState

Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/exec/catalog-op-executor.cc
M be/src/exec/data-source-scan-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/hash-table-test.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/union-node.cc
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
R be/src/runtime/fragment-instance-state.cc
A be/src/runtime/fragment-instance-state.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
A be/src/runtime/query-exec-mgr.cc
A be/src/runtime/query-exec-mgr.h
A be/src/runtime/query-state.cc
A be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/request-pool-service.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/CMakeLists.txt
M be/src/service/fe-support.cc
D be/src/service/fragment-exec-state.h
D be/src/service/fragment-mgr.cc
D be/src/service/fragment-mgr.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
A be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/testutil/desc-tbl-builder.h
M be/src/udf/udf.cc
M be/src/util/container-util.h
M be/src/util/uid-util.h
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
65 files changed, 1,165 insertions(+), 775 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4418/9
-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4418/9/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

Line 89:   Thread("query-exec-mgr",
when this goes out of scope and the d'tor is called, the thread that's underlying the boost::thread object becomes detached.

however, that isn't guaranteed behavior, in a more recent version and for std::thread it needs to be detached explicitly. i'll fix this to make the thread detached.


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Internal Jenkins (Code Review)" <ge...@cloudera.org>.
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


IMPALA-4014: Introduce query-wide execution state.

This introduces a global structure to coordinate execution
of fragment instances on a backend for a single query.

New classes:
- QueryExecMgr: subsumes FragmentMgr
- QueryState
- FragmentInstanceState: replaces FragmentExecState

Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Reviewed-on: http://gerrit.cloudera.org:8080/4418
Reviewed-by: Marcel Kornacker <ma...@cloudera.com>
Tested-by: Internal Jenkins
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/exec/catalog-op-executor.cc
M be/src/exec/data-source-scan-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/hash-table-test.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/union-node.cc
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
R be/src/runtime/fragment-instance-state.cc
A be/src/runtime/fragment-instance-state.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
A be/src/runtime/query-exec-mgr.cc
A be/src/runtime/query-exec-mgr.h
A be/src/runtime/query-state.cc
A be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/request-pool-service.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/CMakeLists.txt
M be/src/service/fe-support.cc
D be/src/service/fragment-exec-state.h
D be/src/service/fragment-mgr.cc
D be/src/service/fragment-mgr.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
A be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/testutil/desc-tbl-builder.h
M be/src/udf/udf.cc
M be/src/util/container-util.h
M be/src/util/thread.h
M be/src/util/uid-util.h
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
67 files changed, 1,177 insertions(+), 782 deletions(-)

Approvals:
  Marcel Kornacker: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, Internal Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/4418

to look at the new patch set (#12).

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................

IMPALA-4014: Introduce query-wide execution state.

This introduces a global structure to coordinate execution
of fragment instances on a backend for a single query.

New classes:
- QueryExecMgr: subsumes FragmentMgr
- QueryState
- FragmentInstanceState: replaces FragmentExecState

Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/exec/catalog-op-executor.cc
M be/src/exec/data-source-scan-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/hash-table-test.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/union-node.cc
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
R be/src/runtime/fragment-instance-state.cc
A be/src/runtime/fragment-instance-state.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
A be/src/runtime/query-exec-mgr.cc
A be/src/runtime/query-exec-mgr.h
A be/src/runtime/query-state.cc
A be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/request-pool-service.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/CMakeLists.txt
M be/src/service/fe-support.cc
D be/src/service/fragment-exec-state.h
D be/src/service/fragment-mgr.cc
D be/src/service/fragment-mgr.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
A be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/testutil/desc-tbl-builder.h
M be/src/udf/udf.cc
M be/src/util/container-util.h
M be/src/util/thread.h
M be/src/util/uid-util.h
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
67 files changed, 1,178 insertions(+), 782 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4418/12
-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>