You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by "Sailesh Mukil (Code Review)" <ge...@cloudera.org> on 2016/09/02 07:34:27 UTC

[Impala-ASF-CR] IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

Sailesh Mukil has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4301

Change subject: IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.
......................................................................

IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

This patch is a header preview of a query wide execution state for
a backend.

None of the existing headers are replaced, they are just added in
parallel to existing headers and not considered for compilation.
Compiling the code will work and is no different from the current
apache/master branch (i.e. the commit before this patch).

Changes in design are as follows:

FragmentExecState -> FInstanceState
  This is now what was previously the FragmentExecState and the
  PlanFragmentExecutor together. The ReportStatusCallback is removed
  and now SendReport() is in charge of sending the report directly.
  The callback was in place originally only so the PlanFragmentExecutor
  could call into the FragmentExecState without a reference. This is
  not necessary anymore.

PlanFragmentExecutor (collapsed into FInstanceState)

FragmentMgr -> QueryExecMgr
  The QueryExecMgr now receives incoming fragments, creates a new
  QueryState if it is the first fragment to arrive for that
  query, and passes this received fragment along to the QueryState.
  This class is responsible for cleaning up the QueryState.

QueryState (new):
  This is the query wide state for the backend. It is initialized by
  the first fragment to arrive to the QueryExecMgr. This class is now
  responsible for creating FInstanceStates and executing them.
  Its life is protected by a ref counting mechanism and it is
  scheduled for destruction once the ref count reaches zero.
  Once scheduled for destruction, a thread in the QueryExecMgr will
  destroy the QueryState.
  Every user of the QueryState must access it within the scope of a
  'Guard' object which guarantees ref count incrementing and
  decrementing at entry/exit of the scope.

runtime-state-new.h is what runtime-state.h would look like, and is
just a temporary name.

Excludes:
 - All .cc files to make use of changes made.

Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
---
A be/src/runtime/fragment-instance-state.h
A be/src/runtime/query-state.h
A be/src/runtime/runtime-state-new.h
A be/src/service/query-exec-mgr.h
4 files changed, 981 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/4301/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sa...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

Posted by "Sailesh Mukil (Code Review)" <ge...@cloudera.org>.
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.
......................................................................


Patch Set 1:

(37 comments)

I've not addressed the comments in RuntimeState yet because I feel it's tangential to this patch. I will get to addressing them once we've decided on the headers.

http://gerrit.cloudera.org:8080/#/c/4301/1/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

Line 38: /// FInstanceState handles all aspects of the execution of a single plan
> nit: line wrapping looks a bit off in this comment (and maybe elsewhere).
Done


PS1, Line 39: fragment
> in comments, please spell it out (as fragment instance)
Done


PS1, Line 39: fragment
> finstance
Done


PS1, Line 44: fragment
> finstance?
Done


Line 62:       exec_params_(params) {
> do we need to copy these? that's a fairly large struct. you already get the
Yes, it looks like we can do without it. The only thing we need it for after Prepare() is for the frarg_instance_id which I can store separately. I removed it.


PS1, Line 68:   /// It is an error to delete a FInstanceState before Open()/GetNext() (depending
            :   /// on whether the fragment has a sink) indicated that execution is finished.
> This part is unclear to me. Maybe it'll get clearer while reading through t
If Cancel() is called before Prepare(), then it would just set the is_cancelled_ flag and return.
Prepare() on seeing this would return Status::CANCELLED.


Line 106:   /// Start execution. Call this prior to GetNext().
> How does calling Cancel() affect Open()?
Calling Cancel() will set the RuntimeState to a cancelled state. It's hard to guarantee if Open() will see it or not. It depends on what point of execution Open() is in when the Cancel() happens.


http://gerrit.cloudera.org:8080/#/c/4301/1/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

Line 38: /// This object is created once the first fragment instance for a query arrives to the
> arrives at
Done


Line 41: /// The lifetime of this class is maintained by using an explicit reference count that is
> 'is dictated by an explicit ...' or 'is guided by an explicit ...'
Done


Line 67:   /// fragment instances arrive in a single RPC.
> do we need this at all right now?
No we don't yet need this. Removed it.


Line 73:   /// Delete all the FInstanceState objects. This is called when the QueryState
> 'delete all state'. (no need to list that state, in particular since what t
Done


PS1, Line 79: RegisterFInstance
> Should this be renamed to include the fact that it does more than Register 
The execution currently is done by the QueryExecMgr. I forgot to reword the comment above. So this class only creates and registers a fragment instance.


PS1, Line 79: RegisterFInstance
> unless registration is a concept that shows up elsewhere in this class, no 
This function would actually only create a fragment instance and register it with the finstance_exec_state_map_.

The function would return and the QueryExecMgr would do the execution, since that's what we decided.

The comment above was stale, sorry about that.


Line 81:   /// Cancels a plan fragment that is running asynchronously.
> remove references to plan fragments, unless that's what you mean.
Done


PS1, Line 81:   /// Cancels a plan fragment that is running asynchronously.
            :   Status CancelPlanFragment(const TUniqueId& finstance_id);
            : 
            :   /// Publishes a global runtime filter to a local fragment.
            :   Status PublishFilter(const TUniqueId& dst_instance_id, int32_t filter_id,
            :       const TBloomFilter& thrift_bloom_filter);
> these look like forwarding methods. Would it be possible to make GetFInstan
We could but that would offload all the logic of getting the correct QS and FIS from the query_id and frag_id, to the caller (who is ImpalaInternalService).


PS1, Line 111: QueryExecMgr
> mention it's a friend, so it's clear how it works without getters/setters?
Done


PS1, Line 111: controlled
> nit: used
Done


Line 112:   int32_t num_current_references_;
> why is num_active_finstances_ atomic, but not this?
num_active_finstances_ value is modified when not under a lock. It only keeps track of total fragments in flight. It can be used to display in the RuntimeProfile and WebUI, it has no purpose other than that now.

num_current_references_ will always be modified under a lock. It keeps track of total references to the QS (i.e. finstances and everything else)


Line 118:   /// by us and its lifetime lasts as long as the QueryState that owns it.
> we are that qs.
Done


Line 129:   FInstanceState* GetFInstanceState(
> single line
Done


http://gerrit.cloudera.org:8080/#/c/4301/1/be/src/runtime/runtime-state-new.h
File be/src/runtime/runtime-state-new.h:

Line 1: // Licensed to the Apache Software Foundation (ASF) under one
> could you change runtime-state.h instead (and move the content into a new f
Done


http://gerrit.cloudera.org:8080/#/c/4301/1/be/src/service/query-exec-mgr.h
File be/src/service/query-exec-mgr.h:

Line 31: /// Manages execution of queries by creating a QueryState object which owns
> Mention that exactly one instance of this exists per process? I think that 
Done


PS1, Line 36: passed
> Can we find a better word here? 'passed' seems to imply that it's created o
I used 'associated'.


Line 40: /// offer itself up for destruction when its reference count (i.e. number of references
> drop explanation of 'reference count'.
Done


Line 56:   /// Receives a remote fragment instance and creates or uses (if it already exists) a
> remove "receives a remote fragment instance" (what does that mean?)
I meant an incoming fragment instance from the coordinator. Reworded it now.


Line 58:   /// instance to it to be executed on a separate thread.
> should this really be called 'StartFInstance'?
Yes, that makes more sense. Changed it.


Line 61:   /// was registered (like low memory). Otherwise, returns OK, which guarantees that the
> you talk about the fragment being 'registered', but it's unclear what that 
You're right. I removed the use of the word 'register'. I reworded to use 'start' instead.


Line 74:   /// holds on to it. This is ensured by increasing the internal reference count of the
> don't explain implementation here.
Done


Line 84:   Status CancelFInstance(const TUniqueId& finstanceance_id);
> param spelling
Done


PS1, Line 95: class object
> what does "class object" mean?
Just meant 'class'. Done.


Line 102:   class Guard {
> move this into QueryState. also, this needs to be public.
Done


Line 105:       query_state_ = ExecEnv::GetQueryExecMgr()->GetQueryState(query_id);
> Adding DCHECK(ExecEnv::GetQueryExecMgr() != nullptr) might help spot issues
Done


Line 109:       ExecEnv::GetQueryExecMgr()->ReleaseQueryState(query_state_);
> Same DCHECK here, plus it might catch errors with the order in which object
Done


PS1, Line 119: exits
> nit: is destructed?
An accessor doesn't have to be destroyed to drop a reference. I reworded it to "drops its reference".


Line 140:   /// Runs in the fragment instance' execution thread. Calls ReleaseQueryState() to
> don't comment on where this runs (the caller is in charge of that)
Done


PS1, Line 138: /// Calls finstance_state->Prepare() and then finstance_state->state->Exec().
             :   /// The finstance_state must previously have been registered.
             :   /// Runs in the fragment instance' execution thread. Calls ReleaseQueryState() to
             :   /// decrease the refcount to the QueryState, which was reserved during fragment instance
             :   /// registration time.
             :   void ExecInstance(QueryState* query_state, FInstanceState* finstance_state);
> Should we move the whole method to the QueryState or even FInstanceState? I
I added a comment why it has to be here.


PS1, Line 138: /// Calls finstance_state->Prepare() and then finstance_state->state->Exec().
             :   /// The finstance_state must previously have been registered.
             :   /// Runs in the fragment instance' execution thread. Calls ReleaseQueryState() to
             :   /// decrease the refcount to the QueryState, which was reserved during fragment instance
             :   /// registration time.
             :   void ExecInstance(QueryState* query_state, FInstanceState* finstance_state);
> this is presumably just a wrapper around the execution functionality in qs 
This function does not start the thread. This is the function that gets run by the thread when it is spawned. Currently the thread gets started in StartFInstance().

I changed the name to ExecFinstanceAux().


-- 
To view, visit http://gerrit.cloudera.org:8080/4301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sa...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.
......................................................................


Patch Set 2:

(29 comments)

http://gerrit.cloudera.org:8080/#/c/4301/1/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

PS1, Line 68:   /// on whether the fragment instance has a sink) indicated that execution is finished.
            :   ~FInstanceState() { }
> If Cancel() is called before Prepare(), then it would just set the is_cance
This feels still unclear to me, especially the meaning of "it is an error". What will be the consequence (crash, undef behavior, mem leak)? Is Prepare(), Cancel(), d'tor() allowed?


Line 106:   /// any rows).
> Calling Cancel() will set the RuntimeState to a cancelled state. It's hard 
If Open sees it, it should just do nothing, right?


http://gerrit.cloudera.org:8080/#/c/4301/2/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

PS2, Line 44: This
Does "This" mean, that the profile is created during the teardown? How would I retrieve it? If it just refers to the class, maybe reword "This class maintains..."?


Line 74:     return fragment_instance_id_;
single line?


PS2, Line 109: If this fragment instance has a sink
Does the caller still need to call GetNext() in this case to make the destruction legal? Maybe we should add a comment that outlines the overall lifetime of the instance state as a kind of state diagram (not sure if that would help)? If Cancel() and destruction are the only async signals, we can also make sure that their consequences are mentioned in each step of Prepare()->Open()->GetNext().


PS2, Line 119: underlying plan fragment instance
What is this? I couldn't find it in the member variables. Should it say "Closes this fragment instance state"? Do I have to call this after calling Cancel()? Is it legal to call it before Open()/GetNext()?


PS2, Line 123: If called concurrently with Prepare(), will wait for
             :   /// Prepare() to finish in order to properly tear down Prepare()'d state.
Another bit of the state machine that we seem to implement. Still not sure if we should collect them centrally. Sailesh, Marcel, what's your opinion?


Line 160:   /// set in ReportStatusCb();
Is this only ever set there?


PS2, Line 213: sent to this
"sent by this"? Or is this really the incoming sink? Maybe rename to input_sink_ then?


PS2, Line 285: Does not set status_.
There's no status_ field. Maybe remove the comment altogether?


Line 300: 
nit: remove blank line


http://gerrit.cloudera.org:8080/#/c/4301/1/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

PS1, Line 81:     }
            : 
            :     QueryState* query_state() { return query_state_; }
            :    private:
            :     QueryState* query_state_;
            :     DISALLOW_COPY_AND_ASSIGN(Guard);
> We could but that would offload all the logic of getting the correct QS and
Ah, ok. Thanks for the explanation.


Line 112: 
> num_active_finstances_ value is modified when not under a lock. It only kee
Maybe add a comment like "protected by xyz_lock_" and move it under that lock together with the other protected fields?


http://gerrit.cloudera.org:8080/#/c/4301/2/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

Line 65:   ///   1. If the QueryState exists, it will not be destroyed in the scope of
nit: line wrapping


Line 84:    private:
nit: newline above private: (not sure)?


PS2, Line 100: CancelPlanFragment
CancelFinstance?


PS2, Line 103: dst_instance_id
dst_finstance_id?


Line 142:   /// Returns a pointer to the FInstanceState if one can be found for the
Maybe say "non-owning pointer"?


http://gerrit.cloudera.org:8080/#/c/4301/1/be/src/service/query-exec-mgr.h
File be/src/service/query-exec-mgr.h:

PS1, Line 36: 
> I used 'associated'.
Good


http://gerrit.cloudera.org:8080/#/c/4301/2/be/src/service/query-exec-mgr.h
File be/src/service/query-exec-mgr.h:

PS2, Line 21: boost
std/unordered_map?


Line 73:   QueryState* GetQueryState(const TUniqueId& query_id);
Do these need to be public at all? Can we make them private and just use QueryState::Guard? If we really need to move these pointers between scopes, then adding and using QueryState::Guard::move() might be more explicit and hence less error prone.


PS2, Line 84: fragments
fragment instances? Or really to all instances of a fragment? Maybe add  (sic) or some other wording to make this more clear.


PS2, Line 85: dst_instance_id
then this might need to become dst_finstance_id.


PS2, Line 96: boost
Can we use std::unordered_map?


Line 105:   QueryState* FindOrCreateQueryState(const TQueryCtx& query_ctx);
Will the caller have to call ReleaseQueryState on it?


Line 107:   /// This function is invoked once query_state is no longer in use, and this
nit: "...once a query_state..."


PS2, Line 112: finstance_state->state->Exec()
finstance_state doesn't seem to have a 'state' field. Should this be finstance_state->Open() or Prepare()?


PS2, Line 116:   /// TODO: Move this responsibility outside this class once we have per-query RPC. This
             :   /// is currently called from here because we want the recounts to be managed in the QEM.
Isn't this also called here because we want the QEM to start the threads? If so, remove the TODO and just keep this function here.


PS2, Line 117: recounts
typo


-- 
To view, visit http://gerrit.cloudera.org:8080/4301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sa...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.
......................................................................


Patch Set 1:

(21 comments)

http://gerrit.cloudera.org:8080/#/c/4301/1/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

PS1, Line 39: fragment
> finstance
in comments, please spell it out (as fragment instance)


Line 62:       exec_params_(params) {
do we need to copy these? that's a fairly large struct. you already get them in Prepare(), maybe that's enough (or alternatively, pass them again into Open())


http://gerrit.cloudera.org:8080/#/c/4301/1/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

Line 38: /// This object is created once the first fragment instance for a query arrives to the
arrives at

don't comment on the qem here. transfer those comments to the qem, if that's needed.


Line 41: /// The lifetime of this class is maintained by using an explicit reference count that is
'is dictated by an explicit ...' or 'is guided by an explicit ...'


Line 67:   /// fragment instances arrive in a single RPC.
do we need this at all right now?


Line 73:   /// Delete all the FInstanceState objects. This is called when the QueryState
'delete all state'. (no need to list that state, in particular since what that is will be changing soon.)

don't comment on the caller.

*do* comment on externally observable behavior (i'm assuming it's not legal to call any other function of this class  after this call).


PS1, Line 79: RegisterFInstance
> Should this be renamed to include the fact that it does more than Register 
unless registration is a concept that shows up elsewhere in this class, no need to mention it here (esp. not in the name).

i'd also avoid 'schedule', we already use that verb in a different context (and i'm not sure what else we're trying to express other than 'execute' or 'start fragment instance').


Line 81:   /// Cancels a plan fragment that is running asynchronously.
remove references to plan fragments, unless that's what you mean.


Line 112:   int32_t num_current_references_;
why is num_active_finstances_ atomic, but not this?


Line 118:   /// by us and its lifetime lasts as long as the QueryState that owns it.
we are that qs.


Line 129:   FInstanceState* GetFInstanceState(
single line


http://gerrit.cloudera.org:8080/#/c/4301/1/be/src/runtime/runtime-state-new.h
File be/src/runtime/runtime-state-new.h:

Line 1: // Licensed to the Apache Software Foundation (ASF) under one
could you change runtime-state.h instead (and move the content into a new file before check-in)? it's hard to see the diff.


http://gerrit.cloudera.org:8080/#/c/4301/1/be/src/service/query-exec-mgr.h
File be/src/service/query-exec-mgr.h:

Line 40: /// offer itself up for destruction when its reference count (i.e. number of references
drop explanation of 'reference count'.


Line 56:   /// Receives a remote fragment instance and creates or uses (if it already exists) a
remove "receives a remote fragment instance" (what does that mean?)


Line 58:   /// instance to it to be executed on a separate thread.
should this really be called 'StartFInstance'?


Line 61:   /// was registered (like low memory). Otherwise, returns OK, which guarantees that the
you talk about the fragment being 'registered', but it's unclear what that means (because there are no related functions to that in this class - this isn't a concept visible to the caller of this class)


Line 74:   /// holds on to it. This is ensured by increasing the internal reference count of the
don't explain implementation here.

but: it is a requirement that every caller of GetQueryState() eventually calls ReleaseQueryState()


Line 84:   Status CancelFInstance(const TUniqueId& finstanceance_id);
param spelling


Line 102:   class Guard {
move this into QueryState. also, this needs to be public.


Line 140:   /// Runs in the fragment instance' execution thread. Calls ReleaseQueryState() to
don't comment on where this runs (the caller is in charge of that)


PS1, Line 138: /// Calls finstance_state->Prepare() and then finstance_state->state->Exec().
             :   /// The finstance_state must previously have been registered.
             :   /// Runs in the fragment instance' execution thread. Calls ReleaseQueryState() to
             :   /// decrease the refcount to the QueryState, which was reserved during fragment instance
             :   /// registration time.
             :   void ExecInstance(QueryState* query_state, FInstanceState* finstance_state);
> Should we move the whole method to the QueryState or even FInstanceState? I
this is presumably just a wrapper around the execution functionality in qs or fis which handles the 'release' of the qs at the end.

name it accordingly (as an example, if this is an auxiliary function for the function that starts the actual thread, you could name it '..Aux').


-- 
To view, visit http://gerrit.cloudera.org:8080/4301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sa...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has abandoned this change.

Change subject: IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.
......................................................................


Abandoned

Review is continued in https://gerrit.cloudera.org/#/c/4418/

-- 
To view, visit http://gerrit.cloudera.org:8080/4301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sa...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

Posted by "Sailesh Mukil (Code Review)" <ge...@cloudera.org>.
Sailesh Mukil has uploaded a new patch set (#4).

Change subject: IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.
......................................................................

IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

This patch is a header preview of a query wide execution state for
a backend.

None of the existing headers are replaced, they are just added in
parallel to existing headers and not considered for compilation.

Changes in design are as follows:

FragmentExecState -> FInstanceState
  This is now what was previously the FragmentExecState and the
  PlanFragmentExecutor together. The ReportStatusCallback is removed
  and now SendReport() is in charge of sending the report directly.
  The callback was in place originally only so the PlanFragmentExecutor
  could call into the FragmentExecState without a reference. This is
  not necessary anymore.

PlanFragmentExecutor (collapsed into FInstanceState)

FragmentMgr -> QueryExecMgr
  The QueryExecMgr now receives incoming fragments, creates a new
  QueryState if it is the first fragment to arrive for that
  query, and passes this received fragment along to the QueryState.
  This class is responsible for cleaning up the QueryState.

QueryState (new):
  This is the query wide state for the backend. It is initialized by
  the first fragment to arrive to the QueryExecMgr. This class is now
  responsible for creating FInstanceStates and executing them.
  Its life is protected by a ref counting mechanism and it is
  scheduled for destruction once the ref count reaches zero.
  Once scheduled for destruction, a thread in the QueryExecMgr will
  destroy the QueryState.
  Every user of the QueryState must access it within the scope of a
  'Guard' object which guarantees ref count incrementing and
  decrementing at entry/exit of the scope.

Excludes:
 - All .cc files to make use of changes made.

Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
---
A be/src/runtime/fragment-instance-state.h
A be/src/runtime/query-state.h
M be/src/runtime/runtime-state.h
A be/src/service/query-exec-mgr.h
4 files changed, 586 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/4301/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sa...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.
......................................................................


Patch Set 1:

(35 comments)

Please see my inline comments. Regarding the runtime-state I can see how it would be more cache efficient the way it is right now. However I think it is worth cleaning it up, and then addressing any subsequent cache-related performance issues by re-grouping and re-ordering members of QueryState.

http://gerrit.cloudera.org:8080/#/c/4301/1/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

Line 38: /// FInstanceState handles all aspects of the execution of a single plan
nit: line wrapping looks a bit off in this comment (and maybe elsewhere).


PS1, Line 39: fragment
finstance


PS1, Line 44: fragment
finstance?


PS1, Line 68:   /// It is an error to delete a FInstanceState before Open()/GetNext() (depending
            :   /// on whether the fragment has a sink) indicated that execution is finished.
This part is unclear to me. Maybe it'll get clearer while reading through the code. What if Cancel() is called before Prepare() even?


Line 106:   /// Start execution. Call this prior to GetNext().
How does calling Cancel() affect Open()?


http://gerrit.cloudera.org:8080/#/c/4301/1/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

PS1, Line 79: RegisterFInstance
Should this be renamed to include the fact that it does more than Register the instance (i.e. call methods on it)? ExecuteFInstance()? ScheduleFinstanceExecution()? RegisterAndExecuteFInstance()?


PS1, Line 81:   /// Cancels a plan fragment that is running asynchronously.
            :   Status CancelPlanFragment(const TUniqueId& finstance_id);
            : 
            :   /// Publishes a global runtime filter to a local fragment.
            :   Status PublishFilter(const TUniqueId& dst_instance_id, int32_t filter_id,
            :       const TBloomFilter& thrift_bloom_filter);
these look like forwarding methods. Would it be possible to make GetFInstanceState() public and let the caller call them?


PS1, Line 111: controlled
nit: used


PS1, Line 111: QueryExecMgr
mention it's a friend, so it's clear how it works without getters/setters?


http://gerrit.cloudera.org:8080/#/c/4301/1/be/src/runtime/runtime-state-new.h
File be/src/runtime/runtime-state-new.h:

PS1, Line 74: cgroup
should we explain what these parameters do? It's not clear to me at all what cgroup does.


PS1, Line 85: fragment
finstance?


PS1, Line 157: one
nit: maybe add 1 to...


Line 164:   /// Returns runtime state profile
I don't think these getter comments are needed here - or we should add them everywhere in the class.


Line 174:   /// Returns codegen_ in 'codegen'. If 'initialize' is true, codegen_ will be created if
Reword both sentences to "if 'initialize' is true and codegen_ has not been initialized, it will be created and returned via 'codegen'." and "If 'initialize' is false and codegen_ has not been initialized, then 'codegen' will be set to NULL."


PS1, Line 200: /// Log an error that will be sent back to the coordinator based on an instance of the
             :   /// ErrorMsg class. The runtime state aggregates log messages based on type with one
             :   /// exception: messages with the GENERAL type are not aggregated but are kept
             :   /// individually.
             :   bool LogError(const ErrorMsg& msg, int vlog_level = 1);
             : 
             :   /// Returns true if the error log has not reached max_errors_.
             :   bool LogHasSpace() {
             :     boost::lock_guard<SpinLock> l(error_log_lock_);
             :     return error_log_.size() < query_options().max_errors;
             :   }
             : 
             :   /// Returns the error log lines as a string joined with '\n'.
             :   std::string ErrorLog();
             : 
             :   /// Copy error_log_ to *errors
             :   void GetErrors(ErrorLogMap* errors);
             : 
             :   /// Append all accumulated errors since the last call to this function to new_errors to
             :   /// be sent back to the coordinator
             :   void GetUnreportedErrors(ErrorLogMap* new_errors);
             : 
             :   /// Given an error message, determine whether execution should be aborted and, if so,
             :   /// return the corresponding error status. Otherwise, log the error and return
             :   /// Status::OK(). Execution is aborted if the ABORT_ON_ERROR query option is set to
             :   /// true or the error is not recoverable and should be handled upstream.
             :   Status LogOrReturnError(const ErrorMsg& message);
Can these go into an own struct/class?


PS1, Line 231: RuntimeProfile::Counter* total_cpu_timer() { return total_cpu_timer_; }
             :   RuntimeProfile::Counter* total_storage_wait_timer() {
             :     return total_storage_wait_timer_;
             :   }
             :   RuntimeProfile::Counter* total_network_send_timer() {
             :     return total_network_send_timer_;
             :   }
             :   RuntimeProfile::Counter* total_network_receive_timer() {
             :     return total_network_receive_timer_;
             :   }
Can the counters go into an own struct/class?


Line 252:   void LogMemLimitExceeded(const MemTracker* tracker, int64_t failed_allocation_size);
This could to into the Log struct/class (above)


Line 269:   Status CheckQueryState();
Can this be const? If not a comment should explain why.


Line 279:   Status Init(ExecEnv* exec_env);
Can this be combined with other init methods to simplify initialization?


Line 290:   static const int DEFAULT_BATCH_SIZE = 1024;
comment? What is the unit?


PS1, Line 292:   FInstanceState* finstance_state_;
             :   DescriptorTbl* desc_tbl_;
             :   boost::scoped_ptr<ObjectPool> obj_pool_;
comments?


PS1, Line 300: ErrorLogMap
type and name mismatch, why is this a Map?


PS1, Line 309:   ExecEnv* exec_env_;
             :   boost::scoped_ptr<LlvmCodeGen> codegen_;
Newline above. Comments?


PS1, Line 312:   /// True if this fragment should force codegen for expr evaluation.
This comment does not match the one at the setter method (force vs use).


PS1, Line 326:   RuntimeProfile profile_;
             : 
             :   /// Total CPU time (across all threads), including all wait times.
             :   RuntimeProfile::Counter* total_cpu_timer_;
             : 
             :   /// Total time waiting in storage (across all threads)
             :   RuntimeProfile::Counter* total_storage_wait_timer_;
             : 
             :   /// Total time spent sending over the network (across all threads)
             :   RuntimeProfile::Counter* total_network_send_timer_;
             : 
             :   /// Total time spent receiving over the network (across all threads)
             :   RuntimeProfile::Counter* total_network_receive_timer_;
These could go to into an own struct/class.


PS1, Line 341: must be released after the instance_mem_tracker_
How do we guarantee this if an instance of this class is the last one holding on to query_mem_tracker_?


Line 360:   /// Reader contexts that need to be closed when the fragment is closed.
Who creates them? Who closes them?


PS1, Line 381:   /// prohibit copies
Use DISALLOW_COPY_AND_ASSIGN instead?


http://gerrit.cloudera.org:8080/#/c/4301/1/be/src/service/query-exec-mgr.h
File be/src/service/query-exec-mgr.h:

Line 31: /// Manages execution of queries by creating a QueryState object which owns
Mention that exactly one instance of this exists per process? I think that was a previous comment by Marcel, too: https://gerrit.cloudera.org/#/c/3817/11/be/src/service/query-exec-mgr.h@31


PS1, Line 36: passed
Can we find a better word here? 'passed' seems to imply that it's created outside and then just registered.


PS1, Line 95: class object
what does "class object" mean?


Line 105:       query_state_ = ExecEnv::GetQueryExecMgr()->GetQueryState(query_id);
Adding DCHECK(ExecEnv::GetQueryExecMgr() != nullptr) might help spot issues with incomplete ExecEnv in test setups.


Line 109:       ExecEnv::GetQueryExecMgr()->ReleaseQueryState(query_state_);
Same DCHECK here, plus it might catch errors with the order in which objects are destroyed during process shutdown.


PS1, Line 119: exits
nit: is destructed?


PS1, Line 138: /// Calls finstance_state->Prepare() and then finstance_state->state->Exec().
             :   /// The finstance_state must previously have been registered.
             :   /// Runs in the fragment instance' execution thread. Calls ReleaseQueryState() to
             :   /// decrease the refcount to the QueryState, which was reserved during fragment instance
             :   /// registration time.
             :   void ExecInstance(QueryState* query_state, FInstanceState* finstance_state);
Should we move the whole method to the QueryState or even FInstanceState? If not, can we explain in a comment why not?


-- 
To view, visit http://gerrit.cloudera.org:8080/4301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sa...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

Posted by "Sailesh Mukil (Code Review)" <ge...@cloudera.org>.
Sailesh Mukil has uploaded a new patch set (#2).

Change subject: IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.
......................................................................

IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

This patch is a header preview of a query wide execution state for
a backend.

None of the existing headers are replaced, they are just added in
parallel to existing headers and not considered for compilation.

Changes in design are as follows:

FragmentExecState -> FInstanceState
  This is now what was previously the FragmentExecState and the
  PlanFragmentExecutor together. The ReportStatusCallback is removed
  and now SendReport() is in charge of sending the report directly.
  The callback was in place originally only so the PlanFragmentExecutor
  could call into the FragmentExecState without a reference. This is
  not necessary anymore.

PlanFragmentExecutor (collapsed into FInstanceState)

FragmentMgr -> QueryExecMgr
  The QueryExecMgr now receives incoming fragments, creates a new
  QueryState if it is the first fragment to arrive for that
  query, and passes this received fragment along to the QueryState.
  This class is responsible for cleaning up the QueryState.

QueryState (new):
  This is the query wide state for the backend. It is initialized by
  the first fragment to arrive to the QueryExecMgr. This class is now
  responsible for creating FInstanceStates and executing them.
  Its life is protected by a ref counting mechanism and it is
  scheduled for destruction once the ref count reaches zero.
  Once scheduled for destruction, a thread in the QueryExecMgr will
  destroy the QueryState.
  Every user of the QueryState must access it within the scope of a
  'Guard' object which guarantees ref count incrementing and
  decrementing at entry/exit of the scope.

Excludes:
 - All .cc files to make use of changes made.

Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
---
A be/src/runtime/fragment-instance-state.h
A be/src/runtime/query-state.h
M be/src/runtime/runtime-state.h
A be/src/service/query-exec-mgr.h
4 files changed, 587 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/4301/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sa...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

Posted by "Sailesh Mukil (Code Review)" <ge...@cloudera.org>.
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.
......................................................................


Patch Set 1:

This addresses all the header comments in the non-header only patch:
(i.e. https://gerrit.cloudera.org/#/c/3817/)

I will post a compilable version of this patch in that CR. That will be a lot more tricky mainly because:
 - PlanFragmentExecutor and FragmentExecState are now one class. This makes the coordinator code tricky because the Coordinator had special code which depended only on the PFE and not on the FES; and has redundant FES functionality to manage the PFE.

@Lars: This patch can be used as a guide for the per-query RPC (IMPALA-2550), and also for working out the final kinks in the headers.

-- 
To view, visit http://gerrit.cloudera.org:8080/4301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sa...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.
......................................................................


Patch Set 1:

> > (1 comment)
 > 
 > @Matt:
 > 
 > The RuntimeState has members that are relatively more frequently
 > accessed than QueryState or FInstanceState members.
 > So to have good cache temporal locality, we could leave it as a
 > separate class.
 > 
 > But to your point, it is a mess. Now that more people are bringing
 > this up, I feel having a RuntimeState per query wouldn't be a bad
 > idea. We can move the fragment instance specific stuff to the
 > FInstanceState and leave the query specific stuff in RuntimeState.
 > 
 > The QueryState on the other hand would just contain state required
 > for execution which relatively isn't accessed that frequently.
 > 
 > Thoughts?

Oh right, now I remember that cache locality had been brought up, but I think we can probably work around it. (I guess one of the issues is that there are larger structures, e.g. thrift query/fragment descriptors? Those we can store on the heap, e.g. via a scoped_ptr/unique_ptr on the QueryState/FInstanceState.) IMO the RuntimeState is pretty gross and messes up your nice new abstractions, so I think it'd be worth us chatting more about. Perhaps we can chat in person next wk?

-- 
To view, visit http://gerrit.cloudera.org:8080/4301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sa...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4301/1//COMMIT_MSG
Commit Message:

PS1, Line 47: runtime-state-new.h is what runtime-state.h would look like, and is
            : just a temporary name.
Wouldn't we want to kill RuntimeState rather than update it to use the new QueryState/FInstanceState? I'd think that everything in RuntimeState could fit into either QueryState or FInstanceState. That would be really nice, because then the scope of those members is clear (whereas today RuntimeState is a mess of finstance & query state stuff, and even some wrappers around ExecEnv which may not be needed or maybe could just be accessed via QueryState). Sorry if I missed a discussion about this.


-- 
To view, visit http://gerrit.cloudera.org:8080/4301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sa...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.
......................................................................


Patch Set 1:

Can you try to find a time for a hangout so I can join the chat? Don't worry too much if it doesn't work out. In that case maybe you could send around a quick summary of what you discussed. Thanks

-- 
To view, visit http://gerrit.cloudera.org:8080/4301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sa...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

Posted by "Sailesh Mukil (Code Review)" <ge...@cloudera.org>.
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.
......................................................................


Patch Set 1:

> (1 comment)

@Matt:

The RuntimeState has members that are relatively more frequently accessed than QueryState or FInstanceState members.
So to have good cache temporal locality, we could leave it as a separate class.

But to your point, it is a mess. Now that more people are bringing this up, I feel having a RuntimeState per query wouldn't be a bad idea. We can move the fragment instance specific stuff to the FInstanceState and leave the query specific stuff in RuntimeState.

The QueryState on the other hand would just contain state required for execution which relatively isn't accessed that frequently.

Thoughts?

-- 
To view, visit http://gerrit.cloudera.org:8080/4301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sa...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

Posted by "Sailesh Mukil (Code Review)" <ge...@cloudera.org>.
Sailesh Mukil has uploaded a new patch set (#3).

Change subject: IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.
......................................................................

IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

This patch is a header preview of a query wide execution state for
a backend.

None of the existing headers are replaced, they are just added in
parallel to existing headers and not considered for compilation.

Changes in design are as follows:

FragmentExecState -> FInstanceState
  This is now what was previously the FragmentExecState and the
  PlanFragmentExecutor together. The ReportStatusCallback is removed
  and now SendReport() is in charge of sending the report directly.
  The callback was in place originally only so the PlanFragmentExecutor
  could call into the FragmentExecState without a reference. This is
  not necessary anymore.

PlanFragmentExecutor (collapsed into FInstanceState)

FragmentMgr -> QueryExecMgr
  The QueryExecMgr now receives incoming fragments, creates a new
  QueryState if it is the first fragment to arrive for that
  query, and passes this received fragment along to the QueryState.
  This class is responsible for cleaning up the QueryState.

QueryState (new):
  This is the query wide state for the backend. It is initialized by
  the first fragment to arrive to the QueryExecMgr. This class is now
  responsible for creating FInstanceStates and executing them.
  Its life is protected by a ref counting mechanism and it is
  scheduled for destruction once the ref count reaches zero.
  Once scheduled for destruction, a thread in the QueryExecMgr will
  destroy the QueryState.
  Every user of the QueryState must access it within the scope of a
  'Guard' object which guarantees ref count incrementing and
  decrementing at entry/exit of the scope.

Excludes:
 - All .cc files to make use of changes made.

Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
---
A be/src/runtime/fragment-instance-state.h
A be/src/runtime/query-state.h
M be/src/runtime/runtime-state.h
A be/src/service/query-exec-mgr.h
4 files changed, 585 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/4301/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sa...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>

[Impala-ASF-CR] IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

Posted by "Sailesh Mukil (Code Review)" <ge...@cloudera.org>.
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.
......................................................................


Patch Set 4:

(27 comments)

Thanks for the review Lars. I've made the changes.

http://gerrit.cloudera.org:8080/#/c/4301/1/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

PS1, Line 68:   /// on whether the fragment instance has a sink) indicated that execution is finished.
            :   ~FInstanceState() { }
> This feels still unclear to me, especially the meaning of "it is an error".
It will be undefined behavior because if execution is not complete, the FInstanceState will be expected to be present in the map.


Line 106:   /// is > 0 and a callback was specified in the c'tor.
> If Open sees it, it should just do nothing, right?
Yes, Open() does nothing special if Cancel() happens.


http://gerrit.cloudera.org:8080/#/c/4301/2/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

PS2, Line 44: This
> Does "This" mean, that the profile is created during the teardown? How woul
Yes, it means this class. Done.


Line 74:   const TNetworkAddress& coord_address() const { return query_ctx_.coord_address; }
> single line?
Done


PS2, Line 109: us Open();
> Does the caller still need to call GetNext() in this case to make the destr
That sounds like a good idea. I unfortunately don't know all the possibilities well enough to draw the diagram. I will sit down with someone who does and add this state diagram.


PS2, Line 119: 
> What is this? I couldn't find it in the member variables. Should it say "Cl
That comment was copied over from the old PFE. Changed it.

This is called in the destructor, so it doesn't need to be explicitly called after Cancel().

It doesn't seem legal to call before Open()/GetNext() but I'm not sure. Will confirm and update.


PS2, Line 123: 
             :   /// Cancel() may be called more than once. Calls after the first will hav
> Another bit of the state machine that we seem to implement. Still not sure 
I don't understand what you mean by "collect them centrally". Could you clarify?


Line 160: 
> Is this only ever set there?
No, it was previously. But now it will be set in multiple places. So I've removed the comment.


PS2, Line 213: k> sink_;
> "sent by this"? Or is this really the incoming sink? Maybe rename to input_
I've rephrased this. "sent to/by this" leaves room for ambiguity.


PS2, Line 285:  sink, the sink will
> There's no status_ field. Maybe remove the comment altogether?
It should be exec_status_. Since we've merged both the classes, I've gotten rid of status_ and kept only exec_status_.


Line 300: 
> nit: remove blank line
Done


http://gerrit.cloudera.org:8080/#/c/4301/1/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

Line 112:   }
> Maybe add a comment like "protected by xyz_lock_" and move it under that lo
Done


http://gerrit.cloudera.org:8080/#/c/4301/2/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

Line 65:   ///   1. If the QueryState exists, it will not be destroyed in the scope of this class
> nit: line wrapping
Done


Line 84: 
> nit: newline above private: (not sure)?
Done


PS2, Line 100: cels a plan fragme
> CancelFinstance?
Done


PS2, Line 103: to a local frag
> dst_finstance_id?
Done


Line 142:   FInstanceStateMap finstance_exec_state_map_;
> Maybe say "non-owning pointer"?
This pointer is owned by QueryState however.


http://gerrit.cloudera.org:8080/#/c/4301/2/be/src/service/query-exec-mgr.h
File be/src/service/query-exec-mgr.h:

PS2, Line 21: unord
> std/unordered_map?
Done


Line 73:   /// Forwards a global runtime filter to the corresponding QueryState to publish to
> Do these need to be public at all? Can we make them private and just use Qu
This doesn't need to be public now. Made both this and ReleaseQS() private.


PS2, Line 84: eryState 
> fragment instances? Or really to all instances of a fragment? Maybe add  (s
Since we have the instance_id as the parameter, this should be fragment instance. Done.


PS2, Line 85: 
> then this might need to become dst_finstance_id.
Done


PS2, Line 96: s on 
> Can we use std::unordered_map?
Done


Line 105:   /// Caller must call ReleaseQueryExecState() once done with this.
> Will the caller have to call ReleaseQueryState on it?
Yes. Done.


Line 107: 
> nit: "...once a query_state..."
Done


PS2, Line 112: 
> finstance_state doesn't seem to have a 'state' field. Should this be finsta
It should become Open(). Changed it.


PS2, Line 117: once we 
> typo
Done


PS2, Line 116:   /// reserved during fragment instance registration time.
             :   /// TODO: Move this responsibility outside this class once we have per-query RPC. This
> Isn't this also called here because we want the QEM to start the threads? I
Yes, but we want the QEM to start the threads because we want the QEM to manage the refcounts.
We will later move this responsibility to the QS.


-- 
To view, visit http://gerrit.cloudera.org:8080/4301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sa...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.

Posted by "Sailesh Mukil (Code Review)" <ge...@cloudera.org>.
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4014: PARALLEL HEADERS ONLY: Introduce query-wide execution state.
......................................................................


Patch Set 1:

> > > (1 comment)
 > >
 > > @Matt:
 > >
 > > The RuntimeState has members that are relatively more frequently
 > > accessed than QueryState or FInstanceState members.
 > > So to have good cache temporal locality, we could leave it as a
 > > separate class.
 > >
 > > But to your point, it is a mess. Now that more people are
 > bringing
 > > this up, I feel having a RuntimeState per query wouldn't be a bad
 > > idea. We can move the fragment instance specific stuff to the
 > > FInstanceState and leave the query specific stuff in
 > RuntimeState.
 > >
 > > The QueryState on the other hand would just contain state
 > required
 > > for execution which relatively isn't accessed that frequently.
 > >
 > > Thoughts?
 > 
 > Oh right, now I remember that cache locality had been brought up,
 > but I think we can probably work around it. (I guess one of the
 > issues is that there are larger structures, e.g. thrift
 > query/fragment descriptors? Those we can store on the heap, e.g.
 > via a scoped_ptr/unique_ptr on the QueryState/FInstanceState.) IMO
 > the RuntimeState is pretty gross and messes up your nice new
 > abstractions, so I think it'd be worth us chatting more about.
 > Perhaps we can chat in person next wk?

Sure! Let's chat in person next week.

-- 
To view, visit http://gerrit.cloudera.org:8080/4301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If58292a5c377660d97e7e7cc0c3122328eba72ed
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sa...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>
Gerrit-HasComments: No