You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Joe McDonnell (Code Review)" <ge...@cloudera.org> on 2017/06/05 16:55:52 UTC

[Impala-ASF-CR] IMPALA-1575: Yield admission control resources at query end

Joe McDonnell has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7079

Change subject: IMPALA-1575: Yield admission control resources at query end
......................................................................

IMPALA-1575: Yield admission control resources at query end

Currently, a query does not release admission control
resources until the client calls UnregisterQuery. Slow
clients can hold admission control resources even after
the query is finished. Specifically, in the following
cases, the query is completed, but the client must
still call UnregisterQuery to release admission control
resources:

1. The query encounters and error and fails
2. The query is ended due to the idle query timeout
3. The query reaches eos (or the DML completes)

This change releases admission control resources as
soon as the query is complete rather than waiting
for the client.

Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
---
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/client-request-state.cc
3 files changed, 40 insertions(+), 38 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/7079/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>

[Impala-ASF-CR] IMPALA-1575: Yield admission control resources at query end

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has uploaded a new patch set (#2).

Change subject: IMPALA-1575: Yield admission control resources at query end
......................................................................

IMPALA-1575: Yield admission control resources at query end

Currently, a query does not release admission control
resources until the client calls UnregisterQuery. Slow
clients can hold admission control resources even after
the query has reached a state where it will no longer
return any rows. Specifically, in the following
cases, the query is completed, but the client must
still call UnregisterQuery to release admission control
resources:

1. The query encounters an error and fails
2. The query is cancelled due to the idle query timeout
3. The query reaches eos (or the DML completes)
4. The client cancels the query without closing the query

This change releases admission control resources as
soon as the query reaches a state where it cannot
return any rows rather than waiting for the client
to explicitly end the query.

When cancelling a query, the coordinator asynchronously
notifies all fragment instances to cancel. The
coordinator does not wait for the fragment instances
to respond, so the cancel case can release admission
control resources while some fragment instances may
continue to run until the cancel takes effect. The
concern with this behavior is that the fragment
instances may continue to use memory and cause subsequent
admitted queries to fail. This is already possible today,
as a client can directly close a running query (which
cancels the query and unregisters the query immediately).
For example, the session idle timeout does this. However,
this change expands the circumstances where this can
happen.

Admission control based on mem_limit operates differently.
It relies on the reported memory usage of each
QueryState to generate a cumulative memory usage across
all of the instances. Admission control's behavior is
determined by when the QueryState releases its memory.

The existing behavior releases the query's memory on the
destruction of the QueryState, which occurs when the
query is unregistered. This matches the existing behavior
for admission control prior to this change. To support
the new behavior for mem_limit, the QueryState will now
release query resources when the last fragment instance
terminates. This unregisters the query memory tracker,
which results in the admission control memory resources
being freed.

To test both aspects of this change, the admission control
test (custom_cluster/test_admission_controller.py) has
been modified to use four different modes of ending a
query: client cancelling a query, the query hitting an
idle timeout, the query reaching eos, and the client
closing the query. The test uses a mix of all four.
After the query ends, all clients wait for the test
to complete before closing the query or closing the
connection. This ensures that the admission control
decisions are based entirely on the query end behavior.
This test works for both query admission control
and mem_limit admission control.

Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
---
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/service/client-request-state.cc
M tests/custom_cluster/test_admission_controller.py
7 files changed, 158 insertions(+), 102 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/7079/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>

[Impala-ASF-CR] IMPALA-1575: Yield admission control resources at query end

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-1575: Yield admission control resources at query end
......................................................................


Patch Set 1:

Testing shows that the current approach doesn't work for mem_limit based admission control. The reserved memory is not released at the new location, so TestAdmissionControllerStress::test_mem_limit fails.

From what I can tell, the reserved memory is released when QueryState::ReleaseResources is called. The codepath that calls this is ImpalaServer::UnregisterQuery-> ClientRequestState::~ClientRequestState -> Coordinator::~Coordinator -> QueryExecManager::ReleaseQueryState.

-- 
To view, visit http://gerrit.cloudera.org:8080/7079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-1575: Yield admission control resources at query end

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has abandoned this change. ( http://gerrit.cloudera.org:8080/7079 )

Change subject: IMPALA-1575: Yield admission control resources at query end
......................................................................


Abandoned

Abandoning this patch since it's no longer up for review. I'll post updated patches.
-- 
To view, visit http://gerrit.cloudera.org:8080/7079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
Gerrit-Change-Number: 7079
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@gmail.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-1575: Yield admission control resources at query end

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-1575: Yield admission control resources at query end
......................................................................


Patch Set 1:

(7 comments)

I think there were going to be some functional tests?

http://gerrit.cloudera.org:8080/#/c/7079/1//COMMIT_MSG
Commit Message:

PS1, Line 12: query is finished
query has transitioned to a state in which it will no longer return any more rows.


PS1, Line 17: and
an error


PS1, Line 18: ended
cancelled


Line 19: 3. The query reaches eos (or the DML completes)
Also: The query is cancelled by a client but not closed. (E.g. call HS2::CancelOperation() but not HS2::CloseOperation())


PS1, Line 22: complete
avoid this word, e.g. use the wording I suggested above


Line 24: 
call out the limitations, i.e. this doesn't mean all backends have completed terminated, and while this is already possible today, this can make this scenario more likely.


Line 25: Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
comment on testing? I assume we still need to do the cluster testing.


-- 
To view, visit http://gerrit.cloudera.org:8080/7079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-1575: Yield admission control resources at query end

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-1575: Yield admission control resources at query end
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7079/1/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

Line 81: /// ImpalaServer::CancelInternal()/ClientRequestState::Cancel().)
augment comment


-- 
To view, visit http://gerrit.cloudera.org:8080/7079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-1575: Yield admission control resources at query end

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-1575: Yield admission control resources at query end
......................................................................


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/7079/1//COMMIT_MSG
Commit Message:

PS1, Line 12: query is finished
> query has transitioned to a state in which it will no longer return any mor
Done


PS1, Line 17: and
> an error
Done


PS1, Line 18: ended
> cancelled
Done


Line 19: 3. The query reaches eos (or the DML completes)
> Also: The query is cancelled by a client but not closed. (E.g. call HS2::Ca
Done


PS1, Line 22: complete
> avoid this word, e.g. use the wording I suggested above
Done


Line 24: 
> call out the limitations, i.e. this doesn't mean all backends have complete
Done


Line 25: Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
> comment on testing? I assume we still need to do the cluster testing.
Added a test and a description of the changes to the test.


http://gerrit.cloudera.org:8080/#/c/7079/1/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

Line 81: /// ImpalaServer::CancelInternal()/ClientRequestState::Cancel().)
> augment comment
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-1575: Yield admission control resources at query end

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-1575: Yield admission control resources at query end
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7079/3/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS3, Line 82:   if (desc_tbl_ != nullptr) desc_tbl_->ReleaseResources();
This may need to be in ~QueryState(). I'm investigating how desc_tbl_ is used. So far, testing hasn't revealed any issue with having this here.


-- 
To view, visit http://gerrit.cloudera.org:8080/7079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@gmail.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-1575: Yield admission control resources at query end

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has uploaded a new patch set (#3).

Change subject: IMPALA-1575: Yield admission control resources at query end
......................................................................

IMPALA-1575: Yield admission control resources at query end

Currently, a query does not release admission control
resources until the client calls UnregisterQuery. Slow
clients can hold admission control resources even after
the query has reached a state where it will no longer
return any rows. Specifically, in the following
cases, the query is completed, but the client must
still call UnregisterQuery to release admission control
resources:

1. The query encounters an error and fails
2. The query is cancelled due to the idle query timeout
3. The query reaches eos (or the DML completes)
4. The client cancels the query without closing the query

This change releases admission control resources as
soon as the query reaches a state where it cannot
return any rows rather than waiting for the client
to explicitly end the query.

When cancelling a query, the coordinator asynchronously
notifies all fragment instances to cancel. The
coordinator does not wait for the fragment instances
to respond, so the cancel case can release admission
control resources while some fragment instances may
continue to run until the cancel takes effect. The
concern with this behavior is that the fragment
instances may continue to use memory and cause subsequent
admitted queries to fail. This is already possible today,
as a client can directly close a running query (which
cancels the query and unregisters the query immediately).
For example, the session idle timeout does this. However,
this change expands the circumstances where this can
happen.

Admission control based on mem_limit operates differently.
It relies on the reported memory usage of each
QueryState to generate a cumulative memory usage across
all of the instances. Admission control's behavior is
determined by when the QueryState releases its memory.

The existing behavior releases the query's memory on the
destruction of the QueryState, which occurs when the
query is unregistered. This matches the existing behavior
for admission control prior to this change. To support
the new behavior for mem_limit, the QueryState will now
release query resources when the last fragment instance
terminates. This unregisters the query memory tracker,
which results in the admission control memory resources
being freed. Backend tests do not spawn fragment instance
threads and are not subject to the new semantics for
releasing resources. Instead, the code detects this case
and releases resources in the old location.

To test both aspects of this change, the admission control
test (custom_cluster/test_admission_controller.py) has
been modified to use four different modes of ending a
query: client cancelling a query, the query hitting an
idle timeout, the query reaching eos, and the client
closing the query. The test uses a mix of all four.
After the query ends, all clients wait for the test
to complete before closing the query or closing the
connection. This ensures that the admission control
decisions are based entirely on the query end behavior.
This test works for both query admission control
and mem_limit admission control.

Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
---
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/service/client-request-state.cc
M tests/custom_cluster/test_admission_controller.py
7 files changed, 170 insertions(+), 105 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/7079/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@gmail.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>

[Impala-ASF-CR] IMPALA-1575: Yield admission control resources at query end

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change.

Change subject: IMPALA-1575: Yield admission control resources at query end
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7079/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS1, Line 889: WaitForBackendCompletion())
note that on this path we are waiting for all backends to finish already.


Line 1114:   }
Does Coordinator::ReleaseResources() really mean that all finstances' resources have been freed?  My interpretation was that this method was only for freeing resources for the coordinator object itself, and so I don't see why it makes sense tie admission control to it.


-- 
To view, visit http://gerrit.cloudera.org:8080/7079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-1575: Yield admission control resources at query end

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-1575: Yield admission control resources at query end
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7079/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 892:     // release resources here, since we won't be fetching more result rows
explain why this needs to happen after computequerysummary


-- 
To view, visit http://gerrit.cloudera.org:8080/7079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-1575: Yield admission control resources at query end

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-1575: Yield admission control resources at query end
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7079/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 1114:   }
> Does Coordinator::ReleaseResources() really mean that all finstances' resou
you could view the admission control "tokens" as a resource, ie, there's a finite amount of it and it's accounted for.

but the comment is a bit misleading, since the admission controller doesn't allocate actual compute resources.

ReleaseResources() doesn't have any control over the instance resources.


-- 
To view, visit http://gerrit.cloudera.org:8080/7079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-1575: Yield admission control resources at query end

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-1575: Yield admission control resources at query end
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7079/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 1086: void Coordinator::ReleaseResources() {
I'm having trouble understanding whether this runs before or after QueryState::ReleaseResources(). It seems like we to release memory tracked by filter_mem_tracker_ before QueryState::ReleaseResources(), but then we need to release the admission control resources after QueryState::ReleaseResources() is called.

There's a good chance I'm misunderstanding something but that's what I'm stuck on right now.


-- 
To view, visit http://gerrit.cloudera.org:8080/7079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@gmail.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes