You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Lars Volker (JIRA)" <ji...@apache.org> on 2019/04/13 04:43:00 UTC

[jira] [Created] (IMPALA-8411) A race during query cancellation by a client can make the client see it as still running

Lars Volker created IMPALA-8411:
-----------------------------------

             Summary: A race during query cancellation by a client can make the client see it as still running
                 Key: IMPALA-8411
                 URL: https://issues.apache.org/jira/browse/IMPALA-8411
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
    Affects Versions: Impala 3.3.0
            Reporter: Lars Volker


I observed that under load the check in https://gerrit.cloudera.org/#/c/12530/8/tests/query_test/test_cancellation.py@229 would sometimes fail. After some poking it appears that the client's cancellation request does not update the operation state (because "cause", the third parameter to Cancelnternal, defaults to nullptr). Then the client disconnects and the connection close triggers another call to CancelInternal, this time with cause = "Session closed". This is also what shows up in the profile. If the server is under load, handling the closing connection can take long enough so that a concurrent request for the state still returns RUNNING. The aim of this fix was to make CancelOperation synchronous, i.e. when it returns the subsequent call to GetOperationStatus would not return RUNNING.

The comment on ClientRequestState::Cancel() suggests that this is omitted when handling client cancellation because we expect that to happen regularly for finished queries. In that case however, eos_ would be set and we wouldn't update the state either, no?

/// Cancels the child queries and the coordinator with the given cause.
/// If cause is NULL, it assume this was deliberately cancelled by the user while in
/// FINISHED operation state. Otherwise, sets state to ERROR_STATE (TODO: IMPALA-1262:
/// use CANCELED_STATE). Does nothing if the query has reached EOS or already cancelled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)