You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Chris Westin (JIRA)" <ji...@apache.org> on 2015/03/26 20:24:53 UTC

[jira] [Created] (DRILL-2582) QueryManager shouldn't be manipulating Foreman's state directly

Chris Westin created DRILL-2582:
-----------------------------------

             Summary: QueryManager shouldn't be manipulating Foreman's state directly
                 Key: DRILL-2582
                 URL: https://issues.apache.org/jira/browse/DRILL-2582
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Flow
    Affects Versions: 0.8.0
            Reporter: Chris Westin
            Assignee: Deneche A. Hakim
             Fix For: 0.9.0


We're having trouble always reporting cascading failures that result from a failure or cancellation, and this turns out to be because QueryManager is indiscriminately manipulating Foreman's state without paying any attention to its current state.

For example, suppose we request a cancellation of a query, and Foreman issues queryManager.cancelExecutingFragments. However, in the meantime, suppose a fragment failed. The fragment failure will be picked up by QueryManager.statusUpdate(), which then uses stateListener to slam Foreman to the FAILED state. However, Foreman was in CANCELLATION_REQUESTED, and is waiting for the cancellation acknowledgements. The sudden move to FAILED shuts it down and sends out a FAILURE message instead of the expected CANCELED terminal state, and won't report on any cascading failure from the cancellations.

What should happen is that QueryManager should instead report on fragment status updates to Foreman, and Foreman should decide what transition to make based on the fragment status update and it's own current state. In the above, a fragment failure notification after we're already in CANCELLATION_REQUESTED shouldn't result in any state transition at all, but should simply attach the fragment failure to any current suppressed deferred exceptions. This means QueryManager.statusUpdate() and QueryManager.fragmentDone() need to be reworked, and Foreman needs to give QueryManager a listener for reporting fragment status changes, rather than allowing it to directly manipulate the Foreman's state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)