You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Kunal Khatua (JIRA)" <ji...@apache.org> on 2017/08/02 00:47:00 UTC

[jira] [Updated] (DRILL-4595) FragmentExecutor.fail() should interrupt the fragment thread to avoid possible query hangs

     [ https://issues.apache.org/jira/browse/DRILL-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kunal Khatua updated DRILL-4595:
--------------------------------
    Reviewer: Khurram Faraaz

[~khfaraaz] Can you verify if this issue is resolved with DRILL-5599 (Drill 1.11.0)?

> FragmentExecutor.fail() should interrupt the fragment thread to avoid possible query hangs
> ------------------------------------------------------------------------------------------
>
>                 Key: DRILL-4595
>                 URL: https://issues.apache.org/jira/browse/DRILL-4595
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.4.0
>            Reporter: Deneche A. Hakim
>            Assignee: Deneche A. Hakim
>             Fix For: Future
>
>
> When a fragment fails it's assumed it will be able to close itself and send it's FAILED state to the foreman which will cancel any running fragments. FragmentExecutor.cancel() will interrupt the thread making sure those fragment don't stay blocked.
> However, if a fragment is already blocked when it's fail method is called the foreman may never be notified about this and the query will hang forever. One such scenario is the following:
> - generally it's a CTAS running on a large cluster (lot's of writers running in parallel)
> - logs show that the user channel was closed and UserServer caused the root fragment to move to a FAILED state
> - jstack shows that the root fragment is blocked in it's receiver waiting for data
> - jstack also shows that ALL other fragments are no longer running, and the logs show that all of them succeeded
> - the foreman waits *forever* for the root fragment to finish



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)