You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Sudheesh Katkam <sk...@maprtech.com> on 2015/05/14 07:25:40 UTC

Review Request 34191: DRILL-3052: Canceling a fragment executor before it starts running will cause the Foreman to wait indefinitely for a terminal message from that fragment

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34191/
-----------------------------------------------------------

Review request for drill, Jacques Nadeau and Venki Korukanti.


Repository: drill-git


Description
-------

[DRILL-3052](https://issues.apache.org/jira/browse/DRILL-3052), [DRILL-3066](https://issues.apache.org/jira/browse/DRILL-3066): FragmentExecutor must cleanup exactly once
Cleanup includes, in order:
1) closing out resources,
2) updating to the correct terminal state, and
3) sending the state to QueryManager exactly once

In DRILL-3053 scenario, sendFinalState() is never called
In DRILL-3066 scenario, closeOutResources() is called twice


Diffs
-----

  exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java 6b44ae3 
  exec/java-exec/src/test/java/org/apache/drill/exec/server/TestDrillbitResilience.java f95fbe1 

Diff: https://reviews.apache.org/r/34191/diff/


Testing
-------

Passes all unit test and regression tests. Started another build.


Thanks,

Sudheesh Katkam


Re: Review Request 34191: DRILL-3052: Canceling a fragment executor before it starts running will cause the Foreman to wait indefinitely for a terminal message from that fragment

Posted by Sudheesh Katkam <sk...@maprtech.com>.

> On May 14, 2015, 1:35 p.m., abdelhakim deneche wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java, line 305
> > <https://reviews.apache.org/r/34191/diff/1/?file=958720#file958720line305>
> >
> >     are we sure closeOutResources(), updateState() and sendFinalState() won't throw exceptions ? what happens if they do ??

closeOutResources() handles throwing exceptions. 
updateState(FINISHED) does not throw exceptions. 
sendFinalState() can throw exceptions if it is unable to connect to the Foreman node. I'll add a block in that case.


- Sudheesh


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34191/#review83757
-----------------------------------------------------------


On May 14, 2015, 5:25 a.m., Sudheesh Katkam wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34191/
> -----------------------------------------------------------
> 
> (Updated May 14, 2015, 5:25 a.m.)
> 
> 
> Review request for drill, Jacques Nadeau and Venki Korukanti.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> [DRILL-3052](https://issues.apache.org/jira/browse/DRILL-3052), [DRILL-3066](https://issues.apache.org/jira/browse/DRILL-3066): FragmentExecutor must cleanup exactly once
> Cleanup includes, in order:
> 1) closing out resources,
> 2) updating to the correct terminal state, and
> 3) sending the state to QueryManager exactly once
> 
> In DRILL-3053 scenario, sendFinalState() is never called
> In DRILL-3066 scenario, closeOutResources() is called twice
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java 6b44ae3 
>   exec/java-exec/src/test/java/org/apache/drill/exec/server/TestDrillbitResilience.java f95fbe1 
> 
> Diff: https://reviews.apache.org/r/34191/diff/
> 
> 
> Testing
> -------
> 
> Passes all unit test and regression tests. Started another build.
> 
> 
> Thanks,
> 
> Sudheesh Katkam
> 
>


Re: Review Request 34191: DRILL-3052: Canceling a fragment executor before it starts running will cause the Foreman to wait indefinitely for a terminal message from that fragment

Posted by abdelhakim deneche <ad...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34191/#review83757
-----------------------------------------------------------



exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
<https://reviews.apache.org/r/34191/#comment134819>

    move this to the top of the Class



exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
<https://reviews.apache.org/r/34191/#comment134823>

    are we sure closeOutResources(), updateState() and sendFinalState() won't throw exceptions ? what happens if they do ??



exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
<https://reviews.apache.org/r/34191/#comment134824>

    optional: now that we have a cleanup() method, do we need a separate closeOutResources() ? we could move it's content to cleanup()



exec/java-exec/src/test/java/org/apache/drill/exec/server/TestDrillbitResilience.java
<https://reviews.apache.org/r/34191/#comment134825>

    could you do the following experiment ( but do not include it in the patch):
    
    disable the timeout in TestDrillResilience and call cancelWhenFragmentAreRunning() +20 times in a row and see if it never hangs, I suspect there is some other issue that causes the pause injections to never be resumed.
    
    If it hangs, then you should mark it as @Ignore until we figure out why it's hanging. This shouldn't hold off this patch though, as it's more likely a separate problem


- abdelhakim deneche


On May 14, 2015, 5:25 a.m., Sudheesh Katkam wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34191/
> -----------------------------------------------------------
> 
> (Updated May 14, 2015, 5:25 a.m.)
> 
> 
> Review request for drill, Jacques Nadeau and Venki Korukanti.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> [DRILL-3052](https://issues.apache.org/jira/browse/DRILL-3052), [DRILL-3066](https://issues.apache.org/jira/browse/DRILL-3066): FragmentExecutor must cleanup exactly once
> Cleanup includes, in order:
> 1) closing out resources,
> 2) updating to the correct terminal state, and
> 3) sending the state to QueryManager exactly once
> 
> In DRILL-3053 scenario, sendFinalState() is never called
> In DRILL-3066 scenario, closeOutResources() is called twice
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java 6b44ae3 
>   exec/java-exec/src/test/java/org/apache/drill/exec/server/TestDrillbitResilience.java f95fbe1 
> 
> Diff: https://reviews.apache.org/r/34191/diff/
> 
> 
> Testing
> -------
> 
> Passes all unit test and regression tests. Started another build.
> 
> 
> Thanks,
> 
> Sudheesh Katkam
> 
>


Re: Review Request 34191: DRILL-3052: Canceling a fragment executor before it starts running will cause the Foreman to wait indefinitely for a terminal message from that fragment

Posted by abdelhakim deneche <ad...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34191/#review83870
-----------------------------------------------------------


+1 (non binding)

- abdelhakim deneche


On May 14, 2015, 11:17 p.m., Sudheesh Katkam wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34191/
> -----------------------------------------------------------
> 
> (Updated May 14, 2015, 11:17 p.m.)
> 
> 
> Review request for drill, abdelhakim deneche, Jacques Nadeau, and Venki Korukanti.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> [DRILL-3052](https://issues.apache.org/jira/browse/DRILL-3052), [DRILL-3066](https://issues.apache.org/jira/browse/DRILL-3066), [DRILL-3072](https://issues.apache.org/jira/browse/DRILL-3072): FragmentExecutor must cleanup exactly once (root or non-root)
> 
> In DRILL-3052 scenario, sendFinalState() is never called
> In DRILL-3066 scenario, closeOutResources() is called twice
> In DRILL-3072 scenario, root was interrupted by a cancel signal that the root itself sent
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java 8c49d68 
>   exec/java-exec/src/test/java/org/apache/drill/exec/server/TestDrillbitResilience.java f95fbe1 
> 
> Diff: https://reviews.apache.org/r/34191/diff/
> 
> 
> Testing
> -------
> 
> Runnning unit and regression tests currently.
> 
> 
> Thanks,
> 
> Sudheesh Katkam
> 
>


Re: Review Request 34191: DRILL-3052: Canceling a fragment executor before it starts running will cause the Foreman to wait indefinitely for a terminal message from that fragment

Posted by Sudheesh Katkam <sk...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34191/
-----------------------------------------------------------

(Updated May 14, 2015, 11:17 p.m.)


Review request for drill, abdelhakim deneche, Jacques Nadeau, and Venki Korukanti.


Repository: drill-git


Description (updated)
-------

[DRILL-3052](https://issues.apache.org/jira/browse/DRILL-3052), [DRILL-3066](https://issues.apache.org/jira/browse/DRILL-3066), [DRILL-3072](https://issues.apache.org/jira/browse/DRILL-3072): FragmentExecutor must cleanup exactly once (root or non-root)

In DRILL-3052 scenario, sendFinalState() is never called
In DRILL-3066 scenario, closeOutResources() is called twice
In DRILL-3072 scenario, root was interrupted by a cancel signal that the root itself sent


Diffs (updated)
-----

  exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java 8c49d68 
  exec/java-exec/src/test/java/org/apache/drill/exec/server/TestDrillbitResilience.java f95fbe1 

Diff: https://reviews.apache.org/r/34191/diff/


Testing (updated)
-------

Runnning unit and regression tests currently.


Thanks,

Sudheesh Katkam


Re: Review Request 34191: DRILL-3052: Canceling a fragment executor before it starts running will cause the Foreman to wait indefinitely for a terminal message from that fragment

Posted by Sudheesh Katkam <sk...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34191/
-----------------------------------------------------------

(Updated May 14, 2015, 9:10 p.m.)


Review request for drill, Jacques Nadeau and Venki Korukanti.


Changes
-------

fixed commit message


Repository: drill-git


Description (updated)
-------

[DRILL-3052](https://issues.apache.org/jira/browse/DRILL-3052), [DRILL-3066](https://issues.apache.org/jira/browse/DRILL-3066): FragmentExecutor must cleanup exactly once
Cleanup includes, in order:
1) closing out resources,
2) updating to the correct terminal state, and
3) sending the state to QueryManager exactly once

In DRILL-3052 scenario, sendFinalState() is never called
In DRILL-3066 scenario, closeOutResources() is called twice


Diffs (updated)
-----

  exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java 8c49d68 
  exec/java-exec/src/test/java/org/apache/drill/exec/server/TestDrillbitResilience.java f95fbe1 

Diff: https://reviews.apache.org/r/34191/diff/


Testing
-------

Passes all unit test and regression tests. Started another build.


Thanks,

Sudheesh Katkam


Re: Review Request 34191: DRILL-3052: Canceling a fragment executor before it starts running will cause the Foreman to wait indefinitely for a terminal message from that fragment

Posted by Sudheesh Katkam <sk...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34191/
-----------------------------------------------------------

(Updated May 14, 2015, 9:03 p.m.)


Review request for drill, Jacques Nadeau and Venki Korukanti.


Changes
-------

Addressed review changes, and rebased on current master [583ca4a]


Repository: drill-git


Description
-------

[DRILL-3052](https://issues.apache.org/jira/browse/DRILL-3052), [DRILL-3066](https://issues.apache.org/jira/browse/DRILL-3066): FragmentExecutor must cleanup exactly once
Cleanup includes, in order:
1) closing out resources,
2) updating to the correct terminal state, and
3) sending the state to QueryManager exactly once

In DRILL-3053 scenario, sendFinalState() is never called
In DRILL-3066 scenario, closeOutResources() is called twice


Diffs (updated)
-----

  exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java 8c49d68 
  exec/java-exec/src/test/java/org/apache/drill/exec/server/TestDrillbitResilience.java f95fbe1 

Diff: https://reviews.apache.org/r/34191/diff/


Testing
-------

Passes all unit test and regression tests. Started another build.


Thanks,

Sudheesh Katkam