You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by abdelhakim deneche <ad...@gmail.com> on 2015/07/01 00:37:46 UTC

Re: Review Request 34603: DRILL-3167: When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/
-----------------------------------------------------------

(Updated June 30, 2015, 10:37 p.m.)


Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.


Bugs: DRILL-3167
    https://issues.apache.org/jira/browse/DRILL-3167


Repository: drill-git


Description
-------

- In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
- QueryManager.cancelExecutingFragments() returns false if no fragment available
- Web UI still displays FAILED when Foreman state is FAILING


Diffs
-----

  exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef 
  exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6 
  exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76 
  exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2 
  exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66 
  exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233 
  exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede 
  exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83 
  protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748 
  protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330 
  protocol/src/main/protobuf/UserBitShared.proto 0451fd2 

Diff: https://reviews.apache.org/r/34603/diff/


Testing (updated)
-------

unit tests are passing along with functional and tpch100


Thanks,

abdelhakim deneche


Re: Review Request 34603: DRILL-3167: When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client

Posted by abdelhakim deneche <ad...@gmail.com>.

> On July 1, 2015, 5:23 p.m., Sudheesh Katkam wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java, line 182
> > <https://reviews.apache.org/r/34603/diff/4/?file=996195#file996195line182>
> >
> >     Doc fix:
> >     
> >     (1) "Not recent" inactive intermediate fragment.
> >     (2) Inactive leaf fragment.
> >     (3) Unknown fragment.

what do you mean by "Not recent" inactive intermediate fragment ?


- abdelhakim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/#review90058
-----------------------------------------------------------


On June 30, 2015, 10:37 p.m., abdelhakim deneche wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34603/
> -----------------------------------------------------------
> 
> (Updated June 30, 2015, 10:37 p.m.)
> 
> 
> Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
> 
> 
> Bugs: DRILL-3167
>     https://issues.apache.org/jira/browse/DRILL-3167
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> - In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
> - QueryManager.cancelExecutingFragments() returns false if no fragment available
> - Web UI still displays FAILED when Foreman state is FAILING
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233 
>   exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede 
>   exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83 
>   protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748 
>   protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330 
>   protocol/src/main/protobuf/UserBitShared.proto 0451fd2 
> 
> Diff: https://reviews.apache.org/r/34603/diff/
> 
> 
> Testing
> -------
> 
> unit tests are passing along with functional and tpch100
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>


Re: Review Request 34603: DRILL-3167: When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client

Posted by Sudheesh Katkam <sk...@maprtech.com>.

> On July 1, 2015, 5:23 p.m., Sudheesh Katkam wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java, line 182
> > <https://reviews.apache.org/r/34603/diff/4/?file=996195#file996195line182>
> >
> >     Doc fix:
> >     
> >     (1) "Not recent" inactive intermediate fragment.
> >     (2) Inactive leaf fragment.
> >     (3) Unknown fragment.
> 
> abdelhakim deneche wrote:
>     what do you mean by "Not recent" inactive intermediate fragment ?

The work bus keeps track of recently finished fragments. If the intermediate fragment is "not recent" in this sense, we will fail.


- Sudheesh


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/#review90058
-----------------------------------------------------------


On June 30, 2015, 10:37 p.m., abdelhakim deneche wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34603/
> -----------------------------------------------------------
> 
> (Updated June 30, 2015, 10:37 p.m.)
> 
> 
> Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
> 
> 
> Bugs: DRILL-3167
>     https://issues.apache.org/jira/browse/DRILL-3167
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> - In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
> - QueryManager.cancelExecutingFragments() returns false if no fragment available
> - Web UI still displays FAILED when Foreman state is FAILING
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233 
>   exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede 
>   exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83 
>   protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748 
>   protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330 
>   protocol/src/main/protobuf/UserBitShared.proto 0451fd2 
> 
> Diff: https://reviews.apache.org/r/34603/diff/
> 
> 
> Testing
> -------
> 
> unit tests are passing along with functional and tpch100
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>


Re: Review Request 34603: DRILL-3167: When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client

Posted by Sudheesh Katkam <sk...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/#review90058
-----------------------------------------------------------



exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java (line 161)
<https://reviews.apache.org/r/34603/#comment142995>

    Doc fix: move CANCELLATION_REQUESTED to "active" list



exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java (line 165)
<https://reviews.apache.org/r/34603/#comment142996>

    Doc fix: Add case 3 here (and change the case number to 4 below). 
    
    Case 3: Ignore recently finished intermediate fragment. The work bus keeps track of recently finished fragments.



exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java (line 181)
<https://reviews.apache.org/r/34603/#comment142999>

    Doc fix:
    
    (1) "Not recent" inactive intermediate fragment.
    (2) Inactive leaf fragment.
    (3) Unknown fragment.



exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java (line 233)
<https://reviews.apache.org/r/34603/#comment143006>

    Remove TODO. SingalListener is not redundant since you have added cases that are not possible through FragmentStatusListener.



exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java (line 269)
<https://reviews.apache.org/r/34603/#comment143000>

    Doc fix:
    
    Remove the sentence "As a result, ..."



exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java (line 306)
<https://reviews.apache.org/r/34603/#comment143001>

    extra ","



exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java (line 316)
<https://reviews.apache.org/r/34603/#comment143003>

    extra ","


- Sudheesh Katkam


On June 30, 2015, 10:37 p.m., abdelhakim deneche wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34603/
> -----------------------------------------------------------
> 
> (Updated June 30, 2015, 10:37 p.m.)
> 
> 
> Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
> 
> 
> Bugs: DRILL-3167
>     https://issues.apache.org/jira/browse/DRILL-3167
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> - In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
> - QueryManager.cancelExecutingFragments() returns false if no fragment available
> - Web UI still displays FAILED when Foreman state is FAILING
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233 
>   exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede 
>   exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83 
>   protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748 
>   protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330 
>   protocol/src/main/protobuf/UserBitShared.proto 0451fd2 
> 
> Diff: https://reviews.apache.org/r/34603/diff/
> 
> 
> Testing
> -------
> 
> unit tests are passing along with functional and tpch100
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>


Re: Review Request 34603: DRILL-3167: When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client

Posted by abdelhakim deneche <ad...@gmail.com>.

> On July 2, 2015, 12:20 a.m., Sudheesh Katkam wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java, line 296
> > <https://reviews.apache.org/r/34603/diff/4/?file=996197#file996197line296>
> >
> >     Please add more information to the MinorFragmentProfile, specially an error passed in as a parameter and use .setError(...)
> >     
> >     .setEndTime(...) would also be useful.

I added a new FragmentState.NOT_RESPONDING to make it separate from FAILED


- abdelhakim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/#review90175
-----------------------------------------------------------


On June 30, 2015, 10:37 p.m., abdelhakim deneche wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34603/
> -----------------------------------------------------------
> 
> (Updated June 30, 2015, 10:37 p.m.)
> 
> 
> Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
> 
> 
> Bugs: DRILL-3167
>     https://issues.apache.org/jira/browse/DRILL-3167
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> - In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
> - QueryManager.cancelExecutingFragments() returns false if no fragment available
> - Web UI still displays FAILED when Foreman state is FAILING
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233 
>   exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede 
>   exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83 
>   protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748 
>   protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330 
>   protocol/src/main/protobuf/UserBitShared.proto 0451fd2 
> 
> Diff: https://reviews.apache.org/r/34603/diff/
> 
> 
> Testing
> -------
> 
> unit tests are passing along with functional and tpch100
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>


Re: Review Request 34603: DRILL-3167: When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client

Posted by Sudheesh Katkam <sk...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/#review90175
-----------------------------------------------------------



exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java (line 290)
<https://reviews.apache.org/r/34603/#comment143169>

    Please add more information to the MinorFragmentProfile, specially an error passed in as a parameter and use .setError(...)
    
    .setEndTime(...) would also be useful.


- Sudheesh Katkam


On June 30, 2015, 10:37 p.m., abdelhakim deneche wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34603/
> -----------------------------------------------------------
> 
> (Updated June 30, 2015, 10:37 p.m.)
> 
> 
> Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
> 
> 
> Bugs: DRILL-3167
>     https://issues.apache.org/jira/browse/DRILL-3167
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> - In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
> - QueryManager.cancelExecutingFragments() returns false if no fragment available
> - Web UI still displays FAILED when Foreman state is FAILING
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233 
>   exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede 
>   exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83 
>   protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748 
>   protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330 
>   protocol/src/main/protobuf/UserBitShared.proto 0451fd2 
> 
> Diff: https://reviews.apache.org/r/34603/diff/
> 
> 
> Testing
> -------
> 
> unit tests are passing along with functional and tpch100
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>


Re: Review Request 34603: DRILL-3167: When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client

Posted by abdelhakim deneche <ad...@gmail.com>.

> On July 2, 2015, 6:34 a.m., Sudheesh Katkam wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java, line 218
> > <https://reviews.apache.org/r/34603/diff/4/?file=996197#file996197line218>
> >
> >     fragmentDataSet is never empty.
> >     
> >     May be this?
> >     if (fragmentDataSet.size() == finishedFragments.get())

good catch!

will fix


- abdelhakim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/#review90193
-----------------------------------------------------------


On June 30, 2015, 10:37 p.m., abdelhakim deneche wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34603/
> -----------------------------------------------------------
> 
> (Updated June 30, 2015, 10:37 p.m.)
> 
> 
> Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
> 
> 
> Bugs: DRILL-3167
>     https://issues.apache.org/jira/browse/DRILL-3167
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> - In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
> - QueryManager.cancelExecutingFragments() returns false if no fragment available
> - Web UI still displays FAILED when Foreman state is FAILING
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233 
>   exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede 
>   exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83 
>   protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748 
>   protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330 
>   protocol/src/main/protobuf/UserBitShared.proto 0451fd2 
> 
> Diff: https://reviews.apache.org/r/34603/diff/
> 
> 
> Testing
> -------
> 
> unit tests are passing along with functional and tpch100
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>


Re: Review Request 34603: DRILL-3167: When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client

Posted by abdelhakim deneche <ad...@gmail.com>.

> On July 2, 2015, 6:34 a.m., Sudheesh Katkam wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java, line 97
> > <https://reviews.apache.org/r/34603/diff/4/?file=996192#file996192line97>
> >
> >     What about the race condition that you fixed in DRILL-3061? I thought that had to do with recentlyFinishedFragments. Does this also require synchronization?

the way ControlMessageHandler calls #hasRecentlyFinished() won't cause any race condition, but we may use it in the future in ways that could indeed cause a race condition.

will fix


- abdelhakim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/#review90193
-----------------------------------------------------------


On June 30, 2015, 10:37 p.m., abdelhakim deneche wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34603/
> -----------------------------------------------------------
> 
> (Updated June 30, 2015, 10:37 p.m.)
> 
> 
> Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
> 
> 
> Bugs: DRILL-3167
>     https://issues.apache.org/jira/browse/DRILL-3167
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> - In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
> - QueryManager.cancelExecutingFragments() returns false if no fragment available
> - Web UI still displays FAILED when Foreman state is FAILING
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233 
>   exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede 
>   exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83 
>   protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748 
>   protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330 
>   protocol/src/main/protobuf/UserBitShared.proto 0451fd2 
> 
> Diff: https://reviews.apache.org/r/34603/diff/
> 
> 
> Testing
> -------
> 
> unit tests are passing along with functional and tpch100
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>


Re: Review Request 34603: DRILL-3167: When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client

Posted by Sudheesh Katkam <sk...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/#review90193
-----------------------------------------------------------


Add to my never ending comments on this patch :) 

I was facing similar issues, so I have some more comments.


exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java (line 97)
<https://reviews.apache.org/r/34603/#comment143201>

    What about the race condition that you fixed in DRILL-3061? I thought that had to do with recentlyFinishedFragments. Does this also require synchronization?



exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java (line 91)
<https://reviews.apache.org/r/34603/#comment143200>

    The FragmentData objects referenced from this map are not thread-safe. With your changes, I feel like we can cook up cases (however rare) where two RPC threads could modify the same fragment data object in #updateFragmentState through #fragmentDone, because you've created other paths to #fragmentDone.



exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java (line 218)
<https://reviews.apache.org/r/34603/#comment143196>

    fragmentDataSet is never empty.
    
    May be this?
    if (fragmentDataSet.size() == finishedFragments.get())


- Sudheesh Katkam


On June 30, 2015, 10:37 p.m., abdelhakim deneche wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34603/
> -----------------------------------------------------------
> 
> (Updated June 30, 2015, 10:37 p.m.)
> 
> 
> Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
> 
> 
> Bugs: DRILL-3167
>     https://issues.apache.org/jira/browse/DRILL-3167
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> - In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
> - QueryManager.cancelExecutingFragments() returns false if no fragment available
> - Web UI still displays FAILED when Foreman state is FAILING
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233 
>   exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede 
>   exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83 
>   protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748 
>   protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330 
>   protocol/src/main/protobuf/UserBitShared.proto 0451fd2 
> 
> Diff: https://reviews.apache.org/r/34603/diff/
> 
> 
> Testing
> -------
> 
> unit tests are passing along with functional and tpch100
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>