You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by abdelhakim deneche <ad...@gmail.com> on 2015/07/01 00:37:46 UTC
Re: Review Request 34603: DRILL-3167: When a query fails,
Foreman should wait for all fragments to finish cleaning up before sending a
FAILED state to the client
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/
-----------------------------------------------------------
(Updated June 30, 2015, 10:37 p.m.)
Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
Bugs: DRILL-3167
https://issues.apache.org/jira/browse/DRILL-3167
Repository: drill-git
Description
-------
- In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
- QueryManager.cancelExecutingFragments() returns false if no fragment available
- Web UI still displays FAILED when Foreman state is FAILING
Diffs
-----
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76
exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233
exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede
exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83
protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748
protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330
protocol/src/main/protobuf/UserBitShared.proto 0451fd2
Diff: https://reviews.apache.org/r/34603/diff/
Testing (updated)
-------
unit tests are passing along with functional and tpch100
Thanks,
abdelhakim deneche
Re: Review Request 34603: DRILL-3167: When a query fails,
Foreman should wait for all fragments to finish cleaning up before sending a
FAILED state to the client
Posted by abdelhakim deneche <ad...@gmail.com>.
> On July 1, 2015, 5:23 p.m., Sudheesh Katkam wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java, line 182
> > <https://reviews.apache.org/r/34603/diff/4/?file=996195#file996195line182>
> >
> > Doc fix:
> >
> > (1) "Not recent" inactive intermediate fragment.
> > (2) Inactive leaf fragment.
> > (3) Unknown fragment.
what do you mean by "Not recent" inactive intermediate fragment ?
- abdelhakim
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/#review90058
-----------------------------------------------------------
On June 30, 2015, 10:37 p.m., abdelhakim deneche wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34603/
> -----------------------------------------------------------
>
> (Updated June 30, 2015, 10:37 p.m.)
>
>
> Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
>
>
> Bugs: DRILL-3167
> https://issues.apache.org/jira/browse/DRILL-3167
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> - In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
> - QueryManager.cancelExecutingFragments() returns false if no fragment available
> - Web UI still displays FAILED when Foreman state is FAILING
>
>
> Diffs
> -----
>
> exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef
> exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6
> exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76
> exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2
> exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66
> exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233
> exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede
> exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83
> protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748
> protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330
> protocol/src/main/protobuf/UserBitShared.proto 0451fd2
>
> Diff: https://reviews.apache.org/r/34603/diff/
>
>
> Testing
> -------
>
> unit tests are passing along with functional and tpch100
>
>
> Thanks,
>
> abdelhakim deneche
>
>
Re: Review Request 34603: DRILL-3167: When a query fails,
Foreman should wait for all fragments to finish cleaning up before sending a
FAILED state to the client
Posted by Sudheesh Katkam <sk...@maprtech.com>.
> On July 1, 2015, 5:23 p.m., Sudheesh Katkam wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java, line 182
> > <https://reviews.apache.org/r/34603/diff/4/?file=996195#file996195line182>
> >
> > Doc fix:
> >
> > (1) "Not recent" inactive intermediate fragment.
> > (2) Inactive leaf fragment.
> > (3) Unknown fragment.
>
> abdelhakim deneche wrote:
> what do you mean by "Not recent" inactive intermediate fragment ?
The work bus keeps track of recently finished fragments. If the intermediate fragment is "not recent" in this sense, we will fail.
- Sudheesh
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/#review90058
-----------------------------------------------------------
On June 30, 2015, 10:37 p.m., abdelhakim deneche wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34603/
> -----------------------------------------------------------
>
> (Updated June 30, 2015, 10:37 p.m.)
>
>
> Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
>
>
> Bugs: DRILL-3167
> https://issues.apache.org/jira/browse/DRILL-3167
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> - In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
> - QueryManager.cancelExecutingFragments() returns false if no fragment available
> - Web UI still displays FAILED when Foreman state is FAILING
>
>
> Diffs
> -----
>
> exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef
> exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6
> exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76
> exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2
> exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66
> exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233
> exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede
> exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83
> protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748
> protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330
> protocol/src/main/protobuf/UserBitShared.proto 0451fd2
>
> Diff: https://reviews.apache.org/r/34603/diff/
>
>
> Testing
> -------
>
> unit tests are passing along with functional and tpch100
>
>
> Thanks,
>
> abdelhakim deneche
>
>
Re: Review Request 34603: DRILL-3167: When a query fails,
Foreman should wait for all fragments to finish cleaning up before sending a
FAILED state to the client
Posted by Sudheesh Katkam <sk...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/#review90058
-----------------------------------------------------------
exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java (line 161)
<https://reviews.apache.org/r/34603/#comment142995>
Doc fix: move CANCELLATION_REQUESTED to "active" list
exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java (line 165)
<https://reviews.apache.org/r/34603/#comment142996>
Doc fix: Add case 3 here (and change the case number to 4 below).
Case 3: Ignore recently finished intermediate fragment. The work bus keeps track of recently finished fragments.
exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java (line 181)
<https://reviews.apache.org/r/34603/#comment142999>
Doc fix:
(1) "Not recent" inactive intermediate fragment.
(2) Inactive leaf fragment.
(3) Unknown fragment.
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java (line 233)
<https://reviews.apache.org/r/34603/#comment143006>
Remove TODO. SingalListener is not redundant since you have added cases that are not possible through FragmentStatusListener.
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java (line 269)
<https://reviews.apache.org/r/34603/#comment143000>
Doc fix:
Remove the sentence "As a result, ..."
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java (line 306)
<https://reviews.apache.org/r/34603/#comment143001>
extra ","
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java (line 316)
<https://reviews.apache.org/r/34603/#comment143003>
extra ","
- Sudheesh Katkam
On June 30, 2015, 10:37 p.m., abdelhakim deneche wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34603/
> -----------------------------------------------------------
>
> (Updated June 30, 2015, 10:37 p.m.)
>
>
> Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
>
>
> Bugs: DRILL-3167
> https://issues.apache.org/jira/browse/DRILL-3167
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> - In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
> - QueryManager.cancelExecutingFragments() returns false if no fragment available
> - Web UI still displays FAILED when Foreman state is FAILING
>
>
> Diffs
> -----
>
> exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef
> exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6
> exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76
> exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2
> exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66
> exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233
> exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede
> exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83
> protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748
> protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330
> protocol/src/main/protobuf/UserBitShared.proto 0451fd2
>
> Diff: https://reviews.apache.org/r/34603/diff/
>
>
> Testing
> -------
>
> unit tests are passing along with functional and tpch100
>
>
> Thanks,
>
> abdelhakim deneche
>
>
Re: Review Request 34603: DRILL-3167: When a query fails,
Foreman should wait for all fragments to finish cleaning up before sending a
FAILED state to the client
Posted by abdelhakim deneche <ad...@gmail.com>.
> On July 2, 2015, 12:20 a.m., Sudheesh Katkam wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java, line 296
> > <https://reviews.apache.org/r/34603/diff/4/?file=996197#file996197line296>
> >
> > Please add more information to the MinorFragmentProfile, specially an error passed in as a parameter and use .setError(...)
> >
> > .setEndTime(...) would also be useful.
I added a new FragmentState.NOT_RESPONDING to make it separate from FAILED
- abdelhakim
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/#review90175
-----------------------------------------------------------
On June 30, 2015, 10:37 p.m., abdelhakim deneche wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34603/
> -----------------------------------------------------------
>
> (Updated June 30, 2015, 10:37 p.m.)
>
>
> Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
>
>
> Bugs: DRILL-3167
> https://issues.apache.org/jira/browse/DRILL-3167
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> - In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
> - QueryManager.cancelExecutingFragments() returns false if no fragment available
> - Web UI still displays FAILED when Foreman state is FAILING
>
>
> Diffs
> -----
>
> exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef
> exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6
> exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76
> exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2
> exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66
> exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233
> exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede
> exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83
> protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748
> protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330
> protocol/src/main/protobuf/UserBitShared.proto 0451fd2
>
> Diff: https://reviews.apache.org/r/34603/diff/
>
>
> Testing
> -------
>
> unit tests are passing along with functional and tpch100
>
>
> Thanks,
>
> abdelhakim deneche
>
>
Re: Review Request 34603: DRILL-3167: When a query fails,
Foreman should wait for all fragments to finish cleaning up before sending a
FAILED state to the client
Posted by Sudheesh Katkam <sk...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/#review90175
-----------------------------------------------------------
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java (line 290)
<https://reviews.apache.org/r/34603/#comment143169>
Please add more information to the MinorFragmentProfile, specially an error passed in as a parameter and use .setError(...)
.setEndTime(...) would also be useful.
- Sudheesh Katkam
On June 30, 2015, 10:37 p.m., abdelhakim deneche wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34603/
> -----------------------------------------------------------
>
> (Updated June 30, 2015, 10:37 p.m.)
>
>
> Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
>
>
> Bugs: DRILL-3167
> https://issues.apache.org/jira/browse/DRILL-3167
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> - In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
> - QueryManager.cancelExecutingFragments() returns false if no fragment available
> - Web UI still displays FAILED when Foreman state is FAILING
>
>
> Diffs
> -----
>
> exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef
> exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6
> exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76
> exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2
> exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66
> exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233
> exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede
> exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83
> protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748
> protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330
> protocol/src/main/protobuf/UserBitShared.proto 0451fd2
>
> Diff: https://reviews.apache.org/r/34603/diff/
>
>
> Testing
> -------
>
> unit tests are passing along with functional and tpch100
>
>
> Thanks,
>
> abdelhakim deneche
>
>
Re: Review Request 34603: DRILL-3167: When a query fails,
Foreman should wait for all fragments to finish cleaning up before sending a
FAILED state to the client
Posted by abdelhakim deneche <ad...@gmail.com>.
> On July 2, 2015, 6:34 a.m., Sudheesh Katkam wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java, line 218
> > <https://reviews.apache.org/r/34603/diff/4/?file=996197#file996197line218>
> >
> > fragmentDataSet is never empty.
> >
> > May be this?
> > if (fragmentDataSet.size() == finishedFragments.get())
good catch!
will fix
- abdelhakim
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/#review90193
-----------------------------------------------------------
On June 30, 2015, 10:37 p.m., abdelhakim deneche wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34603/
> -----------------------------------------------------------
>
> (Updated June 30, 2015, 10:37 p.m.)
>
>
> Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
>
>
> Bugs: DRILL-3167
> https://issues.apache.org/jira/browse/DRILL-3167
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> - In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
> - QueryManager.cancelExecutingFragments() returns false if no fragment available
> - Web UI still displays FAILED when Foreman state is FAILING
>
>
> Diffs
> -----
>
> exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef
> exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6
> exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76
> exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2
> exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66
> exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233
> exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede
> exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83
> protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748
> protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330
> protocol/src/main/protobuf/UserBitShared.proto 0451fd2
>
> Diff: https://reviews.apache.org/r/34603/diff/
>
>
> Testing
> -------
>
> unit tests are passing along with functional and tpch100
>
>
> Thanks,
>
> abdelhakim deneche
>
>
Re: Review Request 34603: DRILL-3167: When a query fails,
Foreman should wait for all fragments to finish cleaning up before sending a
FAILED state to the client
Posted by abdelhakim deneche <ad...@gmail.com>.
> On July 2, 2015, 6:34 a.m., Sudheesh Katkam wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java, line 97
> > <https://reviews.apache.org/r/34603/diff/4/?file=996192#file996192line97>
> >
> > What about the race condition that you fixed in DRILL-3061? I thought that had to do with recentlyFinishedFragments. Does this also require synchronization?
the way ControlMessageHandler calls #hasRecentlyFinished() won't cause any race condition, but we may use it in the future in ways that could indeed cause a race condition.
will fix
- abdelhakim
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/#review90193
-----------------------------------------------------------
On June 30, 2015, 10:37 p.m., abdelhakim deneche wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34603/
> -----------------------------------------------------------
>
> (Updated June 30, 2015, 10:37 p.m.)
>
>
> Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
>
>
> Bugs: DRILL-3167
> https://issues.apache.org/jira/browse/DRILL-3167
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> - In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
> - QueryManager.cancelExecutingFragments() returns false if no fragment available
> - Web UI still displays FAILED when Foreman state is FAILING
>
>
> Diffs
> -----
>
> exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef
> exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6
> exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76
> exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2
> exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66
> exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233
> exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede
> exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83
> protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748
> protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330
> protocol/src/main/protobuf/UserBitShared.proto 0451fd2
>
> Diff: https://reviews.apache.org/r/34603/diff/
>
>
> Testing
> -------
>
> unit tests are passing along with functional and tpch100
>
>
> Thanks,
>
> abdelhakim deneche
>
>
Re: Review Request 34603: DRILL-3167: When a query fails,
Foreman should wait for all fragments to finish cleaning up before sending a
FAILED state to the client
Posted by Sudheesh Katkam <sk...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34603/#review90193
-----------------------------------------------------------
Add to my never ending comments on this patch :)
I was facing similar issues, so I have some more comments.
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java (line 97)
<https://reviews.apache.org/r/34603/#comment143201>
What about the race condition that you fixed in DRILL-3061? I thought that had to do with recentlyFinishedFragments. Does this also require synchronization?
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java (line 91)
<https://reviews.apache.org/r/34603/#comment143200>
The FragmentData objects referenced from this map are not thread-safe. With your changes, I feel like we can cook up cases (however rare) where two RPC threads could modify the same fragment data object in #updateFragmentState through #fragmentDone, because you've created other paths to #fragmentDone.
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java (line 218)
<https://reviews.apache.org/r/34603/#comment143196>
fragmentDataSet is never empty.
May be this?
if (fragmentDataSet.size() == finishedFragments.get())
- Sudheesh Katkam
On June 30, 2015, 10:37 p.m., abdelhakim deneche wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34603/
> -----------------------------------------------------------
>
> (Updated June 30, 2015, 10:37 p.m.)
>
>
> Review request for drill, Jacques Nadeau, Jason Altekruse, and Sudheesh Katkam.
>
>
> Bugs: DRILL-3167
> https://issues.apache.org/jira/browse/DRILL-3167
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> - In case of a failure the Foreman will cancel all fragments and move to a FAILING state until all fragments are terminated
> - QueryManager.cancelExecutingFragments() returns false if no fragment available
> - Web UI still displays FAILED when Foreman state is FAILING
>
>
> Diffs
> -----
>
> exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java 3e461ef
> exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java 6656bf6
> exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java dd26a76
> exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/ControlMessageHandler.java 9f302a2
> exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 716fb66
> exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java 9318233
> exec/java-exec/src/main/resources/rest/profile/list.ftl cf92ede
> exec/java-exec/src/main/resources/rest/profile/profile.ftl 46cdc83
> protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748
> protocol/src/main/java/org/apache/drill/exec/proto/beans/QueryResult.java 474e330
> protocol/src/main/protobuf/UserBitShared.proto 0451fd2
>
> Diff: https://reviews.apache.org/r/34603/diff/
>
>
> Testing
> -------
>
> unit tests are passing along with functional and tpch100
>
>
> Thanks,
>
> abdelhakim deneche
>
>