You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by weijietong <gi...@git.apache.org> on 2017/08/29 16:50:28 UTC
[GitHub] drill pull request #925: DRILL-5749: solve foreman and netty threads deadloc...
GitHub user weijietong opened a pull request:
https://github.com/apache/drill/pull/925
DRILL-5749: solve foreman and netty threads deadlock
break the nest invocation of channelClosed method to avoid nested lock holding
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/weijietong/drill DRILL-5749
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/drill/pull/925.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #925
----
commit b44f780a948c4a0898e7cee042c0590f0713f780
Author: weijietong <to...@gmail.com>
Date: 2017-06-08T08:03:46Z
Merge pull request #1 from apache/master
sync
commit d045c757c80a759b435479cc89f33c749fc16ac2
Author: weijie.tong <we...@alipay.com>
Date: 2017-08-11T08:01:36Z
Merge branch 'master' of github.com:weijietong/drill
commit 08b7006f4c70c45a17ebf7eae6beaa2bdb0d0454
Author: weijie.tong <we...@alipay.com>
Date: 2017-08-20T12:05:51Z
update
commit 9e9ebb497a183e61a72665019e6e04070d912027
Author: weijie.tong <we...@alipay.com>
Date: 2017-08-20T12:07:41Z
revert
commit 837d9fc58440fb584690f93b5f638ddcedf042a1
Author: weijie.tong <we...@alipay.com>
Date: 2017-08-22T10:35:12Z
Merge branch 'master' of github.com:apache/drill
commit b1fc840ad9d0a9959b05a84bfd17f17067def32d
Author: weijie.tong <we...@alipay.com>
Date: 2017-08-29T16:39:48Z
Merge branch 'master' of github.com:apache/drill
commit 03afe8650f76d182b86e2d8141780f002538f2b4
Author: weijie.tong <we...@alipay.com>
Date: 2017-08-29T16:43:21Z
solve deadlock
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] drill pull request #925: DRILL-5749: solve foreman and netty threads deadloc...
Posted by weijietong <gi...@git.apache.org>.
Github user weijietong closed the pull request at:
https://github.com/apache/drill/pull/925
---
[GitHub] drill issue #925: DRILL-5749: solve foreman and netty threads deadlock
Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam commented on the issue:
https://github.com/apache/drill/pull/925
Nice catch @weijietong
Looks like the Netty thread, that is failing all outstanding requests to a certain endpoint, is itself trying to send a message (cancel fragment) to that endpoint. Any ideas on how to fix that?
Apart from Paul's comment, the current patch looks good to me as well. The above fix is a nice-to-have.
---
[GitHub] drill pull request #925: DRILL-5749: solve foreman and netty threads deadloc...
Posted by weijietong <gi...@git.apache.org>.
Github user weijietong commented on a diff in the pull request:
https://github.com/apache/drill/pull/925#discussion_r137953813
--- Diff: exec/rpc/src/main/java/org/apache/drill/exec/rpc/RequestIdMap.java ---
@@ -54,10 +52,14 @@ void channelClosed(Throwable ex) {
isOpen.set(false);
if (ex != null) {
final RpcException e = RpcException.mapException(ex);
+ IntObjectHashMap<RpcOutcome<?>> clonedMap;
synchronized (map) {
- map.forEach(new SetExceptionProcedure(e));
+ clonedMap = map.clone();
map.clear();
}
+ if (clonedMap != null) {
--- End diff --
@paul-rogers clonedMap will not be null from the current code logic ,just for conservative safe. If it seems redundancy ,I will remove the null check.
---
[GitHub] drill issue #925: DRILL-5749: solve foreman and netty threads deadlock
Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on the issue:
https://github.com/apache/drill/pull/925
Turns out that I cannot cleanly do the rebase. @weijietong, please rebase this branch onto the latest master, and squash your commits into a single commit.
---
[GitHub] drill issue #925: DRILL-5749: solve foreman and netty threads deadlock
Posted by weijietong <gi...@git.apache.org>.
Github user weijietong commented on the issue:
https://github.com/apache/drill/pull/925
@paul-rogers have refactored the codes. @sudheeshkatkam nothing to fix ,once the netty thread got the re-spawned RPC connection ,it will send out the message.
---
[GitHub] drill pull request #925: DRILL-5749: solve foreman and netty threads deadloc...
Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:
https://github.com/apache/drill/pull/925#discussion_r137938116
--- Diff: exec/rpc/src/main/java/org/apache/drill/exec/rpc/RequestIdMap.java ---
@@ -54,10 +52,14 @@ void channelClosed(Throwable ex) {
isOpen.set(false);
if (ex != null) {
final RpcException e = RpcException.mapException(ex);
+ IntObjectHashMap<RpcOutcome<?>> clonedMap;
synchronized (map) {
- map.forEach(new SetExceptionProcedure(e));
+ clonedMap = map.clone();
map.clear();
}
+ if (clonedMap != null) {
--- End diff --
When would `clonedMap` be `null`?
---
[GitHub] drill issue #925: DRILL-5749: solve foreman and netty threads deadlock
Posted by weijietong <gi...@git.apache.org>.
Github user weijietong commented on the issue:
https://github.com/apache/drill/pull/925
anyone can review this PR ? more details about thread stacks see [DRLL-5749](https://issues.apache.org/jira/browse/DRILL-5749)
---
[GitHub] drill pull request #925: DRILL-5749: solve foreman and netty threads deadloc...
Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:
https://github.com/apache/drill/pull/925#discussion_r137955415
--- Diff: exec/rpc/src/main/java/org/apache/drill/exec/rpc/RequestIdMap.java ---
@@ -54,10 +52,14 @@ void channelClosed(Throwable ex) {
isOpen.set(false);
if (ex != null) {
final RpcException e = RpcException.mapException(ex);
+ IntObjectHashMap<RpcOutcome<?>> clonedMap;
synchronized (map) {
- map.forEach(new SetExceptionProcedure(e));
+ clonedMap = map.clone();
map.clear();
}
+ if (clonedMap != null) {
--- End diff --
Please do. The if statement is a message to readers that clonedMap could be null and so we must try to sleuth out the conditions under which that occurs. Otherwise, I just go ahead and assume that Java is deterministic and that, once we set variable x in a single-threaded environment, it stays at that value until we change it...
---