You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Vyacheslav Daradur (JIRA)" <ji...@apache.org> on 2019/01/11 15:02:00 UTC

[jira] [Updated] (IGNITE-10899) Service Grid: disconnecting during node stop may lead to deadlock

     [ https://issues.apache.org/jira/browse/IGNITE-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vyacheslav Daradur updated IGNITE-10899:
----------------------------------------
    Description: 
In a rare case {{onDisconneced}} may be called during node stopping and deadlock may occur because of  {{ServiceDeploymentManage#stopProcessong}} blocks busyLock and not release it intentionally.

The issue has been found on TeamCity in [Zookeeper's suite|https://ci.ignite.apache.org/viewLog.html?buildId=2768270&buildTypeId=IgniteTests24Java8_ZooKeeperDiscovery2] with the following stack trace:
{code:java}
disco-notifier-worker-#569118%client4%" 
 #609288
 prio=5 os_prio=0 tid=0x00007f905b440800 nid=0x3f6fbd sleeping[0x00007f9383efd000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.ignite.internal.util.GridSpinReadWriteLock.writeLock(GridSpinReadWriteLock.java:204)
at org.apache.ignite.internal.util.GridSpinBusyLock.block(GridSpinBusyLock.java:76)
at org.apache.ignite.internal.processors.service.ServiceDeploymentManager.stopProcessing(ServiceDeploymentManager.java:137)
at org.apache.ignite.internal.processors.service.IgniteServiceProcessor.stopProcessor(IgniteServiceProcessor.java:261)
at org.apache.ignite.internal.processors.service.IgniteServiceProcessor.onDisconnected(IgniteServiceProcessor.java:429)
at org.apache.ignite.internal.IgniteKernal.onDisconnected(IgniteKernal.java:4010)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:819)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:602)
 - locked <0x00000000f7ecdfa0> (a java.lang.Object)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4$$Lambda$25/2087171109.run(Unknown Source)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2696)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2734)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
{code}

  was:
In a rare case {{onDisconneced}} may be called during node stopping and deadlock may occur because of  {{ServiceDeploymentManage#stopProcessong}} blocks busyLock and not release it intentionally.

The issue has been found on TeamCity in Zookeeper's suite with the following stack trace:
{code:java}
disco-notifier-worker-#569118%client4%" 
 #609288
 prio=5 os_prio=0 tid=0x00007f905b440800 nid=0x3f6fbd sleeping[0x00007f9383efd000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.ignite.internal.util.GridSpinReadWriteLock.writeLock(GridSpinReadWriteLock.java:204)
at org.apache.ignite.internal.util.GridSpinBusyLock.block(GridSpinBusyLock.java:76)
at org.apache.ignite.internal.processors.service.ServiceDeploymentManager.stopProcessing(ServiceDeploymentManager.java:137)
at org.apache.ignite.internal.processors.service.IgniteServiceProcessor.stopProcessor(IgniteServiceProcessor.java:261)
at org.apache.ignite.internal.processors.service.IgniteServiceProcessor.onDisconnected(IgniteServiceProcessor.java:429)
at org.apache.ignite.internal.IgniteKernal.onDisconnected(IgniteKernal.java:4010)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:819)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:602)
 - locked <0x00000000f7ecdfa0> (a java.lang.Object)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4$$Lambda$25/2087171109.run(Unknown Source)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2696)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2734)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
{code}


> Service Grid: disconnecting during node stop may lead to deadlock
> -----------------------------------------------------------------
>
>                 Key: IGNITE-10899
>                 URL: https://issues.apache.org/jira/browse/IGNITE-10899
>             Project: Ignite
>          Issue Type: Task
>          Components: managed services
>    Affects Versions: 2.7
>            Reporter: Vyacheslav Daradur
>            Assignee: Vyacheslav Daradur
>            Priority: Major
>              Labels: MakeTeamcityGreenAgain
>             Fix For: 2.8
>
>
> In a rare case {{onDisconneced}} may be called during node stopping and deadlock may occur because of  {{ServiceDeploymentManage#stopProcessong}} blocks busyLock and not release it intentionally.
> The issue has been found on TeamCity in [Zookeeper's suite|https://ci.ignite.apache.org/viewLog.html?buildId=2768270&buildTypeId=IgniteTests24Java8_ZooKeeperDiscovery2] with the following stack trace:
> {code:java}
> disco-notifier-worker-#569118%client4%" 
>  #609288
>  prio=5 os_prio=0 tid=0x00007f905b440800 nid=0x3f6fbd sleeping[0x00007f9383efd000]
> java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at org.apache.ignite.internal.util.GridSpinReadWriteLock.writeLock(GridSpinReadWriteLock.java:204)
> at org.apache.ignite.internal.util.GridSpinBusyLock.block(GridSpinBusyLock.java:76)
> at org.apache.ignite.internal.processors.service.ServiceDeploymentManager.stopProcessing(ServiceDeploymentManager.java:137)
> at org.apache.ignite.internal.processors.service.IgniteServiceProcessor.stopProcessor(IgniteServiceProcessor.java:261)
> at org.apache.ignite.internal.processors.service.IgniteServiceProcessor.onDisconnected(IgniteServiceProcessor.java:429)
> at org.apache.ignite.internal.IgniteKernal.onDisconnected(IgniteKernal.java:4010)
> at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:819)
> at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:602)
>  - locked <0x00000000f7ecdfa0> (a java.lang.Object)
> at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4$$Lambda$25/2087171109.run(Unknown Source)
> at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2696)
> at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2734)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)