You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Maxim Muzafarov (Jira)" <ji...@apache.org> on 2019/10/02 13:37:00 UTC

[jira] [Commented] (IGNITE-9068) Node fails to stop when CacheObjectBinaryProcessor.addMeta() is executed inside guard()/unguard()

    [ https://issues.apache.org/jira/browse/IGNITE-9068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942818#comment-16942818 ] 

Maxim Muzafarov commented on IGNITE-9068:
-----------------------------------------

[~ilantukh], [~ilyak], [~agoncharuk]

Folks, any updates? Can we move it to the next release since there is no attention to this issue for more that one year?

> Node fails to stop when CacheObjectBinaryProcessor.addMeta() is executed inside guard()/unguard()
> -------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-9068
>                 URL: https://issues.apache.org/jira/browse/IGNITE-9068
>             Project: Ignite
>          Issue Type: Bug
>          Components: binary, managed services
>    Affects Versions: 2.5
>            Reporter: Ilya Kasnacheev
>            Assignee: Ilya Lantukh
>            Priority: Blocker
>              Labels: test
>             Fix For: 2.8
>
>         Attachments: GridServiceDeadlockTest.java, MyService.java
>
>
> When addMeta is called in e.g. service deployment it us executed inside guard()/unguard()
> If node will be stopped at this point, Ignite.stop() will hang.
> Consider the following thread dump:
> {code}
> "Thread-1" #57 prio=5 os_prio=0 tid=0x00007f7780005000 nid=0x7f26 runnable [0x00007f766cbef000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	- parking to wait for  <0x00000005cb7b0468> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> 	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireNanos(AbstractQueuedSynchronizer.java:934)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1247)
> 	at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.java:1115)
> 	at org.apache.ignite.internal.util.StripedCompositeReadWriteLock$WriteLock.tryLock(StripedCompositeReadWriteLock.java:220)
> 	at org.apache.ignite.internal.GridKernalGatewayImpl.tryWriteLock(GridKernalGatewayImpl.java:143)
>         // Waiting for lock to cancel futures of BinaryMetadataTransport
> 	at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2171)
> 	at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2094)
> 	at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2545)
> 	- locked <0x00000005cb423f00> (a org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
> 	at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2508)
> 	at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.run(IgnitionEx.java:2033)
> "test-runner-#1%service.GridServiceDeadlockTest%" #13 prio=5 os_prio=0 tid=0x00007f77b87d5800 nid=0x7eb8 waiting on condition [0x00007f778cdfc000]
>    java.lang.Thread.State: WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>         // May never return if there's discovery problems
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
> 	at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.addMeta(CacheObjectBinaryProcessorImpl.java:463)
> 	at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl$2.addMeta(CacheObjectBinaryProcessorImpl.java:188)
> 	at org.apache.ignite.internal.binary.BinaryContext.registerUserClassDescriptor(BinaryContext.java:802)
> 	at org.apache.ignite.internal.binary.BinaryContext.registerClassDescriptor(BinaryContext.java:761)
> 	at org.apache.ignite.internal.binary.BinaryContext.descriptorForClass(BinaryContext.java:627)
> 	at org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal0(BinaryWriterExImpl.java:174)
> 	at org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:157)
> 	at org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:144)
> 	at org.apache.ignite.internal.binary.GridBinaryMarshaller.marshal(GridBinaryMarshaller.java:254)
> 	at org.apache.ignite.internal.binary.BinaryMarshaller.marshal0(BinaryMarshaller.java:82)
> 	at org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.marshal(AbstractNodeNameAwareMarshaller.java:58)
> 	at org.apache.ignite.internal.util.IgniteUtils.marshal(IgniteUtils.java:10069)
> 	at org.apache.ignite.internal.processors.service.GridServiceProcessor.prepareServiceConfigurations(GridServiceProcessor.java:570)
> 	at org.apache.ignite.internal.processors.service.GridServiceProcessor.deployAll(GridServiceProcessor.java:622)
> 	at org.apache.ignite.internal.processors.service.GridServiceProcessor.deployAll(GridServiceProcessor.java:610)
>         // Lock held here:
> 	at org.apache.ignite.internal.processors.service.GridServiceProcessor.deployMultiple(GridServiceProcessor.java:498)
> 	at org.apache.ignite.internal.IgniteServicesImpl.deployMultiple(IgniteServicesImpl.java:153)
> 	at org.apache.ignite.internal.processors.service.GridServiceDeadlockTest.testMetadataDeadlock(GridServiceDeadlockTest.java:48)
> {code}
> It seems that waiting for futures inside addMeta (and for that matter inside guard()/unguard() for service deploy) is not safe. Some kind of continuation / compound future is desired here.
> I am attaching a reproducing test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)