You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Sergey Chugunov (JIRA)" <ji...@apache.org> on 2017/07/03 09:47:00 UTC

[jira] [Comment Edited] (IGNITE-5401) Investigate hangs in JDBC driver testIndexState()

    [ https://issues.apache.org/jira/browse/IGNITE-5401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070359#comment-16070359 ] 

Sergey Chugunov edited comment on IGNITE-5401 at 7/3/17 9:46 AM:
-----------------------------------------------------------------

This hang was caused by a very specific scenario that may happen in multinode cluster setup.

Source of this scenario is that marshaller mappings can be added to local map on each node in two different ways: they may be read from file system (from *%IGNITE_HOME%/marshaller* directory) or may be created during mapping exchange process (which involves exchanging proposed/accepted Custom Discovery Messages).

Loading from file system is *local* operation; when some node reads mapping it doesn't notify other nodes about this fact. And it creates mapping in *accepted* state right away.
At the same time exchange protocol has an optimization to mark proposed messages as duplicates to reduce CDM traffic in the ring.

So what happened in the test and what made it hanging is that some node loaded mapping from disk, when another one requested adding the same mapping via exchange protocol. Proposed message of the second node was marked as duplicated and skipped, no accept had been sent. So the second node was waiting for accepted message forever.


was (Author: sergey-chugunov):
This hang was caused by a very specific scenario that may happen in multinode cluster setup.

Source of this scenario is that marshaller mappings can be added to local map on each node in two different ways: they may be read from file system (from *%IGNITE_HOME%/marshaller* directory) or may be created during mapping exchange process (which involves exchanging proposed/accepted Custom Discovery Messages).

Loading from file system is *local* operation; when some node reads mapping it doesn't notify other nodes about this fact. And it creates mapping in *accepted* state right away.
At the same time exchange protocol has an optimization to mark proposed messages as duplicates to reduce CDM traffic in the ring.

So what happened in the test and what made it hanging is that some node loaded mapping from disk, when another one requested its adding via exchange protocol. Proposed message of the second node was marked as duplicated and skipped, no accept had been sent. So the second node was waiting for accepted message forever.

> Investigate hangs in JDBC driver testIndexState()
> -------------------------------------------------
>
>                 Key: IGNITE-5401
>                 URL: https://issues.apache.org/jira/browse/IGNITE-5401
>             Project: Ignite
>          Issue Type: Task
>          Components: jdbc, sql
>            Reporter: Vladimir Ozerov
>            Assignee: Sergey Chugunov
>             Fix For: 2.1
>
>
> Two JDBC tests hang from time to time. Root cause is the same as tests are similar.
> 1) org.apache.ignite.jdbc.thin.JdbcThinDynamicIndexAbstractSelfTest#testIndexState
> 2) org.apache.ignite.internal.jdbc2.JdbcDynamicIndexAbstractSelfTest#testIndexState
> Failures are noly happen in ATOMIC PARTITIONED cache (with and without "near").
> Stack trace:
> {noformat}
> [17:37:00] :	 [Step 4/5] Thread [name="test-runner-#22990%thin.JdbcThinDynamicIndexAtomicPartitionedSelfTest%", id=29018, state=WAITING, blockCnt=0, waitCnt=4]
> [17:37:00] :	 [Step 4/5]         at sun.misc.Unsafe.park(Native Method)
> [17:37:00] :	 [Step 4/5]         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:315)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:176)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.util.future.GridFutureAdapter.get(GridFutureAdapter.java:139)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.MarshallerContextImpl.registerClassName(MarshallerContextImpl.java:262)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.binary.BinaryContext.registerUserClassDescriptor(BinaryContext.java:780)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.binary.BinaryContext.registerClassDescriptor(BinaryContext.java:757)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.binary.BinaryContext.descriptorForClass(BinaryContext.java:628)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.binary.BinaryWriterExImpl.marshal0(BinaryWriterExImpl.java:164)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:147)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:134)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.binary.GridBinaryMarshaller.marshal(GridBinaryMarshaller.java:248)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.processors.cache.binary.CacheObjectBinaryProcessorImpl.marshalToBinary(CacheObjectBinaryProcessorImpl.java:371)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.processors.cache.binary.CacheObjectBinaryProcessorImpl.toBinary(CacheObjectBinaryProcessorImpl.java:849)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.processors.cache.binary.CacheObjectBinaryProcessorImpl.toCacheObject(CacheObjectBinaryProcessorImpl.java:799)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.processors.cache.GridCacheContext.toCacheObject(GridCacheContext.java:1769)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapSingleUpdate(GridNearAtomicSingleUpdateFuture.java:546)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.map(GridNearAtomicSingleUpdateFuture.java:451)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:440)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:248)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update0(GridDhtAtomicCache.java:1161)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.put0(GridDhtAtomicCache.java:650)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2329)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.processors.cache.distributed.near.GridNearAtomicCache.put(GridNearAtomicCache.java:444)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2306)
> [17:37:00] :	 [Step 4/5]         at o.a.i.i.processors.cache.IgniteCacheProxy.put(IgniteCacheProxy.java:1494)
> [17:37:00] :	 [Step 4/5]         at o.a.i.jdbc.thin.JdbcThinDynamicIndexAbstractSelfTest.testIndexState(JdbcThinDynamicIndexAbstractSelfTest.java:273)
> [17:37:00] :	 [Step 4/5]         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [17:37:00] :	 [Step 4/5]         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> [17:37:00] :	 [Step 4/5]         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [17:37:00] :	 [Step 4/5]         at java.lang.reflect.Method.invoke(Method.java:606)
> [17:37:00] :	 [Step 4/5]         at junit.framework.TestCase.runTest(TestCase.java:176)
> [17:37:00] :	 [Step 4/5]         at o.a.i.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:1963)
> [17:37:00] :	 [Step 4/5]         at o.a.i.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:130)
> [17:37:00] :	 [Step 4/5]         at o.a.i.testframework.junits.GridAbstractTest$5.run(GridAbstractTest.java:1878)
> [17:37:00] :	 [Step 4/5]         at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)