You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Stanilovsky Evgeny (JIRA)" <ji...@apache.org> on 2017/05/02 13:54:04 UTC

[jira] [Commented] (IGNITE-5125) Need to improve logging in case of hang

    [ https://issues.apache.org/jira/browse/IGNITE-5125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992932#comment-15992932 ] 

Stanilovsky Evgeny commented on IGNITE-5125:
--------------------------------------------

Yakov, pt 1 it`s all due to :
{code}
// If exchange is in progress it will dump all hanging operations if any.
           if (lastFut != null && !lastFut.isDone())
               return;
{code} does ignite config property will be proper here ?

pt 2\3 i hope GridLogThrottle is good candidate for this purpose (with some modifications) but found memory leak in this realization, force to create jira ticket.

> Need to improve logging in case of hang
> ---------------------------------------
>
>                 Key: IGNITE-5125
>                 URL: https://issues.apache.org/jira/browse/IGNITE-5125
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Yakov Zhdanov
>            Assignee: Stanilovsky Evgeny
>            Priority: Critical
>             Fix For: 2.1
>
>
> 1. When cache operation hangs on node it is not reported as hanged although partition map exchange cannot finish.
> {noformat}
>    java.lang.Thread.State: WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	- parking to wait for  <0x00007fdd260fd7c0> (a org.apache.ignite.internal.util.future.GridEmbeddedFuture)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:159)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:119)
> 	at org.apache.ignite.internal.processors.cache.GridCacheAdapter$22.op(GridCacheAdapter.java:2356)
> 	at org.apache.ignite.internal.processors.cache.GridCacheAdapter$22.op(GridCacheAdapter.java:2354)
> 	at org.apache.ignite.internal.processors.cache.GridCacheAdapter.syncOp(GridCacheAdapter.java:4168)
> 	at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put0(GridCacheAdapter.java:2354)
> 	at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2335)
> 	at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2312)
> 	at org.apache.ignite.internal.processors.cache.IgniteCacheProxy.put(IgniteCacheProxy.java:1379)
> {noformat}
> {noformat}
>   java.lang.Thread.State: WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	- parking to wait for  <0x00007fa0698fee50> (a org.apache.ignite.internal.processors.cache.distributed.dht.GridPartitionedSingleGetFuture)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:161)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:119)
> 	at org.apache.ignite.internal.processors.cache.GridCacheAdapter.get0(GridCacheAdapter.java:4570)
> 	at org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:4544)
> 	at org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:1428)
> 	at org.apache.ignite.internal.processors.cache.GridCacheProxyImpl.get(GridCacheProxyImpl.java:329)
> 	at org.apache.ignite.internal.processors.datastructures.DataStructuresProcessor.getAtomic(DataStructuresProcessor.java:589)
> 	at org.apache.ignite.internal.processors.datastructures.DataStructuresProcessor.sequence(DataStructuresProcessor.java:397)
> {noformat}
> 2. Partition exchnage future dumps objects only limited number of times. I would suggest to switch to mode when we double the delay between dumps each time, but no more than 30min
> 3. If exchange worker is stuck at GridDhtPartitionsExchangeFuture.waitPartitionRelease then unreleased partitions should be reported (same rules as of pt 2 apply) 
> {noformat}
> "exchange-worker-#93%...%" #143 prio=5 os_prio=0 tid=0x00007fd782df3000 nid=0x1526 waiting on condition [0x00007fc2dc9c5000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	- parking to wait for  <0x00007fcab30006b8> (a org.apache.ignite.internal.util.future.GridCompoundFuture)
> 	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:189)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:139)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.waitPartitionRelease(GridDhtPartitionsExchangeFuture.java:980)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:907)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:617)
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:1701)
> 	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> 	at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)