You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by bintisepaha <bi...@tudor.com> on 2017/02/09 22:36:22 UTC

Pessimistic TXN did not release lock on a key, all subsequent txns failed

Guys,

We use pessimistic txns in ignite and they have been working great for
months now, however yesterday we ran into an issue, where on multiple
updates to the same key (although in interval of minutes), many updates went
through, but at some point, we started getting TransactionTimeoutException
because the key was locked. After that any update for the same key failed
with the same error.

We checked the logs for the last successful update and all looks good.
Something happened that although the txn was completed, they key was still
not released. We had to actually restart the cluster post business hours to
resolve the issue.

1) Is there a way to avoid this? 
2) Can we monitor objects via visor or jconsole to see if some keys are
locked for a prolonged period of time?
3) Is there a way to free up the lock remotely or by running some salvaging
code?

Please advise, your help is much appreciated.

Thanks,
Binti




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by agura <ag...@apache.org>.
Hi,

I think writeSynchronizationMode doesn't affect key locking. Actually this
mode is just hint for transaction origination node ant it defines when tx
should return control flow to user code: when all nodes responded or only
primary nodes responded (without waiting for reply from backups).



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p13516.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Andrey, It has been very hard for us to reproduce this. 
However, I have a question around writeSynchronizationMode. The caches we
have seen being locked from time to time and not having the key released are
in "FULL_SYNC" mode. All our other caches are also distributed but in
PRIMARY_SYNC mode. 

We are having some different issues with upgrading to 1.9. We are seeing
grid stalling in 1.7 and similar in 1.9 which my co-worker would post on a
new thread. But that is holding us from upgrading.

Let us know if you think different writeSynchronizationModes would affect
key locking or could be hte cause of our issue.

Thanks,
Binti



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p13078.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
Hi,

issue mentioned early is related with both pessimistic and optimistic
transactions. So it is still possible that it is your case.

Anyway, I don't have any other ideas about the problem. It would be
great to have reproducer in order to debug issue. But from my point of
view migaration to newer version is the best way.

On Fri, May 19, 2017 at 12:23 PM, yfernando <yo...@tudor.com> wrote:
> The issue you mention doesn't seem to be related to the issue we are having
> as,
> - The entire grid does not lock up as in the situation described in the bug
> - There are no threads blocked or locked when our key lock occurs
> - The bug seems to occur on Optimistic locking whereas our scenario occurs
> on Pessimistic
>
>
> agura wrote
>> There was a problem with incorrect transaction timeout handling [1]
>> that was fixed in Ignite 1.8. It is possible that it is your case.
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-2797
>
> Another observation we had was that the key that was locked occurred on a
> cache that gets loaded at startup and never changes content. The cache does
> however take part in transactions in terms of cache reads (using
> IgniteCache.get() )
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p13021.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by yfernando <yo...@tudor.com>.
The issue you mention doesn't seem to be related to the issue we are having
as,
- The entire grid does not lock up as in the situation described in the bug
- There are no threads blocked or locked when our key lock occurs
- The bug seems to occur on Optimistic locking whereas our scenario occurs
on Pessimistic


agura wrote
> There was a problem with incorrect transaction timeout handling [1]
> that was fixed in Ignite 1.8. It is possible that it is your case.
> 
> [1] https://issues.apache.org/jira/browse/IGNITE-2797

Another observation we had was that the key that was locked occurred on a
cache that gets loaded at startup and never changes content. The cache does
however take part in transactions in terms of cache reads (using
IgniteCache.get() )



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p13021.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
Hi,

There was a problem with incorrect transaction timeout handling [1]
that was fixed in Ignite 1.8. It is possible that it is your case.

[1] https://issues.apache.org/jira/browse/IGNITE-2797

On Thu, May 11, 2017 at 1:51 AM, bintisepaha <bi...@tudor.com> wrote:
> Hey guys, we had a key lock issue again on 1.7.0. here is a suspicious thread
> dump. Is this helpful for tracking down our issue further?
> we did not see any topology changes or any other exceptions.
>
> Attaching the entire thread dump too tdump.zip
> <http://apache-ignite-users.70518.x6.nabble.com/file/n12611/tdump.zip>
>
> "pub-#7%DataGridServer-Production%" Id=47 in WAITING on
> lock=org.apache.ignite.internal.util.future.GridFutureAdapter$ChainFuture@2094df59
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>   at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>   at
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:159)
>   at
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:117)
>   at
> org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:4800)
>   at
> org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:4783)
>   at
> org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:1395)
>   at
> org.apache.ignite.internal.processors.cache.IgniteCacheProxy.get(IgniteCacheProxy.java:956)
>   at
> com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.updatePosition(OrderHolderSaveRunnable.java:790)
>   at
> com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.cancelPosition(OrderHolderSaveRunnable.java:805)
>   at
> com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.cancelExistingTradeOrderForPositionUpdate(OrderHolderSaveRunnable.java:756)
>   at
> com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.processOrderHolders(OrderHolderSaveRunnable.java:356)
>   at
> com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.run(OrderHolderSaveRunnable.java:109)
>   at
> org.apache.ignite.internal.processors.closure.GridClosureProcessor$C4V2.execute(GridClosureProcessor.java:2184)
>   at
> org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:509)
>   at
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6521)
>   at
> org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:503)
>   at
> org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:456)
>   at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at
> org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1161)
>   at
> org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1766)
>   at
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1238)
>   at
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:866)
>   at
> org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:106)
>   at
> org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:829)
>   at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
>
>   Locked synchronizers: count = 1
>      <1237e0be>  - java.util.concurrent.ThreadPoolExecutor$Worker@1237e0be
>
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12611.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Hey guys, we had a key lock issue again on 1.7.0. here is a suspicious thread
dump. Is this helpful for tracking down our issue further?
we did not see any topology changes or any other exceptions.

Attaching the entire thread dump too tdump.zip
<http://apache-ignite-users.70518.x6.nabble.com/file/n12611/tdump.zip>  

"pub-#7%DataGridServer-Production%" Id=47 in WAITING on
lock=org.apache.ignite.internal.util.future.GridFutureAdapter$ChainFuture@2094df59
  at sun.misc.Unsafe.park(Native Method)
  at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
  at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
  at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
  at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
  at
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:159)
  at
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:117)
  at
org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:4800)
  at
org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:4783)
  at
org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:1395)
  at
org.apache.ignite.internal.processors.cache.IgniteCacheProxy.get(IgniteCacheProxy.java:956)
  at
com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.updatePosition(OrderHolderSaveRunnable.java:790)
  at
com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.cancelPosition(OrderHolderSaveRunnable.java:805)
  at
com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.cancelExistingTradeOrderForPositionUpdate(OrderHolderSaveRunnable.java:756)
  at
com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.processOrderHolders(OrderHolderSaveRunnable.java:356)
  at
com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.run(OrderHolderSaveRunnable.java:109)
  at
org.apache.ignite.internal.processors.closure.GridClosureProcessor$C4V2.execute(GridClosureProcessor.java:2184)
  at
org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:509)
  at
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6521)
  at
org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:503)
  at
org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:456)
  at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
  at
org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1161)
  at
org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1766)
  at
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1238)
  at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:866)
  at
org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:106)
  at
org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:829)
  at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)

  Locked synchronizers: count = 1
     <1237e0be>  - java.util.concurrent.ThreadPoolExecutor$Worker@1237e0be





--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12611.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
Hi,

fix merged into ignite-2.0 that will be released soon.

On Tue, Apr 25, 2017 at 4:52 PM, bintisepaha <bi...@tudor.com> wrote:
> We have not been able to migrate to 1.8 yet due to business reasons, and now
> we are planning to migrate to 1.9 directly and are testing it.
>
> Do you have an ETA for the fix? Also do you have the code for reproducing
> it?
>
> We are also changing our clients to become long running servers so that the
> topology does not change a lot.
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12237.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
We have not been able to migrate to 1.8 yet due to business reasons, and now
we are planning to migrate to 1.9 directly and are testing it.

Do you have an ETA for the fix? Also do you have the code for reproducing
it?

We are also changing our clients to become long running servers so that the
topology does not change a lot.

Thanks,
Binti



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12237.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
Hi Binti,

unfortunately I don't have any other ideas why this NPE happens
(except of two mentioned above). But I believe that fix of this
problem will prevent this NPE in all other cases.

About key remaining future: in some cases mentioned NPE led to it.

BTW, did you migrate to Ignite 1.8?


On Fri, Apr 21, 2017 at 10:39 PM, bintisepaha <bi...@tudor.com> wrote:
> Andrey, we never start a txn on the client side. The key that gets locked on
> our end and stays locked even after a successful txn is never read or
> updated in a txn from the client side. Are you also able to reproduce the
> key remaining locked issue?
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12164.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Andrey, we never start a txn on the client side. The key that gets locked on
our end and stays locked even after a successful txn is never read or
updated in a txn from the client side. Are you also able to reproduce the
key remaining locked issue?

Thanks,
Binti



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12164.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
Hi,

I have a simple reproducer and JIRA ticket is created:
https://issues.apache.org/jira/browse/IGNITE-5041

Workaround looks very strange but it should work: you should call
getOrCreateCache on client node for all your caches or disable
deadlock detection (see ticket)


On Thu, Apr 20, 2017 at 12:32 AM, bintisepaha <bi...@tudor.com> wrote:
> This is positive news Andrey. Thanks a lot.
>
> Please keep us posted about reproducing this. We are definitely not using
> node filters...and we suspect topology changes to be causing issues, but
> irrespective of that, we are not able to reproduce it. we also do not see
> deadlock issues reported anywhere. the last time we got a key lock last
> week, we did not see the NPE but only topology change for client.
>
> Thanks,
> Binti
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12094.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
This is positive news Andrey. Thanks a lot.

Please keep us posted about reproducing this. We are definitely not using
node filters...and we suspect topology changes to be causing issues, but
irrespective of that, we are not able to reproduce it. we also do not see
deadlock issues reported anywhere. the last time we got a key lock last
week, we did not see the NPE but only topology change for client.

Thanks,
Binti




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12094.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
Hi,

I've reproduced the problem and have exatly the same stack traces for
NullPointerException and IgniteTxTimeoutCheckedException that you
mentioned early.

But my case looks too complex. I started three nodes and cache1 on
nodes N1, N2 and N3, and cache2 on nodes N1 and N2. After it deadlock
was created between nodes N1 and N2 where both caches are
participiants (without transaction timeout). And finally I try to
update key (that participiates in deadlock) for cache2 from N3 in
transaction with timeout. As result deadlock detection receives
message from nodes N1 or/and N2 that contains information about cache1
that isn't started on N3. This leads to NPE and timeout.

I think that similar situation can happen in case of deadlock between
two caches on server nodes and attempt to update key from client node.
Will try this idea tomorrow hopefully.

This problem should exist on any Ignite version starting from 1.7.

On Mon, Apr 17, 2017 at 11:46 PM, bintisepaha <bi...@tudor.com> wrote:
> Looking further, I see this in the failed exception stack trace. The topology
> did change but it is only a client that joined, do you think that has any
> correlation to the key being locked?
>
> [INFO ] 2017-04-13 14:15:44.360 [pub-#44%DataGridServer-Production%]
> OrderHolderSaveRunnable - Updating PositionKey: PositionId [fundAbbrev=BVI,
> clearBrokerId=12718, insIid=679675, strategy=AFI, traderId=6531,
> valueDate=19000101]
> [14:15:46] Topology snapshot [ver=1980, servers=16, clients=82, CPUs=273,
> heap=850.0GB]
> [ERROR] 2017-04-13 14:15:54.348 [pub-#44%DataGridServer-Production%]
> OrderHolderSaveRunnable - Received Exception - printing on Entry
> javax.cache.CacheException: class
> org.apache.ignite.transactions.TransactionTimeoutException: Failed to
> acquire lock within provided timeout for transaction [timeout=10000,
> tx=GridNearTxLocal [mappings=IgniteTxMappingsImpl [],
> nearLocallyMapped=false, colocatedLocallyMapped=false, needCheckBackup=null,
> hasRemoteLocks=true, thread=pub-#44%DataGridServer-Production%,
> mappings=IgniteTxMappingsImpl [], super=GridDhtTxLocalAdapter
> [nearOnOriginatingNode=false, nearNodes=[], dhtNodes=[], explicitLock=false,
> super=IgniteTxLocalAdapter [completedBase=null, sndTransformedVals=false,
> depEnabled=false, txState=IgniteTxStateImpl [activeCacheIds=GridLongList
> [idx=2, arr=[2062286236,812449097]], txMap={IgniteTxKey
> [key=KeyCacheObjectImpl [val=OrderKey [traderId=6531, orderId=12382604],
> hasValBytes=true], cacheId=2062286236]=IgniteTxEntry [key=KeyCacheObjectImpl
> [val=OrderKey [traderId=6531, orderId=12382604], hasValBytes=true],
> cacheId=2062286236, partId=-1, txKey=IgniteTxKey [key=KeyCacheObjectImpl
> [val=OrderKey [traderId=6531, orderId=12382604], hasValBytes=true],
> cacheId=2062286236], val=[op=READ, val=CacheObjectImpl [val=TradeOrder
> [orderKey=OrderKey [traderId=6531, orderId=12382604], insIid=679675,
> clearBrokerId=12718, strategy=AFI, time=2017-04-13 13:30:00.0,
> settlement=2017-04-19 00:00:00.0, quantity=-6800.0, insType=STK, version=1,
> userId=3081, created=2017-04-13 13:29:47.831, status=open, allocFund=STD,
> isAlloc=Y, clearAgent=MSCOEPB, execBroker=DBKSE, initiate=L,
> notes=ClOrdId[20170413-Y47D580RHH99], allocRule=H2L, comType=T, comTurn=N,
> comImplied=N, trdCur=USD, trdFreeze=N, kindFlag=, lastRepo=, exCpn=,
> generatedTime=Thu Apr 13 14:15:02 EDT 2017, batchMatchFlag=N,
> commission=0.003, trdRate=1.0, gross=, delivInstruct=null, startflys=3,
> parentId=null, linkId=null, repo=N, repoRate=null, repoCalendar=null,
> repoStartDate=null, repoEndDate=null, xiid=null, quantityCurr=null,
> masterOrderId=null, unfilledQty=800.0, avgFillPrice=18.0021324, psRuleId=6,
> origDate=2017-04-13 00:00:00.0, postingId=2, executingUserId=5647,
> repoCloseDate=1900-01-01 00:00:00.0, repoPrice=0.0, directFxFlag=N, tax=0.0,
> fixStatusId=58, txnTypeId=0, yield=null, valueDate=null,
> interestOnlyRepoFlag=null, orderGroupId=0, fundingDate=2017-04-19
> 00:00:00.0, execBrokerId=12038, branchBrokerId=7511, fillOrigUserId=3081,
> initialMargin=null, cmmsnChgUserId=0, cmmsnChgReasonId=0, fixingSourceId=0,
> orderDesignationId=0, riskRewardId=0, placementTime=2017-04-13 13:29:47.657,
> initialInvestment=0.0, equityFxBrokerTypeId=0, execBranchBrokerId=0,
> createUserId=3081, targetAllocFlag=N, pvDate=null, pvFactor=null, pvId=0,
> executionTypeId=0, borrowScheduleId=0, borrowScheduleTypeId=0,
> marketPrice=null, interestAccrualDate=null, sourceAppId=103,
> initiatingUserId=6531, isDiscretionary=Y, traderBsssc=S, clearingBsssc=S,
> executingBsssc=S, shortsellBanApproverUserId=null, intendedQuantity=-7600.0,
> lastUpdated=2017-04-13 14:15:02.147, traderStrategyId=24686,
> businessDate=2017-04-13 00:00:00.0, firstExecutionTime=2017-04-13
> 13:29:47.657, doNotBulkFlag=null, trimDb=trim_grn, trades=[Trade
> [tradeKey=TradeKey [tradeId=263603637, tradeId64=789971421, traderId=6531],
> orderId=12382604, ftbId=2023850, quantity=-985.0, fundAbbrev=TRCP,
> subfundAbbrev=TRCP_EDAB, date=Thu Apr 13 00:00:00 EDT 2017,
> commission=0.003, fillId=1, flyallocNumber=1, pnlTime=Tue Jan 01 00:00:00
> EST 2036, price=18.0021324, psId=0, psId64=0, psLiquid=I, psSettle=Tue Jan
> 01 00:00:00 EST 2036, psTime=Tue Jan 01 00:00:00 EST 2036, splitTradeId=0,
> splitTradeId64=0, trimDb=], Trade [tradeKey=TradeKey [tradeId=888175445,
> tradeId64=182624390, traderId=6531], orderId=12382604, ftbId=2022525,
> quantity=-4141.0, fundAbbrev=BVI, subfundAbbrev=BVI_EDAB, date=Thu Apr 13
> 00:00:00 EDT 2017, commission=0.003, fillId=1, flyallocNumber=2, pnlTime=Tue
> Jan 01 00:00:00 EST 2036, price=18.0021324, psId=0, psId64=0, psLiquid=I,
> psSettle=Tue Jan 01 00:00:00 EST 2036, psTime=Tue Jan 01 00:00:00 EST 2036,
> splitTradeId=0, splitTradeId64=0, trimDb=], Trade [tradeKey=TradeKey
> [tradeId=938093803, tradeId64=953318988, traderId=6531], orderId=12382604,
> ftbId=2022524, quantity=-1674.0, fundAbbrev=TGF, subfundAbbrev=TGF_EDAB,
> date=Thu Apr 13 00:00:00 EDT 2017, commission=0.003, fillId=1,
> flyallocNumber=3, pnlTime=Tue Jan 01 00:00:00 EST 2036, price=18.0021324,
> psId=0, psId64=0, psLiquid=I, psSettle=Tue Jan 01 00:00:00 EST 2036,
> psTime=Tue Jan 01 00:00:00 EST 2036, splitTradeId=0, splitTradeId64=0,
> trimDb=]]], hasValBytes=true]], prevVal=[op=READ, val=CacheObjectImpl
> [val=TradeOrder [orderKey=OrderKey [traderId=6531, orderId=12382604],
> insIid=679675, clearBrokerId=12718, strategy=AFI, time=2017-04-13
> 13:30:00.0, settlement=2017-04-19 00:00:00.0, quantity=-6800.0, insType=STK,
> version=1, userId=3081, created=2017-04-13 13:29:47.831, status=open,
> allocFund=STD, isAlloc=Y, clearAgent=MSCOEPB, execBroker=DBKSE, initiate=L,
> notes=ClOrdId[20170413-Y47D580RHH99], allocRule=H2L, comType=T, comTurn=N,
> comImplied=N, trdCur=USD, trdFreeze=N, kindFlag=, lastRepo=, exCpn=,
> generatedTime=Thu Apr 13 14:15:02 EDT 2017, batchMatchFlag=N,
> commission=0.003, trdRate=1.0, gross=, delivInstruct=null, startflys=3,
> parentId=null, linkId=null, repo=N, repoRate=null, repoCalendar=null,
> repoStartDate=null, repoEndDate=null, xiid=null, quantityCurr=null,
> masterOrderId=null, unfilledQty=800.0, avgFillPrice=18.0021324, psRuleId=6,
> origDate=2017-04-13 00:00:00.0, postingId=2, executingUserId=5647,
> repoCloseDate=1900-01-01 00:00:00.0, repoPrice=0.0, directFxFlag=N, tax=0.0,
> fixStatusId=58, txnTypeId=0, yield=null, valueDate=null,
> interestOnlyRepoFlag=null, orderGroupId=0, fundingDate=2017-04-19
> 00:00:00.0, execBrokerId=12038, branchBrokerId=7511, fillOrigUserId=3081,
> initialMargin=null, cmmsnChgUserId=0, cmmsnChgReasonId=0, fixingSourceId=0,
> orderDesignationId=0, riskRewardId=0, placementTime=2017-04-13 13:29:47.657,
> initialInvestment=0.0, equityFxBrokerTypeId=0, execBranchBrokerId=0,
> createUserId=3081, targetAllocFlag=N, pvDate=null, pvFactor=null, pvId=0,
> executionTypeId=0, borrowScheduleId=0, borrowScheduleTypeId=0,
> marketPrice=null, interestAccrualDate=null, sourceAppId=103,
> initiatingUserId=6531, isDiscretionary=Y, traderBsssc=S, clearingBsssc=S,
> executingBsssc=S, shortsellBanApproverUserId=null, intendedQuantity=-7600.0,
> lastUpdated=2017-04-13 14:15:02.147, traderStrategyId=24686,
> businessDate=2017-04-13 00:00:00.0, firstExecutionTime=2017-04-13
> 13:29:47.657, doNotBulkFlag=null, trimDb=trim_grn, trades=[Trade
> [tradeKey=TradeKey [tradeId=263603637, tradeId64=789971421, traderId=6531],
> orderId=12382604, ftbId=2023850, quantity=-985.0, fundAbbrev=TRCP,
> subfundAbbrev=TRCP_EDAB, date=Thu Apr 13 00:00:00 EDT 2017,
> commission=0.003, fillId=1, flyallocNumber=1, pnlTime=Tue Jan 01 00:00:00
> EST 2036, price=18.0021324, psId=0, psId64=0, psLiquid=I, psSettle=Tue Jan
> 01 00:00:00 EST 2036, psTime=Tue Jan 01 00:00:00 EST 2036, splitTradeId=0,
> splitTradeId64=0, trimDb=], Trade [tradeKey=TradeKey [tradeId=888175445,
> tradeId64=182624390, traderId=6531], orderId=12382604, ftbId=2022525,
> quantity=-4141.0, fundAbbrev=BVI, subfundAbbrev=BVI_EDAB, date=Thu Apr 13
> 00:00:00 EDT 2017, commission=0.003, fillId=1, flyallocNumber=2, pnlTime=Tue
> Jan 01 00:00:00 EST 2036, price=18.0021324, psId=0, psId64=0, psLiquid=I,
> psSettle=Tue Jan 01 00:00:00 EST 2036, psTime=Tue Jan 01 00:00:00 EST 2036,
> splitTradeId=0, splitTradeId64=0, trimDb=], Trade [tradeKey=TradeKey
> [tradeId=938093803, tradeId64=953318988, traderId=6531], orderId=12382604,
> ftbId=2022524, quantity=-1674.0, fundAbbrev=TGF, subfundAbbrev=TGF_EDAB,
> date=Thu Apr 13 00:00:00 EDT 2017, commission=0.003, fillId=1,
> flyallocNumber=3, pnlTime=Tue Jan 01 00:00:00 EST 2036, price=18.0021324,
> psId=0, psId64=0, psLiquid=I, psSettle=Tue Jan 01 00:00:00 EST 2036,
> psTime=Tue Jan 01 00:00:00 EST 2036, splitTradeId=0, splitTradeId64=0,
> trimDb=]]], hasValBytes=true]], entryProcessorsCol=null, ttl=-1,
> conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null,
> filters=null, filtersPassed=false, filtersSet=true,
> entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry
> [super=GridCacheMapEntry [key=KeyCacheObjectImpl [val=OrderKey
> [traderId=6531, orderId=12382604], hasValBytes=true], val=CacheObjectImpl
> [val=TradeOrder [orderKey=OrderKey [traderId=6531, orderId=12382604],
> insIid=679675, clearBrokerId=12718, strategy=AFI, time=2017-04-13
> 13:30:00.0, settlement=2017-04-19 00:00:00.0, quantity=-6800.0, insType=STK,
> version=1, userId=3081, created=2017-04-13 13:29:47.831, status=open,
> allocFund=STD, isAlloc=Y, clearAgent=MSCOEPB, execBroker=DBKSE, initiate=L,
> notes=ClOrdId[20170413-Y47D580RHH99], allocRule=H2L, comType=T, comTurn=N,
> comImplied=N, trdCur=USD, trdFreeze=N, kindFlag=, lastRepo=, exCpn=,
> generatedTime=Thu Apr 13 14:15:02 EDT 2017, batchMatchFlag=N,
> commission=0.003, trdRate=1.0, gross=, delivInstruct=null, startflys=3,
> parentId=null, linkId=null, repo=N, repoRate=null, repoCalendar=null,
> repoStartDate=null, repoEndDate=null, xiid=null, quantityCurr=null,
> masterOrderId=null, unfilledQty=800.0, avgFillPrice=18.0021324, psRuleId=6,
> origDate=2017-04-13 00:00:00.0, postingId=2, executingUserId=5647,
> repoCloseDate=1900-01-01 00:00:00.0, repoPrice=0.0, directFxFlag=N, tax=0.0,
> fixStatusId=58, txnTypeId=0, yield=null, valueDate=null,
> interestOnlyRepoFlag=null, orderGroupId=0, fundingDate=2017-04-19
> 00:00:00.0, execBrokerId=12038, branchBrokerId=7511, fillOrigUserId=3081,
> initialMargin=null, cmmsnChgUserId=0, cmmsnChgReasonId=0, fixingSourceId=0,
> orderDesignationId=0, riskRewardId=0, placementTime=2017-04-13 13:29:47.657,
> initialInvestment=0.0, equityFxBrokerTypeId=0, execBranchBrokerId=0,
> createUserId=3081, targetAllocFlag=N, pvDate=null, pvFactor=null, pvId=0,
> executionTypeId=0, borrowScheduleId=0, borrowScheduleTypeId=0,
> marketPrice=null, interestAccrualDate=null, sourceAppId=103,
> initiatingUserId=6531, isDiscretionary=Y, traderBsssc=S, clearingBsssc=S,
> executingBsssc=S, shortsellBanApproverUserId=null, intendedQuantity=-7600.0,
> lastUpdated=2017-04-13 14:15:02.147, traderStrategyId=24686,
> businessDate=2017-04-13 00:00:00.0, firstExecutionTime=2017-04-13
> 13:29:47.657, doNotBulkFlag=null, trimDb=trim_grn, trades=[Trade
> [tradeKey=TradeKey [tradeId=263603637, tradeId64=789971421, traderId=6531],
> orderId=12382604, ftbId=2023850, quantity=-985.0, fundAbbrev=TRCP,
> subfundAbbrev=TRCP_EDAB, date=Thu Apr 13 00:00:00 EDT 2017,
> commission=0.003, fillId=1, flyallocNumber=1, pnlTime=Tue Jan 01 00:00:00
> EST 2036, price=18.0021324, psId=0, psId64=0, psLiquid=I, psSettle=Tue Jan
> 01 00:00:00 EST 2036, psTime=Tue Jan 01 00:00:00 EST 2036, splitTradeId=0,
> splitTradeId64=0, trimDb=], Trade [tradeKey=TradeKey [tradeId=888175445,
> tradeId64=182624390, traderId=6531], orderId=12382604, ftbId=2022525,
> quantity=-4141.0, fundAbbrev=BVI, subfundAbbrev=BVI_EDAB, date=Thu Apr 13
> 00:00:00 EDT 2017, commission=0.003, fillId=1, flyallocNumber=2, pnlTime=Tue
> Jan 01 00:00:00 EST 2036, price=18.0021324, psId=0, psId64=0, psLiquid=I,
> psSettle=Tue Jan 01 00:00:00 EST 2036, psTime=Tue Jan 01 00:00:00 EST 2036,
> splitTradeId=0, splitTradeId64=0, trimDb=], Trade [tradeKey=TradeKey
> [tradeId=938093803, tradeId64=953318988, traderId=6531], orderId=12382604,
> ftbId=2022524, quantity=-1674.0, fundAbbrev=TGF, subfundAbbrev=TGF_EDAB,
> date=Thu Apr 13 00:00:00 EDT 2017, commission=0.003, fillId=1,
> flyallocNumber=3, pnlTime=Tue Jan 01 00:00:00 EST 2036, price=18.0021324,
> psId=0, psId64=0, psLiquid=I, psSettle=Tue Jan 01 00:00:00 EST 2036,
> psTime=Tue Jan 01 00:00:00 EST 2036, splitTradeId=0, splitTradeId64=0,
> trimDb=]]], hasValBytes=true], startVer=1492027414017, ver=GridCacheVersion
> [topVer=103141923, time=1492107302158, order=1492027411508, nodeOrder=14],
> hash=383868216, extras=null, flags=0]]], prepared=0, locked=true,
> nodeId=89f19ec9-462e-4d99-a267-1189457c124b, locMapped=false,
> expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0,
> serReadVer=null, xidVer=GridCacheVersion [topVer=103141926,
> time=1492107344341, order=1492027414014, nodeOrder=13]], IgniteTxKey
> [key=KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
> clearBrokerId=12718, insIid=679675, strategy=AFI, traderId=6531,
> valueDate=19000101], hasValBytes=true], cacheId=812449097]=IgniteTxEntry
> [key=KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
> clearBrokerId=12718, insIid=679675, strategy=AFI, traderId=6531,
> valueDate=19000101], hasValBytes=true], cacheId=812449097, partId=-1,
> txKey=IgniteTxKey [key=KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
> clearBrokerId=12718, insIid=679675, strategy=AFI, traderId=6531,
> valueDate=19000101], hasValBytes=true], cacheId=812449097], val=[op=READ,
> val=null], prevVal=[op=NOOP, val=null], entryProcessorsCol=null, ttl=-1,
> conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null,
> filters=null, filtersPassed=false, filtersSet=true,
> entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry
> [super=GridCacheMapEntry [key=KeyCacheObjectImpl [val=PositionId
> [fundAbbrev=BVI, clearBrokerId=12718, insIid=679675, strategy=AFI,
> traderId=6531, valueDate=19000101], hasValBytes=true], val=null,
> startVer=1492027414018, ver=GridCacheVersion [topVer=103141926,
> time=1492107344360, order=1492027414018, nodeOrder=13], hash=-1282641846,
> extras=null, flags=0]]], prepared=0, locked=false,
> nodeId=63e3e087-15c5-45e8-9420-9ee657a5a871, locMapped=false,
> expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0,
> serReadVer=null, xidVer=GridCacheVersion [topVer=103141926,
> time=1492107344341, order=1492027414014, nodeOrder=13]]}],
> super=IgniteTxAdapter [xidVer=GridCacheVersion [topVer=103141926,
> time=1492107344341, order=1492027414014, nodeOrder=13], writeVer=null,
> implicit=false, loc=true, threadId=83, startTime=1492107344336,
> nodeId=bdd5e4ed-aac9-4769-b241-a2e6f21f7e18, startVer=GridCacheVersion
> [topVer=103141926, time=1492107344341, order=1492027414014, nodeOrder=13],
> endVer=null, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC,
> timeout=10000, sysInvalidate=false, sys=false, plc=2, commitVer=null,
> finalizing=NONE, preparing=false, invalidParts=null, state=MARKED_ROLLBACK,
> timedOut=false, topVer=AffinityTopologyVersion [topVer=1979,
> minorTopVer=34], duration=10006ms, onePhaseCommit=false], size=2]]]]
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12009.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Looking further, I see this in the failed exception stack trace. The topology
did change but it is only a client that joined, do you think that has any
correlation to the key being locked?

[INFO ] 2017-04-13 14:15:44.360 [pub-#44%DataGridServer-Production%]
OrderHolderSaveRunnable - Updating PositionKey: PositionId [fundAbbrev=BVI,
clearBrokerId=12718, insIid=679675, strategy=AFI, traderId=6531,
valueDate=19000101]
[14:15:46] Topology snapshot [ver=1980, servers=16, clients=82, CPUs=273,
heap=850.0GB]
[ERROR] 2017-04-13 14:15:54.348 [pub-#44%DataGridServer-Production%]
OrderHolderSaveRunnable - Received Exception - printing on Entry
javax.cache.CacheException: class
org.apache.ignite.transactions.TransactionTimeoutException: Failed to
acquire lock within provided timeout for transaction [timeout=10000,
tx=GridNearTxLocal [mappings=IgniteTxMappingsImpl [],
nearLocallyMapped=false, colocatedLocallyMapped=false, needCheckBackup=null,
hasRemoteLocks=true, thread=pub-#44%DataGridServer-Production%,
mappings=IgniteTxMappingsImpl [], super=GridDhtTxLocalAdapter
[nearOnOriginatingNode=false, nearNodes=[], dhtNodes=[], explicitLock=false,
super=IgniteTxLocalAdapter [completedBase=null, sndTransformedVals=false,
depEnabled=false, txState=IgniteTxStateImpl [activeCacheIds=GridLongList
[idx=2, arr=[2062286236,812449097]], txMap={IgniteTxKey
[key=KeyCacheObjectImpl [val=OrderKey [traderId=6531, orderId=12382604],
hasValBytes=true], cacheId=2062286236]=IgniteTxEntry [key=KeyCacheObjectImpl
[val=OrderKey [traderId=6531, orderId=12382604], hasValBytes=true],
cacheId=2062286236, partId=-1, txKey=IgniteTxKey [key=KeyCacheObjectImpl
[val=OrderKey [traderId=6531, orderId=12382604], hasValBytes=true],
cacheId=2062286236], val=[op=READ, val=CacheObjectImpl [val=TradeOrder
[orderKey=OrderKey [traderId=6531, orderId=12382604], insIid=679675,
clearBrokerId=12718, strategy=AFI, time=2017-04-13 13:30:00.0,
settlement=2017-04-19 00:00:00.0, quantity=-6800.0, insType=STK, version=1,
userId=3081, created=2017-04-13 13:29:47.831, status=open, allocFund=STD,
isAlloc=Y, clearAgent=MSCOEPB, execBroker=DBKSE, initiate=L,
notes=ClOrdId[20170413-Y47D580RHH99], allocRule=H2L, comType=T, comTurn=N,
comImplied=N, trdCur=USD, trdFreeze=N, kindFlag=, lastRepo=, exCpn=,
generatedTime=Thu Apr 13 14:15:02 EDT 2017, batchMatchFlag=N,
commission=0.003, trdRate=1.0, gross=, delivInstruct=null, startflys=3,
parentId=null, linkId=null, repo=N, repoRate=null, repoCalendar=null,
repoStartDate=null, repoEndDate=null, xiid=null, quantityCurr=null,
masterOrderId=null, unfilledQty=800.0, avgFillPrice=18.0021324, psRuleId=6,
origDate=2017-04-13 00:00:00.0, postingId=2, executingUserId=5647,
repoCloseDate=1900-01-01 00:00:00.0, repoPrice=0.0, directFxFlag=N, tax=0.0,
fixStatusId=58, txnTypeId=0, yield=null, valueDate=null,
interestOnlyRepoFlag=null, orderGroupId=0, fundingDate=2017-04-19
00:00:00.0, execBrokerId=12038, branchBrokerId=7511, fillOrigUserId=3081,
initialMargin=null, cmmsnChgUserId=0, cmmsnChgReasonId=0, fixingSourceId=0,
orderDesignationId=0, riskRewardId=0, placementTime=2017-04-13 13:29:47.657,
initialInvestment=0.0, equityFxBrokerTypeId=0, execBranchBrokerId=0,
createUserId=3081, targetAllocFlag=N, pvDate=null, pvFactor=null, pvId=0,
executionTypeId=0, borrowScheduleId=0, borrowScheduleTypeId=0,
marketPrice=null, interestAccrualDate=null, sourceAppId=103,
initiatingUserId=6531, isDiscretionary=Y, traderBsssc=S, clearingBsssc=S,
executingBsssc=S, shortsellBanApproverUserId=null, intendedQuantity=-7600.0,
lastUpdated=2017-04-13 14:15:02.147, traderStrategyId=24686,
businessDate=2017-04-13 00:00:00.0, firstExecutionTime=2017-04-13
13:29:47.657, doNotBulkFlag=null, trimDb=trim_grn, trades=[Trade
[tradeKey=TradeKey [tradeId=263603637, tradeId64=789971421, traderId=6531],
orderId=12382604, ftbId=2023850, quantity=-985.0, fundAbbrev=TRCP,
subfundAbbrev=TRCP_EDAB, date=Thu Apr 13 00:00:00 EDT 2017,
commission=0.003, fillId=1, flyallocNumber=1, pnlTime=Tue Jan 01 00:00:00
EST 2036, price=18.0021324, psId=0, psId64=0, psLiquid=I, psSettle=Tue Jan
01 00:00:00 EST 2036, psTime=Tue Jan 01 00:00:00 EST 2036, splitTradeId=0,
splitTradeId64=0, trimDb=], Trade [tradeKey=TradeKey [tradeId=888175445,
tradeId64=182624390, traderId=6531], orderId=12382604, ftbId=2022525,
quantity=-4141.0, fundAbbrev=BVI, subfundAbbrev=BVI_EDAB, date=Thu Apr 13
00:00:00 EDT 2017, commission=0.003, fillId=1, flyallocNumber=2, pnlTime=Tue
Jan 01 00:00:00 EST 2036, price=18.0021324, psId=0, psId64=0, psLiquid=I,
psSettle=Tue Jan 01 00:00:00 EST 2036, psTime=Tue Jan 01 00:00:00 EST 2036,
splitTradeId=0, splitTradeId64=0, trimDb=], Trade [tradeKey=TradeKey
[tradeId=938093803, tradeId64=953318988, traderId=6531], orderId=12382604,
ftbId=2022524, quantity=-1674.0, fundAbbrev=TGF, subfundAbbrev=TGF_EDAB,
date=Thu Apr 13 00:00:00 EDT 2017, commission=0.003, fillId=1,
flyallocNumber=3, pnlTime=Tue Jan 01 00:00:00 EST 2036, price=18.0021324,
psId=0, psId64=0, psLiquid=I, psSettle=Tue Jan 01 00:00:00 EST 2036,
psTime=Tue Jan 01 00:00:00 EST 2036, splitTradeId=0, splitTradeId64=0,
trimDb=]]], hasValBytes=true]], prevVal=[op=READ, val=CacheObjectImpl
[val=TradeOrder [orderKey=OrderKey [traderId=6531, orderId=12382604],
insIid=679675, clearBrokerId=12718, strategy=AFI, time=2017-04-13
13:30:00.0, settlement=2017-04-19 00:00:00.0, quantity=-6800.0, insType=STK,
version=1, userId=3081, created=2017-04-13 13:29:47.831, status=open,
allocFund=STD, isAlloc=Y, clearAgent=MSCOEPB, execBroker=DBKSE, initiate=L,
notes=ClOrdId[20170413-Y47D580RHH99], allocRule=H2L, comType=T, comTurn=N,
comImplied=N, trdCur=USD, trdFreeze=N, kindFlag=, lastRepo=, exCpn=,
generatedTime=Thu Apr 13 14:15:02 EDT 2017, batchMatchFlag=N,
commission=0.003, trdRate=1.0, gross=, delivInstruct=null, startflys=3,
parentId=null, linkId=null, repo=N, repoRate=null, repoCalendar=null,
repoStartDate=null, repoEndDate=null, xiid=null, quantityCurr=null,
masterOrderId=null, unfilledQty=800.0, avgFillPrice=18.0021324, psRuleId=6,
origDate=2017-04-13 00:00:00.0, postingId=2, executingUserId=5647,
repoCloseDate=1900-01-01 00:00:00.0, repoPrice=0.0, directFxFlag=N, tax=0.0,
fixStatusId=58, txnTypeId=0, yield=null, valueDate=null,
interestOnlyRepoFlag=null, orderGroupId=0, fundingDate=2017-04-19
00:00:00.0, execBrokerId=12038, branchBrokerId=7511, fillOrigUserId=3081,
initialMargin=null, cmmsnChgUserId=0, cmmsnChgReasonId=0, fixingSourceId=0,
orderDesignationId=0, riskRewardId=0, placementTime=2017-04-13 13:29:47.657,
initialInvestment=0.0, equityFxBrokerTypeId=0, execBranchBrokerId=0,
createUserId=3081, targetAllocFlag=N, pvDate=null, pvFactor=null, pvId=0,
executionTypeId=0, borrowScheduleId=0, borrowScheduleTypeId=0,
marketPrice=null, interestAccrualDate=null, sourceAppId=103,
initiatingUserId=6531, isDiscretionary=Y, traderBsssc=S, clearingBsssc=S,
executingBsssc=S, shortsellBanApproverUserId=null, intendedQuantity=-7600.0,
lastUpdated=2017-04-13 14:15:02.147, traderStrategyId=24686,
businessDate=2017-04-13 00:00:00.0, firstExecutionTime=2017-04-13
13:29:47.657, doNotBulkFlag=null, trimDb=trim_grn, trades=[Trade
[tradeKey=TradeKey [tradeId=263603637, tradeId64=789971421, traderId=6531],
orderId=12382604, ftbId=2023850, quantity=-985.0, fundAbbrev=TRCP,
subfundAbbrev=TRCP_EDAB, date=Thu Apr 13 00:00:00 EDT 2017,
commission=0.003, fillId=1, flyallocNumber=1, pnlTime=Tue Jan 01 00:00:00
EST 2036, price=18.0021324, psId=0, psId64=0, psLiquid=I, psSettle=Tue Jan
01 00:00:00 EST 2036, psTime=Tue Jan 01 00:00:00 EST 2036, splitTradeId=0,
splitTradeId64=0, trimDb=], Trade [tradeKey=TradeKey [tradeId=888175445,
tradeId64=182624390, traderId=6531], orderId=12382604, ftbId=2022525,
quantity=-4141.0, fundAbbrev=BVI, subfundAbbrev=BVI_EDAB, date=Thu Apr 13
00:00:00 EDT 2017, commission=0.003, fillId=1, flyallocNumber=2, pnlTime=Tue
Jan 01 00:00:00 EST 2036, price=18.0021324, psId=0, psId64=0, psLiquid=I,
psSettle=Tue Jan 01 00:00:00 EST 2036, psTime=Tue Jan 01 00:00:00 EST 2036,
splitTradeId=0, splitTradeId64=0, trimDb=], Trade [tradeKey=TradeKey
[tradeId=938093803, tradeId64=953318988, traderId=6531], orderId=12382604,
ftbId=2022524, quantity=-1674.0, fundAbbrev=TGF, subfundAbbrev=TGF_EDAB,
date=Thu Apr 13 00:00:00 EDT 2017, commission=0.003, fillId=1,
flyallocNumber=3, pnlTime=Tue Jan 01 00:00:00 EST 2036, price=18.0021324,
psId=0, psId64=0, psLiquid=I, psSettle=Tue Jan 01 00:00:00 EST 2036,
psTime=Tue Jan 01 00:00:00 EST 2036, splitTradeId=0, splitTradeId64=0,
trimDb=]]], hasValBytes=true]], entryProcessorsCol=null, ttl=-1,
conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null,
filters=null, filtersPassed=false, filtersSet=true,
entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry
[super=GridCacheMapEntry [key=KeyCacheObjectImpl [val=OrderKey
[traderId=6531, orderId=12382604], hasValBytes=true], val=CacheObjectImpl
[val=TradeOrder [orderKey=OrderKey [traderId=6531, orderId=12382604],
insIid=679675, clearBrokerId=12718, strategy=AFI, time=2017-04-13
13:30:00.0, settlement=2017-04-19 00:00:00.0, quantity=-6800.0, insType=STK,
version=1, userId=3081, created=2017-04-13 13:29:47.831, status=open,
allocFund=STD, isAlloc=Y, clearAgent=MSCOEPB, execBroker=DBKSE, initiate=L,
notes=ClOrdId[20170413-Y47D580RHH99], allocRule=H2L, comType=T, comTurn=N,
comImplied=N, trdCur=USD, trdFreeze=N, kindFlag=, lastRepo=, exCpn=,
generatedTime=Thu Apr 13 14:15:02 EDT 2017, batchMatchFlag=N,
commission=0.003, trdRate=1.0, gross=, delivInstruct=null, startflys=3,
parentId=null, linkId=null, repo=N, repoRate=null, repoCalendar=null,
repoStartDate=null, repoEndDate=null, xiid=null, quantityCurr=null,
masterOrderId=null, unfilledQty=800.0, avgFillPrice=18.0021324, psRuleId=6,
origDate=2017-04-13 00:00:00.0, postingId=2, executingUserId=5647,
repoCloseDate=1900-01-01 00:00:00.0, repoPrice=0.0, directFxFlag=N, tax=0.0,
fixStatusId=58, txnTypeId=0, yield=null, valueDate=null,
interestOnlyRepoFlag=null, orderGroupId=0, fundingDate=2017-04-19
00:00:00.0, execBrokerId=12038, branchBrokerId=7511, fillOrigUserId=3081,
initialMargin=null, cmmsnChgUserId=0, cmmsnChgReasonId=0, fixingSourceId=0,
orderDesignationId=0, riskRewardId=0, placementTime=2017-04-13 13:29:47.657,
initialInvestment=0.0, equityFxBrokerTypeId=0, execBranchBrokerId=0,
createUserId=3081, targetAllocFlag=N, pvDate=null, pvFactor=null, pvId=0,
executionTypeId=0, borrowScheduleId=0, borrowScheduleTypeId=0,
marketPrice=null, interestAccrualDate=null, sourceAppId=103,
initiatingUserId=6531, isDiscretionary=Y, traderBsssc=S, clearingBsssc=S,
executingBsssc=S, shortsellBanApproverUserId=null, intendedQuantity=-7600.0,
lastUpdated=2017-04-13 14:15:02.147, traderStrategyId=24686,
businessDate=2017-04-13 00:00:00.0, firstExecutionTime=2017-04-13
13:29:47.657, doNotBulkFlag=null, trimDb=trim_grn, trades=[Trade
[tradeKey=TradeKey [tradeId=263603637, tradeId64=789971421, traderId=6531],
orderId=12382604, ftbId=2023850, quantity=-985.0, fundAbbrev=TRCP,
subfundAbbrev=TRCP_EDAB, date=Thu Apr 13 00:00:00 EDT 2017,
commission=0.003, fillId=1, flyallocNumber=1, pnlTime=Tue Jan 01 00:00:00
EST 2036, price=18.0021324, psId=0, psId64=0, psLiquid=I, psSettle=Tue Jan
01 00:00:00 EST 2036, psTime=Tue Jan 01 00:00:00 EST 2036, splitTradeId=0,
splitTradeId64=0, trimDb=], Trade [tradeKey=TradeKey [tradeId=888175445,
tradeId64=182624390, traderId=6531], orderId=12382604, ftbId=2022525,
quantity=-4141.0, fundAbbrev=BVI, subfundAbbrev=BVI_EDAB, date=Thu Apr 13
00:00:00 EDT 2017, commission=0.003, fillId=1, flyallocNumber=2, pnlTime=Tue
Jan 01 00:00:00 EST 2036, price=18.0021324, psId=0, psId64=0, psLiquid=I,
psSettle=Tue Jan 01 00:00:00 EST 2036, psTime=Tue Jan 01 00:00:00 EST 2036,
splitTradeId=0, splitTradeId64=0, trimDb=], Trade [tradeKey=TradeKey
[tradeId=938093803, tradeId64=953318988, traderId=6531], orderId=12382604,
ftbId=2022524, quantity=-1674.0, fundAbbrev=TGF, subfundAbbrev=TGF_EDAB,
date=Thu Apr 13 00:00:00 EDT 2017, commission=0.003, fillId=1,
flyallocNumber=3, pnlTime=Tue Jan 01 00:00:00 EST 2036, price=18.0021324,
psId=0, psId64=0, psLiquid=I, psSettle=Tue Jan 01 00:00:00 EST 2036,
psTime=Tue Jan 01 00:00:00 EST 2036, splitTradeId=0, splitTradeId64=0,
trimDb=]]], hasValBytes=true], startVer=1492027414017, ver=GridCacheVersion
[topVer=103141923, time=1492107302158, order=1492027411508, nodeOrder=14],
hash=383868216, extras=null, flags=0]]], prepared=0, locked=true,
nodeId=89f19ec9-462e-4d99-a267-1189457c124b, locMapped=false,
expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0,
serReadVer=null, xidVer=GridCacheVersion [topVer=103141926,
time=1492107344341, order=1492027414014, nodeOrder=13]], IgniteTxKey
[key=KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
clearBrokerId=12718, insIid=679675, strategy=AFI, traderId=6531,
valueDate=19000101], hasValBytes=true], cacheId=812449097]=IgniteTxEntry
[key=KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
clearBrokerId=12718, insIid=679675, strategy=AFI, traderId=6531,
valueDate=19000101], hasValBytes=true], cacheId=812449097, partId=-1,
txKey=IgniteTxKey [key=KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
clearBrokerId=12718, insIid=679675, strategy=AFI, traderId=6531,
valueDate=19000101], hasValBytes=true], cacheId=812449097], val=[op=READ,
val=null], prevVal=[op=NOOP, val=null], entryProcessorsCol=null, ttl=-1,
conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null,
filters=null, filtersPassed=false, filtersSet=true,
entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry
[super=GridCacheMapEntry [key=KeyCacheObjectImpl [val=PositionId
[fundAbbrev=BVI, clearBrokerId=12718, insIid=679675, strategy=AFI,
traderId=6531, valueDate=19000101], hasValBytes=true], val=null,
startVer=1492027414018, ver=GridCacheVersion [topVer=103141926,
time=1492107344360, order=1492027414018, nodeOrder=13], hash=-1282641846,
extras=null, flags=0]]], prepared=0, locked=false,
nodeId=63e3e087-15c5-45e8-9420-9ee657a5a871, locMapped=false,
expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0,
serReadVer=null, xidVer=GridCacheVersion [topVer=103141926,
time=1492107344341, order=1492027414014, nodeOrder=13]]}],
super=IgniteTxAdapter [xidVer=GridCacheVersion [topVer=103141926,
time=1492107344341, order=1492027414014, nodeOrder=13], writeVer=null,
implicit=false, loc=true, threadId=83, startTime=1492107344336,
nodeId=bdd5e4ed-aac9-4769-b241-a2e6f21f7e18, startVer=GridCacheVersion
[topVer=103141926, time=1492107344341, order=1492027414014, nodeOrder=13],
endVer=null, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC,
timeout=10000, sysInvalidate=false, sys=false, plc=2, commitVer=null,
finalizing=NONE, preparing=false, invalidParts=null, state=MARKED_ROLLBACK,
timedOut=false, topVer=AffinityTopologyVersion [topVer=1979,
minorTopVer=34], duration=10006ms, onePhaseCommit=false], size=2]]]]




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p12009.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Andrey, in the exception above, we see this

 pendingLocks=[KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
clearBrokerId=12718, insIid=679675, strategy=AFI, traderId=6531,
valueDate=19000101], hasValBytes=true]], super=GridCompoundIdentityFuture
[super=GridCompoundFuture [rdc=Bool reducer: true, initFlag=0, lsnrCalls=0,
done=true, cancelled=false, err=null, futs=[]]]]]]]
class
org.apache.ignite.internal.transactions.IgniteTxTimeoutCheckedException:
Failed to acquire lock within provided timeout for transaction
[timeout=10000,
tx=org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocalAdapter$1@7597f862]

What does this mean? clearly its a pendingLock and we know that be behavior
because every subsequent update to this key fails with a txn timeout. But up
until this point, all updates were fine.

Thanks,
Binti



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11968.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
We got into the key lock situation again today.
Does the below error help in identifying anything?

Apr 13, 2017 2:16:09 PM org.apache.ignite.logger.java.JavaLogger error
SEVERE: <Position> Failed to acquire lock for request: GridNearLockRequest
[topVer=AffinityTopologyVersion [topVer=1980, minorTopVer=34],
miniId=0628e8f4b51-2d0a76c5-854f-4909-90ed-b12db61a1632, implicitTx=false,
implicitSingleTx=false, onePhaseCommit=false, dhtVers=[null],
subjId=bdd5e4ed-aac9-4769-b241-a2e6f21f7e18, taskNameHash=0,
hasTransforms=false, syncCommit=true, accessTtl=-1, retVal=true,
firstClientReq=false, filter=null, super=GridDistributedLockRequest
[nodeId=bdd5e4ed-aac9-4769-b241-a2e6f21f7e18, nearXidVer=GridCacheVersion
[topVer=103141927, time=1492107359249, order=1492027414045, nodeOrder=13],
threadId=83, futId=e528e8f4b51-2d0a76c5-854f-4909-90ed-b12db61a1632,
timeout=10000, isInTx=true, isInvalidate=false, isRead=true,
isolation=REPEATABLE_READ, retVals=[true], txSize=0, flags=0, keysCnt=1,
super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=103141927,
time=1492107359249, order=1492027414045, nodeOrder=13], committedVers=null,
rolledbackVers=null, cnt=0, super=GridCacheMessage [msgId=4632574,
depInfo=null, err=null, skipPrepare=false, cacheId=812449097,
cacheId=812449097]]]]
class
org.apache.ignite.internal.transactions.IgniteTxTimeoutCheckedException:
Failed to acquire lock within provided timeout for transaction
[timeout=10000,
tx=org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocalAdapter$1@7597f862]
        at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:3924)
        at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:3874)
        at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$2.applyx(GridEmbeddedFuture.java:91)
        at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:297)
        at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:290)
        at
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:263)
        at
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:251)
        at
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:381)
        at
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:347)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.onComplete(GridDhtLockFuture.java:752)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.access$600(GridDhtLockFuture.java:79)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture$LockTimeoutObject.onTimeout(GridDhtLockFuture.java:1116)
        at
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:159)
        at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
        at java.lang.Thread.run(Thread.java:745)

Apr 13, 2017 2:16:09 PM org.apache.ignite.logger.java.JavaLogger error
SEVERE: <Position> Future execution resulted in error: GridDhtEmbeddedFuture
[super=GridEmbeddedFuture [embedded=GridEmbeddedFuture
[embedded=GridDhtLockFuture
[nearNodeId=bdd5e4ed-aac9-4769-b241-a2e6f21f7e18,
nearLockVer=GridCacheVersion [topVer=103141927, time=1492107359249,
order=1492027414045, nodeOrder=13], topVer=AffinityTopologyVersion
[topVer=1980, minorTopVer=34], threadId=83,
futId=ee848215b51-be4c49fc-3ecf-43d8-a667-8bbe34fa045d,
lockVer=GridCacheVersion [topVer=103141927, time=1492107359742,
order=1492027414057, nodeOrder=17], read=true, err=null, timedOut=true,
timeout=10000, tx=GridDhtTxLocal
[nearNodeId=bdd5e4ed-aac9-4769-b241-a2e6f21f7e18,
nearFutId=e528e8f4b51-2d0a76c5-854f-4909-90ed-b12db61a1632,
nearMiniId=0628e8f4b51-2d0a76c5-854f-4909-90ed-b12db61a1632,
nearFinFutId=d088e8f4b51-2d0a76c5-854f-4909-90ed-b12db61a1632,
nearFinMiniId=e088e8f4b51-2d0a76c5-854f-4909-90ed-b12db61a1632,
nearXidVer=GridCacheVersion [topVer=103141927, time=1492107359249,
order=1492027414045, nodeOrder=13], super=GridDhtTxLocalAdapter
[nearOnOriginatingNode=false, nearNodes=[], dhtNodes=[], explicitLock=false,
super=IgniteTxLocalAdapter [completedBase=null, sndTransformedVals=false,
depEnabled=false, txState=IgniteTxStateImpl [activeCacheIds=GridLongList
[idx=1, arr=[812449097]], txMap={IgniteTxKey [key=KeyCacheObjectImpl
[val=PositionId [fundAbbrev=BVI, clearBrokerId=12718, insIid=679675,
strategy=AFI, traderId=6531, valueDate=19000101], hasValBytes=true],
cacheId=812449097]=IgniteTxEntry [key=KeyCacheObjectImpl [val=PositionId
[fundAbbrev=BVI, clearBrokerId=12718, insIid=679675, strategy=AFI,
traderId=6531, valueDate=19000101], hasValBytes=true], cacheId=812449097,
partId=-1, txKey=IgniteTxKey [key=KeyCacheObjectImpl [val=PositionId
[fundAbbrev=BVI, clearBrokerId=12718, insIid=679675, strategy=AFI,
traderId=6531, valueDate=19000101], hasValBytes=true], cacheId=812449097],
val=[op=NOOP, val=null], prevVal=[op=NOOP, val=null],
entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null,
explicitVer=null, dhtVer=null, filters=[], filtersPassed=false,
filtersSet=false, entry=GridDhtColocatedCacheEntry [super=GridDhtCacheEntry
[rdrs=[], locPart=GridDhtLocalPartition [id=950,
map=o.a.i.i.processors.cache.GridCacheConcurrentMapImpl@40e69256,
rmvQueue=GridCircularBuffer [sizeMask=127, idxGen=0], cntr=956,
state=OWNING, reservations=0, empty=false, createTime=04/08/2017 09:59:13],
super=GridDistributedCacheEntry [super=GridCacheMapEntry
[key=KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
clearBrokerId=12718, insIid=679675, strategy=AFI, traderId=6531,
valueDate=19000101], hasValBytes=true], val=CacheObjectImpl [val=Position
[positionId=PositionId [fundAbbrev=BVI, clearBrokerId=12718, insIid=679675,
strategy=AFI, traderId=6531, valueDate=19000101], quantity=55846.0],
hasValBytes=true], startVer=1491660018438, ver=GridCacheVersion
[topVer=103141923, time=1492107302157, order=1492027411428, nodeOrder=17],
hash=-1282641846, extras=GridCacheMvccEntryExtras [mvcc=GridCacheMvcc
[locs=[GridCacheMvccCandidate [nodeId=63e3e087-15c5-45e8-9420-9ee657a5a871,
ver=GridCacheVersion [topVer=103141926, time=1492107359128,
order=1492027414021, nodeOrder=17], timeout=9980, ts=1492107359121,
threadId=83, id=2617730, topVer=AffinityTopologyVersion [topVer=1979,
minorTopVer=34], reentry=null,
otherNodeId=bdd5e4ed-aac9-4769-b241-a2e6f21f7e18, otherVer=GridCacheVersion
[topVer=103141926, time=1492107344341, order=1492027414014, nodeOrder=13],
mappedDhtNodes=null, mappedNearNodes=null, ownerVer=null, serOrder=null,
key=KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI, clearBrokerId=12718,
insIid=679675, strategy=AFI, traderId=6531, valueDate=19000101],
hasValBytes=true],
masks=local=1|owner=1|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0,
prevVer=null, nextVer=null], GridCacheMvccCandidate
[nodeId=63e3e087-15c5-45e8-9420-9ee657a5a871, ver=GridCacheVersion
[topVer=103141927, time=1492107369295, order=1492027414501, nodeOrder=17],
timeout=10000, ts=1492107369289, threadId=83, id=2617925,
topVer=AffinityTopologyVersion [topVer=1980, minorTopVer=34], reentry=null,
otherNodeId=bdd5e4ed-aac9-4769-b241-a2e6f21f7e18, otherVer=GridCacheVersion
[topVer=103141927, time=1492107369294, order=1492027414500, nodeOrder=13],
mappedDhtNodes=null, mappedNearNodes=null, ownerVer=GridCacheVersion
[topVer=103141926, time=1492107359128, order=1492027414021, nodeOrder=17],
serOrder=null, key=KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
clearBrokerId=12718, insIid=679675, strategy=AFI, traderId=6531,
valueDate=19000101], hasValBytes=true],
masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0,
prevVer=null, nextVer=null]], rmts=null]], flags=0]]]], prepared=0,
locked=false, nodeId=null, locMapped=false, expiryPlc=null,
transferExpiryPlc=false, flags=0, partUpdateCntr=0, serReadVer=null,
xidVer=GridCacheVersion [topVer=103141927, time=1492107359742,
order=1492027414057, nodeOrder=17]]}], super=IgniteTxAdapter
[xidVer=GridCacheVersion [topVer=103141927, time=1492107359742,
order=1492027414057, nodeOrder=17], writeVer=null, implicit=false, loc=true,
threadId=83, startTime=1492107359735,
nodeId=63e3e087-15c5-45e8-9420-9ee657a5a871, startVer=GridCacheVersion
[topVer=103141927, time=1492107359742, order=1492027414057, nodeOrder=17],
endVer=null, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC,
timeout=10000, sysInvalidate=false, sys=false, plc=2, commitVer=null,
finalizing=NONE, preparing=false, invalidParts=null, state=ROLLED_BACK,
timedOut=false, topVer=AffinityTopologyVersion [topVer=1980,
minorTopVer=34], duration=10007ms, onePhaseCommit=false], size=1]]],
mapped=false, trackable=true, accessTtl=-1, needReturnVal=true,
skipStore=false, keepBinary=false, innerFuts=[],
pendingLocks=[KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
clearBrokerId=12718, insIid=679675, strategy=AFI, traderId=6531,
valueDate=19000101], hasValBytes=true]], super=GridCompoundIdentityFuture
[super=GridCompoundFuture [rdc=Bool reducer: true, initFlag=0, lsnrCalls=0,
done=true, cancelled=false, err=null, futs=[]]]]]]]
class
org.apache.ignite.internal.transactions.IgniteTxTimeoutCheckedException:
Failed to acquire lock within provided timeout for transaction
[timeout=10000,
tx=org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocalAdapter$1@7597f862]
        at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:3924)
        at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:3874)
        at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$2.applyx(GridEmbeddedFuture.java:91)
        at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:297)
        at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:290)
        at
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:263)
        at
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:251)
        at
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:381)
        at
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:347)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.onComplete(GridDhtLockFuture.java:752)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.access$600(GridDhtLockFuture.java:79)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture$LockTimeoutObject.onTimeout(GridDhtLockFuture.java:1116)
        at
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:159)
        at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
        at java.lang.Thread.run(Thread.java:745)





--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11967.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
As far as I know no one has such a problems.

On Wed, Apr 12, 2017 at 4:15 PM, bintisepaha <bi...@tudor.com> wrote:
> Thanks for trying, is the node filter an issue someone else is seeing?
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11905.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Thanks for trying, is the node filter an issue someone else is seeing?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11905.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
Hi,

unfortunatelly I didn't have possibility to reproduce it. But, anyway,
I'm going to reproduce case with node filters. I still don't have any
idea about your case. Hopefully I'll get some ideas during reproducing
and fixing problem that can happen when cache exists only on some
subset of nodes.

On Thu, Apr 6, 2017 at 6:41 PM, bintisepaha <bi...@tudor.com> wrote:
> Andrey,
>
> We start the caches only at server start up time. 2 of our caches are
> replicated, all other caches are partitioned with 1 backup. Do you think
> replicated caches might be causing this issue, when clients leave and join
> the cluster?
>
> All server nodes start with the same cache config. All client nodes start
> with the same config but have no knowledge of cache creation.
>
> We don't use any node filters. For IgniteCompute we start the Runnable on
> server nodes, but for cache creation there are no filters.
>
> Any luck reproducing? Has anyone else had an issue where a txn seems like it
> is finished but the key lock is not released?
>
> Thanks,
> Binti
>
>
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11784.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Andrey, 

We start the caches only at server start up time. 2 of our caches are
replicated, all other caches are partitioned with 1 backup. Do you think
replicated caches might be causing this issue, when clients leave and join
the cluster?

All server nodes start with the same cache config. All client nodes start
with the same config but have no knowledge of cache creation. 

We don't use any node filters. For IgniteCompute we start the Runnable on
server nodes, but for cache creation there are no filters.

Any luck reproducing? Has anyone else had an issue where a txn seems like it
is finished but the key lock is not released?

Thanks,
Binti






--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11784.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
It's obviously bug. NPE occures if node doesn't have cache but got
respone message during deadlock detection about thiscache on toher
node. It's possible for example when you have node filter and caches
start on some subset nodes of cluster. Is it your case?

I'll try to reproduce this problem, create JIRA ticket and answer here.

On Mon, Apr 3, 2017 at 8:23 PM, bintisepaha <bi...@tudor.com> wrote:
> Sorry for the late response. We do not close/destroy caches dynamically.
> could you please explain this NPE?
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11676.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Sorry for the late response. We do not close/destroy caches dynamically.
could you please explain this NPE?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11676.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
It's very strange NPE. It looks like cache is closed or destoryed. Is
it possible that you start and close/destroy caches dynamically?

On Fri, Mar 24, 2017 at 5:43 PM, bintisepaha <bi...@tudor.com> wrote:
> Hey Andrey, Thanks a lot for getting back.
>
> These errors were a result of a bad client connected to grid.
>
> We have been running clients that leave and join the cluster constantly in
> order to see if we can reproduce this issue. Last night we saw this issue
> again. Here is one of the errors that a sys thread has on a client node that
> initiates a transaction. The client node was not restarted or disconnected.
> It kept working fine.
> We do not restart these clients but there are some otehr clietns that leave
> and join the cluster.
>
> Do you think this is helpful in locating the cause?
>
> Exception in thread "sys-#41%DataGridServer-Production%"
> java.lang.NullPointerException
>         at
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxKey.finishUnmarshal(IgniteTxKey.java:92)
>         at
> org.apache.ignite.internal.processors.cache.transactions.TxLocksResponse.finishUnmarshal(TxLocksResponse.java:190)
>         at
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$DeadlockDetectionListener.unmarshall(IgniteTxManager.java:2427)
>         at
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$DeadlockDetectionListener.onMessage(IgniteTxManager.java:2317)
>         at
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1238)
>         at
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:866)
>         at
> org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:106)
>         at
> org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:829)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> [20:02:18] Topology snapshot [ver=7551, servers=16, clients=53, CPUs=217,
> heap=740.0GB]
> [20:02:22] Topology snapshot [ver=7552, servers=16, clients=52, CPUs=213,
> heap=740.0GB]
> [20:02:28] Topology snapshot [ver=7553, servers=16, clients=53, CPUs=217,
> heap=740.0GB]
> [20:02:36] Topology snapshot [ver=7554, servers=16, clients=54, CPUs=217,
> heap=740.0GB]
> [20:02:40] Topology snapshot [ver=7555, servers=16, clients=53, CPUs=217,
> heap=740.0GB]
> [20:02:41] Topology snapshot [ver=7556, servers=16, clients=54, CPUs=217,
> heap=740.0GB]
> [20:02:48] Topology snapshot [ver=7557, servers=16, clients=53, CPUs=217,
> heap=740.0GB]
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11433.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Hey Andrey, Thanks a lot for getting back.

These errors were a result of a bad client connected to grid.

We have been running clients that leave and join the cluster constantly in
order to see if we can reproduce this issue. Last night we saw this issue
again. Here is one of the errors that a sys thread has on a client node that
initiates a transaction. The client node was not restarted or disconnected.
It kept working fine.
We do not restart these clients but there are some otehr clietns that leave
and join the cluster.

Do you think this is helpful in locating the cause?

Exception in thread "sys-#41%DataGridServer-Production%"
java.lang.NullPointerException
        at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxKey.finishUnmarshal(IgniteTxKey.java:92)
        at
org.apache.ignite.internal.processors.cache.transactions.TxLocksResponse.finishUnmarshal(TxLocksResponse.java:190)
        at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$DeadlockDetectionListener.unmarshall(IgniteTxManager.java:2427)
        at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$DeadlockDetectionListener.onMessage(IgniteTxManager.java:2317)
        at
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1238)
        at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:866)
        at
org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:106)
        at
org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:829)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
[20:02:18] Topology snapshot [ver=7551, servers=16, clients=53, CPUs=217,
heap=740.0GB]
[20:02:22] Topology snapshot [ver=7552, servers=16, clients=52, CPUs=213,
heap=740.0GB]
[20:02:28] Topology snapshot [ver=7553, servers=16, clients=53, CPUs=217,
heap=740.0GB]
[20:02:36] Topology snapshot [ver=7554, servers=16, clients=54, CPUs=217,
heap=740.0GB]
[20:02:40] Topology snapshot [ver=7555, servers=16, clients=53, CPUs=217,
heap=740.0GB]
[20:02:41] Topology snapshot [ver=7556, servers=16, clients=54, CPUs=217,
heap=740.0GB]
[20:02:48] Topology snapshot [ver=7557, servers=16, clients=53, CPUs=217,
heap=740.0GB]




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11433.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
Hi,

From your errors list I see many connection errors. Why does it happen?

> how ignite treats cluster topology when transaction are running on server nodes

When Ignite executes some operation in transaction it maps all
involved keys to affinity nodes and then transaction will be
propagated to this nodes. If transaction isn't finished and some of
involved nodes left topology then transaction will be failed and
rolled back.

Client nodes doesn't affect transactions when join or leave cluster
except of cases when client node is transaction initiator or have
local cache that participates in transaction.


On Tue, Mar 21, 2017 at 11:40 PM, bintisepaha <bi...@tudor.com> wrote:
> Sorry for keep following up on this, but this is becoming a major issue for
> us and we need to understand how ignite treats cluster topology when
> transaction are running on server nodes. How do clients joining and leaving
> the cluster affects txns?
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11346.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Sorry for keep following up on this, but this is becoming a major issue for
us and we need to understand how ignite treats cluster topology when
transaction are running on server nodes. How do clients joining and leaving
the cluster affects txns?

Thanks,
Binti



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11346.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Andrey, would you be able to look at the attached errors and advice please?

Thanks,
Binti



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11322.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Evans, we don't have a grid hanging issue. Also we do have server nodes
leaving the cluster. the toplogy changes are for clients.

I am attaching some errors we saw when a key got locked.
Could someone please look at it and help us to find out why a client
topology change is causing key locks?

Errors.txt
<http://apache-ignite-users.70518.x6.nabble.com/file/n11254/Errors.txt>  



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11254.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Evans Ye <ev...@apache.org>.
I'm not quite sure this is relevant to your issue.
I've experienced the hanging issue on ignite 1.8.0 when topology changed.
https://issues.apache.org/jira/browse/IGNITE-4450 might get to fix the
problem.
FYR.


2017-03-16 1:03 GMT+08:00 bintisepaha <bi...@tudor.com>:

> Andrey, do you think topology changes with client nodes (not server nodes)
> leaving and joining the cluster may have any impact on transaction?
>
> twice out of the 4 times we saw topology changes while the last successful
> update was done to the key that was eventually locked.
>
> Yesterday we saw this same issue on another cache. We are due for 1.8
> upgrade soon, but given that no one else has seen this issue before, we are
> not sure if 1.8 would have fixed it either.
>
> Let us know what you think.
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-
> subsequent-txns-failed-tp10536p11203.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Andrey, do you think topology changes with client nodes (not server nodes)
leaving and joining the cluster may have any impact on transaction?

twice out of the 4 times we saw topology changes while the last successful
update was done to the key that was eventually locked.

Yesterday we saw this same issue on another cache. We are due for 1.8
upgrade soon, but given that no one else has seen this issue before, we are
not sure if 1.8 would have fixed it either.

Let us know what you think.

Thanks,
Binti



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11203.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
This is helpful and we have seen it earlier in our code too. So you are right
that the reason for this issue we are seeing is not deadlock.

It feels like a strange bug where a pessimistic txn is committing but not
releasing the key. Let's see if 1.8 helps. We will upgrade to it in 2 weeks.

Thanks,
Binti



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11100.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
Your logs should contain exception with stack trace that looks like
piece of log bellow. You can try find "TransactionDeadlockException"
or "Deadlock detected" patterns in log.

javax.cache.CacheException: class
org.apache.ignite.transactions.TransactionTimeoutException: Failed to
acquire lock within provided timeout for transaction [timeout=800,
tx=org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$10@647b61d9]
    at org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1440)
    at org.apache.ignite.internal.processors.cache.IgniteCacheProxy.cacheException(IgniteCacheProxy.java:2183)
    at org.apache.ignite.internal.processors.cache.IgniteCacheProxy.putAll(IgniteCacheProxy.java:1430)
    at org.apache.ignite.internal.processors.cache.transactions.TxPessimisticDeadlockDetectionTest$1.run(TxPessimisticDeadlockDetectionTest.java:322)
    at org.apache.ignite.testframework.GridTestUtils$8.call(GridTestUtils.java:1092)
    at org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:86)
    Suppressed: class org.apache.ignite.IgniteException: Transaction
has been already completed.
        at org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:924)
        at org.apache.ignite.internal.processors.cache.transactions.TransactionProxyImpl.close(TransactionProxyImpl.java:279)
        at org.apache.ignite.internal.processors.cache.transactions.TxPessimisticDeadlockDetectionTest$1.run(TxPessimisticDeadlockDetectionTest.java:325)
        ... 2 more
    Caused by: class org.apache.ignite.IgniteCheckedException:
Transaction has been already completed.
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finishDhtLocal(IgniteTxHandler.java:786)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:728)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxFinishRequest(IgniteTxHandler.java:687)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:157)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:155)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:827)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:369)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:293)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:95)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:238)
        at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1082)
        at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:710)
        at org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:102)
        at org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:673)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: class
org.apache.ignite.transactions.TransactionTimeoutException: Failed to
acquire lock within provided timeout for transaction [timeout=800,
tx=org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$10@647b61d9]
    at org.apache.ignite.internal.util.IgniteUtils$13.apply(IgniteUtils.java:831)
    at org.apache.ignite.internal.util.IgniteUtils$13.apply(IgniteUtils.java:828)
    ... 6 more
Caused by: class org.apache.ignite.transactions.TransactionDeadlockException:
Deadlock detected:

K1: TX1 holds lock, TX2 waits lock.
K2: TX3 holds lock, TX1 waits lock.
K3: TX2 holds lock, TX3 waits lock.

Transactions:

TX1 [txId=GridCacheVersion [topVer=100472269, time=1488992349410,
order=1488992271778, nodeOrder=5],
nodeId=52c630e7-3b59-4d58-9291-dd8bd9e00004, threadId=637]
TX2 [txId=GridCacheVersion [topVer=100472269, time=1488992349410,
order=1488992271778, nodeOrder=6],
nodeId=652f18b9-8948-4d66-a1a8-f45346c00005, threadId=636]
TX3 [txId=GridCacheVersion [topVer=100472269, time=1488992349410,
order=1488992271806, nodeOrder=4],
nodeId=5810f06c-f9a5-40b2-91ec-e9ec97a00003, threadId=635]

Keys:

K1 [key=3, cache=cache]
K2 [key=2, cache=cache]
K3 [key=1, cache=cache]

    at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture$LockTimeoutObject$1.apply(GridDhtColocatedLockFuture.java:1355)
    at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture$LockTimeoutObject$1.apply(GridDhtColocatedLockFuture.java:1349)
    at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:264)
    at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:252)
    at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:382)
    at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:348)
    at org.apache.ignite.internal.processors.cache.transactions.TxDeadlockDetection$TxDeadlockFuture.onDone(TxDeadlockDetection.java:537)
    at org.apache.ignite.internal.processors.cache.transactions.TxDeadlockDetection$TxDeadlockFuture.onDone(TxDeadlockDetection.java:158)
    at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:325)
    at org.apache.ignite.internal.processors.cache.transactions.TxDeadlockDetection$TxDeadlockFuture.detect(TxDeadlockDetection.java:313)
    at org.apache.ignite.internal.processors.cache.transactions.TxDeadlockDetection$TxDeadlockFuture.onResult(TxDeadlockDetection.java:513)
    at org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$DeadlockDetectionListener.onMessage(IgniteTxManager.java:2525)
    at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1082)
    at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:710)
    at org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:102)
    at org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:673)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

On Wed, Mar 8, 2017 at 12:32 AM, bintisepaha <bi...@tudor.com> wrote:
> Could you please tell us what the logs would say if there was a deadlock?
> we rollback the txn on any exception explicitly.
>
> We are running 1.8 in UAT environment this week. It is not a simple upgrade
> to production, we need to let it burn in. But in UAT we never saw this issue
> with 1.7 either. It already happened in production, so far 3 times.
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11064.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Could you please tell us what the logs would say if there was a deadlock?
we rollback the txn on any exception explicitly.

We are running 1.8 in UAT environment this week. It is not a simple upgrade
to production, we need to let it burn in. But in UAT we never saw this issue
with 1.7 either. It already happened in production, so far 3 times.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11064.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
Hi,

you could to know more about deadlock detection here [1]

Mentioned ticket about optiomistic transactions. Deadlock detection
for pessimistic transactions (your case) is implemented in 1.7
release.

I don't think that it is your case because in case of
TransactionDeadlockException transaction should be rolled back and all
locks should be released. But it is possible that there is bug in this
place.

By the way, do you have the same issue on 1.8?

[1] https://apacheignite.readme.io/docs/transactions#deadlock-detection

On Mon, Mar 6, 2017 at 5:16 PM, bintisepaha <bi...@tudor.com> wrote:
> Andrey, Could you please tell us a little bit more about the deadlock
> detection feature in 1.8?
>
> https://issues.apache.org/jira/browse/IGNITE-2969
>
> How would we able to know that there is a deadlock? Do you think that is the
> case for us now in 1.7 but we can't be sure because we have no such feature?
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11038.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Andrey, Could you please tell us a little bit more about the deadlock
detection feature in 1.8?

https://issues.apache.org/jira/browse/IGNITE-2969

How would we able to know that there is a deadlock? Do you think that is the
case for us now in 1.7 but we can't be sure because we have no such feature?

Thanks,
Binti



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p11038.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
> The long pause is between 2 different threads, so isn't that normal?

Usually it is not normal. So need some investigation in order to
understand why this pause happens.

> Also the 9990 ms and 10 ms used earlier for some previous step in the tn, is
this how ignite breaks out the time?

When Ignite starts transaction on initial node it pass timeout to
tranasction. After mapping keys on nodes of cluster Ignite starts
transactions on remote nodes and pass remaining transaction time to
it. So it is possible that it is your case.

> Is there any way we can force release a locked key without restarting the
whole cluster?

There is no such way.

On Mon, Feb 27, 2017 at 5:39 PM, bintisepaha <bi...@tudor.com> wrote:
> Andrey, thanks for getting back.
> The long pause is between 2 different threads, so isn't that normal?
>
> Also the 9990 ms and 10 ms used earlier for some previous step in the tn, is
> this how ignite breaks out the time? we have always seen the timeouts for
> the value in ms that we set in our code. Also the stack trace does not come
> from our code which it usually does on genuine timeouts that do not leave
> the key locked.
>
> We are trying 1.8 in UAT environment and will release to production soon.
> Unfortunately this issue does not happen in UAT and we have no way of
> reproducing it.
>
> Is there any way we can force release a locked key without restarting the
> whole cluster?
>
> Thanks,
> Binti
>
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10910.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Andrey, thanks for getting back.
The long pause is between 2 different threads, so isn't that normal?

Also the 9990 ms and 10 ms used earlier for some previous step in the tn, is
this how ignite breaks out the time? we have always seen the timeouts for
the value in ms that we set in our code. Also the stack trace does not come
from our code which it usually does on genuine timeouts that do not leave
the key locked.

We are trying 1.8 in UAT environment and will release to production soon.
Unfortunately this issue does not happen in UAT and we have no way of
reproducing it.

Is there any way we can force release a locked key without restarting the
whole cluster?

Thanks,
Binti





--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10910.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
I still don't see any cause of timeout in logs. But there is one
suspiciously long pause between two log records:

[INFO ] 2017-02-22 15:45:52.807 [pub-#1%DataGridServer-Production%]
OrderHolderSaveRunnable - Begin TXN for orderHolder save, saving 1
orders
[ERROR] 2017-02-22 15:46:01.406 [pub-#2%DataGridServer-Production%]
OrderHolderSaveRunnable - javax.cache.CacheException: class
org.apache.ignite.transactions.TransactionTimeoutException: Cache
transaction time
d out (was rolled back automatically)

It would be great to get thread dumps during this pause.

Did you try Apache Ignite 1.8 release?


On Thu, Feb 23, 2017 at 5:03 PM, bintisepaha <bi...@tudor.com> wrote:
> This is the actual error that looks like it is not coming from our code
>
> 109714 Feb 22, 2017 3:46:17 PM org.apache.ignite.logger.java.JavaLogger
> error
> 109715 SEVERE: <Position> Failed to acquire lock for request:
> GridNearLockRequest [topVer=AffinityTopologyVersion [topVer=3153,
> minorTopVer=32], miniId=acdced25a51-c5e64ee6-1079-4b90-bb7b-5ec14032a859,
> implicitTx=f       alse, implicitSingleTx=false, onePhaseCommit=false,
> dhtVers=[null], subjId=f6663b00-24fc-4515-91ac-20c3b47d90ec, taskNameHash=0,
> hasTransforms=false, syncCommit=true, accessTtl=-1, retVal=true,
> firstClientRe       q=false, filter=null, super=GridDistributedLockRequest
> [nodeId=f6663b00-24fc-4515-91ac-20c3b47d90ec, nearXidVer=GridCacheVersion
> [topVer=98913254, time=1487796367155, order=1487785731866, nodeOrder=7],
> threa       dId=57, futId=8cdced25a51-c5e64ee6-1079-4b90-bb7b-5ec14032a859,
> timeout=9990, isInTx=true, isInvalidate=false, isRead=true,
> isolation=REPEATABLE_READ, retVals=[true], txSize=0, flags=0, keysCnt=1,
> super=Grid       DistributedBaseMessage [ver=GridCacheVersion
> [topVer=98913254, time=1487796367155, order=1487785731866, nodeOrder=7],
> committedVers=null, rolledbackVers=null, cnt=0, super=GridCacheMessage
> [msgId=2688455, de       pInfo=null, err=null, skipPrepare=false,
> cacheId=812449097, cacheId=812449097]]]]
> 109716 class
> org.apache.ignite.internal.transactions.IgniteTxTimeoutCheckedException:
> Failed to acquire lock within provided timeout for transaction
> [timeout=9990, tx=org.apache.ignite.internal.processors.cache.dis
> tributed.dht.GridDhtTxLocalAdapter$1@7f2a8c8a]
> 109717         at
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:3924)
> 109718         at
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:3874)
> 109719         at
> org.apache.ignite.internal.util.future.GridEmbeddedFuture$2.applyx(GridEmbeddedFuture.java:91)
> 109720         at
> org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:297)
> 109721         at
> org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:290)
> 109722         at
> org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:263)
> 109723         at
> org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:251)
> 109724         at
> org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:381)
> 109725         at
> org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:347)
> 109726         at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.onComplete(GridDhtLockFuture.java:752)
> 109727         at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.access$600(GridDhtLockFuture.java:79)
> 109728         at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture$LockTimeoutObject.onTimeout(GridDhtLockFuture.java:1116)
> 109729         at
> org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:159)
> 109730         at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> 109731         at java.lang.Thread.run(Thread.java:745)
> 109732
> 109733 Feb 22, 2017 3:46:17 PM org.apache.ignite.logger.java.JavaLogger
> error
> 109716 class
> org.apache.ignite.internal.transactions.IgniteTxTimeoutCheckedException:
> Failed to acquire lock within provided timeout for transaction
> [timeout=9990, tx=org.apache.ignite.internal.processors.cache.dis
> tributed.dht.GridDhtTxLocalAdapter$1@7f2a8c8a]
> 109717         at
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:3924)
> 109718         at
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:3874)
> 109719         at
> org.apache.ignite.internal.util.future.GridEmbeddedFuture$2.applyx(GridEmbeddedFuture.java:91)
> 109720         at
> org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:297)
> 109721         at
> org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:290)
> 109722         at
> org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:263)
> 109723         at
> org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:251)
> 109724         at
> org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:381)
> 109725         at
> org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:347)
> 109726         at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.onComplete(GridDhtLockFuture.java:752)
> 109727         at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.access$600(GridDhtLockFuture.java:79)
> 109728         at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture$LockTimeoutObject.onTimeout(GridDhtLockFuture.java:1116)
> 109729         at
> org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:159)
> 109730         at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> 109731         at java.lang.Thread.run(Thread.java:745)
> 109732
> 109733 Feb 22, 2017 3:46:17 PM org.apache.ignite.logger.java.JavaLogger
> error
> 109734 SEVERE: <Position> Future execution resulted in error:
> GridDhtEmbeddedFuture [super=GridEmbeddedFuture [embedded=GridEmbeddedFuture
> [embedded=GridDhtLockFuture [nearNodeId=f6663b00-24fc-4515-91ac-20c3b47d90e
> c, nearLockVer=GridCacheVersion [topVer=98913254, time=1487796367155,
> order=1487785731866, nodeOrder=7], topVer=AffinityTopologyVersion
> [topVer=3153, minorTopVer=32], threadId=57, futId=7e221e25a51-8c03dfe0-
> 0f6e-4200-aed7-2233c421e0a3, lockVer=GridCacheVersion [topVer=98913254,
> time=1487796367159, order=1487785731867, nodeOrder=14], read=true, err=null,
> timedOut=true, timeout=9990, tx=GridDhtTxLocal [nearNodeId
> =f6663b00-24fc-4515-91ac-20c3b47d90ec,
> nearFutId=8cdced25a51-c5e64ee6-1079-4b90-bb7b-5ec14032a859,
> nearMiniId=acdced25a51-c5e64ee6-1079-4b90-bb7b-5ec14032a859,
> nearFinFutId=null, nearFinMiniId=null, nearXidV       er=GridCacheVersion
> [topVer=98913254, time=1487796367155, order=1487785731866, nodeOrder=7],
> super=GridDhtTxLocalAdapter [nearOnOriginatingNode=false, nearNodes=[],
> dhtNodes=[], explicitLock=false, super=Ign       iteTxLocalAdapter
> [completedBase=null, sndTransformedVals=false, depEnabled=false,
> txState=IgniteTxStateImpl [activeCacheIds=GridLongList [idx=1,
> arr=[812449097]], txMap={IgniteTxKey [key=KeyCacheObjectImpl
> [val=PositionId [fundAbbrev=BVI, clearBrokerId=10032, insIid=681871,
> strategy=, traderId=5142, valueDate=19000101], hasValBytes=true],
> cacheId=812449097]=IgniteTxEntry [key=KeyCacheObjectImpl [val=PositionId
> [fundAbbrev=BVI, clearBrokerId=10032, insIid=681871, strategy=,
> traderId=5142, valueDate=19000101], hasValBytes=true], cacheId=812449097,
> partId=-1, txKey=IgniteTxKey [key=KeyCacheObjectImpl [val=PositionId
> [fundAbbrev=BVI, clearBrokerId=10032, insIid=681871, strategy=,
> traderId=5142, valueDate=19000101], hasValBytes=true], cacheId=812449097],
> val=[op=NOOP, val=null], prevVal=[op=NOOP, val=null], entryProcesso
> rsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null,
> explicitVer=null, dhtVer=null, filters=[], filtersPassed=false,
> filtersSet=false, entry=GridDhtColocatedCacheEntry [super=GridDhtCacheEntry
> [rdrs=       [], locPart=GridDhtLocalPartition [id=961,
> map=o.a.i.i.processors.cache.GridCacheConcurrentMapImpl@4cd4bfee,
> rmvQueue=GridCircularBuffer [sizeMask=127, idxGen=0], cntr=1181,
> state=OWNING, reservations=0, emp       ty=false, createTime=02/18/2017
> 10:01:50], super=GridDistributedCacheEntry [super=GridCacheMapEntry
> [key=KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
> clearBrokerId=10032, insIid=681871, strategy=, tra       derId=5142,
> valueDate=19000101], hasValBytes=true], val=CacheObjectImpl [val=Position
> [positionId=PositionId [fundAbbrev=BVI, clearBrokerId=10032, insIid=681871,
> strategy=, traderId=5142, valueDate=19000101]       , quantity=104864.0],
> hasValBytes=true], startVer=1487430990114, ver=GridCacheVersion
> [topVer=98913252, time=1487796336445, order=1487785730286, nodeOrder=14],
> hash=534941633, extras=GridCacheMvccEntryExtras        [mvcc=GridCacheMvcc
> [locs=[GridCacheMvccCandidate [nodeId=985b44be-ea4a-4d0b-a2a5-ab2aa441ac62,
> ver=GridCacheVersion [topVer=98913253, time=1487796361851,
> order=1487785731568, nodeOrder=14], timeout=9990, t       s=1487796361850,
> threadId=76, id=1817320, topVer=AffinityTopologyVersion [topVer=3152,
> minorTopVer=41], reentry=null,
> otherNodeId=14cb9e24-0902-48b7-a016-f572db04838f, otherVer=GridCacheVersion
> [topVer=98913       253, time=1487796349651, order=1487785731495,
> nodeOrder=13], mappedDhtNodes=null, mappedNearNodes=null, ownerVer=null,
> serOrder=null, key=KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
> clearBrokerId=100       32, insIid=681871, strategy=, traderId=5142,
> valueDate=19000101], hasValBytes=true],
> masks=local=1|owner=1|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0,
> prevVer=null, ne       xtVer=null], GridCacheMvccCandidate
> [nodeId=985b44be-ea4a-4d0b-a2a5-ab2aa441ac62, ver=GridCacheVersion
> [topVer=98913254, time=1487796367159, order=1487785731867, nodeOrder=14],
> timeout=9990, ts=1487796367150       , threadId=57, id=1817427,
> topVer=AffinityTopologyVersion [topVer=3153, minorTopVer=32], reentry=null,
> otherNodeId=f6663b00-24fc-4515-91ac-20c3b47d90ec, otherVer=GridCacheVersion
> [topVer=98913254, time=14877       96367155, order=1487785731866,
> nodeOrder=7], mappedDhtNodes=null, mappedNearNodes=null,
> ownerVer=GridCacheVersion [topVer=98913253, time=1487796361851,
> order=1487785731568, nodeOrder=14], serOrder=null, key=
> KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI, clearBrokerId=10032,
> insIid=681871, strategy=, traderId=5142, valueDate=19000101],
> hasValBytes=true], masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|s
> ingle_implicit=0|dht_local=1|near_local=0|removed=0, prevVer=null,
> nextVer=null]], rmts=null]], flags=0]]]], prepared=0, locked=false,
> nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, f
> lags=0, partUpdateCntr=0, serReadVer=null, xidVer=GridCacheVersion
> [topVer=98913254, time=1487796367159, order=1487785731867, nodeOrder=14]]}],
> super=IgniteTxAdapter [xidVer=GridCacheVersion [topVer=98913254       ,
> time=1487796367159, order=1487785731867, nodeOrder=14], writeVer=null,
> implicit=false, loc=true, threadId=57, startTime=1487796367150,
> nodeId=985b44be-ea4a-4d0b-a2a5-ab2aa441ac62, startVer=GridCacheVersion
> [topVer=98913254, time=1487796367159, order=1487785731867, nodeOrder=14],
> endVer=null, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC,
> timeout=9990, sysInvalidate=false, sys=false, plc=2, commitVer=null       ,
> finalizing=NONE, preparing=false, invalidParts=null, state=MARKED_ROLLBACK,
> timedOut=false, topVer=AffinityTopologyVersion [topVer=3153,
> minorTopVer=32], duration=10008ms, onePhaseCommit=false], size=1]]],
> mapped=false, trackable=true, accessTtl=-1, needReturnVal=true,
> skipStore=false, keepBinary=false, innerFuts=[],
> pendingLocks=[KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
> clearBrokerId=10032, insIid       =681871, strategy=, traderId=5142,
> valueDate=19000101], hasValBytes=true]], super=GridCompoundIdentityFuture
> [super=GridCompoundFuture [rdc=Bool reducer: true, initFlag=0, lsnrCalls=0,
> done=true, cancelled=f       alse, err=null, futs=[]]]]]]]
> 109735 class
> org.apache.ignite.internal.transactions.IgniteTxTimeoutCheckedException:
> Failed to acquire lock within provided timeout for transaction
> [timeout=9990, tx=org.apache.ignite.internal.processors.cache.dis
> tributed.dht.GridDhtTxLocalAdapter$1@7f2a8c8a]
> 109736         at
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:3924)
> 109737         at
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:3874)
> 109738         at
> org.apache.ignite.internal.util.future.GridEmbeddedFuture$2.applyx(GridEmbeddedFuture.java:91)
> 109739         at
> org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:297)
> 109740         at
> org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:290)
> 109741         at
> org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:263)
> 109742         at
> org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:251)
> 109743         at
> org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:381)
> 109744         at
> org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:347)
> 109745         at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.onComplete(GridDhtLockFuture.java:752)
> 109746         at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.access$600(GridDhtLockFuture.java:79)
> 109747         at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture$LockTimeoutObject.onTimeout(GridDhtLockFuture.java:1116)
> 109748         at
> org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:159)
> 109749         at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> 109750         at java.lang.Thread.run(Thread.java:745)
> 109751
> 109752 Feb 22, 2017 3:46:17 PM org.apache.ignite.logger.java.JavaLogger
> error
> 109753 SEVERE: Failed to initialize job
> [jobId=73652976a51-2c4ae429-3284-4588-a96e-7e487f6653a1,
> ses=GridJobSessionImpl [ses=GridTaskSessionImpl
> [taskName=com.tudor.datagridI.server.tradegen.OrderHolderSaveRunn
> able, dep=LocalDeployment [super=GridDeployment [ts=1487430097277,
> depMode=SHARED, clsLdr=sun.misc.Launcher$AppClassLoader@18b4aac2,
> clsLdrId=1051eb15a51-985b44be-ea4a-4d0b-a2a5-ab2aa441ac62, userVer=0,
> loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
> undeployed=false, usage=0]],
> taskClsName=com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable,
> sesId=63652976a51-2c4ae429-3284-4588       -a96e-7e487f6653a1,
> startTime=1487796377472, endTime=9223372036854775807,
> taskNodeId=2c4ae429-3284-4588-a96e-7e487f6653a1,
> clsLdr=sun.misc.Launcher$AppClassLoader@18b4aac2, closed=false, cpSpi=null,
> fail       Spi=null, loadSpi=null, usage=1, fullSup=false,
> subjId=2c4ae429-3284-4588-a96e-7e487f6653a1, mapFut=IgniteFuture
> [orig=GridFutureAdapter [resFlag=0, res=null, startTime=1487796377491,
> endTime=0, ignoreIn       terrupts=false, state=INIT]]],
> jobId=73652976a51-2c4ae429-3284-4588-a96e-7e487f6653a1]]
> 109754 class org.apache.ignite.IgniteCheckedException: Failed to deserialize
> object with given class loader: sun.misc.Launcher$AppClassLoader@18b4aac2
> 109755         at
> org.apache.ignite.marshaller.optimized.OptimizedMarshaller.unmarshal(OptimizedMarshaller.java:250)
> 109756         at
> org.apache.ignite.internal.processors.job.GridJobWorker.initialize(GridJobWorker.java:409)
> 109757         at
> org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1089)
> 109758         at
> org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1766)
> 109759         at
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1238)
> 109760         at
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:866)
> 109761         at
> org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:106)
> 109762         at
> org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:829)
> 109763         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 109764         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 109765         at java.lang.Thread.run(Thread.java:745)
> 109766 Caused by: java.io.EOFException: Attempt to read beyond the end of
> the stream [pos=2216, more=1, max=2215]
> 109767         at
> org.apache.ignite.internal.util.io.GridUnsafeDataInput.offset(GridUnsafeDataInput.java:171)
> 109768         at
> org.apache.ignite.internal.util.io.GridUnsafeDataInput.readByte(GridUnsafeDataInput.java:398)
> 109769         at
> org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride(OptimizedObjectInputStream.java:197)
> 109770         at
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:367)
> 109771         at
> org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readFields(OptimizedObjectInputStream.java:491)
> 109772         at
> org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readSerializable(OptimizedObjectInputStream.java:579)
> 109773         at
> org.apache.ignite.marshaller.optimized.OptimizedClassDescriptor.read(OptimizedClassDescriptor.java:841)
> 109774         at
> org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride(OptimizedObjectInputStream.java:324)
> 109775         at
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:367)
> 109776         at
> org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readFields(OptimizedObjectInputStream.java:491)
> 109777         at
> org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readSerializable(OptimizedObjectInputStream.java:579)
> 109778         at
> org.apache.ignite.marshaller.optimized.OptimizedClassDescriptor.read(OptimizedClassDescriptor.java:841)
> 109779         at
> org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride(OptimizedObjectInputStream.java:324)
> 109780         at
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:367)
> 109781         at
> org.apache.ignite.marshaller.optimized.OptimizedMarshaller.unmarshal(OptimizedMarshaller.java:247)
> 109782         ... 10 more
> 109783
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10833.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
This is the actual error that looks like it is not coming from our code

109714 Feb 22, 2017 3:46:17 PM org.apache.ignite.logger.java.JavaLogger
error
109715 SEVERE: <Position> Failed to acquire lock for request:
GridNearLockRequest [topVer=AffinityTopologyVersion [topVer=3153,
minorTopVer=32], miniId=acdced25a51-c5e64ee6-1079-4b90-bb7b-5ec14032a859,
implicitTx=f       alse, implicitSingleTx=false, onePhaseCommit=false,
dhtVers=[null], subjId=f6663b00-24fc-4515-91ac-20c3b47d90ec, taskNameHash=0,
hasTransforms=false, syncCommit=true, accessTtl=-1, retVal=true,
firstClientRe       q=false, filter=null, super=GridDistributedLockRequest
[nodeId=f6663b00-24fc-4515-91ac-20c3b47d90ec, nearXidVer=GridCacheVersion
[topVer=98913254, time=1487796367155, order=1487785731866, nodeOrder=7],
threa       dId=57, futId=8cdced25a51-c5e64ee6-1079-4b90-bb7b-5ec14032a859,
timeout=9990, isInTx=true, isInvalidate=false, isRead=true,
isolation=REPEATABLE_READ, retVals=[true], txSize=0, flags=0, keysCnt=1,
super=Grid       DistributedBaseMessage [ver=GridCacheVersion
[topVer=98913254, time=1487796367155, order=1487785731866, nodeOrder=7],
committedVers=null, rolledbackVers=null, cnt=0, super=GridCacheMessage
[msgId=2688455, de       pInfo=null, err=null, skipPrepare=false,
cacheId=812449097, cacheId=812449097]]]]
109716 class
org.apache.ignite.internal.transactions.IgniteTxTimeoutCheckedException:
Failed to acquire lock within provided timeout for transaction
[timeout=9990, tx=org.apache.ignite.internal.processors.cache.dis      
tributed.dht.GridDhtTxLocalAdapter$1@7f2a8c8a]
109717         at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:3924)
109718         at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:3874)
109719         at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$2.applyx(GridEmbeddedFuture.java:91)
109720         at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:297)
109721         at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:290)
109722         at
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:263)
109723         at
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:251)
109724         at
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:381)
109725         at
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:347)
109726         at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.onComplete(GridDhtLockFuture.java:752)
109727         at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.access$600(GridDhtLockFuture.java:79)
109728         at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture$LockTimeoutObject.onTimeout(GridDhtLockFuture.java:1116)
109729         at
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:159)
109730         at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
109731         at java.lang.Thread.run(Thread.java:745)
109732 
109733 Feb 22, 2017 3:46:17 PM org.apache.ignite.logger.java.JavaLogger
error
109716 class
org.apache.ignite.internal.transactions.IgniteTxTimeoutCheckedException:
Failed to acquire lock within provided timeout for transaction
[timeout=9990, tx=org.apache.ignite.internal.processors.cache.dis      
tributed.dht.GridDhtTxLocalAdapter$1@7f2a8c8a]
109717         at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:3924)
109718         at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:3874)
109719         at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$2.applyx(GridEmbeddedFuture.java:91)
109720         at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:297)
109721         at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:290)
109722         at
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:263)
109723         at
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:251)
109724         at
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:381)
109725         at
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:347)
109726         at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.onComplete(GridDhtLockFuture.java:752)
109727         at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.access$600(GridDhtLockFuture.java:79)
109728         at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture$LockTimeoutObject.onTimeout(GridDhtLockFuture.java:1116)
109729         at
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:159)
109730         at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
109731         at java.lang.Thread.run(Thread.java:745)
109732 
109733 Feb 22, 2017 3:46:17 PM org.apache.ignite.logger.java.JavaLogger
error
109734 SEVERE: <Position> Future execution resulted in error:
GridDhtEmbeddedFuture [super=GridEmbeddedFuture [embedded=GridEmbeddedFuture
[embedded=GridDhtLockFuture [nearNodeId=f6663b00-24fc-4515-91ac-20c3b47d90e      
c, nearLockVer=GridCacheVersion [topVer=98913254, time=1487796367155,
order=1487785731866, nodeOrder=7], topVer=AffinityTopologyVersion
[topVer=3153, minorTopVer=32], threadId=57, futId=7e221e25a51-8c03dfe0-      
0f6e-4200-aed7-2233c421e0a3, lockVer=GridCacheVersion [topVer=98913254,
time=1487796367159, order=1487785731867, nodeOrder=14], read=true, err=null,
timedOut=true, timeout=9990, tx=GridDhtTxLocal [nearNodeId      
=f6663b00-24fc-4515-91ac-20c3b47d90ec,
nearFutId=8cdced25a51-c5e64ee6-1079-4b90-bb7b-5ec14032a859,
nearMiniId=acdced25a51-c5e64ee6-1079-4b90-bb7b-5ec14032a859,
nearFinFutId=null, nearFinMiniId=null, nearXidV       er=GridCacheVersion
[topVer=98913254, time=1487796367155, order=1487785731866, nodeOrder=7],
super=GridDhtTxLocalAdapter [nearOnOriginatingNode=false, nearNodes=[],
dhtNodes=[], explicitLock=false, super=Ign       iteTxLocalAdapter
[completedBase=null, sndTransformedVals=false, depEnabled=false,
txState=IgniteTxStateImpl [activeCacheIds=GridLongList [idx=1,
arr=[812449097]], txMap={IgniteTxKey [key=KeyCacheObjectImpl       
[val=PositionId [fundAbbrev=BVI, clearBrokerId=10032, insIid=681871,
strategy=, traderId=5142, valueDate=19000101], hasValBytes=true],
cacheId=812449097]=IgniteTxEntry [key=KeyCacheObjectImpl [val=PositionId       
[fundAbbrev=BVI, clearBrokerId=10032, insIid=681871, strategy=,
traderId=5142, valueDate=19000101], hasValBytes=true], cacheId=812449097,
partId=-1, txKey=IgniteTxKey [key=KeyCacheObjectImpl [val=PositionId       
[fundAbbrev=BVI, clearBrokerId=10032, insIid=681871, strategy=,
traderId=5142, valueDate=19000101], hasValBytes=true], cacheId=812449097],
val=[op=NOOP, val=null], prevVal=[op=NOOP, val=null], entryProcesso      
rsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null,
explicitVer=null, dhtVer=null, filters=[], filtersPassed=false,
filtersSet=false, entry=GridDhtColocatedCacheEntry [super=GridDhtCacheEntry
[rdrs=       [], locPart=GridDhtLocalPartition [id=961,
map=o.a.i.i.processors.cache.GridCacheConcurrentMapImpl@4cd4bfee,
rmvQueue=GridCircularBuffer [sizeMask=127, idxGen=0], cntr=1181,
state=OWNING, reservations=0, emp       ty=false, createTime=02/18/2017
10:01:50], super=GridDistributedCacheEntry [super=GridCacheMapEntry
[key=KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
clearBrokerId=10032, insIid=681871, strategy=, tra       derId=5142,
valueDate=19000101], hasValBytes=true], val=CacheObjectImpl [val=Position
[positionId=PositionId [fundAbbrev=BVI, clearBrokerId=10032, insIid=681871,
strategy=, traderId=5142, valueDate=19000101]       , quantity=104864.0],
hasValBytes=true], startVer=1487430990114, ver=GridCacheVersion
[topVer=98913252, time=1487796336445, order=1487785730286, nodeOrder=14],
hash=534941633, extras=GridCacheMvccEntryExtras        [mvcc=GridCacheMvcc
[locs=[GridCacheMvccCandidate [nodeId=985b44be-ea4a-4d0b-a2a5-ab2aa441ac62,
ver=GridCacheVersion [topVer=98913253, time=1487796361851,
order=1487785731568, nodeOrder=14], timeout=9990, t       s=1487796361850,
threadId=76, id=1817320, topVer=AffinityTopologyVersion [topVer=3152,
minorTopVer=41], reentry=null,
otherNodeId=14cb9e24-0902-48b7-a016-f572db04838f, otherVer=GridCacheVersion
[topVer=98913       253, time=1487796349651, order=1487785731495,
nodeOrder=13], mappedDhtNodes=null, mappedNearNodes=null, ownerVer=null,
serOrder=null, key=KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
clearBrokerId=100       32, insIid=681871, strategy=, traderId=5142,
valueDate=19000101], hasValBytes=true],
masks=local=1|owner=1|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0,
prevVer=null, ne       xtVer=null], GridCacheMvccCandidate
[nodeId=985b44be-ea4a-4d0b-a2a5-ab2aa441ac62, ver=GridCacheVersion
[topVer=98913254, time=1487796367159, order=1487785731867, nodeOrder=14],
timeout=9990, ts=1487796367150       , threadId=57, id=1817427,
topVer=AffinityTopologyVersion [topVer=3153, minorTopVer=32], reentry=null,
otherNodeId=f6663b00-24fc-4515-91ac-20c3b47d90ec, otherVer=GridCacheVersion
[topVer=98913254, time=14877       96367155, order=1487785731866,
nodeOrder=7], mappedDhtNodes=null, mappedNearNodes=null,
ownerVer=GridCacheVersion [topVer=98913253, time=1487796361851,
order=1487785731568, nodeOrder=14], serOrder=null, key=      
KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI, clearBrokerId=10032,
insIid=681871, strategy=, traderId=5142, valueDate=19000101],
hasValBytes=true], masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|s      
ingle_implicit=0|dht_local=1|near_local=0|removed=0, prevVer=null,
nextVer=null]], rmts=null]], flags=0]]]], prepared=0, locked=false,
nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, f      
lags=0, partUpdateCntr=0, serReadVer=null, xidVer=GridCacheVersion
[topVer=98913254, time=1487796367159, order=1487785731867, nodeOrder=14]]}],
super=IgniteTxAdapter [xidVer=GridCacheVersion [topVer=98913254       ,
time=1487796367159, order=1487785731867, nodeOrder=14], writeVer=null,
implicit=false, loc=true, threadId=57, startTime=1487796367150,
nodeId=985b44be-ea4a-4d0b-a2a5-ab2aa441ac62, startVer=GridCacheVersion       
[topVer=98913254, time=1487796367159, order=1487785731867, nodeOrder=14],
endVer=null, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC,
timeout=9990, sysInvalidate=false, sys=false, plc=2, commitVer=null       ,
finalizing=NONE, preparing=false, invalidParts=null, state=MARKED_ROLLBACK,
timedOut=false, topVer=AffinityTopologyVersion [topVer=3153,
minorTopVer=32], duration=10008ms, onePhaseCommit=false], size=1]]],       
mapped=false, trackable=true, accessTtl=-1, needReturnVal=true,
skipStore=false, keepBinary=false, innerFuts=[],
pendingLocks=[KeyCacheObjectImpl [val=PositionId [fundAbbrev=BVI,
clearBrokerId=10032, insIid       =681871, strategy=, traderId=5142,
valueDate=19000101], hasValBytes=true]], super=GridCompoundIdentityFuture
[super=GridCompoundFuture [rdc=Bool reducer: true, initFlag=0, lsnrCalls=0,
done=true, cancelled=f       alse, err=null, futs=[]]]]]]]
109735 class
org.apache.ignite.internal.transactions.IgniteTxTimeoutCheckedException:
Failed to acquire lock within provided timeout for transaction
[timeout=9990, tx=org.apache.ignite.internal.processors.cache.dis      
tributed.dht.GridDhtTxLocalAdapter$1@7f2a8c8a]
109736         at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:3924)
109737         at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:3874)
109738         at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$2.applyx(GridEmbeddedFuture.java:91)
109739         at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:297)
109740         at
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:290)
109741         at
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:263)
109742         at
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:251)
109743         at
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:381)
109744         at
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:347)
109745         at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.onComplete(GridDhtLockFuture.java:752)
109746         at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.access$600(GridDhtLockFuture.java:79)
109747         at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture$LockTimeoutObject.onTimeout(GridDhtLockFuture.java:1116)
109748         at
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:159)
109749         at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
109750         at java.lang.Thread.run(Thread.java:745)
109751 
109752 Feb 22, 2017 3:46:17 PM org.apache.ignite.logger.java.JavaLogger
error
109753 SEVERE: Failed to initialize job
[jobId=73652976a51-2c4ae429-3284-4588-a96e-7e487f6653a1,
ses=GridJobSessionImpl [ses=GridTaskSessionImpl
[taskName=com.tudor.datagridI.server.tradegen.OrderHolderSaveRunn      
able, dep=LocalDeployment [super=GridDeployment [ts=1487430097277,
depMode=SHARED, clsLdr=sun.misc.Launcher$AppClassLoader@18b4aac2,
clsLdrId=1051eb15a51-985b44be-ea4a-4d0b-a2a5-ab2aa441ac62, userVer=0,       
loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
undeployed=false, usage=0]],
taskClsName=com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable,
sesId=63652976a51-2c4ae429-3284-4588       -a96e-7e487f6653a1,
startTime=1487796377472, endTime=9223372036854775807,
taskNodeId=2c4ae429-3284-4588-a96e-7e487f6653a1,
clsLdr=sun.misc.Launcher$AppClassLoader@18b4aac2, closed=false, cpSpi=null,
fail       Spi=null, loadSpi=null, usage=1, fullSup=false,
subjId=2c4ae429-3284-4588-a96e-7e487f6653a1, mapFut=IgniteFuture
[orig=GridFutureAdapter [resFlag=0, res=null, startTime=1487796377491,
endTime=0, ignoreIn       terrupts=false, state=INIT]]],
jobId=73652976a51-2c4ae429-3284-4588-a96e-7e487f6653a1]]
109754 class org.apache.ignite.IgniteCheckedException: Failed to deserialize
object with given class loader: sun.misc.Launcher$AppClassLoader@18b4aac2
109755         at
org.apache.ignite.marshaller.optimized.OptimizedMarshaller.unmarshal(OptimizedMarshaller.java:250)
109756         at
org.apache.ignite.internal.processors.job.GridJobWorker.initialize(GridJobWorker.java:409)
109757         at
org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1089)
109758         at
org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1766)
109759         at
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1238)
109760         at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:866)
109761         at
org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:106)
109762         at
org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:829)
109763         at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
109764         at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
109765         at java.lang.Thread.run(Thread.java:745)
109766 Caused by: java.io.EOFException: Attempt to read beyond the end of
the stream [pos=2216, more=1, max=2215]
109767         at
org.apache.ignite.internal.util.io.GridUnsafeDataInput.offset(GridUnsafeDataInput.java:171)
109768         at
org.apache.ignite.internal.util.io.GridUnsafeDataInput.readByte(GridUnsafeDataInput.java:398)
109769         at
org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride(OptimizedObjectInputStream.java:197)
109770         at
java.io.ObjectInputStream.readObject(ObjectInputStream.java:367)
109771         at
org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readFields(OptimizedObjectInputStream.java:491)
109772         at
org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readSerializable(OptimizedObjectInputStream.java:579)
109773         at
org.apache.ignite.marshaller.optimized.OptimizedClassDescriptor.read(OptimizedClassDescriptor.java:841)
109774         at
org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride(OptimizedObjectInputStream.java:324)
109775         at
java.io.ObjectInputStream.readObject(ObjectInputStream.java:367)
109776         at
org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readFields(OptimizedObjectInputStream.java:491)
109777         at
org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readSerializable(OptimizedObjectInputStream.java:579)
109778         at
org.apache.ignite.marshaller.optimized.OptimizedClassDescriptor.read(OptimizedClassDescriptor.java:841)
109779         at
org.apache.ignite.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride(OptimizedObjectInputStream.java:324)
109780         at
java.io.ObjectInputStream.readObject(ObjectInputStream.java:367)
109781         at
org.apache.ignite.marshaller.optimized.OptimizedMarshaller.unmarshal(OptimizedMarshaller.java:247)
109782         ... 10 more
109783




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10833.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
Hi

Timeout with value 9990 is ok because it is just the next step of
transaction and 10 ms was spend already for the previous steps of
transactions.

On Thu, Feb 23, 2017 at 8:48 AM, bintisepaha <bi...@tudor.com> wrote:
> Andrey,
>
> I finally have an error that might help. this happened again in production
> today for us.
> Ignite-Console-3.zip
> <http://apache-ignite-users.70518.x6.nabble.com/file/n10828/Ignite-Console-3.zip>
>
> This is the last update that threw an error, after this error every update
> just times out.
>
> The timeout=9990 in this error, none of our transactions have this timeout.
> Do you think this is an ignite bug? If you look at the stack trace, this is
> not happening due to our code.
>
> Although there is an error on marshaling. How can we further narrow it down?
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10828.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Andrey,

I finally have an error that might help. this happened again in production
today for us.
Ignite-Console-3.zip
<http://apache-ignite-users.70518.x6.nabble.com/file/n10828/Ignite-Console-3.zip>  

This is the last update that threw an error, after this error every update
just times out.

The timeout=9990 in this error, none of our transactions have this timeout.
Do you think this is an ignite bug? If you look at the stack trace, this is
not happening due to our code.

Although there is an error on marshaling. How can we further narrow it down?

Thanks,
Binti



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10828.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Attaching the console log file too, it has the above error.
Unfortunately we lost the files for older updates to this key now, they
rolled over.
Ignite-Console-1.zip
<http://apache-ignite-users.70518.x6.nabble.com/file/n10779/Ignite-Console-1.zip>  

but the "Transaction has been already completed" error could also happen
because maybe we call explicit rollback in case of any Runtime Exception.

tx.rollback();

We do get genuine timeouts in the code, but at that time, we never get this
exception.

Hope this helps.

Thanks,
Binti





--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10779.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Hi, I will try this code the next time this issue happens.

Attaching one nodes full logs. It has a lot of info.
Ignite-11221.gz
<http://apache-ignite-users.70518.x6.nabble.com/file/n10778/Ignite-11221.gz>  

However, I found this on the console logs on the first exception that
occurred. Is this usual?

 14764 Feb 17, 2017 1:16:04 PM org.apache.ignite.logger.java.JavaLogger
error
 14765 SEVERE: Failed to execute job
[jobId=fcedb9c4a51-9aa7d7c5-f6fa-4bdd-9473-0439b889d46f,
ses=GridJobSessionImpl [ses=GridTaskSessionImpl
[taskName=com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable,
dep=LocalDeployment [super=GridDeployment        [ts=1486832074045,
depMode=SHARED, clsLdr=sun.misc.Launcher$AppClassLoader@18b4aac2,
clsLdrId=678f81e2a51-b602d584-5565-434b-9727-94e218108073, userVer=0,
loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
undeployed=false, usage=0]],       
taskClsName=com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable,
sesId=ecedb9c4a51-9aa7d7c5-f6fa-4bdd-9473-0439b889d46f,
startTime=1487355352478, endTime=9223372036854775807,
taskNodeId=9aa7d7c5-f6fa-4bdd-9473-0439b889d46f, clsLdr=sun.misc      
.Launcher$AppClassLoader@18b4aac2, closed=false, cpSpi=null, failSpi=null,
loadSpi=null, usage=1, fullSup=false,
subjId=9aa7d7c5-f6fa-4bdd-9473-0439b889d46f, mapFut=IgniteFuture
[orig=GridFutureAdapter [resFlag=0, res=null, startTime=1487355352483,       
endTime=0, ignoreInterrupts=false, state=INIT]]],
jobId=fcedb9c4a51-9aa7d7c5-f6fa-4bdd-9473-0439b889d46f]]
 14766 class org.apache.ignite.IgniteException: Transaction has been already
completed.
 14767         at
org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:908)
 14768         at
org.apache.ignite.internal.processors.cache.transactions.TransactionProxyImpl.rollback(TransactionProxyImpl.java:299)
 14769         at
com.tudor.datagridI.server.cache.transaction.IgniteCacheTransaction.rollback(IgniteCacheTransaction.java:19)
 14770         at
com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.processOrderHolders(OrderHolderSaveRunnable.java:509) 
 14771         at
com.tudor.datagridI.server.tradegen.OrderHolderSaveRunnable.run(OrderHolderSaveRunnable.java:105)
 14772         at
org.apache.ignite.internal.processors.closure.GridClosureProcessor$C4V2.execute(GridClosureProcessor.java:2184)
 14773         at
org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:509)
 14774         at
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6521)
 14775         at
org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:503)
 14776         at
org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:456)
 14777         at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
 14778         at
org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1161) 
 14779         at
org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1766)
 14780         at
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1238)
 14781         at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:866)
 14782         at
org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:106)
 14783         at
org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:829)
 14784         at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 14785         at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 14786         at java.lang.Thread.run(Thread.java:745)
 14787 Caused by: class org.apache.ignite.IgniteCheckedException:
Transaction has been already completed.
 14788         at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finishDhtLocal(IgniteTxHandler.java:776)
 14789         at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:718) 
 14790         at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxFinishRequest(IgniteTxHandler.java:681) 
 14791         at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:156)
 14792         at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:154)
 14793         at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:748)
 14794         at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:353)
 14795         at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:277)
 14796         at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:88)
 14797         at
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1$2$1.run(GridCacheIoManager.java:222)
 14798         at
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6577)
 14799         at
org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:781)
 14800         at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
 14801         ... 3 more
 14802 




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10778.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
There are no something suspicious in stack trace.

You can check that key is locked using IgniteCache.isLocalLocked()
method. For remote nodes you can run task that performs this checking.

Could you please provide full logs for analysis?

On Tue, Feb 21, 2017 at 6:48 PM, bintisepaha <bi...@tudor.com> wrote:
> Andrey, thanks for getting back.
> I am attaching the stack trace. Don't think the cause is a deadloc, but the
> trace is long so maybe I am missing out something, let me know if you find
> something useful.
>
> We cannot ourselves reproduce this issue as there are no errors on the prior
> successful update. It feels like the txn was marked successful, but on one
> of the keys the lock was not released. and later when we try to access the
> key, its locked, hence the exceptions.
>
> No messages in the logs for long running txns or futures.
> By killing the node that holds the key, the lock is not released.
>
> Is there a way to query ignite to see if the locks are being held on a
> particular key? Any code we can run to salvage such locks?
>
> Any other suggestions?
>
> Thanks,
> Binti
>
>
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10764.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Andrey, thanks for getting back.
I am attaching the stack trace. Don't think the cause is a deadloc, but the
trace is long so maybe I am missing out something, let me know if you find
something useful.

We cannot ourselves reproduce this issue as there are no errors on the prior
successful update. It feels like the txn was marked successful, but on one
of the keys the lock was not released. and later when we try to access the
key, its locked, hence the exceptions.

No messages in the logs for long running txns or futures. 
By killing the node that holds the key, the lock is not released.

Is there a way to query ignite to see if the locks are being held on a
particular key? Any code we can run to salvage such locks?

Any other suggestions?

Thanks,
Binti






--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10764.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Gura <ag...@apache.org>.
Hi,

Actualy dead lock detection for pessimistic transactions was
introduced in Apache Ignite 1.7 release. But anyway it is good idea to
try the same code on the newest release.

In case of trasaction timeout dead lock detection should be triggered
and TransactionTimoutException should contain
TransactionDeadlockException as cause if dead lock was detected. Is it
your case or not?

If transaction doesn't release any locks (or non transactional locks)
then you will get messages in logs about long running transactions or
futures because exchange can't be finished in this case. Do you have
something like to this messages?

Is it possible to create some reproducer project that we can analyze?


On Sat, Feb 18, 2017 at 1:42 AM, bintisepaha <bi...@tudor.com> wrote:
> Thanks Andrew,
>
> The same thing happened again today. Clearly the key is locked, we get the
> timeout exceptions. But prior update to the same has not thrown any
> exceptions. Suddenly one update fails with timeout exceptions and we are
> notified due to those exceptions that the key is locked.
>
> We will upgrade to 1.8, but in the meantime is there a way to free up this
> locked key using some code?
>
> We try killing nodes, but we have one back up and it looks like the lock is
> carried over too, which would be the right thing to do.
>
> Outside the transaction we can read this key (dirty read). This is becoming
> an issue for us, since its a production system and the only way to free up
> is to restart the cluster.
>
> Please point us in a direction where we can avoid this or free it up.
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10713.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Thanks Andrew,

The same thing happened again today. Clearly the key is locked, we get the
timeout exceptions. But prior update to the same has not thrown any
exceptions. Suddenly one update fails with timeout exceptions and we are
notified due to those exceptions that the key is locked.

We will upgrade to 1.8, but in the meantime is there a way to free up this
locked key using some code?

We try killing nodes, but we have one back up and it looks like the lock is
carried over too, which would be the right thing to do.

Outside the transaction we can read this key (dirty read). This is becoming
an issue for us, since its a production system and the only way to free up
is to restart the cluster.

Please point us in a direction where we can avoid this or free it up.

Thanks,
Binti



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10713.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Mashenkov <an...@gmail.com>.
Hi,

I see nothing in thread dump. Possibly, there is a deadlock, as there are
no known issues with pessimistic TX on stable topology for ignite 1.7.
Deadlock detection implemented in ignite 1.8. You can try to upgrade to 1.8.


On Thu, Feb 16, 2017 at 9:32 PM, bintisepaha <bi...@tudor.com> wrote:

> Hello, any help on this would be very much appreciated.
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-
> subsequent-txns-failed-tp10536p10683.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Best regards,
Andrey V. Mashenkov

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Hello, any help on this would be very much appreciated.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10683.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by bintisepaha <bi...@tudor.com>.
Hi Andrey,

We are using ignite 1.7.0.
I am attaching the thread dumps. As far as we can tell the grid was working
fine, no threads were hanging. Only subsequent updates to this key were
hung.

The topology was stable. And the client sends an object to be saved to the
server and the transaction is actually only on the server side. 
prodThreadDump20170208.zip
<http://apache-ignite-users.70518.x6.nabble.com/file/n10601/prodThreadDump20170208.zip>  

Please let us know if you see something useful.

Thanks,
Binti




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-subsequent-txns-failed-tp10536p10601.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Pessimistic TXN did not release lock on a key, all subsequent txns failed

Posted by Andrey Mashenkov <am...@gridgain.com>.
Hi Binti,

It looks like smth happed while transaction was rollingback and some locks
was not released properly.
What Ignite version do you use?

Is it happened on stable topology or not?
Did you see hanging threads?
Are there any clients on topology that was involved in transaction?

Please provide more information, logs and thread-dumps if possible.

On Fri, Feb 10, 2017 at 1:36 AM, bintisepaha <bi...@tudor.com> wrote:

> Guys,
>
> We use pessimistic txns in ignite and they have been working great for
> months now, however yesterday we ran into an issue, where on multiple
> updates to the same key (although in interval of minutes), many updates
> went
> through, but at some point, we started getting TransactionTimeoutException
> because the key was locked. After that any update for the same key failed
> with the same error.
>
> We checked the logs for the last successful update and all looks good.
> Something happened that although the txn was completed, they key was still
> not released. We had to actually restart the cluster post business hours to
> resolve the issue.
>
> 1) Is there a way to avoid this?
> 2) Can we monitor objects via visor or jconsole to see if some keys are
> locked for a prolonged period of time?
> 3) Is there a way to free up the lock remotely or by running some salvaging
> code?
>
> Please advise, your help is much appreciated.
>
> Thanks,
> Binti
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Pessimistic-TXN-did-not-release-lock-on-a-key-all-
> subsequent-txns-failed-tp10536.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>