You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "Young-Seok Kim (JIRA)" <ji...@apache.org> on 2015/10/14 06:33:05 UTC
[jira] [Assigned] (ASTERIXDB-1138) Abort request triggered by the deadlock from the Feed job is not handled.

     [ https://issues.apache.org/jira/browse/ASTERIXDB-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Young-Seok Kim reassigned ASTERIXDB-1138:
-----------------------------------------

    Assignee: Young-Seok Kim  (was: Abdullah Alamoudi)

> Abort request triggered by the deadlock from the Feed job is not handled.
> -------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-1138
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1138
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Young-Seok Kim
>            Assignee: Young-Seok Kim
>            Priority: Critical
>
> I'm observing deadlock during my spatial index experiment.
> (The reason caused the deadlock is probably due to the PK hash value collision between a reader(query) and a writer(feed job))
> When the lock manager declares deadlock and throws ACID exception with requesting abort, the exception is swallowed by  FeedExceptionHandler without any proper handling. 
> After this, all incoming inserted record keeps requesting abort by throwing exception repeatedly, but again all swallowed by FeedExceptionHandler. 
> Due to this, queries hang by waiting for locks to be released, where the locks are acquired during the record insertion which caused initial deadlock situation. 
> The following shows the exception thrown:
> push Job 0
> push Resource 1970324836977414
> push Request 562949953421987
> push Job 2814749767106561
> pop Job 2814749767106561
> pop Request 562949953421987
> push Request 4222124650663985
> push Job 281474976710657
> push Resource 281474976711711
> push Request 281474976715806
> push Job 0
> Oct 13, 2015 8:25:10 AM org.apache.asterix.transaction.management.service.locking.ConcurrentLockManager requestAbort
> INFO: Exception: Transaction JID:21 should abort (requested by the Lock Manager):
> Job 0:0:0
> Resource 7:0:b06
> Request f:0:1031
> Job 1:0:1
> Resource 1:0:41f
> Request 1:0:141e
> Job 0:0:0
> Exception: Transaction JID:21 should abort (requested by the Lock Manager):
> Job 0:0:0
> Resource 7:0:b06
> Request f:0:1031
> Job 1:0:1
> Resource 1:0:41f
> Request 1:0:141e
> Job 0:0:0
> Oct 13, 2015 8:25:10 AM org.apache.asterix.transaction.management.service.locking.ConcurrentLockManager requestAbort
> INFO: Exception: Transaction JID:21 should abort (requested by the Lock Manager):
> timeout
> Exception: Transaction JID:21 should abort (requested by the Lock Manager):
> timeout
> org.apache.hyracks.api.exceptions.HyracksDataException: org.apache.asterix.common.exceptions.ACIDException: Transaction JID:21 should abort (requested by the Lock Manager):
> Job 0:0:0
> Resource 7:0:b06
> Request f:0:1031
> Job 1:0:1
> Resource 1:0:41f
> Request 1:0:141e
> Job 0:0:0
>         at org.apache.asterix.transaction.management.opcallbacks.PrimaryIndexModificationOperationCallback.before(PrimaryIndexModificationOperationCallback.java:62)
>         at org.apache.hyracks.storage.am.btree.impls.BTree.upsert(BTree.java:336)
>         at org.apache.hyracks.storage.am.btree.impls.BTree.access$400(BTree.java:74)
>         at org.apache.hyracks.storage.am.btree.impls.BTree$BTreeAccessor.upsertIfConditionElseInsert(BTree.java:938)
>         at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTree.insert(LSMBTree.java:441)
>         at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTree.modify(LSMBTree.java:379)
>         at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.modify(LSMHarness.java:351)
>         at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.forceModify(LSMHarness.java:334)
>         at org.apache.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.forceInsert(LSMTreeIndexAccessor.java:157)
>         at org.apache.asterix.common.dataflow.AsterixLSMInsertDeleteOperatorNodePushable.nextFrame(AsterixLSMInsertDeleteOperatorNodePushable.java:107)
>         at org.apache.asterix.common.feeds.MonitoredBuffer.processMessage(MonitoredBuffer.java:322)
>         at org.apache.asterix.common.feeds.MonitoredBuffer.processMessage(MonitoredBuffer.java:44)
>         at org.apache.asterix.common.feeds.MessageReceiver$MessageReceiverRunnable.run(MessageReceiver.java:83)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.asterix.common.exceptions.ACIDException: Transaction JID:21 should abort (requested by the Lock Manager):
> Job 0:0:0
> Resource 7:0:b06
> Request f:0:1031
> Job 1:0:1
> Resource 1:0:41f
> Request 1:0:141e
> Job 0:0:0
>         at org.apache.asterix.transaction.management.service.locking.ConcurrentLockManager.requestAbort(ConcurrentLockManager.java:925)
>         at org.apache.asterix.transaction.management.service.locking.ConcurrentLockManager.enqueueWaiter(ConcurrentLockManager.java:180)
>         at org.apache.asterix.transaction.management.service.locking.ConcurrentLockManager.lock(ConcurrentLockManager.java:155)
>         at org.apache.asterix.transaction.management.opcallbacks.PrimaryIndexModificationOperationCallback.before(PrimaryIndexModificationOperationCallback.java:53)
>         ... 15 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)