You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "Young-Seok Kim (JIRA)" <ji...@apache.org> on 2016/05/17 04:20:12 UTC

[jira] [Closed] (ASTERIXDB-1138) Abort request triggered by the deadlock from the Feed job is not handled.

     [ https://issues.apache.org/jira/browse/ASTERIXDB-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Young-Seok Kim closed ASTERIXDB-1138.
-------------------------------------
    Resolution: Fixed

This issue is fixed from the following commit:

Deadlock-free locking protocol is enabled
- Added EntityCommitProfiler class in TransactionSubsystem.java file:
This profiler takes a report interval (in seconds) parameter and
reports entity level commit count every report interval (in seconds)
only if IS_PROFILE_MODE is set to true. The profiler runs in a separate
thread. However, the profiler thread doesn't start reporting the count
until the entityCommitCount > 0. The profiler can be used to measure
1) IPS (Inserts Per Second) and
2) IIPS (instantaneous IPS) for the every report interval.

Change-Id: Ie58ae2f519baa53599e99b51bd61ea5f8366dafd
Reviewed-on: https://asterix-gerrit.ics.uci.edu/825
Tested-by: Jenkins <je...@fulliautomatix.ics.uci.edu>
Reviewed-by: Murtadha Hubail <hu...@gmail.com>

> Abort request triggered by the deadlock from the Feed job is not handled.
> -------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-1138
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1138
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Young-Seok Kim
>            Assignee: Young-Seok Kim
>            Priority: Critical
>
> I'm observing deadlock during my spatial index experiment.
> (The reason caused the deadlock is probably due to the PK hash value collision between a reader(query) and a writer(feed job))
> When the lock manager declares deadlock and throws ACID exception with requesting abort, the exception is swallowed by  FeedExceptionHandler without any proper handling. 
> After this, all incoming inserted record keeps requesting abort by throwing exception repeatedly, but again all swallowed by FeedExceptionHandler. 
> Due to this, queries hang by waiting for locks to be released, where the locks are acquired during the record insertion which caused initial deadlock situation. 
> The following shows the exception thrown:
> push Job 0
> push Resource 1970324836977414
> push Request 562949953421987
> push Job 2814749767106561
> pop Job 2814749767106561
> pop Request 562949953421987
> push Request 4222124650663985
> push Job 281474976710657
> push Resource 281474976711711
> push Request 281474976715806
> push Job 0
> Oct 13, 2015 8:25:10 AM org.apache.asterix.transaction.management.service.locking.ConcurrentLockManager requestAbort
> INFO: Exception: Transaction JID:21 should abort (requested by the Lock Manager):
> Job 0:0:0
> Resource 7:0:b06
> Request f:0:1031
> Job 1:0:1
> Resource 1:0:41f
> Request 1:0:141e
> Job 0:0:0
> Exception: Transaction JID:21 should abort (requested by the Lock Manager):
> Job 0:0:0
> Resource 7:0:b06
> Request f:0:1031
> Job 1:0:1
> Resource 1:0:41f
> Request 1:0:141e
> Job 0:0:0
> Oct 13, 2015 8:25:10 AM org.apache.asterix.transaction.management.service.locking.ConcurrentLockManager requestAbort
> INFO: Exception: Transaction JID:21 should abort (requested by the Lock Manager):
> timeout
> Exception: Transaction JID:21 should abort (requested by the Lock Manager):
> timeout
> org.apache.hyracks.api.exceptions.HyracksDataException: org.apache.asterix.common.exceptions.ACIDException: Transaction JID:21 should abort (requested by the Lock Manager):
> Job 0:0:0
> Resource 7:0:b06
> Request f:0:1031
> Job 1:0:1
> Resource 1:0:41f
> Request 1:0:141e
> Job 0:0:0
>         at org.apache.asterix.transaction.management.opcallbacks.PrimaryIndexModificationOperationCallback.before(PrimaryIndexModificationOperationCallback.java:62)
>         at org.apache.hyracks.storage.am.btree.impls.BTree.upsert(BTree.java:336)
>         at org.apache.hyracks.storage.am.btree.impls.BTree.access$400(BTree.java:74)
>         at org.apache.hyracks.storage.am.btree.impls.BTree$BTreeAccessor.upsertIfConditionElseInsert(BTree.java:938)
>         at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTree.insert(LSMBTree.java:441)
>         at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTree.modify(LSMBTree.java:379)
>         at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.modify(LSMHarness.java:351)
>         at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.forceModify(LSMHarness.java:334)
>         at org.apache.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.forceInsert(LSMTreeIndexAccessor.java:157)
>         at org.apache.asterix.common.dataflow.AsterixLSMInsertDeleteOperatorNodePushable.nextFrame(AsterixLSMInsertDeleteOperatorNodePushable.java:107)
>         at org.apache.asterix.common.feeds.MonitoredBuffer.processMessage(MonitoredBuffer.java:322)
>         at org.apache.asterix.common.feeds.MonitoredBuffer.processMessage(MonitoredBuffer.java:44)
>         at org.apache.asterix.common.feeds.MessageReceiver$MessageReceiverRunnable.run(MessageReceiver.java:83)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.asterix.common.exceptions.ACIDException: Transaction JID:21 should abort (requested by the Lock Manager):
> Job 0:0:0
> Resource 7:0:b06
> Request f:0:1031
> Job 1:0:1
> Resource 1:0:41f
> Request 1:0:141e
> Job 0:0:0
>         at org.apache.asterix.transaction.management.service.locking.ConcurrentLockManager.requestAbort(ConcurrentLockManager.java:925)
>         at org.apache.asterix.transaction.management.service.locking.ConcurrentLockManager.enqueueWaiter(ConcurrentLockManager.java:180)
>         at org.apache.asterix.transaction.management.service.locking.ConcurrentLockManager.lock(ConcurrentLockManager.java:155)
>         at org.apache.asterix.transaction.management.opcallbacks.PrimaryIndexModificationOperationCallback.before(PrimaryIndexModificationOperationCallback.java:53)
>         ... 15 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)