You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/11/14 14:24:02 UTC
[jira] [Commented] (IGNITE-6527) Deadlock detection works
incorrectly with some timeouts that haven't caused by deadlocks.
[ https://issues.apache.org/jira/browse/IGNITE-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251454#comment-16251454 ]
ASF GitHub Bot commented on IGNITE-6527:
----------------------------------------
GitHub user BiryukovVA opened a pull request:
https://github.com/apache/ignite/pull/3033
IGNITE-6527: Solution 2.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/BiryukovVA/ignite IGNITE-6527_secondSolution
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/ignite/pull/3033.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3033
----
commit dc6e337c4663e977abd1c56e60837a9335f26215
Author: Vitaliy Biryukov <bi...@gmail.com>
Date: 2017-11-14T12:58:58Z
IGNITE-6527: Example solution 2.
----
> Deadlock detection works incorrectly with some timeouts that haven't caused by deadlocks.
> -----------------------------------------------------------------------------------------
>
> Key: IGNITE-6527
> URL: https://issues.apache.org/jira/browse/IGNITE-6527
> Project: Ignite
> Issue Type: Bug
> Reporter: Vitaliy Biryukov
> Assignee: Vitaliy Biryukov
> Attachments: TxOptimisticDeadlockDetectionIncorrectMessageTest.java
>
>
> Deadlock detection works incorrectly with timeouts that haven't caused by deadlocks. In case of a deadlock in future. Or can detect another deadlock which was not the cause of timeout.
> *requested keys:* keys primary for the same node and blocking in sequential order during the timeout (or all keys that haven't locked by an optimistic transaction in case of near cache).
> *candidates:* keys candidates to be locked on a primary node (entries contains in GridDhtTxLocal).
> In the process of updating the Wait-For-Graph requested keys used as candidates. But "TxDeadlock.toString" method use candidates which were received from messages.
> 1) It causes an incorrect error message.
> Example:
> K1: TX1 holds lock, TX2 waits lock.
> K2: TX3 holds lock, TX1 waits lock.
> Transactions:
> TX1 [txId=GridCacheVersion [topVer=118090802, order=1506610794980, nodeOrder=1], nodeId=f03b1ae3-a100-479c-9671-11d5cef00000, threadId=455]
> TX2 [txId=GridCacheVersion [topVer=118090802, order=1506610794980, nodeOrder=2], nodeId=2c0c0e78-cab2-4b23-a985-4965e4200001, threadId=456]
> TX3 [txId=GridCacheVersion [topVer=118090802, order=1506610794980, nodeOrder=3], nodeId=3340dc48-f1a1-4ea8-8742-19b314300002, threadId=457]
> Keys:
> K1 [key=6, cache=cache]
> K2 [key=1, cache=cache]
> 2) DD can detect another deadlock which was not the cause of timeout but it would be the cause if the current deadlock did not happen.
> These are very rare situations, but they can happen.
> I see several solutions:
> * Just make a correct message.
> * log warn and continue detecting.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)