You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Ivan Pavlukhin (JIRA)" <ji...@apache.org> on 2019/01/30 08:25:00 UTC
[jira] [Commented] (IGNITE-11121) MVCC TX: AssertionError in discovery manager on BLT change.

    [ https://issues.apache.org/jira/browse/IGNITE-11121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755842#comment-16755842 ] 

Ivan Pavlukhin commented on IGNITE-11121:
-----------------------------------------

I was not able to reproduce the issue. Fix in the patch addresses the problem technically by adding one more check. But actually it would be great to reproduce an issue and find a root cause. Because currently it seems impossible which might mean that we make wrong assumptions during _mvcc coordinator_ object initialization.

> MVCC TX: AssertionError in discovery manager on BLT change.
> -----------------------------------------------------------
>
>                 Key: IGNITE-11121
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11121
>             Project: Ignite
>          Issue Type: Bug
>          Components: mvcc
>            Reporter: Igor Seliverstov
>            Assignee: Ivan Pavlukhin
>            Priority: Critical
>             Fix For: 2.8
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The next exception occurred in logs on BLT change.
> {noformat}
> [12:11:36,912][SEVERE][sys-#87][GridClosureProcessor] Closure execution failed with error.
> java.lang.AssertionError
>         at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.node(GridDiscoveryManager.java:1794)
>         at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1693)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$NodeFailureTimeoutObject.lambda$onTimeout0$16553d7$1(IgniteTxManager.java:2592)
>         at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399)
>         at org.apache.ignite.internal.util.future.GridFutureAdapter.listen(GridFutureAdapter.java:354)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$NodeFailureTimeoutObject.onTimeout0(IgniteTxManager.java:2588)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$NodeFailureTimeoutObject.access$3300(IgniteTxManager.java:2505)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$NodeFailureTimeoutObject$1.run(IgniteTxManager.java:2623)
>         at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6874)
>         at org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
>         at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> {noformat}
> From the stack trace I see there is a node failure which causes transactions recovery and uninitialized Mvcc coordinator (it means there are no server nodes, or there is a coordinatorAssignClosure which returns no result, or a recovering node was not activated)
> the scenario, where the exception may be observed:
> # Start a cluster
> # Load some data (from client node, the client node is shut down after that)
> # Calculate hash
> # Add new server node
> # Change BLT
> # Wait for rebalance
> # Calculate new hash and check it is the same as previously calculated



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)