You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Semen Boikov (JIRA)" <ji...@apache.org> on 2015/12/23 11:26:47 UTC

[jira] [Comment Edited] (IGNITE-647) org.apache.ignite.IgniteCacheAffinitySelfTest.testAffinity() hangs

    [ https://issues.apache.org/jira/browse/IGNITE-647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15069394#comment-15069394 ] 

Semen Boikov edited comment on IGNITE-647 at 12/23/15 10:26 AM:
----------------------------------------------------------------

There is one more race here:
- node 1 starts, exchange is finished
- node 2 starts, exchange is finished
- node 2 starts cache with fair affinity function, starts exchange and sends GridDhtAffinityAssignmentRequest to node1. At this point node1 started exchange process, but did start cache yet and did not register required message handler so message will be ignored.

So any dynamic cache start with fair affinity initiated from non-oldest node can easily hang.

This behaviour is caused by wrong logic in GridDhtPartitionsExchangeFuture#canCalculateAffinity - when cache is started then all nodes can calculate affinity and there is no need in GridDhtAffinityAssignmentRequest send.


was (Author: sboikov):
There is one more race here:
- node 1 starts, exchange is finished
- node 2 starts, exchange is finished
- node 2 starts cache with fair affinity function, starts exchange and sends GridDhtAffinityAssignmentRequest to node1. At this point node1 started exchange process, but did start cache yet and did not register required message handler so message will be ignored.

So any dynamic cache start with fair affinity initiated from non-oldest node can easily hang.

> org.apache.ignite.IgniteCacheAffinitySelfTest.testAffinity() hangs
> ------------------------------------------------------------------
>
>                 Key: IGNITE-647
>                 URL: https://issues.apache.org/jira/browse/IGNITE-647
>             Project: Ignite
>          Issue Type: Sub-task
>            Reporter: Yakov Zhdanov
>            Assignee: Semen Boikov
>            Priority: Blocker
>              Labels: Muted_test
>         Attachments: FairAffinityDynamicCacheSelfTest.testStartStopCache.txt, threaddump.txt
>
>
> 1-2 runs out of ~10 local runs hanged for me



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)