You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Semen Boikov (JIRA)" <ji...@apache.org> on 2016/04/05 15:06:25 UTC

[jira] [Resolved] (IGNITE-324) Partition exchange: node should be assigned as primary only after preloading is finished

     [ https://issues.apache.org/jira/browse/IGNITE-324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Semen Boikov resolved IGNITE-324.
---------------------------------
    Resolution: Fixed

Implemented late affinity assignment mode (enabled by default, can be disabled using new property IgniteConfiguration.lateAffinityAssignment). 

Implementation details:
- coordinator should maintain affinity information about all caches (even for caches not started on coordinator)
- joining node should always request affinity from some other node (since current assignment can differ from one calculated by affinity function)
- when new server joins then all existing nodes are able to calculate affinity locally (if affinity function assigns joined node as primary then it is temporary assigned as backup)
Coordinator knows about all primaries waiting for rebalancing, when coordinator receives partitions state update it checks if all new primaries rebalanced required partitions, and sends special discovery message (CacheAffinityChangeMessage). This messages initiates partitions exchanges on all nodes, during this exchange new affinity assignment is applied.
- when server node fails new affinity is calculated on coordinator after it receives GridDhtPartitionsSingleMessage from others nodes. 
If affinity assigns primary node which does not own partition, coordinator tries to find existing owner which, if owner node is found then it is temporary assigned as primary.
Then coordinator should pass calculated affinity to others nodes, if exchange is completed by GridDhtPartitionsFullMessage then following scenario is possible:
coordinator sends GridDhtPartitionsFullMessage to some nodes and fails, nodes received this message complete exchange, and it is possible that new coordinator for exchange can compute different affinity (since it can have locally another information about current partition owners).
To avoid this issue exchange started for server node left event is completed by discovery message which contains new affinity assignments (the same CacheAffinityChangeMessage is used).


> Partition exchange: node should be assigned as primary only after preloading is finished
> ----------------------------------------------------------------------------------------
>
>                 Key: IGNITE-324
>                 URL: https://issues.apache.org/jira/browse/IGNITE-324
>             Project: Ignite
>          Issue Type: Task
>          Components: cache
>    Affects Versions: sprint-2
>            Reporter: Alexey Goncharuk
>            Assignee: Semen Boikov
>            Priority: Critical
>             Fix For: 1.6
>
>
> After node joins topology, affinity assignment should not be changed immediately. New node is assigned as a backup node even for those partitions that are supposed to be primary. Node becomes primary only when all partitions are loaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)