You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ignite.apache.org by "Semen Boikov (JIRA)" <ji...@apache.org> on 2015/08/24 10:17:45 UTC

[jira] [Commented] (IGNITE-1027) Possible data loss in replicated cache on unstable topology.

    [ https://issues.apache.org/jira/browse/IGNITE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14708944#comment-14708944 ] 

Semen Boikov commented on IGNITE-1027:
--------------------------------------

>From rebalance code I see that SYNC rebalance is broken if multiple nodes start concurrently:
- method 'GridDhtPartitionDemandPool.assign' returns empty assigments if there are pending exchanges
- DemandWorkers receive empty assigns, finish loop and complete SyncFuture 
- Ignite exits from start method before rebalance was really finished

Also DemandWorkers can stop rebalance process and complete SyncFuture if topology changed during rebalancing (see usages of DemandWorker.topologyChanged()).

> Possible data loss in replicated cache on unstable topology.
> ------------------------------------------------------------
>
>                 Key: IGNITE-1027
>                 URL: https://issues.apache.org/jira/browse/IGNITE-1027
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Sergi Vladykin
>            Assignee: Semen Boikov
>             Fix For: ignite-1.4
>
>
> In test IgniteCacheClientQueryReplicatedNodeRestartSelfTest we have 4 data nodes with replicated caches and single client-only node, which runs SQL queries against those data nodes. Background threads restarting data nodes. When we restart 2 of 4 data nodes everything is fine, when 3 of 4 then eventually query returns inconsistent result and cache size returns smaller values than expected. Since we use rebalance mode SYNC, such a data loss should not happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)