You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ignite.apache.org by "Dmitriy Govorukhin (JIRA)" <ji...@apache.org> on 2018/05/10 12:19:00 UTC

[jira] [Updated] (IGNITE-8459) Searching checkpoint history for WAL rebalance is broken

     [ https://issues.apache.org/jira/browse/IGNITE-8459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy Govorukhin updated IGNITE-8459:
---------------------------------------
    Fix Version/s: 2.6

> Searching checkpoint history for WAL rebalance is broken
> --------------------------------------------------------
>
>                 Key: IGNITE-8459
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8459
>             Project: Ignite
>          Issue Type: Bug
>          Components: cache
>    Affects Versions: 2.5
>            Reporter: Pavel Kovalenko
>            Assignee: Pavel Kovalenko
>            Priority: Critical
>             Fix For: 2.6
>
>
> Currently the mechanism to search available checkpoint records in WAL to have history for WAL rebalance is broken. It means that WAL (Historical) rebalance will never find history for rebalance and full rebalance will be always used.
> This mechanism was broken in https://github.com/apache/ignite/commit/ec04cd174ed5476fba83e8682214390736321b37 by unclear reasons.
> If we swap the following two code blocks (database().beforeExchange() and exchCtx if block):
> {noformat}
>         /* It is necessary to run database callback before all topology callbacks.
>            In case of persistent store is enabled we first restore partitions presented on disk.
>            We need to guarantee that there are no partition state changes logged to WAL before this callback
>            to make sure that we correctly restored last actual states. */
>         cctx.database().beforeExchange(this);
>         if (!exchCtx.mergeExchanges()) {
>             for (CacheGroupContext grp : cctx.cache().cacheGroups()) {
>                 if (grp.isLocal() || cacheGroupStopping(grp.groupId()))
>                     continue;
>                 // It is possible affinity is not initialized yet if node joins to cluster.
>                 if (grp.affinity().lastVersion().topologyVersion() > 0)
>                     grp.topology().beforeExchange(this, !centralizedAff && !forceAffReassignment, false);
>             }
>         }
> {noformat}
> the searching mechanism will start to work correctly. Currently it's unclear why it's happened.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)