You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by "Pavel Kovalenko (JIRA)" <ji...@apache.org> on 2018/05/08 18:05:00 UTC

[jira] [Created] (IGNITE-8459) Searching checkpoint history for WAL rebalance is broken

Pavel Kovalenko created IGNITE-8459:
---------------------------------------

             Summary: Searching checkpoint history for WAL rebalance is broken
                 Key: IGNITE-8459
                 URL: https://issues.apache.org/jira/browse/IGNITE-8459
             Project: Ignite
          Issue Type: Bug
          Components: cache
    Affects Versions: 2.5
            Reporter: Pavel Kovalenko
            Assignee: Pavel Kovalenko


Currently the mechanism to search available checkpoint records in WAL to have history for WAL rebalance is broken. It means that WAL (Historical) rebalance will never find history for rebalance and full rebalance will be always used.

This mechanism was broken in https://github.com/apache/ignite/commit/ec04cd174ed5476fba83e8682214390736321b37 by unclear reasons.

If we swap the following two code blocks (database().beforeExchange() and exchCtx if block):

{noformat}
        /* It is necessary to run database callback before all topology callbacks.
           In case of persistent store is enabled we first restore partitions presented on disk.
           We need to guarantee that there are no partition state changes logged to WAL before this callback
           to make sure that we correctly restored last actual states. */
        cctx.database().beforeExchange(this);

        if (!exchCtx.mergeExchanges()) {
            for (CacheGroupContext grp : cctx.cache().cacheGroups()) {
                if (grp.isLocal() || cacheGroupStopping(grp.groupId()))
                    continue;

                // It is possible affinity is not initialized yet if node joins to cluster.
                if (grp.affinity().lastVersion().topologyVersion() > 0)
                    grp.topology().beforeExchange(this, !centralizedAff && !forceAffReassignment, false);
            }
        }
{noformat}

the searching mechanism will start to work correctly. Currently it's unclear why it's happened.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)