You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by "Ivan (Jira)" <ji...@apache.org> on 2020/04/26 08:22:00 UTC

[jira] [Created] (IGNITE-12950) Partitions validator must check sizes even if update counters are different

Ivan created IGNITE-12950:
-----------------------------

             Summary: Partitions validator must check sizes even if update counters are different
                 Key: IGNITE-12950
                 URL: https://issues.apache.org/jira/browse/IGNITE-12950
             Project: Ignite
          Issue Type: Improvement
          Components: cache
            Reporter: Ivan
             Fix For: 2.9


We have method in GridDhtPartitionsStateValidator:
{code:java}
// public void validatePartitionCountersAndSizes(
        GridDhtPartitionsExchangeFuture fut,
        GridDhtPartitionTopology top,
        Map<UUID, GridDhtPartitionsSingleMessage> messages
    ) throws IgniteCheckedException {
        final Set<UUID> ignoringNodes = new HashSet<>();

        // Ignore just joined nodes.
        for (DiscoveryEvent evt : fut.events().events()) {
            if (evt.type() == EVT_NODE_JOINED)
                ignoringNodes.add(evt.eventNode().id());
        }

        AffinityTopologyVersion topVer = fut.context().events().topologyVersion();

        // Validate update counters.
        Map<Integer, Map<UUID, Long>> result = validatePartitionsUpdateCounters(top, messages, ignoringNodes);

        if (!result.isEmpty())
            throw new IgniteCheckedException("Partitions update counters are inconsistent for " + fold(topVer, result));

        // For sizes validation ignore also nodes which are not able to send cache sizes.
        for (UUID id : messages.keySet()) {
            ClusterNode node = cctx.discovery().node(id);
            if (node != null && node.version().compareTo(SIZES_VALIDATION_AVAILABLE_SINCE) < 0)
                ignoringNodes.add(id);
        }

        if (!cctx.cache().cacheGroup(top.groupId()).mvccEnabled()) { // TODO: Remove "if" clause in IGNITE-9451.
            // Validate cache sizes.
            result = validatePartitionsSizes(top, messages, ignoringNodes);

            if (!result.isEmpty())
                throw new IgniteCheckedException("Partitions cache sizes are inconsistent for " + fold(topVer, result));
        }
    }
{code}
{{}}
We should check paritions sizes even if update counters are different. It could be helpful for debug problems on production.
We must print information about all copies, if partition is in inconsistent state. Now we could get message on cache group with 3 backups:
{code:java}
// Partition states validation has failed for group: CACHEGROUP_PARTICLE_union-module_com.sbt.processing.data.partition.dpl.PartitionKey. Partitions update counters are inconsistent for Part 3415: [10.104.6.10:47500=2577263 10.104.6.12:47500=2577263 10.104.6.23:47500=2577262 10.104.6.9:47500=2577263 ] Part 4960: [10.104.6.11:47500=2560994 10.104.6.23:47500=2560993 ]
{code}
(part 4960 contains information about 2 copies only)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)