You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Ignite TC Bot (Jira)" <ji...@apache.org> on 2021/01/12 02:47:00 UTC

[jira] [Commented] (IGNITE-12950) Partitions validator must check sizes even if update counters are different

    [ https://issues.apache.org/jira/browse/IGNITE-12950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263035#comment-17263035 ] 

Ignite TC Bot commented on IGNITE-12950:
----------------------------------------

{panel:title=Branch: [pull/8645/head] Base: [master] : No blockers found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/8645/head] Base: [master] : New Tests (4)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#00008b}Cache 1{color} [[tests 4|https://ci.ignite.apache.org/viewLog.html?buildId=5824047]]
* {color:#013220}IgniteBinaryCacheTestSuite: GridCachePartitionsUpdateCountersAndSizeTest.testValidationfPartitionSizeInconsistent - PASSED{color}
* {color:#013220}IgniteBinaryCacheTestSuite: GridCachePartitionsUpdateCountersAndSizeTest.testValidationfPartitionCountersInconsistent - PASSED{color}
* {color:#013220}IgniteBinaryCacheTestSuite: GridCachePartitionsUpdateCountersAndSizeTest.testValidationBothPatririonSixeAndCountersAreConsistent - PASSED{color}
* {color:#013220}IgniteBinaryCacheTestSuite: GridCachePartitionsUpdateCountersAndSizeTest.testValidationBothPartitionSizesAndCountersAreInconsistent - PASSED{color}

{panel}
[TeamCity *--&gt; Run :: All* Results|https://ci.ignite.apache.org/viewLog.html?buildId=5824091&amp;buildTypeId=IgniteTests24Java8_RunAll]

> Partitions validator must check sizes even if update counters are different
> ---------------------------------------------------------------------------
>
>                 Key: IGNITE-12950
>                 URL: https://issues.apache.org/jira/browse/IGNITE-12950
>             Project: Ignite
>          Issue Type: Improvement
>          Components: cache
>            Reporter: Ivan Mironovich
>            Assignee: Ivan Mironovich
>            Priority: Major
>             Fix For: 2.10
>
>   Original Estimate: 336h
>          Time Spent: 0.5h
>  Remaining Estimate: 335.5h
>
> We have method in GridDhtPartitionsStateValidator:
> {code:java}
> // public void validatePartitionCountersAndSizes(
>         GridDhtPartitionsExchangeFuture fut,
>         GridDhtPartitionTopology top,
>         Map<UUID, GridDhtPartitionsSingleMessage> messages
>     ) throws IgniteCheckedException {
>         final Set<UUID> ignoringNodes = new HashSet<>();
>         // Ignore just joined nodes.
>         for (DiscoveryEvent evt : fut.events().events()) {
>             if (evt.type() == EVT_NODE_JOINED)
>                 ignoringNodes.add(evt.eventNode().id());
>         }
>         AffinityTopologyVersion topVer = fut.context().events().topologyVersion();
>         // Validate update counters.
>         Map<Integer, Map<UUID, Long>> result = validatePartitionsUpdateCounters(top, messages, ignoringNodes);
>         if (!result.isEmpty())
>             throw new IgniteCheckedException("Partitions update counters are inconsistent for " + fold(topVer, result));
>         // For sizes validation ignore also nodes which are not able to send cache sizes.
>         for (UUID id : messages.keySet()) {
>             ClusterNode node = cctx.discovery().node(id);
>             if (node != null && node.version().compareTo(SIZES_VALIDATION_AVAILABLE_SINCE) < 0)
>                 ignoringNodes.add(id);
>         }
>         if (!cctx.cache().cacheGroup(top.groupId()).mvccEnabled()) { // TODO: Remove "if" clause in IGNITE-9451.
>             // Validate cache sizes.
>             result = validatePartitionsSizes(top, messages, ignoringNodes);
>             if (!result.isEmpty())
>                 throw new IgniteCheckedException("Partitions cache sizes are inconsistent for " + fold(topVer, result));
>         }
>     }
> {code}
>  We should check partitions sizes even if update counters are different. It could be helpful for debugging problems on production.
>  We must print information about all copies, if a partition is in an inconsistent state. Now we could get the message on cache group with 3 backups:
> {code:java}
> // Partition states validation has failed for group: CACHEGROUP. Partitions update counters are inconsistent for Part 3415: [10.104.6.10:47500=2577263 10.104.6.12:47500=2577263 10.104.6.23:47500=2577262 10.104.6.9:47500=2577263 ] Part 4960: [10.104.6.11:47500=2560994 10.104.6.23:47500=2560993 ]
> {code}
> (part 4960 contains information about 2 copies only)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)