You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Ignite TC Bot (Jira)" <ji...@apache.org> on 2020/05/01 06:58:00 UTC
[jira] [Commented] (IGNITE-12950) Partitions validator must check
sizes even if update counters are different
[ https://issues.apache.org/jira/browse/IGNITE-12950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17097201#comment-17097201 ]
Ignite TC Bot commented on IGNITE-12950:
----------------------------------------
{panel:title=Branch: [pull/7735/head] Base: [master] : Possible Blockers (2)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}Platform .NET (NuGet)*{color} [[tests 0 Exit Code , Compilation Error |https://ci.ignite.apache.org/viewLog.html?buildId=5274903]]
{color:#d04437}Platform .NET (Inspections)*{color} [[tests 0 Failure on metric |https://ci.ignite.apache.org/viewLog.html?buildId=5274905]]
{panel}
[TeamCity *--> Run :: All* Results|https://ci.ignite.apache.org/viewLog.html?buildId=5273102&buildTypeId=IgniteTests24Java8_RunAll]
> Partitions validator must check sizes even if update counters are different
> ---------------------------------------------------------------------------
>
> Key: IGNITE-12950
> URL: https://issues.apache.org/jira/browse/IGNITE-12950
> Project: Ignite
> Issue Type: Improvement
> Components: cache
> Reporter: Ivan Mironovich
> Assignee: Ivan Mironovich
> Priority: Major
> Fix For: 2.9
>
> Original Estimate: 336h
> Time Spent: 10m
> Remaining Estimate: 335h 50m
>
> We have method in GridDhtPartitionsStateValidator:
> {code:java}
> // public void validatePartitionCountersAndSizes(
> GridDhtPartitionsExchangeFuture fut,
> GridDhtPartitionTopology top,
> Map<UUID, GridDhtPartitionsSingleMessage> messages
> ) throws IgniteCheckedException {
> final Set<UUID> ignoringNodes = new HashSet<>();
> // Ignore just joined nodes.
> for (DiscoveryEvent evt : fut.events().events()) {
> if (evt.type() == EVT_NODE_JOINED)
> ignoringNodes.add(evt.eventNode().id());
> }
> AffinityTopologyVersion topVer = fut.context().events().topologyVersion();
> // Validate update counters.
> Map<Integer, Map<UUID, Long>> result = validatePartitionsUpdateCounters(top, messages, ignoringNodes);
> if (!result.isEmpty())
> throw new IgniteCheckedException("Partitions update counters are inconsistent for " + fold(topVer, result));
> // For sizes validation ignore also nodes which are not able to send cache sizes.
> for (UUID id : messages.keySet()) {
> ClusterNode node = cctx.discovery().node(id);
> if (node != null && node.version().compareTo(SIZES_VALIDATION_AVAILABLE_SINCE) < 0)
> ignoringNodes.add(id);
> }
> if (!cctx.cache().cacheGroup(top.groupId()).mvccEnabled()) { // TODO: Remove "if" clause in IGNITE-9451.
> // Validate cache sizes.
> result = validatePartitionsSizes(top, messages, ignoringNodes);
> if (!result.isEmpty())
> throw new IgniteCheckedException("Partitions cache sizes are inconsistent for " + fold(topVer, result));
> }
> }
> {code}
> We should check partitions sizes even if update counters are different. It could be helpful for debugging problems on production.
> We must print information about all copies, if a partition is in an inconsistent state. Now we could get the message on cache group with 3 backups:
> {code:java}
> // Partition states validation has failed for group: CACHEGROUP. Partitions update counters are inconsistent for Part 3415: [10.104.6.10:47500=2577263 10.104.6.12:47500=2577263 10.104.6.23:47500=2577262 10.104.6.9:47500=2577263 ] Part 4960: [10.104.6.11:47500=2560994 10.104.6.23:47500=2560993 ]
> {code}
> (part 4960 contains information about 2 copies only)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)