You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ignite.apache.org by "Amelchev Nikita (JIRA)" <ji...@apache.org> on 2019/06/24 11:43:00 UTC

[jira] [Commented] (IGNITE-9913) Prevent data updates blocking in case of backup BLT server node leave

    [ https://issues.apache.org/jira/browse/IGNITE-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16871096#comment-16871096 ] 

Amelchev Nikita commented on IGNITE-9913:
-----------------------------------------

Hi, [~ivan.glukos].
I found two possible blockers to do such lightweight PME without blocking updates:

1. Finalize partitions counter. It seems that we can't correctly collect gaps and process them without completing all txs. See the {{GridDhtPartitionTopologyImpl#finalizeUpdateCounters}} method.

2. Apply update counters. We can't correctly set {{HWM}} counter if primary left the cluster and sent updates to part of backups. Such updates can be processed later and break guarantee that {{LWM<=HWM}}.

Could you take a look?

> Prevent data updates blocking in case of backup BLT server node leave
> ---------------------------------------------------------------------
>
>                 Key: IGNITE-9913
>                 URL: https://issues.apache.org/jira/browse/IGNITE-9913
>             Project: Ignite
>          Issue Type: Improvement
>          Components: general
>            Reporter: Ivan Rakov
>            Assignee: Amelchev Nikita
>            Priority: Major
>             Fix For: 2.8
>
>         Attachments: 9913_yardstick.png, master_yardstick.png
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> Ignite cluster performs distributed partition map exchange when any server node leaves or joins the topology.
> Distributed PME blocks all updates and may take a long time. If all partitions are assigned according to the baseline topology and server node leaves, there's no actual need to perform distributed PME: every cluster node is able to recalculate new affinity assigments and partition states locally. If we'll implement such lightweight PME and handle mapping and lock requests on new topology version correctly, updates won't be stopped (except updates of partitions that lost their primary copy).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)