You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ignite.apache.org by "Anton Vinogradov (Jira)" <ji...@apache.org> on 2022/05/05 16:40:00 UTC

[jira] (IGNITE-15316) Read Repair may see inconsistent entry at tx cache when it is consistent but updated right before the check

    [ https://issues.apache.org/jira/browse/IGNITE-15316 ]


    Anton Vinogradov deleted comment on IGNITE-15316:
    -------------------------------------------

was (Author: av):
It's a good idea to consider a replacement (as a part of this issue)
{noformat}
for (KeyCacheObject key : keys) {
                List<ClusterNode> nodes = ctx.affinity().nodesByKey(key, topVer); // affinity

                primaryNodes.put(key, nodes.get(0));
...
{noformat}
to
{noformat}
for (KeyCacheObject key : keys) {
                List<ClusterNode> nodes = ctx.topology().nodes(key.partition(), topVer); // topology

                primaryNodes.put(key, nodes.get(0));
...
{noformat}
at {{org.apache.ignite.internal.processors.cache.distributed.near.consistency.GridNearReadRepairAbstractFuture#map}}.
This may help to reduce remaps count at unstable topology, but require being thoughtfully researched.

Looks like affinity mapping instead of topology may cause unchecked copies on unstable topology.

> Read Repair may see inconsistent entry at tx cache when it is consistent but updated right before the check
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-15316
>                 URL: https://issues.apache.org/jira/browse/IGNITE-15316
>             Project: Ignite
>          Issue Type: Sub-task
>            Reporter: Anton Vinogradov
>            Assignee: Anton Vinogradov
>            Priority: Major
>              Labels: iep-31
>
> Even at FULL_SYNC mode stale reads are possible from backups after the lock is obtained by "Read Repair" tx.
> This is possible because (at previous tx) entry becomes unlocked (committed) on primary before tx committed on backups.
> This is not a problem for Ignite (since backups keep locks until updated) but produces false positive "inconsistency state found" events and repairs.
> Unlock relocation does not seems to be a proper fix, since it will cause a performance drop.
> So, we should recheck values several times if an inconsistency is found, even when the lock is already obtained by "Read Repair".



--
This message was sent by Atlassian Jira
(v8.20.7#820007)