You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ignite.apache.org by "Alexey Kukushkin (Jira)" <ji...@apache.org> on 2022/09/08 13:09:00 UTC

[jira] [Created] (IGNITE-17657) Partition data loss after rolling restart

Alexey Kukushkin created IGNITE-17657:
-----------------------------------------

             Summary: Partition data loss after rolling restart
                 Key: IGNITE-17657
                 URL: https://issues.apache.org/jira/browse/IGNITE-17657
             Project: Ignite
          Issue Type: Bug
    Affects Versions: 2.13
            Reporter: Alexey Kukushkin


*Setup*

An active 3+ node Apache Ignite 2.13 cluster with a cache with 1 backup, enabled persistence and default partition loss policy.

*Actions*
 # An application is continuously writing data to the cache
 # The nodes are sequentially restarted one after another while the data is being written to the cache. The next node is restarted only after the data rebalancing is complete. Using the {{KeysToRebalanceLeft}} metric to monitor rebalancing (see the [documentation|https://ignite.apache.org/docs/latest/monitoring-metrics/metrics#monitoring-rebalancing] for more details)
 # The application reads some of the data after restarting all the nodes.

*Expected*

No data is lost since there is 1 backup and the nodes are restarted sequentially after rebalancing is complete.

*Actual*

Sometimes (in our case in more than 50% of cases) there is a "partition data has been lost" exception on the attempt to read the data.

*Notes*

Tried to create a JUnit reproducer (all nodes within the same JVM) for the above scenario - no success so far.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)