You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by 18624049226 <18...@163.com> on 2023/05/25 13:06:52 UTC

The ConsistentId verification logic for the recovery process of incremental and full snapshots is different

Hi team,

According to testing, the recovery process of a full snapshot does not 
verify the consistency of ConsistentId, for example:

1.start a node with consistentId A;

2.insert some data;

3../control.sh --snapshot create full

4.kill A;

5.start a node with consistentId B;

6../control.sh --snapshot restore full.

this is OK.

But, if an incremental snapshot is created, an error will be reported 
during recovery:

1.start a node with consistentId A;

2.insert some data;

3../control.sh --snapshot create full --incremental

4.kill A;

5.start a node with consistentId B;

6../control.sh --snapshot restore full --increment 1.

then:

[2023-05-25T19:34:43,592][ERROR][rest-#68%B%][SnapshotRestoreProcess] 
Failed to restore snapshot cache groups 
[reqId=0eb27a3c-854d-43b7-81f2-2729bd0233f4].
org.apache.ignite.internal.processors.cache.persistence.snapshot.IgniteSnapshotVerifyException: 
Failed to find snapshot metafile [metas=[], snpName=full, snpPath=null]
at 
org.apache.ignite.internal.processors.cache.persistence.snapshot.IgniteSnapshotManager.lambda$checkSnapshot$299baa4c$1(IgniteSnapshotManager.java:1944) 
~[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:464) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.listen(GridFutureAdapter.java:355) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.cache.persistence.snapshot.IgniteSnapshotManager.checkSnapshot(IgniteSnapshotManager.java:1846) 
~[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.cache.persistence.snapshot.SnapshotRestoreProcess.start(SnapshotRestoreProcess.java:336) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.cache.persistence.snapshot.IgniteSnapshotManager.restoreSnapshot(IgniteSnapshotManager.java:2349) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.visor.snapshot.VisorSnapshotRestoreTask$VisorSnapshotStartRestoreJob.run(VisorSnapshotRestoreTask.java:70) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.visor.snapshot.VisorSnapshotRestoreTask$VisorSnapshotStartRestoreJob.run(VisorSnapshotRestoreTask.java:56) 
[ignite-core-2.15.0.jar:2.15.0]
at org.apache.ignite.internal.visor.VisorJob.execute(VisorJob.java:73) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.job.GridJobWorker$1.call(GridJobWorker.java:628) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7431) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:622) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:547) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1366) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.task.GridTaskWorker.sendRequest(GridTaskWorker.java:1440) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.task.GridTaskWorker.processMappedJobs(GridTaskWorker.java:669) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.task.GridTaskWorker.body(GridTaskWorker.java:533) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.task.GridTaskProcessor.startTask(GridTaskProcessor.java:753) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:507) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.rest.handlers.task.GridTaskCommandHandler.handleAsyncUnsafe(GridTaskCommandHandler.java:220) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.rest.handlers.task.GridTaskCommandHandler.handleAsync(GridTaskCommandHandler.java:161) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.rest.GridRestProcessor.handleRequest0(GridRestProcessor.java:331) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.rest.GridRestProcessor.handleRequest(GridRestProcessor.java:309) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.rest.GridRestProcessor.access$000(GridRestProcessor.java:108) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.rest.GridRestProcessor$2.body(GridRestProcessor.java:192) 
[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125) 
[ignite-core-2.15.0.jar:2.15.0]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) 
[?:?]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) 
[?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: org.apache.ignite.IgniteException: Failed to find snapshot 
metafile [metas=[], snpName=full, snpPath=null]
at 
org.apache.ignite.internal.processors.cache.persistence.snapshot.SnapshotMetadataVerificationTask$MetadataVerificationJob.execute(SnapshotMetadataVerificationTask.java:103) 
~[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.cache.persistence.snapshot.SnapshotMetadataVerificationTask$MetadataVerificationJob.execute(SnapshotMetadataVerificationTask.java:70) 
~[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.job.GridJobWorker$1.call(GridJobWorker.java:628) 
~[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7431) 
~[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:622) 
~[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:547) 
~[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125) 
~[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1366) 
~[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.task.GridTaskWorker.sendRequest(GridTaskWorker.java:1440) 
~[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.task.GridTaskWorker.processMappedJobs(GridTaskWorker.java:669) 
~[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.task.GridTaskWorker.body(GridTaskWorker.java:533) 
~[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125) 
~[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.task.GridTaskProcessor.startTask(GridTaskProcessor.java:753) 
~[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:415) 
~[ignite-core-2.15.0.jar:2.15.0]
at 
org.apache.ignite.internal.processors.cache.persistence.snapshot.IgniteSnapshotManager.checkSnapshot(IgniteSnapshotManager.java:1842) 
~[ignite-core-2.15.0.jar:2.15.0]

from the code:

I believe that the inconsistent of verify and processing logic of the 
recovery process between full and incremental snapshots is unreasonable, 
is it a bug?


Re: The ConsistentId verification logic for the recovery process of incremental and full snapshots is different

Posted by Maksim Timonin <ti...@apache.org>.
Hi,

I reproduced your behavior. Yes, I confirm that this is a bug - there is a
wrong check. It verifies that increments match the full snapshot. But
instead of getting consistentId from the full snapshot metafile, it gets it
from the local node.

Workaround is to configure the node for restore with the same consistentId
as it was at the moment of the snapshot creation.

I created a ticket for fixing that:
https://issues.apache.org/jira/browse/IGNITE-19567.

Thank you for your bug report!