You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Pavel Pereslegin (Jira)" <ji...@apache.org> on 2021/10/05 12:55:00 UTC
[jira] [Updated] (IGNITE-14794) Add JMX command and metrics for
automatic snapshot restore operation.
[ https://issues.apache.org/jira/browse/IGNITE-14794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pavel Pereslegin updated IGNITE-14794:
--------------------------------------
Description:
Add JMX command to restore a cache group from the snapshot.
Suggested methods
{code:java}
@MXBeanDescription("Restore cluster-wide snapshot.")
public void restoreSnapshot(
@MXBeanParameter(name = "snpName", description = "Snapshot name.") String name,
@MXBeanParameter(name = "cacheGroupNames", description = "Optional comma-separated list of cache group names.") String cacheGroupNames);
@MXBeanDescription("Cancel previously started snapshot restore operation.")
public void cancelSnapshotRestore(@MXBeanParameter(name = "snpName", description = "Snapshot name.") String name);
{code}
Since the automatic snapshot restore operation can take a long time, we must be able to track its progress using metrics.
Suggested metrics:
{noformat}
start time
partitions (processed/total)
bytes (processed/total)
end time
{noformat}
Suggested status command output.
[in progress]
{noformat}
Restore operation for snapshot "snapshot_25052021" is still in progress (requestId=0e2d8c06-d44a-4ade-91bf-2b84b367499a). Progress: 100% completed (66/66 partitions, 3.8/3.8 MB)
Started: 2021-10-05 15:47:47.942
Cache groups: default Node 11faec83-a304-48f7-aac7-e67bf8800001: 100% completed (33/33 partitions, 1.9/1.9 MB)
Node 99066100-890f-41a3-b0cd-4a3d59600000: 100% completed (33/33 partitions, 1.9/1.9 MB)Command [SNAPSHOT] finished with code: 0{noformat}
[finished]
{noformat}
Restore operation for snapshot "snapshot_25052021" completed successfully (requestId=6adeea86-1ee2-4664-8d7d-3383a484a00a). Progress: 100% completed (66/66 partitions, 3.8/3.8 MB)
Started: 2021-10-05 15:53:03.352
Finished: 2021-10-05 15:53:03.443
Cache groups: default Node cc69e33f-de95-42b4-99af-86cf83900001: 100% completed (33/33 partitions, 1.9/1.9 MB)
Node b4f3bb36-aef3-4813-a3e9-9f7773600000: 100% completed (33/33 partitions, 1.9/1.9 MB){noformat}
[missing snapshot name]
{noformat}
No information about restoring snapshot "snapshot_MISSING" is available.{noformat}
[error]
{noformat}
Restore operation for snapshot "snapshot_25052021" failed (requestId=b9b312f5-ba34-40e9-bb94-35daacd552c0). Error: Operation has been canceled by the user.
Started: 2021-10-05 15:51:52.255
Finished: 2021-10-05 15:51:52.782
Cache groups: default Node e3c8d45b-2ccd-43ba-81ab-ea3bb9e00001: 100% completed (33/33 partitions, 1.9/1.9 MB)
Node 884cd446-38c2-4538-9dcd-81509eb00000: 100% completed (33/33 partitions, 1.9/1.9 MB){noformat}
was:
Add JMX command to restore a cache group from the snapshot.
Suggested methods
{code:java}
@MXBeanDescription("Restore cluster-wide snapshot.")
public void restoreSnapshot(
@MXBeanParameter(name = "snpName", description = "Snapshot name.") String name,
@MXBeanParameter(name = "cacheGroupNames", description = "Optional comma-separated list of cache group names.") String cacheGroupNames);
@MXBeanDescription("Cancel previously started snapshot restore operation.")
public void cancelSnapshotRestore(@MXBeanParameter(name = "snpName", description = "Snapshot name.") String name);
{code}
Since the automatic snapshot restore operation can take a long time, we must be able to track its progress using metrics.
Suggested metrics:
{noformat}
start time
total partitions
copied partitions
end time
{noformat}
> Add JMX command and metrics for automatic snapshot restore operation.
> ----------------------------------------------------------------------
>
> Key: IGNITE-14794
> URL: https://issues.apache.org/jira/browse/IGNITE-14794
> Project: Ignite
> Issue Type: Improvement
> Reporter: Pavel Pereslegin
> Assignee: Pavel Pereslegin
> Priority: Major
> Labels: iep-43
> Fix For: 2.12
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Add JMX command to restore a cache group from the snapshot.
> Suggested methods
> {code:java}
> @MXBeanDescription("Restore cluster-wide snapshot.")
> public void restoreSnapshot(
> @MXBeanParameter(name = "snpName", description = "Snapshot name.") String name,
> @MXBeanParameter(name = "cacheGroupNames", description = "Optional comma-separated list of cache group names.") String cacheGroupNames);
> @MXBeanDescription("Cancel previously started snapshot restore operation.")
> public void cancelSnapshotRestore(@MXBeanParameter(name = "snpName", description = "Snapshot name.") String name);
> {code}
> Since the automatic snapshot restore operation can take a long time, we must be able to track its progress using metrics.
> Suggested metrics:
> {noformat}
> start time
> partitions (processed/total)
> bytes (processed/total)
> end time
> {noformat}
>
> Suggested status command output.
> [in progress]
>
> {noformat}
> Restore operation for snapshot "snapshot_25052021" is still in progress (requestId=0e2d8c06-d44a-4ade-91bf-2b84b367499a). Progress: 100% completed (66/66 partitions, 3.8/3.8 MB)
> Started: 2021-10-05 15:47:47.942
> Cache groups: default Node 11faec83-a304-48f7-aac7-e67bf8800001: 100% completed (33/33 partitions, 1.9/1.9 MB)
> Node 99066100-890f-41a3-b0cd-4a3d59600000: 100% completed (33/33 partitions, 1.9/1.9 MB)Command [SNAPSHOT] finished with code: 0{noformat}
>
> [finished]
> {noformat}
> Restore operation for snapshot "snapshot_25052021" completed successfully (requestId=6adeea86-1ee2-4664-8d7d-3383a484a00a). Progress: 100% completed (66/66 partitions, 3.8/3.8 MB)
> Started: 2021-10-05 15:53:03.352
> Finished: 2021-10-05 15:53:03.443
> Cache groups: default Node cc69e33f-de95-42b4-99af-86cf83900001: 100% completed (33/33 partitions, 1.9/1.9 MB)
> Node b4f3bb36-aef3-4813-a3e9-9f7773600000: 100% completed (33/33 partitions, 1.9/1.9 MB){noformat}
> [missing snapshot name]
>
> {noformat}
> No information about restoring snapshot "snapshot_MISSING" is available.{noformat}
>
> [error]
>
> {noformat}
> Restore operation for snapshot "snapshot_25052021" failed (requestId=b9b312f5-ba34-40e9-bb94-35daacd552c0). Error: Operation has been canceled by the user.
> Started: 2021-10-05 15:51:52.255
> Finished: 2021-10-05 15:51:52.782
> Cache groups: default Node e3c8d45b-2ccd-43ba-81ab-ea3bb9e00001: 100% completed (33/33 partitions, 1.9/1.9 MB)
> Node 884cd446-38c2-4538-9dcd-81509eb00000: 100% completed (33/33 partitions, 1.9/1.9 MB){noformat}
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)