You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by mo...@apache.org on 2022/04/09 11:05:01 UTC

[incubator-doris] branch master updated: [refactor][doc] Add data backup, data restore and data delete recovery (#8865)

This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git


The following commit(s) were added to refs/heads/master by this push:
     new 2c1c7f40b6 [refactor][doc] Add data backup, data restore and data delete recovery (#8865)
2c1c7f40b6 is described below

commit 2c1c7f40b6f9aa9fe031232a1c790382269aa83e
Author: wudi <67...@qq.com>
AuthorDate: Sat Apr 9 19:04:57 2022 +0800

    [refactor][doc] Add data backup, data restore and data delete recovery (#8865)
    
    1.Add data backup doc,
    2.add data restore doc,
    3.add data delete recovery doc
---
 new-docs/en/admin-manual/data-admin/backup.md      | 181 ++++++++++++++++++++
 .../en/admin-manual/data-admin/delete-recover.md   |  28 +++-
 new-docs/en/admin-manual/data-admin/restore.md     | 158 +++++++++++++++++-
 new-docs/zh-CN/admin-manual/data-admin/backup.md   | 184 ++++++++++++++++++++-
 .../admin-manual/data-admin/delete-recover.md      |  28 +++-
 new-docs/zh-CN/admin-manual/data-admin/restore.md  | 159 +++++++++++++++++-
 .../Backup-and-Restore/RECOVER.md}                 |  15 +-
 7 files changed, 746 insertions(+), 7 deletions(-)

diff --git a/new-docs/en/admin-manual/data-admin/backup.md b/new-docs/en/admin-manual/data-admin/backup.md
index 047e4832a9..dd00156599 100644
--- a/new-docs/en/admin-manual/data-admin/backup.md
+++ b/new-docs/en/admin-manual/data-admin/backup.md
@@ -26,3 +26,184 @@ under the License.
 
 # Data Backup
 
+Doris supports backing up the current data in the form of files to the remote storage system through the broker. Afterwards, you can restore data from the remote storage system to any Doris cluster through the restore command. Through this function, Doris can support periodic snapshot backup of data. You can also use this function to migrate data between different clusters.
+
+This feature requires Doris version 0.8.2+
+
+To use this function, you need to deploy the broker corresponding to the remote storage. Such as BOS, HDFS, etc. You can view the currently deployed broker through `SHOW BROKER;`.
+
+## A brief explanation of the principle
+
+The backup operation is to upload the data of the specified table or partition directly to the remote warehouse for storage in the form of files stored by Doris. When a user submits a Backup request, the system will perform the following operations:
+
+1. Snapshot and snapshot upload
+
+   The snapshot phase takes a snapshot of the specified table or partition data file. After that, backups are all operations on snapshots. After the snapshot, changes, imports, etc. to the table no longer affect the results of the backup. Snapshots only generate a hard link to the current data file, which takes very little time. After the snapshot is completed, the snapshot files will be uploaded one by one. Snapshot uploads are done concurrently by each Backend.
+
+2. Metadata preparation and upload
+
+   After the data file snapshot upload is complete, Frontend will first write the corresponding metadata to a local file, and then upload the local metadata file to the remote warehouse through the broker. Completing the final backup job
+
+3. Dynamic Partition Table Description
+
+   If the table is a dynamic partition table, the dynamic partition attribute will be automatically disabled after backup. When restoring, you need to manually enable the dynamic partition attribute of the table. The command is as follows:
+
+```sql
+ALTER TABLE tbl1 SET ("dynamic_partition.enable"="true")
+```
+
+## Start Backup
+
+1. Create a hdfs remote warehouse example_repo:
+
+   ```sql
+   CREATE REPOSITORY `example_repo`
+   WITH BROKER `hdfs_broker`
+   ON LOCATION "hdfs://hadoop-name-node:54310/path/to/repo/"
+   PROPERTIES
+   (
+      "username" = "user",
+      "password" = "password"
+   );
+   ```
+
+1. Full backup of table example_tbl under example_db to warehouse example_repo:
+
+   ```sql
+   BACKUP SNAPSHOT example_db.snapshot_label1
+   TO example_repo
+   ON (example_tbl)
+   PROPERTIES ("type" = "full");
+   ```
+
+2. Under the full backup example_db, the p1, p2 partitions of the table example_tbl, and the table example_tbl2 to the warehouse example_repo:
+
+   ```sql
+   BACKUP SNAPSHOT example_db.snapshot_label2
+   TO example_repo
+   ON 
+   (
+      example_tbl PARTITION (p1,p2),
+      example_tbl2
+   );
+   ```
+
+4. View the execution of the most recent backup job:
+
+   ```sql
+   mysql> show BACKUP\G;
+   *************************** 1. row ***************************
+                  JobId: 17891847
+           SnapshotName: snapshot_label1
+                 DbName: example_db
+                  State: FINISHED
+             BackupObjs: [default_cluster:example_db.example_tbl]
+             CreateTime: 2022-04-08 15:52:29
+   SnapshotFinishedTime: 2022-04-08 15:52:32
+     UploadFinishedTime: 2022-04-08 15:52:38
+           FinishedTime: 2022-04-08 15:52:44
+        UnfinishedTasks: 
+               Progress: 
+             TaskErrMsg: 
+                 Status: [OK]
+                Timeout: 86400
+   1 row in set (0.01 sec)
+   ```
+
+5. View existing backups in remote repositories:
+
+   ```sql
+   mysql> SHOW SNAPSHOT ON example_repo WHERE SNAPSHOT = "snapshot_label1";
+   +-----------------+---------------------+--------+
+   | Snapshot        | Timestamp           | Status |
+   +-----------------+---------------------+--------+
+   | snapshot_label1 | 2022-04-08-15-52-29 | OK     |
+   +-----------------+---------------------+--------+
+   1 row in set (0.15 sec)
+   ```
+
+For the detailed usage of BACKUP, please refer to [here](../../sql-manual/sql-reference-v2/Show-Statements/BACKUP.html).
+
+## Best Practices
+
+### Backup
+
+Currently, we support full backup with the smallest partition (Partition) granularity (incremental backup may be supported in future versions). If you need to back up data regularly, you first need to plan the partitioning and bucketing of the table reasonably when building the table, such as partitioning by time. Then, in the subsequent running process, regular data backups are performed according to the partition granularity.
+
+### Data Migration
+
+Users can back up the data to the remote warehouse first, and then restore the data to another cluster through the remote warehouse to complete the data migration. Because data backup is done in the form of snapshots, new imported data after the snapshot phase of the backup job will not be backed up. Therefore, after the snapshot is completed and until the recovery job is completed, the data imported on the original cluster needs to be imported again on the new cluster.
+
+It is recommended to import the new and old clusters in parallel for a period of time after the migration is complete. After verifying the correctness of data and services, migrate services to a new cluster.
+
+## Highlights
+
+1. Operations related to backup and recovery are currently only allowed to be performed by users with ADMIN privileges.
+2. Within a database, only one backup or restore job is allowed to be executed.
+3. Both backup and recovery support operations at the minimum partition (Partition) level. When the amount of data in the table is large, it is recommended to perform operations by partition to reduce the cost of failed retry.
+4. Because of the backup and restore operations, the operations are the actual data files. Therefore, when a table has too many shards, or a shard has too many small versions, it may take a long time to backup or restore even if the total amount of data is small. Users can use `SHOW PARTITIONS FROM table_name;` and `SHOW TABLET FROM table_name;` to view the number of shards in each partition and the number of file versions in each shard to estimate job execution time. The number of files [...]
+5. When checking job status via `SHOW BACKUP` or `SHOW RESTORE` command. It is possible to see error messages in the `TaskErrMsg` column. But as long as the `State` column is not `CANCELLED`, the job is still continuing. These tasks may retry successfully. Of course, some Task errors will also directly cause the job to fail.
+6. If the recovery job is an overwrite operation (specifying the recovery data to an existing table or partition), then from the `COMMIT` phase of the recovery job, the overwritten data on the current cluster may no longer be restored. If the restore job fails or is canceled at this time, the previous data may be damaged and inaccessible. In this case, the only way to do it is to perform the recovery operation again and wait for the job to complete. Therefore, we recommend that if unnece [...]
+
+## Related Commands
+
+1. The commands related to the backup and restore function are as follows. For the following commands, you can use `help cmd;` to view detailed help after connecting to Doris through mysql-client.
+
+   1. CREATE REPOSITORY
+
+      Create a remote repository path for backup or restore. This command needs to use the Broker process to access the remote storage. Different brokers need to provide different parameters. For details, please refer to [Broker documentation](../../advanced/broker.html), or you can directly back up to support through the S3 protocol For the remote storage of AWS S3 protocol, please refer to [Create Remote Warehouse Documentation](../../sql-manual/sql-reference-v2/Data-Definition-Stateme [...]
+
+   2. BACKUP
+
+      Perform a backup operation.
+
+   3. SHOW BACKUP
+
+      View the execution of the most recent backup job, including:
+
+      - JobId: The id of this backup job.
+      - SnapshotName: The name (Label) of this backup job specified by the user.
+      - DbName: Database corresponding to the backup job.
+      - State: The current stage of the backup job:
+        - PENDING: The initial status of the job.
+        - SNAPSHOTING: A snapshot operation is in progress.
+        - UPLOAD_SNAPSHOT: The snapshot is over, ready to upload.
+        - UPLOADING: Uploading snapshot.
+        - SAVE_META: The metadata file is being generated locally.
+        - UPLOAD_INFO: Upload metadata files and information about this backup job.
+        - FINISHED: The backup is complete.
+        - CANCELLED: Backup failed or was canceled.
+      - BackupObjs: List of tables and partitions involved in this backup.
+      - CreateTime: Job creation time.
+      - SnapshotFinishedTime: Snapshot completion time.
+      - UploadFinishedTime: Snapshot upload completion time.
+      - FinishedTime: The completion time of this job.
+      - UnfinishedTasks: During `SNAPSHOTTING`, `UPLOADING` and other stages, there will be multiple subtasks going on at the same time. The current stage shown here is the task id of the unfinished subtasks.
+      - TaskErrMsg: If there is an error in the execution of a subtask, the error message of the corresponding subtask will be displayed here.
+      - Status: Used to record some status information that may appear during the entire job process.
+      - Timeout: The timeout period of the job, in seconds.
+
+   4. SHOW SNAPSHOT
+
+      View existing backups in the remote repository.
+
+      - Snapshot: The name (Label) of the backup specified during backup.
+      - Timestamp: Timestamp of the backup.
+      - Status: Whether the backup is normal.
+
+      More detailed backup information can be displayed if a where clause is specified after `SHOW SNAPSHOT`.
+
+      - Database: The database corresponding to the backup.
+      - Details: Shows the complete data directory structure of the backup.
+
+   5. CANCEL BACKUP
+
+      Cancel the currently executing backup job.
+
+   6. DROP REPOSITORY
+
+      Delete the created remote repository. Deleting a warehouse only deletes the mapping of the warehouse in Doris, and does not delete the actual warehouse data.
+
+## More Help
+
+ For more detailed syntax and best practices used by BACKUP, please refer to the [BACKUP](../../sql-reference-v2/Data-Definition-Statements/Backup-and-Restore/BACKUP.html) command manual, You can also type `HELP BACKUP` on the MySql client command line for more help.
diff --git a/new-docs/en/admin-manual/data-admin/delete-recover.md b/new-docs/en/admin-manual/data-admin/delete-recover.md
index deb9aa6e62..6f4330baf5 100644
--- a/new-docs/en/admin-manual/data-admin/delete-recover.md
+++ b/new-docs/en/admin-manual/data-admin/delete-recover.md
@@ -24,4 +24,30 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# Data Recover
+# Data Deletion Recovery
+
+In order to avoid disasters caused by misoperation, Doris supports data recovery of accidentally deleted databases/tables/partitions. After dropping table or database, Doris will not physically delete the data immediately, but will keep it in Trash for a period of time ( The default is 1 day, which can be configured through the `catalog_trash_expire_second` parameter in fe.conf). The administrator can use the RECOVER command to restore accidentally deleted data.
+
+## Start Data Recovery
+
+1.restore the database named example_db
+
+```sql
+RECOVER DATABASE example_db;
+```
+
+2.restore the table named example_tbl
+
+```sql
+RECOVER TABLE example_db.example_tbl;
+```
+
+3.restore partition named p1 in table example_tbl
+
+```sql
+RECOVER PARTITION p1 FROM example_tbl;
+```
+
+## More Help
+
+For more detailed syntax and best practices used by RECOVER, please refer to the [RECOVER](../../sql-reference-v2/Data-Definition-Statements/Backup-and-Restore/RECOVER.html) command manual, You can also type `HELP RECOVER` on the MySql client command line for more help.
diff --git a/new-docs/en/admin-manual/data-admin/restore.md b/new-docs/en/admin-manual/data-admin/restore.md
index f223a8caa3..46199bde73 100644
--- a/new-docs/en/admin-manual/data-admin/restore.md
+++ b/new-docs/en/admin-manual/data-admin/restore.md
@@ -24,4 +24,160 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# Data Restore
+# Data Recovery
+
+Doris supports backing up the current data in the form of files to the remote storage system through the broker. Afterwards, you can restore data from the remote storage system to any Doris cluster through the restore command. Through this function, Doris can support periodic snapshot backup of data. You can also use this function to migrate data between different clusters.
+
+This feature requires Doris version 0.8.2+
+
+To use this function, you need to deploy the broker corresponding to the remote storage. Such as BOS, HDFS, etc. You can view the currently deployed broker through `SHOW BROKER;`.
+
+## Brief principle description
+
+The restore operation needs to specify an existing backup in the remote warehouse, and then restore the content of the backup to the local cluster. When the user submits the Restore request, the system will perform the following operations:
+
+1. Create the corresponding metadata locally
+
+   This step will first create and restore the corresponding table partition and other structures in the local cluster. After creation, the table is visible, but not accessible.
+
+2. Local snapshot
+
+   This step is to take a snapshot of the table created in the previous step. This is actually an empty snapshot (because the table just created has no data), and its purpose is to generate the corresponding snapshot directory on the Backend for later receiving the snapshot file downloaded from the remote warehouse.
+
+3. Download snapshot
+
+   The snapshot files in the remote warehouse will be downloaded to the corresponding snapshot directory generated in the previous step. This step is done concurrently by each Backend.
+
+4. Effective snapshot
+
+   After the snapshot download is complete, we need to map each snapshot to the metadata of the current local table. These snapshots are then reloaded to take effect, completing the final recovery job.
+
+## Start Restore
+
+1. Restore the table backup_tbl in backup snapshot_1 from example_repo to database example_db1, the time version is "2018-05-04-16-45-08". Revert to 1 copy:
+
+   ```sql
+   RESTORE SNAPSHOT example_db1.`snapshot_1`
+   FROM `example_repo`
+   ON ( `backup_tbl` )
+   PROPERTIES
+   (
+       "backup_timestamp"="2022-04-08-15-52-29",
+       "replication_num" = "1"
+   );
+   ```
+
+2. Restore partitions p1 and p2 of table backup_tbl in backup snapshot_2 from example_repo, and table backup_tbl2 to database example_db1, and rename it to new_tbl with time version "2018-05-04-17-11-01". The default reverts to 3 replicas:
+
+   ```sql
+   RESTORE SNAPSHOT example_db1.`snapshot_2`
+   FROM `example_repo`
+   ON
+   (
+       `backup_tbl` PARTITION (`p1`, `p2`),
+       `backup_tbl2` AS `new_tbl`
+   )
+   PROPERTIES
+   (
+       "backup_timestamp"="2022-04-08-15-55-43"
+   );
+   ```
+
+3. View the execution of the restore job:
+
+   ```sql
+   mysql> SHOW RESTORE\G;
+   *************************** 1. row ***************************
+                  JobId: 17891851
+                  Label: snapshot_label1
+              Timestamp: 2022-04-08-15-52-29
+                 DbName: default_cluster:example_db1
+                  State: FINISHED
+              AllowLoad: false
+         ReplicationNum: 3
+            RestoreObjs: {
+     "name": "snapshot_label1",
+     "database": "example_db",
+     "backup_time": 1649404349050,
+     "content": "ALL",
+     "olap_table_list": [
+       {
+         "name": "backup_tbl",
+         "partition_names": [
+           "p1",
+           "p2"
+         ]
+       }
+     ],
+     "view_list": [],
+     "odbc_table_list": [],
+     "odbc_resource_list": []
+   }
+             CreateTime: 2022-04-08 15:59:01
+       MetaPreparedTime: 2022-04-08 15:59:02
+   SnapshotFinishedTime: 2022-04-08 15:59:05
+   DownloadFinishedTime: 2022-04-08 15:59:12
+           FinishedTime: 2022-04-08 15:59:18
+        UnfinishedTasks: 
+               Progress: 
+             TaskErrMsg: 
+                 Status: [OK]
+                Timeout: 86400
+   1 row in set (0.01 sec)
+   ```
+
+For detailed usage of RESTORE, please refer to [here](../../sql-manual/sql-reference-v2/Show-Statements/RESTORE.html).
+
+## Related Commands
+
+The commands related to the backup and restore function are as follows. For the following commands, you can use `help cmd;` to view detailed help after connecting to Doris through mysql-client.
+
+1. CREATE REPOSITORY
+
+   Create a remote repository path for backup or restore. This command needs to use the Broker process to access the remote storage. Different brokers need to provide different parameters. For details, please refer to [Broker documentation](../../advanced/broker.html), or you can directly back up to support through the S3 protocol For the remote storage of AWS S3 protocol, please refer to [Create Remote Warehouse Documentation](../../sql-manual/sql-reference-v2/Data-Definition-Statements [...]
+
+2. RESTORE
+
+   Perform a restore operation.
+
+3. SHOW RESTORE
+
+   View the execution of the most recent restore job, including:
+
+   - JobId: The id of the current recovery job.
+   - Label: The name (Label) of the backup in the warehouse specified by the user.
+   - Timestamp: The timestamp of the backup in the user-specified repository.
+   - DbName: Database corresponding to the restore job.
+   - State: The current stage of the recovery job:
+     - PENDING: The initial status of the job.
+     - SNAPSHOTING: The snapshot operation of the newly created table is in progress.
+     - DOWNLOAD: Sending download snapshot task.
+     - DOWNLOADING: Snapshot is downloading.
+     - COMMIT: Prepare the downloaded snapshot to take effect.
+     - COMMITTING: Validating downloaded snapshots.
+     - FINISHED: Recovery is complete.
+     - CANCELLED: Recovery failed or was canceled.
+   - AllowLoad: Whether to allow import during restore.
+   - ReplicationNum: Restores the specified number of replicas.
+   - RestoreObjs: List of tables and partitions involved in this restore.
+   - CreateTime: Job creation time.
+   - MetaPreparedTime: Local metadata generation completion time.
+   - SnapshotFinishedTime: The local snapshot completion time.
+   - DownloadFinishedTime: The time when the remote snapshot download is completed.
+   - FinishedTime: The completion time of this job.
+   - UnfinishedTasks: During `SNAPSHOTTING`, `DOWNLOADING`, `COMMITTING` and other stages, there will be multiple subtasks going on at the same time. The current stage shown here is the task id of the unfinished subtasks.
+   - TaskErrMsg: If there is an error in the execution of a subtask, the error message of the corresponding subtask will be displayed here.
+   - Status: Used to record some status information that may appear during the entire job process.
+   - Timeout: The timeout period of the job, in seconds.
+
+4. CANCEL RESTORE
+
+   Cancel the currently executing restore job.
+
+5. DROP REPOSITORY
+
+   Delete the created remote repository. Deleting a warehouse only deletes the mapping of the warehouse in Doris, and does not delete the actual warehouse data.
+
+## More Help
+
+For more detailed syntax and best practices used by RESTORE, please refer to the [RESTORE](../../sql-reference-v2/Data-Definition-Statements/Backup-and-Restore/RESTORE.html) command manual, You can also type `HELP RESTORE` on the MySql client command line for more help.
diff --git a/new-docs/zh-CN/admin-manual/data-admin/backup.md b/new-docs/zh-CN/admin-manual/data-admin/backup.md
index 5efe7f9545..372f5cfd7d 100644
--- a/new-docs/zh-CN/admin-manual/data-admin/backup.md
+++ b/new-docs/zh-CN/admin-manual/data-admin/backup.md
@@ -24,4 +24,186 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# 数据备份
\ No newline at end of file
+# 数据备份
+
+Doris 支持将当前数据以文件的形式,通过 broker 备份到远端存储系统中。之后可以通过 恢复 命令,从远端存储系统中将数据恢复到任意 Doris 集群。通过这个功能,Doris 可以支持将数据定期的进行快照备份。也可以通过这个功能,在不同集群间进行数据迁移。
+
+该功能需要 Doris 版本 0.8.2+
+
+使用该功能,需要部署对应远端存储的 broker。如 BOS、HDFS 等。可以通过 `SHOW BROKER;` 查看当前部署的 broker。
+
+## 简要原理说明
+
+备份操作是将指定表或分区的数据,直接以 Doris 存储的文件的形式,上传到远端仓库中进行存储。当用户提交 Backup 请求后,系统内部会做如下操作:
+
+1. 快照及快照上传
+
+   快照阶段会对指定的表或分区数据文件进行快照。之后,备份都是对快照进行操作。在快照之后,对表进行的更改、导入等操作都不再影响备份的结果。快照只是对当前数据文件产生一个硬链,耗时很少。快照完成后,会开始对这些快照文件进行逐一上传。快照上传由各个 Backend 并发完成。
+
+2. 元数据准备及上传
+
+   数据文件快照上传完成后,Frontend 会首先将对应元数据写成本地文件,然后通过 broker 将本地元数据文件上传到远端仓库。完成最终备份作业
+
+3. 动态分区表说明
+
+   如果该表是动态分区表,备份之后会自动禁用动态分区属性,在做恢复的时候需要手动将该表的动态分区属性启用,命令如下:
+
+   ```sql
+   ALTER TABLE tbl1 SET ("dynamic_partition.enable"="true")
+   ```
+
+## 开始备份
+
+1. 创建一个hdfs的远程仓库example_repo:
+
+   ```sql
+   CREATE REPOSITORY `example_repo`
+   WITH BROKER `hdfs_broker`
+   ON LOCATION "hdfs://hadoop-name-node:54310/path/to/repo/"
+   PROPERTIES
+   (
+      "username" = "user",
+      "password" = "password"
+   );
+   ```
+
+2. 全量备份 example_db 下的表 example_tbl 到仓库 example_repo 中:
+
+   ```sql
+   BACKUP SNAPSHOT example_db.snapshot_label1
+   TO example_repo
+   ON (example_tbl)
+   PROPERTIES ("type" = "full");
+   ```
+
+3. 全量备份 example_db 下,表 example_tbl 的 p1, p2 分区,以及表 example_tbl2 到仓库 example_repo 中:
+
+   ```sql
+   BACKUP SNAPSHOT example_db.snapshot_label2
+   TO example_repo
+   ON 
+   (
+      example_tbl PARTITION (p1,p2),
+      example_tbl2
+   );
+   ```
+
+4. 查看最近 backup 作业的执行情况:
+
+   ```sql
+   mysql> show BACKUP\G;
+   *************************** 1. row ***************************
+                  JobId: 17891847
+           SnapshotName: snapshot_label1
+                 DbName: example_db
+                  State: FINISHED
+             BackupObjs: [default_cluster:example_db.example_tbl]
+             CreateTime: 2022-04-08 15:52:29
+   SnapshotFinishedTime: 2022-04-08 15:52:32
+     UploadFinishedTime: 2022-04-08 15:52:38
+           FinishedTime: 2022-04-08 15:52:44
+        UnfinishedTasks: 
+               Progress: 
+             TaskErrMsg: 
+                 Status: [OK]
+                Timeout: 86400
+   1 row in set (0.01 sec)
+   ```
+
+5. 查看远端仓库中已存在的备份
+
+   ```sql
+   mysql> SHOW SNAPSHOT ON example_repo WHERE SNAPSHOT = "snapshot_label1";
+   +-----------------+---------------------+--------+
+   | Snapshot        | Timestamp           | Status |
+   +-----------------+---------------------+--------+
+   | snapshot_label1 | 2022-04-08-15-52-29 | OK     |
+   +-----------------+---------------------+--------+
+   1 row in set (0.15 sec)
+   ```
+
+BACKUP的更多用法可参考 [这里](../../sql-manual/sql-reference-v2/Show-Statements/BACKUP.html)。
+
+## 最佳实践
+
+### 备份
+
+当前我们支持最小分区(Partition)粒度的全量备份(增量备份有可能在未来版本支持)。如果需要对数据进行定期备份,首先需要在建表时,合理的规划表的分区及分桶,比如按时间进行分区。然后在之后的运行过程中,按照分区粒度进行定期的数据备份。
+
+### 数据迁移
+
+用户可以先将数据备份到远端仓库,再通过远端仓库将数据恢复到另一个集群,完成数据迁移。因为数据备份是通过快照的形式完成的,所以,在备份作业的快照阶段之后的新的导入数据,是不会备份的。因此,在快照完成后,到恢复作业完成这期间,在原集群上导入的数据,都需要在新集群上同样导入一遍。
+
+建议在迁移完成后,对新旧两个集群并行导入一段时间。完成数据和业务正确性校验后,再将业务迁移到新的集群。
+
+## 重点说明
+
+1. 备份恢复相关的操作目前只允许拥有 ADMIN 权限的用户执行。
+2. 一个 Database 内,只允许有一个正在执行的备份或恢复作业。
+3. 备份和恢复都支持最小分区(Partition)级别的操作,当表的数据量很大时,建议按分区分别执行,以降低失败重试的代价。
+4. 因为备份恢复操作,操作的都是实际的数据文件。所以当一个表的分片过多,或者一个分片有过多的小版本时,可能即使总数据量很小,依然需要备份或恢复很长时间。用户可以通过 `SHOW PARTITIONS FROM table_name;` 和 `SHOW TABLET FROM table_name;` 来查看各个分区的分片数量,以及各个分片的文件版本数量,来预估作业执行时间。文件数量对作业执行的时间影响非常大,所以建议在建表时,合理规划分区分桶,以避免过多的分片。
+5. 当通过 `SHOW BACKUP` 或者 `SHOW RESTORE` 命令查看作业状态时。有可能会在 `TaskErrMsg` 一列中看到错误信息。但只要 `State` 列不为 `CANCELLED`,则说明作业依然在继续。这些 Task 有可能会重试成功。当然,有些 Task 错误,也会直接导致作业失败。
+6. 如果恢复作业是一次覆盖操作(指定恢复数据到已经存在的表或分区中),那么从恢复作业的 `COMMIT` 阶段开始,当前集群上被覆盖的数据有可能不能再被还原。此时如果恢复作业失败或被取消,有可能造成之前的数据已损坏且无法访问。这种情况下,只能通过再次执行恢复操作,并等待作业完成。因此,我们建议,如无必要,尽量不要使用覆盖的方式恢复数据,除非确认当前数据已不再使用。
+
+## 相关命令
+
+和备份恢复功能相关的命令如下。以下命令,都可以通过 mysql-client 连接 Doris 后,使用 `help cmd;` 的方式查看详细帮助。
+
+1. CREATE REPOSITORY
+
+   创建一个远端仓库路径,用于备份或恢复。该命令需要借助 Broker 进程访问远端存储,不同的 Broker 需要提供不同的参数,具体请参阅 [Broker文档](../../advanced/broker.html),也可以直接通过S3 协议备份到支持AWS S3协议的远程存储上去,具体参考 [创建远程仓库文档](../../sql-manual/sql-reference-v2/Data-Definition-Statements/Backup-and-Restore/CREATE-REPOSITORY.md)
+
+2. BACKUP
+
+   执行一次备份操作。
+
+3. SHOW BACKUP
+
+   查看最近一次 backup 作业的执行情况,包括:
+
+   - JobId:本次备份作业的 id。
+   - SnapshotName:用户指定的本次备份作业的名称(Label)。
+   - DbName:备份作业对应的 Database。
+   - State:备份作业当前所在阶段:
+     - PENDING:作业初始状态。
+     - SNAPSHOTING:正在进行快照操作。
+     - UPLOAD_SNAPSHOT:快照结束,准备上传。
+     - UPLOADING:正在上传快照。
+     - SAVE_META:正在本地生成元数据文件。
+     - UPLOAD_INFO:上传元数据文件和本次备份作业的信息。
+     - FINISHED:备份完成。
+     - CANCELLED:备份失败或被取消。
+   - BackupObjs:本次备份涉及的表和分区的清单。
+   - CreateTime:作业创建时间。
+   - SnapshotFinishedTime:快照完成时间。
+   - UploadFinishedTime:快照上传完成时间。
+   - FinishedTime:本次作业完成时间。
+   - UnfinishedTasks:在 `SNAPSHOTTING`,`UPLOADING` 等阶段,会有多个子任务在同时进行,这里展示的当前阶段,未完成的子任务的 task id。
+   - TaskErrMsg:如果有子任务执行出错,这里会显示对应子任务的错误信息。
+   - Status:用于记录在整个作业过程中,可能出现的一些状态信息。
+   - Timeout:作业的超时时间,单位是秒。
+
+4. SHOW SNAPSHOT
+
+   查看远端仓库中已存在的备份。
+
+   - Snapshot:备份时指定的该备份的名称(Label)。
+   - Timestamp:备份的时间戳。
+   - Status:该备份是否正常。
+
+   如果在 `SHOW SNAPSHOT` 后指定了 where 子句,则可以显示更详细的备份信息。
+
+   - Database:备份时对应的 Database。
+   - Details:展示了该备份完整的数据目录结构。
+
+5. CANCEL BACKUP
+
+   取消当前正在执行的备份作业。
+
+6. DROP REPOSITORY
+
+   删除已创建的远端仓库。删除仓库,仅仅是删除该仓库在 Doris 中的映射,不会删除实际的仓库数据。
+
+## 更多帮助
+
+ 关于 BACKUP 使用的更多详细语法及最佳实践,请参阅 [BACKUP](../../sql-reference-v2/Data-Definition-Statements/Backup-and-Restore/BACKUP.html) 命令手册,你也可以在 MySql 客户端命令行下输入 `HELP BACKUP` 获取更多帮助信息。
\ No newline at end of file
diff --git a/new-docs/zh-CN/admin-manual/data-admin/delete-recover.md b/new-docs/zh-CN/admin-manual/data-admin/delete-recover.md
index 9f936f2654..69688fb70d 100644
--- a/new-docs/zh-CN/admin-manual/data-admin/delete-recover.md
+++ b/new-docs/zh-CN/admin-manual/data-admin/delete-recover.md
@@ -24,4 +24,30 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# 数据删除恢复
\ No newline at end of file
+# 数据删除恢复
+
+Doris为了避免误操作造成的灾难,支持对误删除的数据库/表/分区进行数据恢复,在drop table或者 drop database之后,Doris不会立刻对数据进行物理删除,而是在 Trash 中保留一段时间(默认1天,可通过fe.conf中`catalog_trash_expire_second`参数配置),管理员可以通过RECOVER命令对误删除的数据进行恢复。
+
+## 开始数据恢复
+
+1.恢复名为 example_db 的 database
+
+```sql
+RECOVER DATABASE example_db;
+```
+
+2.恢复名为 example_tbl 的 table
+
+```sql
+RECOVER TABLE example_db.example_tbl;
+```
+
+3.恢复表 example_tbl 中名为 p1 的 partition
+
+```sql
+RECOVER PARTITION p1 FROM example_tbl;
+```
+
+## 更多帮助
+
+关于 RECOVER 使用的更多详细语法及最佳实践,请参阅 [RECOVER](../../sql-reference-v2/Data-Definition-Statements/Backup-and-Restore/RECOVER.html) 命令手册,你也可以在 MySql 客户端命令行下输入 `HELP RECOVER` 获取更多帮助信息。
\ No newline at end of file
diff --git a/new-docs/zh-CN/admin-manual/data-admin/restore.md b/new-docs/zh-CN/admin-manual/data-admin/restore.md
index 16b966d27e..bcdc6ae744 100644
--- a/new-docs/zh-CN/admin-manual/data-admin/restore.md
+++ b/new-docs/zh-CN/admin-manual/data-admin/restore.md
@@ -24,4 +24,161 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# 数据还原
\ No newline at end of file
+# 数据恢复
+
+Doris 支持将当前数据以文件的形式,通过 broker 备份到远端存储系统中。之后可以通过 恢复 命令,从远端存储系统中将数据恢复到任意 Doris 集群。通过这个功能,Doris 可以支持将数据定期的进行快照备份。也可以通过这个功能,在不同集群间进行数据迁移。
+
+该功能需要 Doris 版本 0.8.2+
+
+使用该功能,需要部署对应远端存储的 broker。如 BOS、HDFS 等。可以通过 `SHOW BROKER;` 查看当前部署的 broker。
+
+## 简要原理说明
+
+恢复操作需要指定一个远端仓库中已存在的备份,然后将这个备份的内容恢复到本地集群中。当用户提交 Restore 请求后,系统内部会做如下操作:
+
+1. 在本地创建对应的元数据
+
+   这一步首先会在本地集群中,创建恢复对应的表分区等结构。创建完成后,该表可见,但是不可访问。
+
+2. 本地snapshot
+
+   这一步是将上一步创建的表做一个快照。这其实是一个空快照(因为刚创建的表是没有数据的),其目的主要是在 Backend 上产生对应的快照目录,用于之后接收从远端仓库下载的快照文件。
+
+3. 下载快照
+
+   远端仓库中的快照文件,会被下载到对应的上一步生成的快照目录中。这一步由各个 Backend 并发完成。
+
+4. 生效快照
+
+   快照下载完成后,我们要将各个快照映射为当前本地表的元数据。然后重新加载这些快照,使之生效,完成最终的恢复作业。
+
+## 开始恢复
+
+1. 从 example_repo 中恢复备份 snapshot_1 中的表 backup_tbl 到数据库 example_db1,时间版本为 "2018-05-04-16-45-08"。恢复为 1 个副本:
+   
+    ```sql
+    RESTORE SNAPSHOT example_db1.`snapshot_1`
+    FROM `example_repo`
+    ON ( `backup_tbl` )
+    PROPERTIES
+    (
+        "backup_timestamp"="2022-04-08-15-52-29",
+        "replication_num" = "1"
+    );
+    ```
+    
+2. 从 example_repo 中恢复备份 snapshot_2 中的表 backup_tbl 的分区 p1,p2,以及表 backup_tbl2 到数据库 example_db1,并重命名为 new_tbl,时间版本为 "2018-05-04-17-11-01"。默认恢复为 3 个副本:
+   
+    ```sql
+    RESTORE SNAPSHOT example_db1.`snapshot_2`
+    FROM `example_repo`
+    ON
+    (
+        `backup_tbl` PARTITION (`p1`, `p2`),
+        `backup_tbl2` AS `new_tbl`
+    )
+    PROPERTIES
+    (
+        "backup_timestamp"="2022-04-08-15-55-43"
+    );
+    ```
+
+3. 查看 restore 作业的执行情况:
+
+   ```sql
+   mysql> SHOW RESTORE\G;
+   *************************** 1. row ***************************
+                  JobId: 17891851
+                  Label: snapshot_label1
+              Timestamp: 2022-04-08-15-52-29
+                 DbName: default_cluster:example_db1
+                  State: FINISHED
+              AllowLoad: false
+         ReplicationNum: 3
+            RestoreObjs: {
+     "name": "snapshot_label1",
+     "database": "example_db",
+     "backup_time": 1649404349050,
+     "content": "ALL",
+     "olap_table_list": [
+       {
+         "name": "backup_tbl",
+         "partition_names": [
+           "p1",
+           "p2"
+         ]
+       }
+     ],
+     "view_list": [],
+     "odbc_table_list": [],
+     "odbc_resource_list": []
+   }
+             CreateTime: 2022-04-08 15:59:01
+       MetaPreparedTime: 2022-04-08 15:59:02
+   SnapshotFinishedTime: 2022-04-08 15:59:05
+   DownloadFinishedTime: 2022-04-08 15:59:12
+           FinishedTime: 2022-04-08 15:59:18
+        UnfinishedTasks: 
+               Progress: 
+             TaskErrMsg: 
+                 Status: [OK]
+                Timeout: 86400
+   1 row in set (0.01 sec)
+   ```
+
+RESTORE的更多用法可参考 [这里](../../sql-manual/sql-reference-v2/Show-Statements/RESTORE.html)。
+
+## 相关命令
+
+和备份恢复功能相关的命令如下。以下命令,都可以通过 mysql-client 连接 Doris 后,使用 `help cmd;` 的方式查看详细帮助。
+
+1. CREATE REPOSITORY
+
+   创建一个远端仓库路径,用于备份或恢复。该命令需要借助 Broker 进程访问远端存储,不同的 Broker 需要提供不同的参数,具体请参阅 [Broker文档](../../advanced/broker.html),也可以直接通过S3 协议备份到支持AWS S3协议的远程存储上去,具体参考 [创建远程仓库文档](../../sql-manual/sql-reference-v2/Data-Definition-Statements/Backup-and-Restore/CREATE-REPOSITORY.md)
+
+2. RESTORE
+
+   执行一次恢复操作。
+
+3. SHOW RESTORE
+
+   查看最近一次 restore 作业的执行情况,包括:
+
+   - JobId:本次恢复作业的 id。
+   - Label:用户指定的仓库中备份的名称(Label)。
+   - Timestamp:用户指定的仓库中备份的时间戳。
+   - DbName:恢复作业对应的 Database。
+   - State:恢复作业当前所在阶段:
+     - PENDING:作业初始状态。
+     - SNAPSHOTING:正在进行本地新建表的快照操作。
+     - DOWNLOAD:正在发送下载快照任务。
+     - DOWNLOADING:快照正在下载。
+     - COMMIT:准备生效已下载的快照。
+     - COMMITTING:正在生效已下载的快照。
+     - FINISHED:恢复完成。
+     - CANCELLED:恢复失败或被取消。
+   - AllowLoad:恢复期间是否允许导入。
+   - ReplicationNum:恢复指定的副本数。
+   - RestoreObjs:本次恢复涉及的表和分区的清单。
+   - CreateTime:作业创建时间。
+   - MetaPreparedTime:本地元数据生成完成时间。
+   - SnapshotFinishedTime:本地快照完成时间。
+   - DownloadFinishedTime:远端快照下载完成时间。
+   - FinishedTime:本次作业完成时间。
+   - UnfinishedTasks:在 `SNAPSHOTTING`,`DOWNLOADING`, `COMMITTING` 等阶段,会有多个子任务在同时进行,这里展示的当前阶段,未完成的子任务的 task id。
+   - TaskErrMsg:如果有子任务执行出错,这里会显示对应子任务的错误信息。
+   - Status:用于记录在整个作业过程中,可能出现的一些状态信息。
+   - Timeout:作业的超时时间,单位是秒。
+
+4. CANCEL RESTORE
+
+   取消当前正在执行的恢复作业。
+
+5. DROP REPOSITORY
+
+   删除已创建的远端仓库。删除仓库,仅仅是删除该仓库在 Doris 中的映射,不会删除实际的仓库数据。
+
+## 更多帮助
+
+关于 RESTORE 使用的更多详细语法及最佳实践,请参阅 [RESTORE](../../sql-reference-v2/Data-Definition-Statements/Backup-and-Restore/RESTORE.html) 命令手册,你也可以在 MySql 客户端命令行下输入 `HELP RESTORE` 获取更多帮助信息。
+
diff --git a/new-docs/zh-CN/admin-manual/data-admin/restore.md b/new-docs/zh-CN/sql-manual/sql-reference-v2/Data-Definition-Statements/Backup-and-Restore/RECOVER.md
similarity index 87%
copy from new-docs/zh-CN/admin-manual/data-admin/restore.md
copy to new-docs/zh-CN/sql-manual/sql-reference-v2/Data-Definition-Statements/Backup-and-Restore/RECOVER.md
index 16b966d27e..adca245140 100644
--- a/new-docs/zh-CN/admin-manual/data-admin/restore.md
+++ b/new-docs/zh-CN/sql-manual/sql-reference-v2/Data-Definition-Statements/Backup-and-Restore/RECOVER.md
@@ -1,6 +1,6 @@
 ---
 {
-    "title": "数据恢复",
+    "title": "BACKUP",
     "language": "zh-CN"
 }
 ---
@@ -24,4 +24,15 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# 数据还原
\ No newline at end of file
+## RECOVER
+
+### Description
+
+### Example
+
+### Keywords
+
+    RECOVER
+
+### Best Practice
+


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org