You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pegasus.apache.org by GitBox <gi...@apache.org> on 2022/08/12 07:56:36 UTC

[GitHub] [incubator-pegasus] hycdong opened a new pull request, #1112: feat(backup): 4. meta server send backup_request to replica

hycdong opened a new pull request, #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112

   https://github.com/apache/incubator-pegasus/issues/1081


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] foreverneverer commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
foreverneverer commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947418226


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   why need two error_code? using one-error_code + hint is not ok?



##########
src/rdsn/src/common/backup_restore_common.cpp:
##########
@@ -36,5 +44,41 @@ const std::string backup_restore_constant::BACKUP_ID("restore.backup_id");
 const std::string backup_restore_constant::SKIP_BAD_PARTITION("restore.skip_bad_partition");
 const std::string backup_restore_constant::RESTORE_PATH("restore.restore_path");
 
+std::string get_backup_root(const std::string &backup_root,
+                            const std::string &user_defined_root_path)
+{
+    if (user_defined_root_path.empty()) {
+        return backup_root;
+    }
+    return utils::filesystem::path_combine(user_defined_root_path, backup_root);

Review Comment:
   if user specify root path, we should return the value, but not combine `system` + `user` path



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] foreverneverer commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
foreverneverer commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947645088


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   In addition, hint_ message can also explain the error code, so there is no need to distinguish `common` and `backup`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] foreverneverer commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
foreverneverer commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947815671


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   With your design, the sender needs to judge the response as follows:
   ```c++
   if (rpc == ok) {
        if ( resp.err == ok) {
            if (resp.checkpoint_err == ok) {
                /** if you define `more clear err code as you say`, maybe need:**/
                 /** if (err_a == ...) {
                           if (err_b == ...)
   }
   }
   }
   }
   ```
   This logic is also redundant. don't suggest this design, you can refer the origin rpc define, I think the origin is elegant



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] acelyc111 commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948619765


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request

Review Comment:
   Rolling update is a common case which many user have to face. We will not ensure every feature should work well at this case, but at least avoid the server crash.
   If we rewrite the RPC message, the related RPC code would better to add a new one, and left the old one as deprecated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948693789


##########
src/rdsn/src/meta/test/meta_backup_engine_test.cpp:
##########
@@ -0,0 +1,107 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <gtest/gtest.h>
+#include <dsn/dist/fmt_logging.h>
+#include <dsn/utility/fail_point.h>
+
+#include "meta_test_base.h"
+#include "meta/meta_backup_engine.h"
+
+namespace dsn {
+namespace replication {
+
+class meta_backup_engine_test : public meta_test_base
+{
+public:
+    void SetUp()

Review Comment:
   Done, and I resolve the following similar conversation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] foreverneverer commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
foreverneverer commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947617039


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   So I think that just `err` is still OK. In other rpc, I remember that we have always defined it using one `err`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947511356


##########
src/rdsn/src/common/backup_restore_common.cpp:
##########
@@ -36,5 +44,41 @@ const std::string backup_restore_constant::BACKUP_ID("restore.backup_id");
 const std::string backup_restore_constant::SKIP_BAD_PARTITION("restore.skip_bad_partition");
 const std::string backup_restore_constant::RESTORE_PATH("restore.restore_path");
 
+std::string get_backup_root(const std::string &backup_root,
+                            const std::string &user_defined_root_path)
+{
+    if (user_defined_root_path.empty()) {
+        return backup_root;
+    }
+    return utils::filesystem::path_combine(user_defined_root_path, backup_root);

Review Comment:
   If user specify root_path, we will return `root_path/backup_root`, otherwise return `backup_root`. This is compatible for old version.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947635660


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   I got your point. You think checkpoint_upload_err is unecessary, all errors can be shown by err. I think it is not clear if we combine two different error into one. `checkpoint_upload_err` is only used to show if checkpoint or upload meet error during backup. `err` is used for common error for backup request. In some cases, such as backup is succeed or canceled, `checkpoint_upload_err` is meanless, so this field is optional.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947656523


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   I don't agree with that one error code should identify all error cases. `checkpoint_upload_err` is only valid in backup checkpoint and upload stage, other stage it should be set. `err=ok` and `checkpoint_upload_err=ok` means backup not meet error during checkpoint and upload. It is clear to have two errors.
   Besides, this rpc is sent from meta server and replica server periodically during backup, meta server will do different actions according to the err and checkpoint_upload_err, not only show the hint_message.
   I still inisist on my previous design, seperate another error code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948693789


##########
src/rdsn/src/meta/test/meta_backup_engine_test.cpp:
##########
@@ -0,0 +1,107 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <gtest/gtest.h>
+#include <dsn/dist/fmt_logging.h>
+#include <dsn/utility/fail_point.h>
+
+#include "meta_test_base.h"
+#include "meta/meta_backup_engine.h"
+
+namespace dsn {
+namespace replication {
+
+class meta_backup_engine_test : public meta_test_base
+{
+public:
+    void SetUp()

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947514254


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   Good idea, meta server can distinguish error with its backup_status.
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] foreverneverer commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
foreverneverer commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947642881


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   If your error definition is appropriate enough, for example, I don't think a single err `ERR_ CHEKCPOINT_ FAILED` has different from  two err `err = ERR_ OK but err_checnpoint=ERR_ CHEKCPOINT_ FAILED`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] acelyc111 commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948628738


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   If I didn't miss anything, I didn't find where `checkpoint_upload_err` is used currently. We keep this point in mind and see how will it be used later, if found it's not so elegant, we can point it out then.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#issuecomment-1220286283

   > LGTM But still doubt if there is any compatablity problem.
   
   Okay, I will add compatiable explaination after all code merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948690495


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request

Review Comment:
   Rolling update is a common case, but it is not recommended to update meta firstly, we recommendly update replica server firstly, so it is a seldom case that meta is new version but replica is old version.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] foreverneverer commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
foreverneverer commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947539858


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   Ok, what the different between  `1:dsn.error_code    err` and `5:optional dsn.error_code   checkpoint_upload_err;`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] foreverneverer commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
foreverneverer commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947815671


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   With your design, the sender needs to judge the response as follows:
   ```c++
   if (rpc == ok) {
        if ( resp.err == ok) {
            if (resp.checkpoint_err == ok) {
                /** if you define `more clear err code as you say`, maybe need:**/
                 /** if (err_a == ...) {
                           if (err_b == ...)
   }
   }
   }
   }
   ```
   This logic is also redundant. don't suggest this design, you can refer the origin rpc defination, I think the origin is elegant



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947656523


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   I don't agree with that one error code should identify all error cases. `checkpoint_upload_err` is only valid in backup checkpoint and upload stage, other stage it should not be set. `err=ok` and `checkpoint_upload_err=ok` means backup not meet error during checkpoint and upload. It is clear to have two errors.
   Besides, this rpc is sent from meta server and replica server periodically during backup, meta server will do different actions according to the err and checkpoint_upload_err, not only show the hint_message.
   I still inisist on my previous design, seperate another error code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947656523


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   I don't agree with that one error code should identify all error cases. `checkpoint_upload_err` is only valid in backup checkpoint and upload stage, other stage it should not be set. 
   Besides, this rpc is sent from meta server and replica server periodically during backup, meta server will do different actions according to the err and checkpoint_upload_err, not only show the hint_message.
   I still inisist on my previous design, seperate another error code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] acelyc111 commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r949776222


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request

Review Comment:
   No matter which one is new version. The new replica server will still crash if receive a old backup request from old versio meta server?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] foreverneverer commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
foreverneverer commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947808862


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   I still think `checkpoint_upload_err` is redundant. if `ERR_CHECKPOINT_FAILED` can clearly indicates that an error occurred in the checkpoint stage , why define `check_point_err` and set it value is `ERR_CHECKPOINT_FAILED` to explain?
   
   In the past rpc, we haven't this design. Include bulkload, duplication, for example, we don't need add `ingest_err` to show the error of ingesting, or add `download_err` to show the err of downloading and so on, which is redundant, but just use `ERR_INGEST_FAILD`, `ERR_DOWNLOAD` is clear enough. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947619897


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   No, error is not always OK. For example, current partition is not primary, backup should be failed, err is ERR_INVALID_STATE.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] foreverneverer commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
foreverneverer commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947632929


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   > No, error is not always OK
   
   I mean that just need one variable `err` is enough



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] acelyc111 commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947436885


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request

Review Comment:
   Will you use another RPC code? Is there any problem if a user use old version shell-tool  attempt to control the cluster?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948691663


##########
src/rdsn/src/common/backup_restore_common.h:
##########
@@ -52,5 +57,69 @@ class backup_restore_constant
     static const std::string RESTORE_PATH;
 };
 
+/// The directory structure on block service
+///
+/// (<root> = <user_define_root>/<cluster>)
+///
+///  <root>/<app_name>_<app_id>/<backup_id>/<pidx>/chkpt_<ip>_<port>/***.sst
+///                                        /<pidx>/chkpt_<ip>_<port>/CURRENT
+///                                        /<pidx>/chkpt_<ip>_<port>/IDENTITY
+///                                        /<pidx>/chkpt_<ip>_<port>/MANIFEST
+///                                        /<pidx>/chkpt_<ip>_<port>/OPTIONS
+///                                        /<pidx>/chkpt_<ip>_<port>/LOG
+///                                        /<pidx>/chkpt_<ip>_<port>/backup_metadata
+///                                        /<pidx>/current_checkpoint
+///                                        /<pidx>/data_version
+///
+///  ......other partitions......
+///
+///  <root>/<app_name>_<app_id>/<backup_id>/meta/app_metadata
+///  <root>/<app_name>_<app_id>/<backup_id>/backup_info
+///
+
+///
+/// The usage of files:
+///      1, app_metadata : the metadata of the app, the same with the app's app_info
+///      2, backup_metadata : the file to statistic the information of a checkpoint, include all the
+///         file's name, size and md5
+///      3, current_checkpoint : specifing which checkpoint directory is valid
+///      4, data_version: partition data_version
+///      5, backup_info : recording the information of this backup
+
+// TODO(heyuchen): add other common functions
+// get_compatible_backup_root is only used for restore compatible backup
+
+// The backup root path on block service
+// if user_defined_root_path is not empty
+// - return <user_defined_root_path>/<backup_root>
+// else
+// - return <backup_root>
+std::string get_backup_root(const std::string &backup_root,

Review Comment:
   All functions in `backup_restore_common.h` will be used in meta and replica files, backup and restore logic, so I move them into `backup_restore_common.h`. Besides, it is meanful to have a single function. I am very sure about it, this file is added in my last round refactor. I will resolve the following similar conversation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] acelyc111 commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948610419


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request

Review Comment:
   OK, thanks.
   I meant how to keep compatablity, if the meta servers are in new version, and the replica server are in old version, what will happen if we ask the cluster to do backup? Is it neccessary to add new rpc code for the new implemention?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948580634


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   Actually, bulk load still seperete rpc_error, download_error and ingestion_err, you can reference sturcture `partition_bulk_load_state`.
   
   And in my current design, won't have the too many if condition in your example. In my design, it just like
   ```
   if (rpc != ok) { // handle this condition }
   if (resp.err != ok) { // handle this condition }
   if (resp.checkpoint_upload_err != ok) { // handle this condition }
   ```
   If I combine error and checkpoint_upload_err into one, the code will be like:
   ```
   if (rpc != ok) { // handle this condition }
   if (resp.err != ok) { 
       if (original resp.err != ok){
           // handle this condition
       }
       if (original resp.checkpoint_upload_err != ok ) {
           // handle this condition
       }
   }
   ```
   
   I think it is okay to have two errors to distinguish different errors, if there is only one field redundant, but can make structure clear. In my previous design, checkpoint error and upload error also should be seperated, I have already compromised my logic :-)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948691663


##########
src/rdsn/src/common/backup_restore_common.h:
##########
@@ -52,5 +57,69 @@ class backup_restore_constant
     static const std::string RESTORE_PATH;
 };
 
+/// The directory structure on block service
+///
+/// (<root> = <user_define_root>/<cluster>)
+///
+///  <root>/<app_name>_<app_id>/<backup_id>/<pidx>/chkpt_<ip>_<port>/***.sst
+///                                        /<pidx>/chkpt_<ip>_<port>/CURRENT
+///                                        /<pidx>/chkpt_<ip>_<port>/IDENTITY
+///                                        /<pidx>/chkpt_<ip>_<port>/MANIFEST
+///                                        /<pidx>/chkpt_<ip>_<port>/OPTIONS
+///                                        /<pidx>/chkpt_<ip>_<port>/LOG
+///                                        /<pidx>/chkpt_<ip>_<port>/backup_metadata
+///                                        /<pidx>/current_checkpoint
+///                                        /<pidx>/data_version
+///
+///  ......other partitions......
+///
+///  <root>/<app_name>_<app_id>/<backup_id>/meta/app_metadata
+///  <root>/<app_name>_<app_id>/<backup_id>/backup_info
+///
+
+///
+/// The usage of files:
+///      1, app_metadata : the metadata of the app, the same with the app's app_info
+///      2, backup_metadata : the file to statistic the information of a checkpoint, include all the
+///         file's name, size and md5
+///      3, current_checkpoint : specifing which checkpoint directory is valid
+///      4, data_version: partition data_version
+///      5, backup_info : recording the information of this backup
+
+// TODO(heyuchen): add other common functions
+// get_compatible_backup_root is only used for restore compatible backup
+
+// The backup root path on block service
+// if user_defined_root_path is not empty
+// - return <user_defined_root_path>/<backup_root>
+// else
+// - return <backup_root>
+std::string get_backup_root(const std::string &backup_root,

Review Comment:
   All functions in `backup_restore_common.h` will be used in meta and replica files, backup and restore logic, so I move them into `backup_restore_common.h`. Besides, it is meanful to have a single function. This file is added in my last round refactor, if the function is only used in one .h or .cpp file, it won't be added into `backup_restore_common.h`. I will resolve the following similar conversation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] foreverneverer commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
foreverneverer commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947808862


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   I still think `checkpoint_upload_err` is redundant. if `ERR_CHECKPOINT_FAILED` can clearly indicates that an error occurred in the checkpoint stage , why define `check_point_err` and set it value is `ERR_CHECKPOINT_FAILED` to explain?
   
   In the past rpc, we haven't this design. Include bulkload, duplication, for example, we don't need add `ingest_err` to show the error of ingesting, or add `download_err` to show the err of downloading and so on, which is redundant, bu just use `ERR_INGEST_FAILD`, `ERR_DOWNLOAD` is clear enough. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] foreverneverer commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
foreverneverer commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947660451


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   > meta server will do different actions
   
   One err_code can't meet this requirement?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947514254


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   Good idea, meta server can distinguish error with its backup_status.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947635660


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   I got your point. You think checkpoint_upload_err is unecessary, all errors can be shown by err. I think it is not clear if we combine two different error into one. `checkpoint_upload_err` is only used to show if checkpoint or upload meet error during backup. `err` is used for common error for backup request.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948617240


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   @acelyc111 Please give us your suggestion about this comment~ @foreverneverer has different opion with me, and we can not persude each other.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] acelyc111 commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948628738


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   If I didn't miss anything, I didn't find where `checkpoint_upload_err` is used currently. We keep this point in mind and see how will it be used later, if find it's not so elegant, we can point it out then.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948691663


##########
src/rdsn/src/common/backup_restore_common.h:
##########
@@ -52,5 +57,69 @@ class backup_restore_constant
     static const std::string RESTORE_PATH;
 };
 
+/// The directory structure on block service
+///
+/// (<root> = <user_define_root>/<cluster>)
+///
+///  <root>/<app_name>_<app_id>/<backup_id>/<pidx>/chkpt_<ip>_<port>/***.sst
+///                                        /<pidx>/chkpt_<ip>_<port>/CURRENT
+///                                        /<pidx>/chkpt_<ip>_<port>/IDENTITY
+///                                        /<pidx>/chkpt_<ip>_<port>/MANIFEST
+///                                        /<pidx>/chkpt_<ip>_<port>/OPTIONS
+///                                        /<pidx>/chkpt_<ip>_<port>/LOG
+///                                        /<pidx>/chkpt_<ip>_<port>/backup_metadata
+///                                        /<pidx>/current_checkpoint
+///                                        /<pidx>/data_version
+///
+///  ......other partitions......
+///
+///  <root>/<app_name>_<app_id>/<backup_id>/meta/app_metadata
+///  <root>/<app_name>_<app_id>/<backup_id>/backup_info
+///
+
+///
+/// The usage of files:
+///      1, app_metadata : the metadata of the app, the same with the app's app_info
+///      2, backup_metadata : the file to statistic the information of a checkpoint, include all the
+///         file's name, size and md5
+///      3, current_checkpoint : specifing which checkpoint directory is valid
+///      4, data_version: partition data_version
+///      5, backup_info : recording the information of this backup
+
+// TODO(heyuchen): add other common functions
+// get_compatible_backup_root is only used for restore compatible backup
+
+// The backup root path on block service
+// if user_defined_root_path is not empty
+// - return <user_defined_root_path>/<backup_root>
+// else
+// - return <backup_root>
+std::string get_backup_root(const std::string &backup_root,

Review Comment:
   All functions in `backup_restore_common.h` will be used in meta and replica files, backup and restore logic, so I move them into `backup_restore_common.h`. This file is added in my last round refactor.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948692632


##########
src/rdsn/src/meta/meta_backup_engine.h:
##########
@@ -40,6 +40,19 @@ struct app_backup_info
     DEFINE_JSON_SERIALIZATION(backup_id, start_time_ms, end_time_ms, app_id, app_name)
 };
 
+///
+/// backup path on remote storage
+///
+/// Onetime backup:
+/// <cluster_root>/backup/<app_id>/once/<backup_id>/<backup_item>
+///
+/// Periodic backup:
+/// <cluster_root>/backup/<app_id>/periodic/<periodic_backup_policy>
+/// <cluster_root>/backup/<app_id>/periodic/<backup_id>/<backup_item>
+///
+static const std::string ONETIME_PATH = "once";

Review Comment:
   Yes, it will be used in other cpp files.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947583573


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   `err` is used for backup_request rpc, such as ERR_OK, ERR_INVALID_STATE. `checkpoint_upload_err` is used for backup checkpoint  and upload. If backup checkpoint failed, checkpoint_upload_err won't be error_ok.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947513564


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request

Review Comment:
   This rpc is not used from shell-tool to meta server, is meta server to replica server, won't trigger the control problem. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947511356


##########
src/rdsn/src/common/backup_restore_common.cpp:
##########
@@ -36,5 +44,41 @@ const std::string backup_restore_constant::BACKUP_ID("restore.backup_id");
 const std::string backup_restore_constant::SKIP_BAD_PARTITION("restore.skip_bad_partition");
 const std::string backup_restore_constant::RESTORE_PATH("restore.restore_path");
 
+std::string get_backup_root(const std::string &backup_root,
+                            const std::string &user_defined_root_path)
+{
+    if (user_defined_root_path.empty()) {
+        return backup_root;
+    }
+    return utils::filesystem::path_combine(user_defined_root_path, backup_root);

Review Comment:
   If user specify root_path, we will return `root_path/backup_root`, otherwise return backup_root, backup_root is actually the cluster name.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] foreverneverer commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
foreverneverer commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947627933


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   It's doesn't matter. If all passed, you return `ERR_OK`, otherwise, return `ERR_XXX`, you don't need return the case : `err = ERR_OK`, but `checkpoint=ERR_CHECKPOINT_FAILED`, just return `ERR_CHECKPOINT_FAILED` is ok.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] foreverneverer commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
foreverneverer commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947808862


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   I still think `checkpoint_upload_err` is redundant. if `ERR_CHECKPOINT_FAILED` can be clearly indicates that an error occurred in the checkpoint stage , why define `check_point_err` and set it value is `ERR_CHECKPOINT_FAILED` to explain?
   
   In the past rpc, we haven't this design. Include bulkload, duplication, for example, we don't need add `ingest_err` to show the error of ingesting, or add `download_err` to show the err of downloading and so on, which is redundant, bu just use `ERR_INGEST_FAILD`, `ERR_DOWNLOAD` is clear enough. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] acelyc111 commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r949776222


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request

Review Comment:
   No matter which one is new version. The new replica server will still crash if receive a old backup request from old version meta server?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong merged pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong merged PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948616659


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request

Review Comment:
   If meta server is in new version, replica server in old version. I don't consider this condition compatible, I think it is a dangerous case for meta server and replica server has different version, because new version especially a feature version, meta server will provide mamy new rpc which replica server can not recognize, and it is not necessary to add new rpc code for compatible only when new meta and old replica case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947697630


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   In fact, only one error code can meet the requirement, but as I mentioned before I thought it was not clear enough, they are different error, should not be combined.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] acelyc111 commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948610419


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request

Review Comment:
   OK, thanks.
   I meant how to keep compatablity, if the meta servers are in new version, and the replica server are in old version, what will happen if we ask the cluster to do backup? Is it neccessary to add new rpc code for the new implemention.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948691663


##########
src/rdsn/src/common/backup_restore_common.h:
##########
@@ -52,5 +57,69 @@ class backup_restore_constant
     static const std::string RESTORE_PATH;
 };
 
+/// The directory structure on block service
+///
+/// (<root> = <user_define_root>/<cluster>)
+///
+///  <root>/<app_name>_<app_id>/<backup_id>/<pidx>/chkpt_<ip>_<port>/***.sst
+///                                        /<pidx>/chkpt_<ip>_<port>/CURRENT
+///                                        /<pidx>/chkpt_<ip>_<port>/IDENTITY
+///                                        /<pidx>/chkpt_<ip>_<port>/MANIFEST
+///                                        /<pidx>/chkpt_<ip>_<port>/OPTIONS
+///                                        /<pidx>/chkpt_<ip>_<port>/LOG
+///                                        /<pidx>/chkpt_<ip>_<port>/backup_metadata
+///                                        /<pidx>/current_checkpoint
+///                                        /<pidx>/data_version
+///
+///  ......other partitions......
+///
+///  <root>/<app_name>_<app_id>/<backup_id>/meta/app_metadata
+///  <root>/<app_name>_<app_id>/<backup_id>/backup_info
+///
+
+///
+/// The usage of files:
+///      1, app_metadata : the metadata of the app, the same with the app's app_info
+///      2, backup_metadata : the file to statistic the information of a checkpoint, include all the
+///         file's name, size and md5
+///      3, current_checkpoint : specifing which checkpoint directory is valid
+///      4, data_version: partition data_version
+///      5, backup_info : recording the information of this backup
+
+// TODO(heyuchen): add other common functions
+// get_compatible_backup_root is only used for restore compatible backup
+
+// The backup root path on block service
+// if user_defined_root_path is not empty
+// - return <user_defined_root_path>/<backup_root>
+// else
+// - return <backup_root>
+std::string get_backup_root(const std::string &backup_root,

Review Comment:
   All functions in `backup_restore_common.h` will be used in meta and replica files, backup and restore logic, so I move them into `backup_restore_common.h`. This file is added in my last round refactor. I will resolve the following similar conversation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] acelyc111 commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948614410


##########
src/rdsn/src/common/backup_restore_common.cpp:
##########
@@ -36,5 +44,41 @@ const std::string backup_restore_constant::BACKUP_ID("restore.backup_id");
 const std::string backup_restore_constant::SKIP_BAD_PARTITION("restore.skip_bad_partition");
 const std::string backup_restore_constant::RESTORE_PATH("restore.restore_path");
 
+std::string get_backup_root(const std::string &backup_root,
+                            const std::string &user_defined_root_path)
+{
+    if (user_defined_root_path.empty()) {
+        return backup_root;
+    }
+    return utils::filesystem::path_combine(user_defined_root_path, backup_root);
+}
+
+std::string get_backup_path(const std::string &root,
+                            const std::string &app_name,
+                            const int32_t app_id,
+                            const int64_t backup_id,
+                            const bool is_compatible)
+{
+    std::string str_app = app_name + "_" + std::to_string(app_id);
+    std::stringstream ss;
+    if (!is_compatible) {
+        ss << root << "/" << str_app << "/" << backup_id;
+    } else {
+        ss << root << "/" << backup_id << "/" << str_app;
+    }
+    return ss.str();

Review Comment:
   how about use fmt::format instead?



##########
src/rdsn/src/common/backup_restore_common.h:
##########
@@ -52,5 +57,69 @@ class backup_restore_constant
     static const std::string RESTORE_PATH;
 };
 
+/// The directory structure on block service
+///
+/// (<root> = <user_define_root>/<cluster>)
+///
+///  <root>/<app_name>_<app_id>/<backup_id>/<pidx>/chkpt_<ip>_<port>/***.sst
+///                                        /<pidx>/chkpt_<ip>_<port>/CURRENT
+///                                        /<pidx>/chkpt_<ip>_<port>/IDENTITY
+///                                        /<pidx>/chkpt_<ip>_<port>/MANIFEST
+///                                        /<pidx>/chkpt_<ip>_<port>/OPTIONS
+///                                        /<pidx>/chkpt_<ip>_<port>/LOG
+///                                        /<pidx>/chkpt_<ip>_<port>/backup_metadata
+///                                        /<pidx>/current_checkpoint
+///                                        /<pidx>/data_version
+///
+///  ......other partitions......
+///
+///  <root>/<app_name>_<app_id>/<backup_id>/meta/app_metadata
+///  <root>/<app_name>_<app_id>/<backup_id>/backup_info
+///
+
+///
+/// The usage of files:
+///      1, app_metadata : the metadata of the app, the same with the app's app_info
+///      2, backup_metadata : the file to statistic the information of a checkpoint, include all the
+///         file's name, size and md5
+///      3, current_checkpoint : specifing which checkpoint directory is valid
+///      4, data_version: partition data_version
+///      5, backup_info : recording the information of this backup
+
+// TODO(heyuchen): add other common functions
+// get_compatible_backup_root is only used for restore compatible backup
+
+// The backup root path on block service
+// if user_defined_root_path is not empty
+// - return <user_defined_root_path>/<backup_root>
+// else
+// - return <backup_root>
+std::string get_backup_root(const std::string &backup_root,
+                            const std::string &user_defined_root_path);
+
+// This backup path on block service
+// if is_compatible = false (root is the return value of get_backup_root function)
+// - return <root>/<app_name>_<app_id>/<backup_id>
+// else (only used for restore compatible backup, root is the return value of
+// get_compatible_backup_root function)
+// - return <root>/<backup_id>/<app_name>_<app_id>
+std::string get_backup_path(const std::string &root,

Review Comment:
   if it is only used in backup_restore_common.cpp, how about define it as a local anonymous function there?



##########
src/rdsn/src/common/backup_restore_common.cpp:
##########
@@ -36,5 +44,41 @@ const std::string backup_restore_constant::BACKUP_ID("restore.backup_id");
 const std::string backup_restore_constant::SKIP_BAD_PARTITION("restore.skip_bad_partition");
 const std::string backup_restore_constant::RESTORE_PATH("restore.restore_path");
 
+std::string get_backup_root(const std::string &backup_root,
+                            const std::string &user_defined_root_path)
+{
+    if (user_defined_root_path.empty()) {
+        return backup_root;
+    }
+    return utils::filesystem::path_combine(user_defined_root_path, backup_root);
+}
+
+std::string get_backup_path(const std::string &root,
+                            const std::string &app_name,
+                            const int32_t app_id,
+                            const int64_t backup_id,
+                            const bool is_compatible)
+{
+    std::string str_app = app_name + "_" + std::to_string(app_id);
+    std::stringstream ss;
+    if (!is_compatible) {
+        ss << root << "/" << str_app << "/" << backup_id;
+    } else {
+        ss << root << "/" << backup_id << "/" << str_app;
+    }
+    return ss.str();
+}
+
+std::string get_backup_meta_path(const std::string &root,
+                                 const std::string &app_name,
+                                 const int32_t app_id,
+                                 const int64_t backup_id,
+                                 const bool is_compatible)
+{
+    std::stringstream ss;
+    ss << get_backup_path(root, app_name, app_id, backup_id, is_compatible) << "/meta";

Review Comment:
   how about use fmt::format instead?



##########
src/rdsn/src/meta/test/meta_backup_engine_test.cpp:
##########
@@ -0,0 +1,107 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <gtest/gtest.h>
+#include <dsn/dist/fmt_logging.h>
+#include <dsn/utility/fail_point.h>
+
+#include "meta_test_base.h"
+#include "meta/meta_backup_engine.h"
+
+namespace dsn {
+namespace replication {
+
+class meta_backup_engine_test : public meta_test_base
+{
+public:
+    void SetUp()
+    {
+        meta_test_base::SetUp();
+        create_app(APP_NAME, PARTITION_COUNT);
+        std::shared_ptr<app_state> app = find_app(APP_NAME);
+        _app_id = app->app_id;
+        _backup_engine = std::make_shared<meta_backup_engine>(_ms.get(), false);
+        fail::setup();
+    }
+
+    void TearDown()

Review Comment:
   same



##########
src/rdsn/src/meta/test/meta_backup_engine_test.cpp:
##########
@@ -0,0 +1,107 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <gtest/gtest.h>
+#include <dsn/dist/fmt_logging.h>
+#include <dsn/utility/fail_point.h>
+
+#include "meta_test_base.h"
+#include "meta/meta_backup_engine.h"
+
+namespace dsn {
+namespace replication {
+
+class meta_backup_engine_test : public meta_test_base
+{
+public:
+    void SetUp()

Review Comment:
   better to add `override`



##########
src/rdsn/src/meta/meta_backup_engine.h:
##########
@@ -92,12 +111,28 @@ class meta_backup_engine
     void retry_backup(const dsn::gpid pid);
     void handle_replica_backup_failed(const backup_response &response, const gpid pid);
 
-    error_code write_backup_file(const std::string &file_name, const dsn::blob &write_buffer);
+    error_code write_backup_file(const std::string &remote_dir,
+                                 const std::string &file_name,
+                                 const blob &write_buffer);
     error_code write_app_info();
     void write_backup_info();
 
     void update_backup_item_on_remote_storage(backup_status::type new_status, int64_t end_time = 0);
 
+    std::string get_remote_storage_root() const
+    {
+        return meta_options::concat_path_unix_style(_meta_svc->get_cluster_root(), "backup");
+    }
+
+    std::string get_remote_backup_path() const
+    {
+        auto type_path = _is_periodic_backup ? PERIODIC_PATH : ONETIME_PATH;
+        std::stringstream ss;
+        ss << get_remote_storage_root() << "/" << _cur_backup.app_id << "/" << type_path << "/"
+           << _cur_backup.backup_id;
+        return ss.str();

Review Comment:
   how about use `fmt::format`?



##########
src/rdsn/src/meta/meta_backup_engine.h:
##########
@@ -40,6 +40,19 @@ struct app_backup_info
     DEFINE_JSON_SERIALIZATION(backup_id, start_time_ms, end_time_ms, app_id, app_name)
 };
 
+///
+/// backup path on remote storage
+///
+/// Onetime backup:
+/// <cluster_root>/backup/<app_id>/once/<backup_id>/<backup_item>
+///
+/// Periodic backup:
+/// <cluster_root>/backup/<app_id>/periodic/<periodic_backup_policy>
+/// <cluster_root>/backup/<app_id>/periodic/<backup_id>/<backup_item>
+///
+static const std::string ONETIME_PATH = "once";

Review Comment:
   Will these variables be used in some other files? if not, better to hide in cpp.



##########
src/rdsn/src/common/backup_restore_common.h:
##########
@@ -52,5 +57,69 @@ class backup_restore_constant
     static const std::string RESTORE_PATH;
 };
 
+/// The directory structure on block service
+///
+/// (<root> = <user_define_root>/<cluster>)
+///
+///  <root>/<app_name>_<app_id>/<backup_id>/<pidx>/chkpt_<ip>_<port>/***.sst
+///                                        /<pidx>/chkpt_<ip>_<port>/CURRENT
+///                                        /<pidx>/chkpt_<ip>_<port>/IDENTITY
+///                                        /<pidx>/chkpt_<ip>_<port>/MANIFEST
+///                                        /<pidx>/chkpt_<ip>_<port>/OPTIONS
+///                                        /<pidx>/chkpt_<ip>_<port>/LOG
+///                                        /<pidx>/chkpt_<ip>_<port>/backup_metadata
+///                                        /<pidx>/current_checkpoint
+///                                        /<pidx>/data_version
+///
+///  ......other partitions......
+///
+///  <root>/<app_name>_<app_id>/<backup_id>/meta/app_metadata
+///  <root>/<app_name>_<app_id>/<backup_id>/backup_info
+///
+
+///
+/// The usage of files:
+///      1, app_metadata : the metadata of the app, the same with the app's app_info
+///      2, backup_metadata : the file to statistic the information of a checkpoint, include all the
+///         file's name, size and md5
+///      3, current_checkpoint : specifing which checkpoint directory is valid
+///      4, data_version: partition data_version
+///      5, backup_info : recording the information of this backup
+
+// TODO(heyuchen): add other common functions
+// get_compatible_backup_root is only used for restore compatible backup
+
+// The backup root path on block service
+// if user_defined_root_path is not empty
+// - return <user_defined_root_path>/<backup_root>
+// else
+// - return <backup_root>
+std::string get_backup_root(const std::string &backup_root,

Review Comment:
   where will this function been called? if only called in src/rdsn/src/meta/meta_backup_engine.cpp, how about define it as a local funciton there?



##########
src/rdsn/src/meta/meta_backup_engine.cpp:
##########
@@ -86,87 +86,57 @@ void meta_backup_engine::start()
 // ThreadPool: THREAD_POOL_DEFAULT
 error_code meta_backup_engine::write_app_info()
 {
-    // TODO(heyuchen): TBD
-    return ERR_OK;
-}
-
-// ThreadPool: THREAD_POOL_DEFAULT
-void meta_backup_engine::update_backup_item_on_remote_storage(backup_status::type new_status,
-                                                              int64_t end_time)
-{
-    // TODO(heyuchen): TBD
-}
-
-// TODO(heyuchen): update following functions
-
-error_code meta_backup_engine::write_backup_file(const std::string &file_name,
-                                                 const dsn::blob &write_buffer)
-{
-    dist::block_service::create_file_request create_file_req;
-    create_file_req.ignore_metadata = true;
-    create_file_req.file_name = file_name;
-
-    dsn::error_code err;
-    dist::block_service::block_file_ptr remote_file;
-    _block_service
-        ->create_file(create_file_req,
-                      TASK_CODE_EXEC_INLINED,
-                      [&err, &remote_file](const dist::block_service::create_file_response &resp) {
-                          err = resp.err;
-                          remote_file = resp.file_handle;
-                      })
-        ->wait();
-    if (err != dsn::ERR_OK) {
-        ddebug_f("create file {} failed", file_name);
-        return err;
-    }
-    dassert_f(remote_file != nullptr,
-              "create file {} succeed, but can't get handle",
-              create_file_req.file_name);
-    remote_file
-        ->write(dist::block_service::write_request{write_buffer},
-                TASK_CODE_EXEC_INLINED,
-                [&err](const dist::block_service::write_response &resp) { err = resp.err; })
-        ->wait();
-    return err;
-}
-
-error_code meta_backup_engine::backup_app_meta()
-{
-    dsn::blob app_info_buffer;
+    blob app_info_buffer;
     {
         zauto_read_lock l;
-        _backup_service->get_state()->lock_read(l);
-        std::shared_ptr<app_state> app = _backup_service->get_state()->get_app(_cur_backup.app_id);
+        _meta_svc->get_server_state()->lock_read(l);
+        std::shared_ptr<app_state> app = _meta_svc->get_server_state()->get_app(_cur_backup.app_id);
         if (app == nullptr || app->status != app_status::AS_AVAILABLE) {
             derror_f("app {} is not available, couldn't do backup now.", _cur_backup.app_id);
             return ERR_INVALID_STATE;
         }
         app_state tmp = *app;
-        // Because we don't restore app envs, so no need to write app envs to backup file.
-        // TODO(zhangyifan): backup and restore app envs when needed.
-        tmp.envs.clear();
-        app_info_buffer = dsn::json::json_forwarder<app_info>::encode(tmp);
+        app_info_buffer = json::json_forwarder<app_info>::encode(tmp);
     }
 
-    std::string backup_root =
-        dsn::utils::filesystem::path_combine(_backup_path, _backup_service->backup_root());
-    // TODO(heyuchen): refactor and update it in future
-    std::string file_name = "todo";
-    return write_backup_file(file_name, app_info_buffer);
+    std::string remote_meta_dir =
+        get_backup_meta_path(get_backup_root(FLAGS_cold_backup_root, _cur_backup.backup_path),

Review Comment:
   can we move `get_backup_root` into `get_backup_meta_path`, and only left `_cur_backup.backup_path` here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948685106


##########
src/rdsn/src/common/backup_restore_common.h:
##########
@@ -52,5 +57,69 @@ class backup_restore_constant
     static const std::string RESTORE_PATH;
 };
 
+/// The directory structure on block service
+///
+/// (<root> = <user_define_root>/<cluster>)
+///
+///  <root>/<app_name>_<app_id>/<backup_id>/<pidx>/chkpt_<ip>_<port>/***.sst
+///                                        /<pidx>/chkpt_<ip>_<port>/CURRENT
+///                                        /<pidx>/chkpt_<ip>_<port>/IDENTITY
+///                                        /<pidx>/chkpt_<ip>_<port>/MANIFEST
+///                                        /<pidx>/chkpt_<ip>_<port>/OPTIONS
+///                                        /<pidx>/chkpt_<ip>_<port>/LOG
+///                                        /<pidx>/chkpt_<ip>_<port>/backup_metadata
+///                                        /<pidx>/current_checkpoint
+///                                        /<pidx>/data_version
+///
+///  ......other partitions......
+///
+///  <root>/<app_name>_<app_id>/<backup_id>/meta/app_metadata
+///  <root>/<app_name>_<app_id>/<backup_id>/backup_info
+///
+
+///
+/// The usage of files:
+///      1, app_metadata : the metadata of the app, the same with the app's app_info
+///      2, backup_metadata : the file to statistic the information of a checkpoint, include all the
+///         file's name, size and md5
+///      3, current_checkpoint : specifing which checkpoint directory is valid
+///      4, data_version: partition data_version
+///      5, backup_info : recording the information of this backup
+
+// TODO(heyuchen): add other common functions
+// get_compatible_backup_root is only used for restore compatible backup
+
+// The backup root path on block service
+// if user_defined_root_path is not empty
+// - return <user_defined_root_path>/<backup_root>
+// else
+// - return <backup_root>
+std::string get_backup_root(const std::string &backup_root,

Review Comment:
   There are many logic will call this function, including replica and meta.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948702695


##########
src/rdsn/src/common/backup_restore_common.cpp:
##########
@@ -36,5 +44,41 @@ const std::string backup_restore_constant::BACKUP_ID("restore.backup_id");
 const std::string backup_restore_constant::SKIP_BAD_PARTITION("restore.skip_bad_partition");
 const std::string backup_restore_constant::RESTORE_PATH("restore.restore_path");
 
+std::string get_backup_root(const std::string &backup_root,
+                            const std::string &user_defined_root_path)
+{
+    if (user_defined_root_path.empty()) {
+        return backup_root;
+    }
+    return utils::filesystem::path_combine(user_defined_root_path, backup_root);
+}
+
+std::string get_backup_path(const std::string &root,
+                            const std::string &app_name,
+                            const int32_t app_id,
+                            const int64_t backup_id,
+                            const bool is_compatible)
+{
+    std::string str_app = app_name + "_" + std::to_string(app_id);
+    std::stringstream ss;
+    if (!is_compatible) {
+        ss << root << "/" << str_app << "/" << backup_id;
+    } else {
+        ss << root << "/" << backup_id << "/" << str_app;
+    }
+    return ss.str();

Review Comment:
   Done, and I resolve the following similar conversation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] foreverneverer commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
foreverneverer commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r948599472


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   Well, wait other's opinion, if they have no have no objections, I will be willing think it is a reasonable design



##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   Well, wait other's opinion, if they have no objections, I will be willing think it is a reasonable design



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a diff in pull request #1112: feat(backup): 4. meta server send backup_request to replica

Posted by GitBox <gi...@apache.org>.
hycdong commented on code in PR #1112:
URL: https://github.com/apache/incubator-pegasus/pull/1112#discussion_r947656523


##########
src/rdsn/src/common/backup.thrift:
##########
@@ -64,24 +64,28 @@ struct configuration_restore_request
     9:optional string   restore_path;
 }
 
+// meta -> replica
 struct backup_request
 {
-    1:dsn.gpid              pid;
-    2:policy_info           policy;
-    3:string                app_name;
-    4:i64                   backup_id;
+    1:dsn.gpid          pid;
+    2:string            app_name;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:string            backup_provider_type;
     // user specified backup_path.
-    5:optional string       backup_path;
+    6:optional string   backup_root_path;
 }
 
 struct backup_response
 {
     1:dsn.error_code    err;
     2:dsn.gpid          pid;
-    3:i32               progress;  // the progress of the cold_backup
-    4:string            policy_name;
-    5:i64               backup_id;
-    6:i64               checkpoint_total_size;
+    3:i64               backup_id;
+    4:backup_status     status;
+    5:optional dsn.error_code   checkpoint_err;
+    6:optional dsn.error_code   upload_err;

Review Comment:
   I don't agree with that one error code should identify all error cases. `checkpoint_upload_err` is only valid in backup checkpoint and upload stage, other stage it should not be set. 
   Besides, this rpc is sent from meta server to replica server periodically during backup, meta server will do different actions according to the err and checkpoint_upload_err, not only show the hint_message.
   I still inisist on my previous design, seperate another error code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org