You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pegasus.apache.org by "ninsmiracle (via GitHub)" <gi...@apache.org> on 2023/09/01 11:39:25 UTC

[GitHub] [incubator-pegasus] ninsmiracle opened a new pull request, #1597: fix(duplication): fix slog gc crash

ninsmiracle opened a new pull request, #1597:
URL: https://github.com/apache/incubator-pegasus/pull/1597

   ### What problem does this PR solve? <!--add issue link with summary if exists-->
   #1596
   
   ### What is changed and how does it work?
   Add a parameter to protect a plog avoid from gc when it's been checking by duplication.
   
   ##### Tests <!-- At least one of them must be included. -->
   - Unit test
   -cluster test
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


Re: [PR] fix(duplication): fix plog gc crash [incubator-pegasus]

Posted by "empiredan (via GitHub)" <gi...@apache.org>.
empiredan commented on code in PR #1597:
URL: https://github.com/apache/incubator-pegasus/pull/1597#discussion_r1538767939


##########
src/replica/duplication/load_from_private_log.cpp:
##########
@@ -146,6 +146,8 @@ void load_from_private_log::run()
 
 void load_from_private_log::find_log_file_to_start()
 {
+    _duplicator->set_duplication_plog_checking(true);

Review Comment:
   Would it be fine to return from the middle of the function without roll back by `_duplicator->set_duplication_plog_checking(false);` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


Re: [PR] fix(duplication): fix plog gc crash [incubator-pegasus]

Posted by "empiredan (via GitHub)" <gi...@apache.org>.
empiredan commented on code in PR #1597:
URL: https://github.com/apache/incubator-pegasus/pull/1597#discussion_r1538927914


##########
src/replica/duplication/load_from_private_log.cpp:
##########
@@ -157,6 +159,7 @@ void load_from_private_log::find_log_file_to_start()
         error_s es = log_utils::open_read(pr.second->path(), file);
         if (!es.is_ok()) {
             LOG_ERROR_PREFIX("{}", es);
+            _duplicator->set_duplication_plog_checking(false);

Review Comment:
   There's `return` statement in function `find_log_file_to_start(log_file_map)`, which should be also set false:
   
   ```c++
   void load_from_private_log::find_log_file_to_start(
       const mutation_log::log_file_map_by_index &log_file_map)
   {
       _current = nullptr;
       if (dsn_unlikely(log_file_map.empty())) {
           LOG_ERROR_PREFIX("unable to start duplication since no log file is available");
           return;
       }
   ```
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


Re: [PR] fix(duplication): fix plog gc crash [incubator-pegasus]

Posted by "acelyc111 (via GitHub)" <gi...@apache.org>.
acelyc111 commented on code in PR #1597:
URL: https://github.com/apache/incubator-pegasus/pull/1597#discussion_r1355361210


##########
src/replica/replica.h:
##########
@@ -633,6 +638,7 @@ class replica : public serverlet<replica>, public ref_counter, public replica_ba
     bool _is_manual_emergency_checkpointing{false};
     bool _is_duplication_master{false};
     bool _is_duplication_follower{false};
+    std::atomic<bool> _is_duplication_plog_checking{false};

Review Comment:
   Could you please to add some comments about this variable?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] acelyc111 commented on pull request #1597: fix(duplication): fix slog gc crash

Posted by "acelyc111 (via GitHub)" <gi...@apache.org>.
acelyc111 commented on PR #1597:
URL: https://github.com/apache/incubator-pegasus/pull/1597#issuecomment-1715836510

   @ninsmiracle Thanks for the contribution!
   
   This pull request mention "slog" but I didn't see how does slog been operated in the patch, could you please describe the relationship? Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] acelyc111 commented on pull request #1597: fix(duplication): fix slog gc crash

Posted by "acelyc111 (via GitHub)" <gi...@apache.org>.
acelyc111 commented on PR #1597:
URL: https://github.com/apache/incubator-pegasus/pull/1597#issuecomment-1725343363

   > > @ninsmiracle Thanks for the contribution!
   > > This pull request mention "slog" but I didn't see how does slog been operated in the patch, could you please describe the relationship? Thanks!
   > 
   > In my opinion, this PR should called "plog gc crash",however in our inside gitlab repository,this commit called this name.I guess it's may be related to old version Slog ? But I'm not familiar with old version pegasus. So could I change the PR's name?
   
   Feel free to rename.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] acelyc111 commented on a diff in pull request #1597: fix(duplication): fix slog gc crash

Posted by "acelyc111 (via GitHub)" <gi...@apache.org>.
acelyc111 commented on code in PR #1597:
URL: https://github.com/apache/incubator-pegasus/pull/1597#discussion_r1323127988


##########
src/replica/replica.h:
##########
@@ -633,6 +639,7 @@ class replica : public serverlet<replica>, public ref_counter, public replica_ba
     bool _is_manual_emergency_checkpointing{false};
     bool _is_duplication_master{false};
     bool _is_duplication_follower{false};
+    std::atomic<bool> _is_duplication_plog_checking{false};

Review Comment:
   Where will the variable be used?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


Re: [PR] fix(duplication): fix plog gc crash [incubator-pegasus]

Posted by "empiredan (via GitHub)" <gi...@apache.org>.
empiredan commented on code in PR #1597:
URL: https://github.com/apache/incubator-pegasus/pull/1597#discussion_r1539095227


##########
src/replica/duplication/load_from_private_log.cpp:
##########
@@ -168,6 +171,8 @@ void load_from_private_log::find_log_file_to_start()
 void load_from_private_log::find_log_file_to_start(
     const mutation_log::log_file_map_by_index &log_file_map)
 {
+    auto cleanup = dsn::defer([this]() { _duplicator->set_duplication_plog_checking(false); });

Review Comment:
   Is this line necessary ? `_duplicator->set_duplication_plog_checking(false);` would be called twice ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] ninsmiracle commented on pull request #1597: fix(duplication): fix slog gc crash

Posted by "ninsmiracle (via GitHub)" <gi...@apache.org>.
ninsmiracle commented on PR #1597:
URL: https://github.com/apache/incubator-pegasus/pull/1597#issuecomment-1717100630

   > @ninsmiracle Thanks for the contribution!
   > 
   > This pull request mention "slog" but I didn't see how does slog been operated in the patch, could you please describe the relationship? Thanks!
   
   I think this PR should called "plog gc crash",however in our inside gitlab repository,this commit called this name.I guess it's may be related to old version Slog ? But I'm not familiar with old version pegasus. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


Re: [PR] fix(duplication): prevent plog files from being removed by GC while they are being checked by duplication [incubator-pegasus]

Posted by "empiredan (via GitHub)" <gi...@apache.org>.
empiredan merged PR #1597:
URL: https://github.com/apache/incubator-pegasus/pull/1597


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


Re: [PR] fix(duplication): fix plog gc crash [incubator-pegasus]

Posted by "acelyc111 (via GitHub)" <gi...@apache.org>.
acelyc111 commented on code in PR #1597:
URL: https://github.com/apache/incubator-pegasus/pull/1597#discussion_r1515625884


##########
src/replica/replica.h:
##########
@@ -625,6 +630,8 @@ class replica : public serverlet<replica>, public ref_counter, public replica_ba
     bool _is_manual_emergency_checkpointing{false};
     bool _is_duplication_master{false};
     bool _is_duplication_follower{false};
+    // replica is finding some private logs to load for duplication,avoid unexpected plog gc

Review Comment:
   ```suggestion
       // Indicate whether the replica is during finding out some private logs to
       // load for duplication. It useful to prevent plog GCed unexpectedly.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


Re: [PR] fix(duplication): fix plog gc crash [incubator-pegasus]

Posted by "empiredan (via GitHub)" <gi...@apache.org>.
empiredan commented on code in PR #1597:
URL: https://github.com/apache/incubator-pegasus/pull/1597#discussion_r1538927914


##########
src/replica/duplication/load_from_private_log.cpp:
##########
@@ -157,6 +159,7 @@ void load_from_private_log::find_log_file_to_start()
         error_s es = log_utils::open_read(pr.second->path(), file);
         if (!es.is_ok()) {
             LOG_ERROR_PREFIX("{}", es);
+            _duplicator->set_duplication_plog_checking(false);

Review Comment:
   There's `return` statement in function `find_log_file_to_start(log_file_map)`, which should be also set false. Use `dsn::defer` to set false ?
   
   ```c++
   void load_from_private_log::find_log_file_to_start(
       const mutation_log::log_file_map_by_index &log_file_map)
   {
       _current = nullptr;
       if (dsn_unlikely(log_file_map.empty())) {
           LOG_ERROR_PREFIX("unable to start duplication since no log file is available");
           return;
       }
   ```
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


Re: [PR] fix(duplication): fix plog gc crash [incubator-pegasus]

Posted by "empiredan (via GitHub)" <gi...@apache.org>.
empiredan commented on code in PR #1597:
URL: https://github.com/apache/incubator-pegasus/pull/1597#discussion_r1538933903


##########
src/replica/duplication/load_from_private_log.cpp:
##########
@@ -146,6 +146,8 @@ void load_from_private_log::run()
 
 void load_from_private_log::find_log_file_to_start()
 {
+    _duplicator->set_duplication_plog_checking(true);

Review Comment:
   ```suggestion
       _duplicator->set_duplication_plog_checking(true);
       auto cleanup = dsn::defer([this]() { _duplicator->set_duplication_plog_checking(false); });
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org