You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kvrocks.apache.org by "guoxiangCN (via GitHub)" <gi...@apache.org> on 2023/03/14 02:10:56 UTC

[GitHub] [incubator-kvrocks] guoxiangCN opened a new issue, #1319: Instantly a large number of writes may cause the slave to enter an infinite loop of full sync

guoxiangCN opened a new issue, #1319:
URL: https://github.com/apache/incubator-kvrocks/issues/1319

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/incubator-kvrocks/issues) and found no similar issues.
   
   
   ### Version
   
   Linux 3.10.0-1160.42.2.el7.x86_64
   Kvrocks v2.3.0
   
   ### Minimal reproduce step
   
   1. modify config rocksdb.wal_ttl_seconds 10800  // aka 3hour
   2. modify config rocksdb.wal_size_limit_mb 51200 // aka 50GiB
   
   Accoding to the checkpoint save/resue logic, we can inspect that share_time = 3600s
   ```
       // Replicas can share checkpiont to replication if the checkpoint existing
       // time is less half of WAL ttl.
       int64_t can_shared_time = storage->config_->RocksDB.WAL_ttl_seconds / 2;
       if (can_shared_time > 60 * 60) can_shared_time = 60 * 60;
       if (can_shared_time < 10 * 60) can_shared_time = 10 * 60;
       auto now = static_cast<time_t>(Util::GetTimeStamp());
       if (now - storage->GetCheckpointCreateTime() > can_shared_time) {
         LOG(WARNING) << "[storage] Can't use current checkpoint, waiting next checkpoint";
         return Status(Status::NotOK, "Can't use current checkpoint, waiting for next checkpoint");
       }
       LOG(INFO) << "[storage] Use current existing checkpoint";
   ```
   
   In order to reproduce this bug, we only need to write more than 50GiB of data in a short time, and the WAL log will be rorated. At this time, there will be a hole between MaxLsnInCheckpoint and MinLsnInWAL, which will trigger fullsync. However, due to the shared of checkpoint_ Time is set to 1 hour, so the old checkpoint will be used and sent to slave again Since then, we have fallen into a fullsyn loop util 1 hour elapsed.
   
   ### What did you expect to see?
   
   the slave works well
   
   ### What did you see instead?
   
   the slave was in an infinite loop of full sync, it cannot work and the master needs to waste a lot of bandwidth to re-synchronize data to slaves.
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] git-hulk commented on issue #1319: Instantly a large number of writes may cause the slave to enter an infinite loop of full sync

Posted by "git-hulk (via GitHub)" <gi...@apache.org>.
git-hulk commented on issue #1319:
URL: https://github.com/apache/incubator-kvrocks/issues/1319#issuecomment-1467239167

   @guoxiangCN Thanks for your feedback, it's indeed a problem. I prepare the commits to resolve this issue, can help to have a try when you're free.
   
   https://github.com/apache/incubator-kvrocks/commit/339532bc636cbd7836bd4c88dd0c14493a51331e


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-kvrocks] git-hulk closed issue #1319: Instantly a large number of writes may cause the slave to enter an infinite loop of full sync

Posted by "git-hulk (via GitHub)" <gi...@apache.org>.
git-hulk closed issue #1319: Instantly a large number of writes may cause the slave to enter an infinite loop of full sync 
URL: https://github.com/apache/incubator-kvrocks/issues/1319


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@kvrocks.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org