You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2021/09/10 03:16:35 UTC

[GitHub] [couchdb] nickva opened a new pull request #3739: Fix splitting shards with large purge sequences

nickva opened a new pull request #3739:
URL: https://github.com/apache/couchdb/pull/3739


   Previously, if the source db purge sequence > `purge_infos_limit`, shard splitting would crash with the `{{invalid_start_purge_seq,0}, [{couch_bt_engine,fold_purge_infos,5...` error. That was because purge sequences where always copied starting from 0. That would only work as long as the total number of purges stayed below the purge_infos_limit threshold. In order to correctly gather and the purge sequences, the start sequence must be based off of the actual oldest sequence currently available.
   
   An example of how it should be done is in the `mem_rpc` module, when [loading purge infos](https://github.com/apache/couchdb/blob/4ea9f1ea1a2078162d0e281948b56469228af3f7/src/mem3/src/mem3_rpc.erl#L206-L207), so here we do exactly the same. The `MinSeq - 1` logic is also evident by inspecting the [fold_purge_infos](https://github.com/apache/couchdb/blob/3.x/src/couch/src/couch_bt_engine.erl#L625-L637) function.
   
   The test sets up the exact scenario as described above: reduces the purge info limit to 10 then purges 20 documents. By purging more than the limit, we ensure the starting sequence is now != 0. However, the purge sequence btree is actually trimmed down during compaction. That is why there are a few extra helper functions to ensure compaction runs and finishes before shard splitting starts.
   
   Fixes: https://github.com/apache/couchdb/issues/3738
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on a change in pull request #3739: Fix splitting shards with large purge sequences

Posted by GitBox <gi...@apache.org>.
nickva commented on a change in pull request #3739:
URL: https://github.com/apache/couchdb/pull/3739#discussion_r706242479



##########
File path: src/mem3/test/eunit/mem3_reshard_test.erl
##########
@@ -140,6 +141,56 @@ split_one_shard(#{db1 := Db}) ->
     end)}.
 
 
+% Test to check that shard with high number of purges can be split
+split_shard_with_lots_of_purges(#{db1 := Db}) ->
+    {timeout, ?TIMEOUT, ?_test(begin
+        % Set a low purge infos limit, we are planning of overunning it

Review comment:
       Thanks for noticing!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva merged pull request #3739: Fix splitting shards with large purge sequences

Posted by GitBox <gi...@apache.org>.
nickva merged pull request #3739:
URL: https://github.com/apache/couchdb/pull/3739


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] kocolosk commented on pull request #3739: Fix splitting shards with large purge sequences

Posted by GitBox <gi...@apache.org>.
kocolosk commented on pull request #3739:
URL: https://github.com/apache/couchdb/pull/3739#issuecomment-916925011


   Code looks great to me. I saw you mention a shard splitting PR on IRC and figured I'd need an extra cup of coffee this morning, then saw a 2 line changes with detailed tests 🎉 
   
   Couple of small typos in the commit message, "purge sequences *were* always copied", "In order to correctly gather and *(split?)* the purge sequences". Nice work! 👍 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on pull request #3739: Fix splitting shards with large purge sequences

Posted by GitBox <gi...@apache.org>.
nickva commented on pull request #3739:
URL: https://github.com/apache/couchdb/pull/3739#issuecomment-916980553


   Thank you for taking a look, Adam!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] kocolosk commented on a change in pull request #3739: Fix splitting shards with large purge sequences

Posted by GitBox <gi...@apache.org>.
kocolosk commented on a change in pull request #3739:
URL: https://github.com/apache/couchdb/pull/3739#discussion_r706201726



##########
File path: src/mem3/test/eunit/mem3_reshard_test.erl
##########
@@ -140,6 +141,56 @@ split_one_shard(#{db1 := Db}) ->
     end)}.
 
 
+% Test to check that shard with high number of purges can be split
+split_shard_with_lots_of_purges(#{db1 := Db}) ->
+    {timeout, ?TIMEOUT, ?_test(begin
+        % Set a low purge infos limit, we are planning of overunning it

Review comment:
       Trivial typo in the comment here, suggest "we are planning to overrun it"




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org