You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2021/10/21 22:37:18 UTC

[GitHub] [couchdb] nickva opened a new pull request #3796: [WIP] Move custodian VDU to a BDU and fix _all_dbs off-by-one limit bug

nickva opened a new pull request #3796:
URL: https://github.com/apache/couchdb/pull/3796


   Move custodian VDU to BDU
   
   Add test for _all_dbs + limit params and few end-to-end shard map update tests which exercise the new BDU functionality.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on pull request #3796: Move custodian VDU to a BDU and fix _all_dbs off-by-one limit bug

Posted by GitBox <gi...@apache.org>.
nickva commented on pull request #3796:
URL: https://github.com/apache/couchdb/pull/3796#issuecomment-951141425


   @jaydoane hmm, apparently the db isn't guaranteed to synchronously appear right after we get a 201 from a shard doc creation.
   
   I couldn't reproduce the flakiness locally but I can imagine it can occur. I think this should fix it https://github.com/apache/couchdb/pull/3802. The "can read db info" was an extra check anyway not related to whether BDU itself worked or not which what tests should be testing only.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on a change in pull request #3796: Move custodian VDU to a BDU and fix _all_dbs off-by-one limit bug

Posted by GitBox <gi...@apache.org>.
nickva commented on a change in pull request #3796:
URL: https://github.com/apache/couchdb/pull/3796#discussion_r735072812



##########
File path: src/mem3/src/mem3_bdu.erl
##########
@@ -0,0 +1,106 @@
+% Licensed under the Apache License, Version 2.0 (the "License"); you may not
+% use this file except in compliance with the License. You may obtain a copy of
+% the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing, software
+% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+% License for the specific language governing permissions and limitations under
+% the License.
+
+-module(mem3_bdu).
+
+
+-export([
+    before_doc_update/3
+]).
+
+
+-include_lib("couch/include/couch_db.hrl").
+
+
+-spec before_doc_update(#doc{}, Db::any(), couch_db:update_type()) -> #doc{}.
+before_doc_update(#doc{id = <<?DESIGN_DOC_PREFIX, _/binary>>} = Doc, _Db, _UpdateType) ->
+    % Skip design docs
+    Doc;
+
+before_doc_update(#doc{deleted = true} = Doc, _Db, _UpdateType) ->
+    % Skip deleted
+    Doc;
+
+before_doc_update(#doc{} = Doc, _Db, replicated_changes) ->
+    % Skip internal replicator updates
+    Doc;
+
+before_doc_update(#doc{} = Doc, _Db, _UpdateType) ->
+    Body1 = couch_util:json_encode(Doc#doc.body),
+    Body2 = couch_util:json_decode(Body1, [return_maps]),
+    validate(Body2),
+    Doc.
+
+
+validate(#{} = Body) ->
+    validate_key(<<"by_node">>, Body, ["by_node is mandatory"]),
+    validate_key(<<"by_range">>, Body, ["by_range is mandatory"]),
+    ByNode = maps:get(<<"by_node">>, Body),
+    ByRange = maps:get(<<"by_range">>, Body),
+    % "by_node": {
+    %    "node1@xxx.xxx.xxx.xxx": ["00000000-1fffffff",...]
+    % ]}
+    maps:map(fun(Node, Ranges) ->
+        validate_by_node(Node, Ranges, ByRange)
+    end, ByNode),
+    % "by_range": {
+    %   "00000000-1fffffff": ["node1@xxx.xxx.xxx.xxx", ...]
+    % ]}
+    maps:map(fun(Range, Nodes) ->
+        validate_by_range(Range, Nodes, ByNode)
+    end, ByRange).
+
+
+validate_by_node(Node, Ranges, ByRange) ->
+    validate_array(Ranges, ["by_node", Ranges, "value not an array"]),
+    lists:foreach(fun(Range) ->
+        validate_key(Range, ByRange, ["by_range for", Range, "missing"]),
+        Nodes = maps:get(Range, ByRange),
+        validate_member(Node, Nodes, ["by_range for", Range, "missing", Node])
+    end, Ranges).
+
+
+validate_by_range(Range, Nodes, ByNode) ->
+    validate_array(Nodes, ["by_range", Nodes, "value not an array"]),
+    lists:foreach(fun(Node) ->
+        validate_key(Node, ByNode, ["by_node for", Node, "missing"]),
+        Ranges = maps:get(Node, ByNode),
+        validate_member(Range, Ranges, ["by_node for", Node, "missing", Range])
+    end, Nodes).
+
+
+validate_array(Val, _ErrMsg) when is_list(Val) ->
+    ok;
+validate_array(_Val, ErrMsg) ->
+    throw({forbidden, errmsg(ErrMsg)}).
+
+
+validate_key(Key, #{} = Map, ErrMsg) ->
+    case maps:is_key(Key, Map) of
+        true -> ok;
+        false -> throw({forbidden, errmsg(ErrMsg)})
+    end;
+validate_key(_Key, _Map, ErrMsg) ->
+    throw({forbidden, errmsg(ErrMsg)}).

Review comment:
       Good catch. This would be when "by_node" key is present but it's value is not a map but something else, say an integer. Will add a test for it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on a change in pull request #3796: Move custodian VDU to a BDU and fix _all_dbs off-by-one limit bug

Posted by GitBox <gi...@apache.org>.
nickva commented on a change in pull request #3796:
URL: https://github.com/apache/couchdb/pull/3796#discussion_r735072812



##########
File path: src/mem3/src/mem3_bdu.erl
##########
@@ -0,0 +1,106 @@
+% Licensed under the Apache License, Version 2.0 (the "License"); you may not
+% use this file except in compliance with the License. You may obtain a copy of
+% the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing, software
+% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+% License for the specific language governing permissions and limitations under
+% the License.
+
+-module(mem3_bdu).
+
+
+-export([
+    before_doc_update/3
+]).
+
+
+-include_lib("couch/include/couch_db.hrl").
+
+
+-spec before_doc_update(#doc{}, Db::any(), couch_db:update_type()) -> #doc{}.
+before_doc_update(#doc{id = <<?DESIGN_DOC_PREFIX, _/binary>>} = Doc, _Db, _UpdateType) ->
+    % Skip design docs
+    Doc;
+
+before_doc_update(#doc{deleted = true} = Doc, _Db, _UpdateType) ->
+    % Skip deleted
+    Doc;
+
+before_doc_update(#doc{} = Doc, _Db, replicated_changes) ->
+    % Skip internal replicator updates
+    Doc;
+
+before_doc_update(#doc{} = Doc, _Db, _UpdateType) ->
+    Body1 = couch_util:json_encode(Doc#doc.body),
+    Body2 = couch_util:json_decode(Body1, [return_maps]),
+    validate(Body2),
+    Doc.
+
+
+validate(#{} = Body) ->
+    validate_key(<<"by_node">>, Body, ["by_node is mandatory"]),
+    validate_key(<<"by_range">>, Body, ["by_range is mandatory"]),
+    ByNode = maps:get(<<"by_node">>, Body),
+    ByRange = maps:get(<<"by_range">>, Body),
+    % "by_node": {
+    %    "node1@xxx.xxx.xxx.xxx": ["00000000-1fffffff",...]
+    % ]}
+    maps:map(fun(Node, Ranges) ->
+        validate_by_node(Node, Ranges, ByRange)
+    end, ByNode),
+    % "by_range": {
+    %   "00000000-1fffffff": ["node1@xxx.xxx.xxx.xxx", ...]
+    % ]}
+    maps:map(fun(Range, Nodes) ->
+        validate_by_range(Range, Nodes, ByNode)
+    end, ByRange).
+
+
+validate_by_node(Node, Ranges, ByRange) ->
+    validate_array(Ranges, ["by_node", Ranges, "value not an array"]),
+    lists:foreach(fun(Range) ->
+        validate_key(Range, ByRange, ["by_range for", Range, "missing"]),
+        Nodes = maps:get(Range, ByRange),
+        validate_member(Node, Nodes, ["by_range for", Range, "missing", Node])
+    end, Ranges).
+
+
+validate_by_range(Range, Nodes, ByNode) ->
+    validate_array(Nodes, ["by_range", Nodes, "value not an array"]),
+    lists:foreach(fun(Node) ->
+        validate_key(Node, ByNode, ["by_node for", Node, "missing"]),
+        Ranges = maps:get(Node, ByNode),
+        validate_member(Range, Ranges, ["by_node for", Node, "missing", Range])
+    end, Nodes).
+
+
+validate_array(Val, _ErrMsg) when is_list(Val) ->
+    ok;
+validate_array(_Val, ErrMsg) ->
+    throw({forbidden, errmsg(ErrMsg)}).
+
+
+validate_key(Key, #{} = Map, ErrMsg) ->
+    case maps:is_key(Key, Map) of
+        true -> ok;
+        false -> throw({forbidden, errmsg(ErrMsg)})
+    end;
+validate_key(_Key, _Map, ErrMsg) ->
+    throw({forbidden, errmsg(ErrMsg)}).

Review comment:
       Good catch. The intent was to check when "by_node" / "by_range" are present but not maps.  But I think in this case we'd never reach the clause as we'd blow up in `maps:map/2`. I'll add another check for for invalid maps and some tests for it




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] jaydoane commented on a change in pull request #3796: Move custodian VDU to a BDU and fix _all_dbs off-by-one limit bug

Posted by GitBox <gi...@apache.org>.
jaydoane commented on a change in pull request #3796:
URL: https://github.com/apache/couchdb/pull/3796#discussion_r735064868



##########
File path: src/mem3/src/mem3_bdu.erl
##########
@@ -0,0 +1,106 @@
+% Licensed under the Apache License, Version 2.0 (the "License"); you may not
+% use this file except in compliance with the License. You may obtain a copy of
+% the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing, software
+% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+% License for the specific language governing permissions and limitations under
+% the License.
+
+-module(mem3_bdu).
+
+
+-export([
+    before_doc_update/3
+]).
+
+
+-include_lib("couch/include/couch_db.hrl").
+
+
+-spec before_doc_update(#doc{}, Db::any(), couch_db:update_type()) -> #doc{}.
+before_doc_update(#doc{id = <<?DESIGN_DOC_PREFIX, _/binary>>} = Doc, _Db, _UpdateType) ->
+    % Skip design docs
+    Doc;
+
+before_doc_update(#doc{deleted = true} = Doc, _Db, _UpdateType) ->
+    % Skip deleted
+    Doc;
+
+before_doc_update(#doc{} = Doc, _Db, replicated_changes) ->
+    % Skip internal replicator updates
+    Doc;
+
+before_doc_update(#doc{} = Doc, _Db, _UpdateType) ->
+    Body1 = couch_util:json_encode(Doc#doc.body),
+    Body2 = couch_util:json_decode(Body1, [return_maps]),
+    validate(Body2),
+    Doc.
+
+
+validate(#{} = Body) ->
+    validate_key(<<"by_node">>, Body, ["by_node is mandatory"]),
+    validate_key(<<"by_range">>, Body, ["by_range is mandatory"]),
+    ByNode = maps:get(<<"by_node">>, Body),
+    ByRange = maps:get(<<"by_range">>, Body),
+    % "by_node": {
+    %    "node1@xxx.xxx.xxx.xxx": ["00000000-1fffffff",...]
+    % ]}
+    maps:map(fun(Node, Ranges) ->
+        validate_by_node(Node, Ranges, ByRange)
+    end, ByNode),
+    % "by_range": {
+    %   "00000000-1fffffff": ["node1@xxx.xxx.xxx.xxx", ...]
+    % ]}
+    maps:map(fun(Range, Nodes) ->
+        validate_by_range(Range, Nodes, ByNode)
+    end, ByRange).
+
+
+validate_by_node(Node, Ranges, ByRange) ->
+    validate_array(Ranges, ["by_node", Ranges, "value not an array"]),
+    lists:foreach(fun(Range) ->
+        validate_key(Range, ByRange, ["by_range for", Range, "missing"]),
+        Nodes = maps:get(Range, ByRange),
+        validate_member(Node, Nodes, ["by_range for", Range, "missing", Node])
+    end, Ranges).
+
+
+validate_by_range(Range, Nodes, ByNode) ->
+    validate_array(Nodes, ["by_range", Nodes, "value not an array"]),
+    lists:foreach(fun(Node) ->
+        validate_key(Node, ByNode, ["by_node for", Node, "missing"]),
+        Ranges = maps:get(Node, ByNode),
+        validate_member(Range, Ranges, ["by_node for", Node, "missing", Range])
+    end, Nodes).
+
+
+validate_array(Val, _ErrMsg) when is_list(Val) ->
+    ok;
+validate_array(_Val, ErrMsg) ->
+    throw({forbidden, errmsg(ErrMsg)}).
+
+
+validate_key(Key, #{} = Map, ErrMsg) ->
+    case maps:is_key(Key, Map) of
+        true -> ok;
+        false -> throw({forbidden, errmsg(ErrMsg)})
+    end;
+validate_key(_Key, _Map, ErrMsg) ->
+    throw({forbidden, errmsg(ErrMsg)}).

Review comment:
       This is another line that doesn't have test coverage. Is it possible for this to happen?

##########
File path: src/mem3/src/mem3_bdu.erl
##########
@@ -0,0 +1,106 @@
+% Licensed under the Apache License, Version 2.0 (the "License"); you may not
+% use this file except in compliance with the License. You may obtain a copy of
+% the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing, software
+% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+% License for the specific language governing permissions and limitations under
+% the License.
+
+-module(mem3_bdu).
+
+
+-export([
+    before_doc_update/3
+]).
+
+
+-include_lib("couch/include/couch_db.hrl").
+
+
+-spec before_doc_update(#doc{}, Db::any(), couch_db:update_type()) -> #doc{}.
+before_doc_update(#doc{id = <<?DESIGN_DOC_PREFIX, _/binary>>} = Doc, _Db, _UpdateType) ->
+    % Skip design docs
+    Doc;
+
+before_doc_update(#doc{deleted = true} = Doc, _Db, _UpdateType) ->
+    % Skip deleted
+    Doc;
+
+before_doc_update(#doc{} = Doc, _Db, replicated_changes) ->
+    % Skip internal replicator updates
+    Doc;

Review comment:
       This is the other BDU clause not covered by eunit tests, and also I believe is the only functional difference between the VDU implementation? What is the reason this was added, and do you think it's worth adding a test case?

##########
File path: src/mem3/src/mem3_bdu.erl
##########
@@ -0,0 +1,106 @@
+% Licensed under the Apache License, Version 2.0 (the "License"); you may not
+% use this file except in compliance with the License. You may obtain a copy of
+% the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing, software
+% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+% License for the specific language governing permissions and limitations under
+% the License.
+
+-module(mem3_bdu).
+
+
+-export([
+    before_doc_update/3
+]).
+
+
+-include_lib("couch/include/couch_db.hrl").
+
+
+-spec before_doc_update(#doc{}, Db::any(), couch_db:update_type()) -> #doc{}.
+before_doc_update(#doc{id = <<?DESIGN_DOC_PREFIX, _/binary>>} = Doc, _Db, _UpdateType) ->
+    % Skip design docs
+    Doc;
+
+before_doc_update(#doc{deleted = true} = Doc, _Db, _UpdateType) ->
+    % Skip deleted
+    Doc;
+
+before_doc_update(#doc{} = Doc, _Db, replicated_changes) ->
+    % Skip internal replicator updates
+    Doc;
+
+before_doc_update(#doc{} = Doc, _Db, _UpdateType) ->
+    Body1 = couch_util:json_encode(Doc#doc.body),
+    Body2 = couch_util:json_decode(Body1, [return_maps]),

Review comment:
       It seems a pity to go through this presumably expensive dance just to get a map here. I guess adding a `#doc.body_map` field would be tricky, but could `#doc.meta` be (ab)used for something sneaky like e.g.
   ```erlang
   {body_map, #{} = BodyMap}
   ```
   ?

##########
File path: src/mem3/src/mem3_bdu.erl
##########
@@ -0,0 +1,106 @@
+% Licensed under the Apache License, Version 2.0 (the "License"); you may not
+% use this file except in compliance with the License. You may obtain a copy of
+% the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing, software
+% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+% License for the specific language governing permissions and limitations under
+% the License.
+
+-module(mem3_bdu).
+
+
+-export([
+    before_doc_update/3
+]).
+
+
+-include_lib("couch/include/couch_db.hrl").
+
+
+-spec before_doc_update(#doc{}, Db::any(), couch_db:update_type()) -> #doc{}.
+before_doc_update(#doc{id = <<?DESIGN_DOC_PREFIX, _/binary>>} = Doc, _Db, _UpdateType) ->
+    % Skip design docs
+    Doc;

Review comment:
       I added eunit coverage to mem3 and this module has 91% test coverage, but this clause is one exception. Do you think it's worth testing this case?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] jaydoane commented on a change in pull request #3796: Move custodian VDU to a BDU and fix _all_dbs off-by-one limit bug

Posted by GitBox <gi...@apache.org>.
jaydoane commented on a change in pull request #3796:
URL: https://github.com/apache/couchdb/pull/3796#discussion_r735064610



##########
File path: src/mem3/src/mem3_bdu.erl
##########
@@ -0,0 +1,106 @@
+% Licensed under the Apache License, Version 2.0 (the "License"); you may not
+% use this file except in compliance with the License. You may obtain a copy of
+% the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing, software
+% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+% License for the specific language governing permissions and limitations under
+% the License.
+
+-module(mem3_bdu).
+
+
+-export([
+    before_doc_update/3
+]).
+
+
+-include_lib("couch/include/couch_db.hrl").
+
+
+-spec before_doc_update(#doc{}, Db::any(), couch_db:update_type()) -> #doc{}.
+before_doc_update(#doc{id = <<?DESIGN_DOC_PREFIX, _/binary>>} = Doc, _Db, _UpdateType) ->
+    % Skip design docs
+    Doc;

Review comment:
       I added eunit coverage to mem3 and this module has 91% test coverage, but this clause is one exception. Do you think it's worth testing this case?
   
   I also pushed the `mem3-coverage` branch in case you want to include it in this PR. Otherwise I can make a separate PR after this is merged.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] jaydoane commented on pull request #3796: Move custodian VDU to a BDU and fix _all_dbs off-by-one limit bug

Posted by GitBox <gi...@apache.org>.
jaydoane commented on pull request #3796:
URL: https://github.com/apache/couchdb/pull/3796#issuecomment-951109770


   Seeing [this failure](https://ci-couchdb.apache.org/job/jenkins-cm1/job/PullRequests/job/PR-3799/2/execution/node/112/log/?cloudbees-analytics-link=scm-reporting%2Ferror%2Fstep):
   ```
   22:30:31  module 'mem3_bdu_test'
   22:30:31    mem3 bdu shard doc tests
   22:30:31      mem3_bdu_test:62: mem3_bdu_shard_doc_test_ (t_can_insert_shard_map_doc)...*failed*
   22:30:31  in function mem3_bdu_test:'-t_can_insert_shard_map_doc/1-fun-2-'/2 (test/eunit/mem3_bdu_test.erl, line 93)
   22:30:31  in call from eunit_test:run_testfun/1 (eunit_test.erl, line 71)
   22:30:31  in call from eunit_proc:run_test/1 (eunit_proc.erl, line 510)
   22:30:31  in call from eunit_proc:with_timeout/3 (eunit_proc.erl, line 335)
   22:30:31  in call from eunit_proc:handle_test/2 (eunit_proc.erl, line 493)
   22:30:31  in call from eunit_proc:tests_inorder/3 (eunit_proc.erl, line 435)
   22:30:31  in call from eunit_proc:with_timeout/3 (eunit_proc.erl, line 325)
   22:30:31  in call from eunit_proc:run_group/2 (eunit_proc.erl, line 549)
   22:30:31  **error:{assertMatch,[{module,mem3_bdu_test},
   22:30:31                {line,93},
   22:30:31                {expression,"req ( get , Top ++ Db )"},
   22:30:31                {pattern,"{ 200 , _ }"},
   22:30:31                {value,{500,
   22:30:31                        #{<<"error">> => <<"internal_server_error">>,
   22:30:31                          <<"reason">> => <<"No DB shards could be opened"...>>,
   22:30:31                          <<"ref">> => 2416970314}}}]}
   22:30:31    output:<<"">>
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on a change in pull request #3796: Move custodian VDU to a BDU and fix _all_dbs off-by-one limit bug

Posted by GitBox <gi...@apache.org>.
nickva commented on a change in pull request #3796:
URL: https://github.com/apache/couchdb/pull/3796#discussion_r735074287



##########
File path: src/mem3/src/mem3_bdu.erl
##########
@@ -0,0 +1,106 @@
+% Licensed under the Apache License, Version 2.0 (the "License"); you may not
+% use this file except in compliance with the License. You may obtain a copy of
+% the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing, software
+% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+% License for the specific language governing permissions and limitations under
+% the License.
+
+-module(mem3_bdu).
+
+
+-export([
+    before_doc_update/3
+]).
+
+
+-include_lib("couch/include/couch_db.hrl").
+
+
+-spec before_doc_update(#doc{}, Db::any(), couch_db:update_type()) -> #doc{}.
+before_doc_update(#doc{id = <<?DESIGN_DOC_PREFIX, _/binary>>} = Doc, _Db, _UpdateType) ->
+    % Skip design docs
+    Doc;
+
+before_doc_update(#doc{deleted = true} = Doc, _Db, _UpdateType) ->
+    % Skip deleted
+    Doc;
+
+before_doc_update(#doc{} = Doc, _Db, replicated_changes) ->
+    % Skip internal replicator updates
+    Doc;

Review comment:
       The idea is to avoid performing the extra validation step again when the internal replicator pushes the updates around since we already performed this validation once during the initial update on the node where we added the change. Users can't typically replicate into the _dbs database as it's not a clustered database so it shouldn't be an issue.
   
   I think this is also the behavior with VDUs previously https://github.com/apache/couchdb/blob/3.x/src/couch/src/couch_db.erl#L889-L891 were we use the io_priority to skip the VDU check if the update came from the internal replicator.

##########
File path: src/mem3/src/mem3_bdu.erl
##########
@@ -0,0 +1,106 @@
+% Licensed under the Apache License, Version 2.0 (the "License"); you may not
+% use this file except in compliance with the License. You may obtain a copy of
+% the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing, software
+% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+% License for the specific language governing permissions and limitations under
+% the License.
+
+-module(mem3_bdu).
+
+
+-export([
+    before_doc_update/3
+]).
+
+
+-include_lib("couch/include/couch_db.hrl").
+
+
+-spec before_doc_update(#doc{}, Db::any(), couch_db:update_type()) -> #doc{}.
+before_doc_update(#doc{id = <<?DESIGN_DOC_PREFIX, _/binary>>} = Doc, _Db, _UpdateType) ->
+    % Skip design docs
+    Doc;
+
+before_doc_update(#doc{deleted = true} = Doc, _Db, _UpdateType) ->
+    % Skip deleted
+    Doc;
+
+before_doc_update(#doc{} = Doc, _Db, replicated_changes) ->
+    % Skip internal replicator updates
+    Doc;
+
+before_doc_update(#doc{} = Doc, _Db, _UpdateType) ->
+    Body1 = couch_util:json_encode(Doc#doc.body),
+    Body2 = couch_util:json_decode(Body1, [return_maps]),

Review comment:
       Hmm, interesting idea. Adding a field to `meta` could mean  re-encoding every single doc body, while in this case we're doing it only for when the shard map is updated.
   
   Compared to a VDU we should still be ahead a bit as previously we encoded it, sent it to an external process, then decoded and re-encoded and decoded the response again, so it should still be cheaper a bit.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on pull request #3796: Move custodian VDU to a BDU and fix _all_dbs off-by-one limit bug

Posted by GitBox <gi...@apache.org>.
nickva commented on pull request #3796:
URL: https://github.com/apache/couchdb/pull/3796#issuecomment-950346033


   @jaydoane thanks taking a look.
   
   I added tests for the _design and replicated_changes (it turns out we can perfectly fine accept shard docs with just `by_node`, who knew...). Also added the cover logic so we get coverage reports from mem3 tests.
   
   Coverage now is 100%
   
   ```
   
   Code Coverage:
   mem3                           :  52%
   mem3_app                       : 100%
   mem3_bdu                       : 100%
   ...
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] jaydoane commented on pull request #3796: Move custodian VDU to a BDU and fix _all_dbs off-by-one limit bug

Posted by GitBox <gi...@apache.org>.
jaydoane commented on pull request #3796:
URL: https://github.com/apache/couchdb/pull/3796#issuecomment-951150950


   @nickva I couldn't repro locally either, but it seems the CI could multiple times in a row for some reason. Thanks for the quick fix!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on a change in pull request #3796: Move custodian VDU to a BDU and fix _all_dbs off-by-one limit bug

Posted by GitBox <gi...@apache.org>.
nickva commented on a change in pull request #3796:
URL: https://github.com/apache/couchdb/pull/3796#discussion_r735072919



##########
File path: src/mem3/src/mem3_bdu.erl
##########
@@ -0,0 +1,106 @@
+% Licensed under the Apache License, Version 2.0 (the "License"); you may not
+% use this file except in compliance with the License. You may obtain a copy of
+% the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing, software
+% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+% License for the specific language governing permissions and limitations under
+% the License.
+
+-module(mem3_bdu).
+
+
+-export([
+    before_doc_update/3
+]).
+
+
+-include_lib("couch/include/couch_db.hrl").
+
+
+-spec before_doc_update(#doc{}, Db::any(), couch_db:update_type()) -> #doc{}.
+before_doc_update(#doc{id = <<?DESIGN_DOC_PREFIX, _/binary>>} = Doc, _Db, _UpdateType) ->
+    % Skip design docs
+    Doc;

Review comment:
       Good catch, I'll add the coverage setup from mem3-coverage and add a test for the design doc case.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva merged pull request #3796: Move custodian VDU to a BDU and fix _all_dbs off-by-one limit bug

Posted by GitBox <gi...@apache.org>.
nickva merged pull request #3796:
URL: https://github.com/apache/couchdb/pull/3796


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org