You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by nickva <gi...@git.apache.org> on 2017/01/24 06:00:28 UTC

[GitHub] couchdb-couch-replicator pull request #52: Use mem3 to discover all _replica...

GitHub user nickva opened a pull request:

    https://github.com/apache/couchdb-couch-replicator/pull/52

    Use mem3 to discover all _replicator shards in replicator manager

    Previously this was done via recursive db directory traversal, looking for
    shards names ending in `_replicator`. However, if there are orphanned shard
    files (not associated with a clustered db), replicator manager crashes. It
    restarts eventually, but as long as the orphanned shard file
    without an entry in dbs db is present on the file system, replicator manager
    will keep crashing and never reach some replication documents in shards which
    would be traversed after the problematic shard. The user-visible effect of this
    is some replication documents are never triggered.
    
    To fix, use mem3 to traverse and discover `_replicator` shards. This was used
    Cloudant's production code for many years it is battle-tested and it doesn't
    suffer from file system vs mem3 inconsistency.
    
    Local `_replicator` db is a special case. Since it is not clustered it will
    not appear in the clustered db list. However it is already handled as a special
    case in `init(_)` so that behavior is not affected by this change.
    
    COUCHDB-3277

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloudant/couchdb-couch-replicator couchdb-3277

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/couchdb-couch-replicator/pull/52.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #52
    
----
commit 8205420d4249cea98ec5568344c43ccf11bbc9b1
Author: Nick Vatamaniuc <va...@apache.org>
Date:   2017-01-24T05:35:32Z

    Use mem3 to discover all _replicator shards in replicator manager
    
    Previously this was done via recursive db directory traversal, looking for
    shards names ending in `_replicator`. However, if there are orphanned shard
    files (not associated with a clustered db), replicator manager crashes. It
    restarts eventually, but as long as the orphanned shard file
    without an entry in dbs db is present on the file system, replicator manager
    will keep crashing and never reach some replication documents in shards which
    would be traversed after the problematic shard. The user-visible effect of this
    is some replication documents are never triggered.
    
    To fix, use mem3 to traverse and discover `_replicator` shards. This was used
    Cloudant's production code for many years it is battle-tested and it doesn't
    suffer from file system vs mem3 inconsistency.
    
    Local `_replicator` db is a special case. Since it is not clustered it will
    not appear in the clustered db list. However it is already handled as a special
    case in `init(_)` so that behavior is not affected by this change.
    
    COUCHDB-3277

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #52: Use mem3 to discover all _replicator sha...

Posted by rnewson <gi...@git.apache.org>.
Github user rnewson commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/52
  
    +1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator pull request #52: Use mem3 to discover all _replica...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/couchdb-couch-replicator/pull/52


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator pull request #52: Use mem3 to discover all _replica...

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on a diff in the pull request:

    https://github.com/apache/couchdb-couch-replicator/pull/52#discussion_r97555698
  
    --- Diff: src/couch_replicator_manager.erl ---
    @@ -980,3 +990,34 @@ get_json_value(Key, Props, Default) when is_binary(Key) ->
             Else ->
                 Else
         end.
    +
    +
    --- End diff --
    
    The following is missing which leads to compilation error about undefined macro `?_test`
    ```
    -ifdef(TEST).
    -include_lib("couch/include/couch_eunit.hrl").
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #52: Use mem3 to discover all _replicator sha...

Posted by kxepal <gi...@git.apache.org>.
Github user kxepal commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/52
  
    @nickva 
    Great numbers then! 
    P.S. \U0001f596  (;


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #52: Use mem3 to discover all _replicator sha...

Posted by nickva <gi...@git.apache.org>.
Github user nickva commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/52
  
    Did an unscientific performance benchmark. Created 280 replicator replicator dbs. Then logged how long it took scan_all_dbs to scan all dbs in the file system and from dbs:
    
    ```xxxxxxxx scan_all_dbs done in 0.330834 sec```
    vs 
    ``` xxxxxxx scan all_dbs done in 2.277361 sec```  with a total of 2249 directories and 2540 files


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #52: Use mem3 to discover all _replicator sha...

Posted by nickva <gi...@git.apache.org>.
Github user nickva commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/52
  
    @kxepal Oh good point. And Hey! It's been a while :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #52: Use mem3 to discover all _replicator sha...

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/52
  
    Tests pass locally. +1
    ```
        Running test function(s):
          couch_replicator_manager:replicator_shards_test_/0
    ======================== EUnit ========================
    couch_replicator_manager:1012: t_pass_replicator_shard...[0.219 s] ok
    couch_replicator_manager:1022: t_fail_non_replicator_shard...[0.072 s] ok
    =======================================================
      All 2 tests passed.
    ``` 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #52: Use mem3 to discover all _replicator sha...

Posted by kxepal <gi...@git.apache.org>.
Github user kxepal commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/52
  
    @nickva I hope these values should be read  as "after" and "before", not in reverse?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator pull request #52: Use mem3 to discover all _replica...

Posted by nickva <gi...@git.apache.org>.
Github user nickva commented on a diff in the pull request:

    https://github.com/apache/couchdb-couch-replicator/pull/52#discussion_r97558473
  
    --- Diff: src/couch_replicator_manager.erl ---
    @@ -980,3 +990,34 @@ get_json_value(Key, Props, Default) when is_binary(Key) ->
             Else ->
                 Else
         end.
    +
    +
    --- End diff --
    
    @iilyak good catch, I messed up a merge


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator pull request #52: Use mem3 to discover all _replica...

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on a diff in the pull request:

    https://github.com/apache/couchdb-couch-replicator/pull/52#discussion_r97556211
  
    --- Diff: src/couch_replicator_manager.erl ---
    @@ -931,24 +931,34 @@ strip_credentials({Props}) ->
         {lists:keydelete(<<"oauth">>, 1, Props)}.
     
     scan_all_dbs(Server) when is_pid(Server) ->
    -    Root = config:get("couchdb", "database_dir", "."),
    -    NormRoot = couch_util:normpath(Root),
    -    filelib:fold_files(Root, "_replicator(\\.[0-9]{10,})?.couch$", true,
    -        fun(Filename, Acc) ->
    -	    % shamelessly stolen from couch_server.erl
    -            NormFilename = couch_util:normpath(Filename),
    -            case NormFilename -- NormRoot of
    -                [$/ | RelativeFilename] -> ok;
    -                RelativeFilename -> ok
    -            end,
    -            DbName = ?l2b(filename:rootname(RelativeFilename, ".couch")),
    -            Jitter = jitter(Acc),
    -            spawn_link(fun() ->
    -                timer:sleep(Jitter),
    -                gen_server:cast(Server, {resume_scan, DbName})
    -            end),
    -	    Acc + 1
    -	end, 1).
    +    {ok, Db} = mem3_util:ensure_exists(
    +        config:get("mem3", "shard_db", "dbs")),
    +    ChangesFun = couch_changes:handle_changes(#changes_args{}, nil, Db, nil),
    +    ChangesFun(fun({change, {Change}, _}, _) ->
    +        DbName = couch_util:get_value(<<"id">>, Change),
    +        case DbName of <<"_design/", _/binary>> -> ok; _Else ->
    +            case couch_replicator_utils:is_deleted(Change) of
    +            true ->
    +                ok;
    +            false ->
    +                [gen_server:cast(Server, {resume_scan, ShardName})
    --- End diff --
    
    I stopped here for a bit because original code calls resume_scan for the DbName and not ShardName. However we don't use fabric here and open shards directly. It looks good to me. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---