You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by iilyak <gi...@git.apache.org> on 2016/07/29 21:42:37 UTC

[GitHub] couchdb-couch-replicator pull request #44: Inject random delays in scan_all_...

GitHub user iilyak opened a pull request:

    https://github.com/apache/couchdb-couch-replicator/pull/44

    Inject random delays in scan_all_dbs

    couch_replication_server scans filesystem to find all _replication
    databases. For every database found it does
    
        gen_server:cast(Server, {resume_scan, DbName})
    
    Extract independent process where we do gen_server:cast after a random delay.
    This effectively removes stampede and randomizes the order in which we
    process _replication databases.
    
    COUCHDB-3088

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloudant/couchdb-couch-replicator 69914-insert-random-delays

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/couchdb-couch-replicator/pull/44.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #44
    
----
commit 5715c5e25dba61442834b08d7d7202b185341a87
Author: ILYA Khlopotov <ii...@ca.ibm.com>
Date:   2016-07-29T21:32:02Z

    Inject random delays in scan_all_dbs
    
    couch_replication_server scans filesystem to find all _replication
    databases. For every database found it does
    
        gen_server:cast(Server, {resume_scan, DbName})
    
    Extract independent process where we do gen_server:cast after a random delay.
    This effectively removes stampede and randomizes the order in which we
    process _replication databases.
    
    COUCHDB-3088

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by nickva <gi...@git.apache.org>.
Github user nickva commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    +1 with maybe switching to spawn_link


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by kxepal <gi...@git.apache.org>.
Github user kxepal commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    That is the [second](https://github.com/apache/couchdb-couch-replicator/pull/37) jitter implementation within the same module. May be generalize somehow that concept?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator pull request #44: Inject random delays in scan_all_...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/couchdb-couch-replicator/pull/44


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    Maybe. I was trying to but it is not that straight forward. I'll try again using a different approach.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by nickva <gi...@git.apache.org>.
Github user nickva commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    @iilyak agreed, it would block it, there are instances as well, say calculating replication ids when turning replication doc into replication records (id needs a hash of user filter's javascript code, but if network is slow, that would block gen_server and prevent all other requests there from making progress).
    
    We're working on improving that as a separate effort, but that is a bigger restructure. In this case the random sleep approach makes sense -- it should let replicator_manager process some other requests interleaved with `resume` messages


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by kxepal <gi...@git.apache.org>.
Github user kxepal commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    @iilyak Just a side thought: would move scan_all_dbs from init give any benefit?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    @kxepal: I try to avoid complex changes. Otherwise there is a risk that the fix wouldn't be merged due to code freeze. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    Right. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    Oops


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by nickva <gi...@git.apache.org>.
Github user nickva commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    Thinking about a bit more, it is nicer to have a random sleep in a spawned process as opposed to sleep in-line in scanner, so let's keep that pattern. With an inline sleep it would be hard to reason about a maximum time scanner would run. With spawned processes max would be total time to traverse the fs + max of 1 minute.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    @kxepal: Updated to avoid doing the same calculation in two places.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    @kxepal: The commit has been updated


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by kxepal <gi...@git.apache.org>.
Github user kxepal commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    Ok.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    > @nickva: Given that replicator_manager is a single bottleneck and will process requests serially anyway (so if it is blocked, resume,... messages will be queued up in its mailbox)
    
    This would be a problem since couch_replication_manager wouldn't be able to handle other types of messages until it would clear all resume_scan messages. It was not a problem before because we didn't do much in `handle_case({resume_scan`. But now we are reading from db and possibly update document.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by nickva <gi...@git.apache.org>.
Github user nickva commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    @iilyak it is not called and waited on, it is spawn_link-ed from there and shouldn't delay init in either case
    
     `ScanPid = spawn_link(fun() -> scan_all_dbs(Server) end),`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by nickva <gi...@git.apache.org>.
Github user nickva commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    Given that replicator_manager is a single bottleneck and will process requests serially anyway (so if it is blocked, `resume,...` messages will be queued up in its mailbox) if it is worth just inserting a predictable sleep based on Acc right inside the scan_all_dbs? Or do you think there is an advantage to quickly parsing the file systems and spawning waiter processes?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    @nickva: We cannot put delays in scan_all_dbs because it is called from init function. Otherwise supervisor will kill us.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by nickva <gi...@git.apache.org>.
Github user nickva commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    @iilyak why not use spawn_link, wonder if that would be better?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by kxepal <gi...@git.apache.org>.
Github user kxepal commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    Thanks! LGFM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-replicator issue #44: Inject random delays in scan_all_dbs

Posted by kxepal <gi...@git.apache.org>.
Github user kxepal commented on the issue:

    https://github.com/apache/couchdb-couch-replicator/pull/44
  
    @iilyak Looks good except that [here](https://github.com/cloudant/couchdb-couch-replicator/blob/a3b7158a9dc9f1243d84c2e77dc18b8ef24eae30/src/couch_replicator_manager.erl#L130) we'll crash to call undefined function.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---