You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "Alex Markham (JIRA)" <ji...@apache.org> on 2011/07/21 17:13:57 UTC

[jira] [Created] (COUCHDB-1231) Replication times out sporadically

Replication times out sporadically 
-----------------------------------

                 Key: COUCHDB-1231
                 URL: https://issues.apache.org/jira/browse/COUCHDB-1231
             Project: CouchDB
          Issue Type: Bug
          Components: Replication
    Affects Versions: 1.0.2, 1.0.3
         Environment: CentOS 5.6 64 bit, XFS HDD drive. Spidermonkey 1.9.2 or 1.7
            Reporter: Alex Markham
         Attachments: Couchdb Filtered replication source timeout .txt, Couchdb Filtered replication target timeout .txt

We have a setup replicating 7 databases from a master to slave. 2 databases use filters. One of these databases (the infrequently updated one) is failing replication. We have a cronjob to poll replication once per minute, and these stack traces appear often in the logs.

The network is a gigabit lan, or 2 vms on the same host (same result seen on both).
The replication job is called by sshing into the target and then curling the source database to localhost
Source -> Target

 ssh TargetServer 'curl -sX POST -H "content-type:application/json" http://localhost:5984/_replicate -d {"source":"http://SourceServer:5984/DataBase","target":"DataBase","continuous":true,"filter":"productionfilter/notProcessingJob"}'


changes_timeout is not defined in the ini files.

Logs attached for stack traces on the source couch and the target couch


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (COUCHDB-1231) Replication times out sporadically

Posted by "Alex Markham (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COUCHDB-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Markham updated COUCHDB-1231:
----------------------------------

    Attachment: Couchdb Filtered replication target timeout .txt
                Couchdb Filtered replication source timeout .txt

Logs snippets attached

> Replication times out sporadically 
> -----------------------------------
>
>                 Key: COUCHDB-1231
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1231
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 1.0.2, 1.0.3
>         Environment: CentOS 5.6 64 bit, XFS HDD drive. Spidermonkey 1.9.2 or 1.7
>            Reporter: Alex Markham
>              Labels: changes, replication, timeout
>         Attachments: Couchdb Filtered replication source timeout .txt, Couchdb Filtered replication target timeout .txt
>
>
> We have a setup replicating 7 databases from a master to slave. 2 databases use filters. One of these databases (the infrequently updated one) is failing replication. We have a cronjob to poll replication once per minute, and these stack traces appear often in the logs.
> The network is a gigabit lan, or 2 vms on the same host (same result seen on both).
> The replication job is called by sshing into the target and then curling the source database to localhost
> Source -> Target
>  ssh TargetServer 'curl -sX POST -H "content-type:application/json" http://localhost:5984/_replicate -d {"source":"http://SourceServer:5984/DataBase","target":"DataBase","continuous":true,"filter":"productionfilter/notProcessingJob"}'
> changes_timeout is not defined in the ini files.
> Logs attached for stack traces on the source couch and the target couch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1231) Replication times out sporadically

Posted by "Hans-D. Böhlau (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072757#comment-13072757 ] 

Hans-D. Böhlau commented on COUCHDB-1231:
-----------------------------------------

We noticed exactly the same effect in our project (using couchdb 1.0.2). Using filtered replication to regulary update a second database is not stable a soon as we have some hundreds of documents in the source database. We see the same timeout entries in our log file.

Requesting the (filtered) changes request manually shows, that it takes more and more time until the response is delivered - the more the number of documents (and/or changes) in the database increases.

Maybe it's important: We have a lot of documents with attachments.

Best regards,
Hans

> Replication times out sporadically 
> -----------------------------------
>
>                 Key: COUCHDB-1231
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1231
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 1.0.2, 1.0.3
>         Environment: CentOS 5.6 64 bit, XFS HDD drive. Spidermonkey 1.9.2 or 1.7
>            Reporter: Alex Markham
>              Labels: changes, replication, timeout
>         Attachments: Couchdb Filtered replication source timeout .txt, Couchdb Filtered replication target timeout .txt
>
>
> We have a setup replicating 7 databases from a master to slave. 2 databases use filters. One of these databases (the infrequently updated one) is failing replication. We have a cronjob to poll replication once per minute, and these stack traces appear often in the logs.
> The network is a gigabit lan, or 2 vms on the same host (same result seen on both).
> The replication job is called by sshing into the target and then curling the source database to localhost
> Source -> Target
>  ssh TargetServer 'curl -sX POST -H "content-type:application/json" http://localhost:5984/_replicate -d {"source":"http://SourceServer:5984/DataBase","target":"DataBase","continuous":true,"filter":"productionfilter/notProcessingJob"}'
> changes_timeout is not defined in the ini files.
> Logs attached for stack traces on the source couch and the target couch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (COUCHDB-1231) Replication times out sporadically

Posted by "Robert Newson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069020#comment-13069020 ] 

Robert Newson commented on COUCHDB-1231:
----------------------------------------

The 'Reason for termination ==  changes_timeout' points at the internal use of the timer module rather than anything network related. I took a quick look at how the time is set (and cancelled) and it looks ok. It does appear to be reset if a heartbeat is received even if there's a filter.

> Replication times out sporadically 
> -----------------------------------
>
>                 Key: COUCHDB-1231
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1231
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 1.0.2, 1.0.3
>         Environment: CentOS 5.6 64 bit, XFS HDD drive. Spidermonkey 1.9.2 or 1.7
>            Reporter: Alex Markham
>              Labels: changes, replication, timeout
>         Attachments: Couchdb Filtered replication source timeout .txt, Couchdb Filtered replication target timeout .txt
>
>
> We have a setup replicating 7 databases from a master to slave. 2 databases use filters. One of these databases (the infrequently updated one) is failing replication. We have a cronjob to poll replication once per minute, and these stack traces appear often in the logs.
> The network is a gigabit lan, or 2 vms on the same host (same result seen on both).
> The replication job is called by sshing into the target and then curling the source database to localhost
> Source -> Target
>  ssh TargetServer 'curl -sX POST -H "content-type:application/json" http://localhost:5984/_replicate -d {"source":"http://SourceServer:5984/DataBase","target":"DataBase","continuous":true,"filter":"productionfilter/notProcessingJob"}'
> changes_timeout is not defined in the ini files.
> Logs attached for stack traces on the source couch and the target couch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1231) Replication times out sporadically

Posted by "Alex Markham (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069019#comment-13069019 ] 

Alex Markham commented on COUCHDB-1231:
---------------------------------------

I should add that if I manually poll the changes url with the filter on using curl it seems to work fine (though not tested for long periods)
http://SourceServer:5984/DataBase/_changes?filter=productionfilter/notProcessingJob&style=all_docs&heartbeat=10000&since=40034&feed=continuous

> Replication times out sporadically 
> -----------------------------------
>
>                 Key: COUCHDB-1231
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1231
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 1.0.2, 1.0.3
>         Environment: CentOS 5.6 64 bit, XFS HDD drive. Spidermonkey 1.9.2 or 1.7
>            Reporter: Alex Markham
>              Labels: changes, replication, timeout
>         Attachments: Couchdb Filtered replication source timeout .txt, Couchdb Filtered replication target timeout .txt
>
>
> We have a setup replicating 7 databases from a master to slave. 2 databases use filters. One of these databases (the infrequently updated one) is failing replication. We have a cronjob to poll replication once per minute, and these stack traces appear often in the logs.
> The network is a gigabit lan, or 2 vms on the same host (same result seen on both).
> The replication job is called by sshing into the target and then curling the source database to localhost
> Source -> Target
>  ssh TargetServer 'curl -sX POST -H "content-type:application/json" http://localhost:5984/_replicate -d {"source":"http://SourceServer:5984/DataBase","target":"DataBase","continuous":true,"filter":"productionfilter/notProcessingJob"}'
> changes_timeout is not defined in the ini files.
> Logs attached for stack traces on the source couch and the target couch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira