You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Oleg Cohen <ol...@assurebridge.com> on 2016/05/06 18:18:06 UTC

CouchDB 2.0 cluster problem when first node is down

Greetings,

We ran into an issue when testing CouchDB 2.0 clustering. We ran a 2-node
cluster using the dev/run -n 2 command.

If we bring node2 down, all on node1 is still working fine. When we bring
down node1 using remsh and issuing init:stop(). command, the databases on
node2 are no longer readable.

We test this by trying to read the _users database using the following
command:

curl -X GET "http://127.0.0.1:25984/_users" --user admin:xxxxxxx

The get the following error:

{"error":"nodedown","reason":"progress not possible"}

If node1 is restarted, the problem goes away.

We experienced the same issue running a 3-node cluster across 3 different
servers.

Wondering if anyone ran into the same issue and if there is a workaround or
a way to fix the issue.

Thank you,
Oleg


-- 
*Oleg Cohen  |  Principal  |  **A S S U R E B R I D G E*
*Office: +1 617 564 0737  |  Mobile: +1 617 455 7927  |  Fax: +1 888 409
6995*
*Email: Oleg.Cohen@assurebridge.com <Ol...@assurebridge.com>  **|
 www.assurebridge.com <http://www.assurebridge.com>*

Re: CouchDB 2.0 cluster problem when first node is down

Posted by Robert Samuel Newson <rn...@apache.org>.
would be useful to get the response to GET :5984/_membership from both nodes during this period.

B.

> On 6 May 2016, at 23:33, Jan Lehnardt <ja...@apache.org> wrote:
> 
> Oleg, thanks for the report! This was reported before, but only very
> recently and we are still looking into it. Can you maybe add your setup
> details to the ticket at:
> 
>    https://issues.apache.org/jira/browse/COUCHDB-3009?
> 
> Thank you!
> 
> Best
> Jan
> --
> 
>> On 06 May 2016, at 20:18, Oleg Cohen <ol...@assurebridge.com> wrote:
>> 
>> Greetings,
>> 
>> We ran into an issue when testing CouchDB 2.0 clustering. We ran a 2-node
>> cluster using the dev/run -n 2 command.
>> 
>> If we bring node2 down, all on node1 is still working fine. When we bring
>> down node1 using remsh and issuing init:stop(). command, the databases on
>> node2 are no longer readable.
>> 
>> We test this by trying to read the _users database using the following
>> command:
>> 
>> curl -X GET "http://127.0.0.1:25984/_users" --user admin:xxxxxxx
>> 
>> The get the following error:
>> 
>> {"error":"nodedown","reason":"progress not possible"}
>> 
>> If node1 is restarted, the problem goes away.
>> 
>> We experienced the same issue running a 3-node cluster across 3 different
>> servers.
>> 
>> Wondering if anyone ran into the same issue and if there is a workaround or
>> a way to fix the issue.
>> 
>> Thank you,
>> Oleg
>> 
>> 
>> -- 
>> *Oleg Cohen  |  Principal  |  **A S S U R E B R I D G E*
>> *Office: +1 617 564 0737  |  Mobile: +1 617 455 7927  |  Fax: +1 888 409
>> 6995*
>> *Email: Oleg.Cohen@assurebridge.com <Ol...@assurebridge.com>  **|
>> www.assurebridge.com <http://www.assurebridge.com>*
> 
> -- 
> Professional Support for Apache CouchDB:
> https://neighbourhood.ie/couchdb-support/
> 


Re: CouchDB 2.0 cluster problem when first node is down

Posted by Jan Lehnardt <ja...@apache.org>.
Oleg, thanks for the report! This was reported before, but only very
recently and we are still looking into it. Can you maybe add your setup
details to the ticket at:

    https://issues.apache.org/jira/browse/COUCHDB-3009?

Thank you!

Best
Jan
--

> On 06 May 2016, at 20:18, Oleg Cohen <ol...@assurebridge.com> wrote:
> 
> Greetings,
> 
> We ran into an issue when testing CouchDB 2.0 clustering. We ran a 2-node
> cluster using the dev/run -n 2 command.
> 
> If we bring node2 down, all on node1 is still working fine. When we bring
> down node1 using remsh and issuing init:stop(). command, the databases on
> node2 are no longer readable.
> 
> We test this by trying to read the _users database using the following
> command:
> 
> curl -X GET "http://127.0.0.1:25984/_users" --user admin:xxxxxxx
> 
> The get the following error:
> 
> {"error":"nodedown","reason":"progress not possible"}
> 
> If node1 is restarted, the problem goes away.
> 
> We experienced the same issue running a 3-node cluster across 3 different
> servers.
> 
> Wondering if anyone ran into the same issue and if there is a workaround or
> a way to fix the issue.
> 
> Thank you,
> Oleg
> 
> 
> -- 
> *Oleg Cohen  |  Principal  |  **A S S U R E B R I D G E*
> *Office: +1 617 564 0737  |  Mobile: +1 617 455 7927  |  Fax: +1 888 409
> 6995*
> *Email: Oleg.Cohen@assurebridge.com <Ol...@assurebridge.com>  **|
> www.assurebridge.com <http://www.assurebridge.com>*

-- 
Professional Support for Apache CouchDB:
https://neighbourhood.ie/couchdb-support/