You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Brendan Mahoney <bt...@gmail.com> on 2015/09/11 14:28:29 UTC
Client fails due to single Zookeeper node failure
Hi,
We have an Accumulo v1.6.1 cluster with a 5-node Zookeeper v3.4.5
cluster. One of the Zookeeper nodes crashed and all Accumulo client
connections (including the shell) now fail with:
ERROR: java.lang.RuntimeException: Failed to connect to zookeeper
(node_11:2181) within 2x zookeeper timeout period 30000.
If we move the bad Zookeeper node (node_11) to the end of the Zookeeper
node list in accumulo-site.xml, clients connect successfully. Is the first
Zookeeper node in the list a single-point-of-failure for our Accumulo
cluster?
Thanks,
Brendan
Re: Client fails due to single Zookeeper node failure
Posted by Eric Newton <er...@gmail.com>.
Specifically: https://issues.apache.org/jira/browse/ACCUMULO-3218
-Eric
On Fri, Sep 11, 2015 at 8:49 AM, Christopher <ct...@apache.org> wrote:
> I believe this was one of the bugs fixed in either 1.6.2 or 1.6.3. There
> was an error parsing the configuration as a list.
>
> On Fri, Sep 11, 2015, 08:28 Brendan Mahoney <bt...@gmail.com> wrote:
>
>> Hi,
>> We have an Accumulo v1.6.1 cluster with a 5-node Zookeeper v3.4.5
>> cluster. One of the Zookeeper nodes crashed and all Accumulo client
>> connections (including the shell) now fail with:
>>
>> ERROR: java.lang.RuntimeException: Failed to connect to zookeeper
>> (node_11:2181) within 2x zookeeper timeout period 30000.
>>
>> If we move the bad Zookeeper node (node_11) to the end of the Zookeeper
>> node list in accumulo-site.xml, clients connect successfully. Is the first
>> Zookeeper node in the list a single-point-of-failure for our Accumulo
>> cluster?
>>
>> Thanks,
>> Brendan
>>
>
Re: Client fails due to single Zookeeper node failure
Posted by Christopher <ct...@apache.org>.
I believe this was one of the bugs fixed in either 1.6.2 or 1.6.3. There
was an error parsing the configuration as a list.
On Fri, Sep 11, 2015, 08:28 Brendan Mahoney <bt...@gmail.com> wrote:
> Hi,
> We have an Accumulo v1.6.1 cluster with a 5-node Zookeeper v3.4.5
> cluster. One of the Zookeeper nodes crashed and all Accumulo client
> connections (including the shell) now fail with:
>
> ERROR: java.lang.RuntimeException: Failed to connect to zookeeper
> (node_11:2181) within 2x zookeeper timeout period 30000.
>
> If we move the bad Zookeeper node (node_11) to the end of the Zookeeper
> node list in accumulo-site.xml, clients connect successfully. Is the first
> Zookeeper node in the list a single-point-of-failure for our Accumulo
> cluster?
>
> Thanks,
> Brendan
>