You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Brendan Mahoney <bt...@gmail.com> on 2015/09/11 14:28:29 UTC

Client fails due to single Zookeeper node failure

Hi,
  We have an Accumulo v1.6.1 cluster with a 5-node Zookeeper v3.4.5
cluster.   One of the Zookeeper nodes crashed and all Accumulo client
connections (including the shell) now fail with:

ERROR: java.lang.RuntimeException: Failed to connect to zookeeper
(node_11:2181) within 2x zookeeper timeout period 30000.

 If we move the bad Zookeeper node (node_11) to the end of the Zookeeper
node list in accumulo-site.xml, clients connect successfully.  Is the first
Zookeeper node in the list a single-point-of-failure for our Accumulo
cluster?

Thanks,
   Brendan

Re: Client fails due to single Zookeeper node failure

Posted by Eric Newton <er...@gmail.com>.
Specifically: https://issues.apache.org/jira/browse/ACCUMULO-3218

-Eric


On Fri, Sep 11, 2015 at 8:49 AM, Christopher <ct...@apache.org> wrote:

> I believe this was one of the bugs fixed in either 1.6.2 or 1.6.3. There
> was an error parsing the configuration as a list.
>
> On Fri, Sep 11, 2015, 08:28 Brendan Mahoney <bt...@gmail.com> wrote:
>
>> Hi,
>>   We have an Accumulo v1.6.1 cluster with a 5-node Zookeeper v3.4.5
>> cluster.   One of the Zookeeper nodes crashed and all Accumulo client
>> connections (including the shell) now fail with:
>>
>> ERROR: java.lang.RuntimeException: Failed to connect to zookeeper
>> (node_11:2181) within 2x zookeeper timeout period 30000.
>>
>>  If we move the bad Zookeeper node (node_11) to the end of the Zookeeper
>> node list in accumulo-site.xml, clients connect successfully.  Is the first
>> Zookeeper node in the list a single-point-of-failure for our Accumulo
>> cluster?
>>
>> Thanks,
>>    Brendan
>>
>

Re: Client fails due to single Zookeeper node failure

Posted by Christopher <ct...@apache.org>.
I believe this was one of the bugs fixed in either 1.6.2 or 1.6.3. There
was an error parsing the configuration as a list.

On Fri, Sep 11, 2015, 08:28 Brendan Mahoney <bt...@gmail.com> wrote:

> Hi,
>   We have an Accumulo v1.6.1 cluster with a 5-node Zookeeper v3.4.5
> cluster.   One of the Zookeeper nodes crashed and all Accumulo client
> connections (including the shell) now fail with:
>
> ERROR: java.lang.RuntimeException: Failed to connect to zookeeper
> (node_11:2181) within 2x zookeeper timeout period 30000.
>
>  If we move the bad Zookeeper node (node_11) to the end of the Zookeeper
> node list in accumulo-site.xml, clients connect successfully.  Is the first
> Zookeeper node in the list a single-point-of-failure for our Accumulo
> cluster?
>
> Thanks,
>    Brendan
>