You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Guy Laden <gu...@gmail.com> on 2016/08/25 15:10:33 UTC

Running 3.4 branch ZooKeeper on Linux with iptables

Is anybody running 3.4 branch ZooKeeper on Linux with iptables?

We are running 3.4.6 and have run into conntrack silently expiring the
leader election connections after they are idle for 5 days.
(/proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established)
We then see leader election on some machines sometimes gets stuck for 15
minutes or so, until the TCP socket times-out.

This JIRA seems to fix this but only in 3.5 branch
https://issues.apache.org/jira/browse/ZOOKEEPER-1748

Does 3.4.8 make a difference to this issue?

If not then this scenario does not seem rare - perhaps it is something to
add to the wiki? (Will be happy to do it)

Re: Running 3.4 branch ZooKeeper on Linux with iptables

Posted by Guy Laden <gu...@gmail.com>.
Forgot to mention one other option: increase the conntrack timeout for idle
established tcp connections.
Some issues with this direction: I think the maximum value for this comes
to a bit more than a year. Also, this is a global setting for the machine
and the conntrack table is of limited size.


On Wed, Aug 31, 2016 at 9:27 PM, Guy Laden <gu...@gmail.com> wrote:

> I may be misunderstanding something but to the best of my knowledge the
> situation is that if you are running ZooKeeper on Linux+Iptables then
>
> - If you run 3.5 or later then be sure to enable the TCP keepalive flags
>
> - If you run 3.4.* or earlier  - BEWARE as leader election packets will
> eventually be dropped
>      - your options include:
>          - manually patching ZK to enable TCP keepalive on the leader
> election connections
>          - run ZK with something like https://github.com/
> flonatel/libdontdie (i have not tested this)
>          - any other suggestions?
>
>
>
>
> On Sat, Aug 27, 2016 at 4:22 AM, Patrick Hunt <ph...@apache.org> wrote:
>
>> I've not seen this but I remember Kishore mentioning they had run with
>> iptable based testing at some point, Kishore any insight?
>>
>> Patrick
>>
>> On Thu, Aug 25, 2016 at 8:10 AM, Guy Laden <gu...@gmail.com> wrote:
>>
>> > Is anybody running 3.4 branch ZooKeeper on Linux with iptables?
>> >
>> > We are running 3.4.6 and have run into conntrack silently expiring the
>> > leader election connections after they are idle for 5 days.
>> > (/proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established)
>> > We then see leader election on some machines sometimes gets stuck for 15
>> > minutes or so, until the TCP socket times-out.
>> >
>> > This JIRA seems to fix this but only in 3.5 branch
>> > https://issues.apache.org/jira/browse/ZOOKEEPER-1748
>> >
>> > Does 3.4.8 make a difference to this issue?
>> >
>> > If not then this scenario does not seem rare - perhaps it is something
>> to
>> > add to the wiki? (Will be happy to do it)
>> >
>>
>
>

Re: Running 3.4 branch ZooKeeper on Linux with iptables

Posted by Guy Laden <gu...@gmail.com>.
I may be misunderstanding something but to the best of my knowledge the
situation is that if you are running ZooKeeper on Linux+Iptables then

- If you run 3.5 or later then be sure to enable the TCP keepalive flags

- If you run 3.4.* or earlier  - BEWARE as leader election packets will
eventually be dropped
     - your options include:
         - manually patching ZK to enable TCP keepalive on the leader
election connections
         - run ZK with something like https://github.com/flonatel/libdontdie
(i have not tested this)
         - any other suggestions?




On Sat, Aug 27, 2016 at 4:22 AM, Patrick Hunt <ph...@apache.org> wrote:

> I've not seen this but I remember Kishore mentioning they had run with
> iptable based testing at some point, Kishore any insight?
>
> Patrick
>
> On Thu, Aug 25, 2016 at 8:10 AM, Guy Laden <gu...@gmail.com> wrote:
>
> > Is anybody running 3.4 branch ZooKeeper on Linux with iptables?
> >
> > We are running 3.4.6 and have run into conntrack silently expiring the
> > leader election connections after they are idle for 5 days.
> > (/proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established)
> > We then see leader election on some machines sometimes gets stuck for 15
> > minutes or so, until the TCP socket times-out.
> >
> > This JIRA seems to fix this but only in 3.5 branch
> > https://issues.apache.org/jira/browse/ZOOKEEPER-1748
> >
> > Does 3.4.8 make a difference to this issue?
> >
> > If not then this scenario does not seem rare - perhaps it is something to
> > add to the wiki? (Will be happy to do it)
> >
>

Re: Running 3.4 branch ZooKeeper on Linux with iptables

Posted by Patrick Hunt <ph...@apache.org>.
I've not seen this but I remember Kishore mentioning they had run with
iptable based testing at some point, Kishore any insight?

Patrick

On Thu, Aug 25, 2016 at 8:10 AM, Guy Laden <gu...@gmail.com> wrote:

> Is anybody running 3.4 branch ZooKeeper on Linux with iptables?
>
> We are running 3.4.6 and have run into conntrack silently expiring the
> leader election connections after they are idle for 5 days.
> (/proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established)
> We then see leader election on some machines sometimes gets stuck for 15
> minutes or so, until the TCP socket times-out.
>
> This JIRA seems to fix this but only in 3.5 branch
> https://issues.apache.org/jira/browse/ZOOKEEPER-1748
>
> Does 3.4.8 make a difference to this issue?
>
> If not then this scenario does not seem rare - perhaps it is something to
> add to the wiki? (Will be happy to do it)
>