You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by Eron Wright <er...@gmail.com> on 2018/02/20 00:14:11 UTC

[Discuss] ZOOKEEPER-2982 DNS negative caching in 3.5

Hello,

I attempted to run ZK 3.5.3-beta in a Kubernetes cluster, using the typical
approach of a StatefulSet plus a pair of Services.   I observed that some
of my ZK servers would fail to resolve the DNS addresses of its peers
indefinitely.   It is normal that addresses cannot be resolved immediately
at startup because the records are created asynchronously by Kubernetes.
 One would expect ZK to keep trying and eventually succeed.   Note that
this issue affects 3.5 only; 3.4 seems to work fine.

I tracked the root cause down to a regression in 3.5.  ZOOKEEPER-1506 made
an improvement 3.4 that wasn't ported to 3.5.  I opened ZOOKEEPER-2982 to
track this, and have a PR ready.   Could we shoot to get the fix into 3.5.4?

Thanks,
Eron Wright

Re: [Discuss] ZOOKEEPER-2982 DNS negative caching in 3.5

Posted by Eron Wright <er...@gmail.com>.
Thanks Flavio.   The PR for 3.5 branch has been reviewed and approved
by Andor Molnár.    Someone please merge.

Meanwhile I will prepare another PR for master branch.

Thanks

On Tue, Feb 20, 2018 at 12:18 AM, Flavio Junqueira <fp...@apache.org> wrote:

> Thanks for catching this, Eron. It looks like the port to 3.5 misses
> changes as you correctly pointed out:
>
>        https://github.com/apache/zookeeper/commit/
> d2a49163b7bc7c9589140dbba7f60e591028f908 <https://github.com/apache/
> zookeeper/commit/d2a49163b7bc7c9589140dbba7f60e591028f908>
>
> In particular, changes in Learner.java. I would say this should definitely
> be in 3.5.4.
>
> -Flavio
>
> > On 20 Feb 2018, at 01:14, Eron Wright <er...@gmail.com> wrote:
> >
> > Hello,
> >
> > I attempted to run ZK 3.5.3-beta in a Kubernetes cluster, using the
> typical
> > approach of a StatefulSet plus a pair of Services.   I observed that some
> > of my ZK servers would fail to resolve the DNS addresses of its peers
> > indefinitely.   It is normal that addresses cannot be resolved
> immediately
> > at startup because the records are created asynchronously by Kubernetes.
> > One would expect ZK to keep trying and eventually succeed.   Note that
> > this issue affects 3.5 only; 3.4 seems to work fine.
> >
> > I tracked the root cause down to a regression in 3.5.  ZOOKEEPER-1506
> made
> > an improvement 3.4 that wasn't ported to 3.5.  I opened ZOOKEEPER-2982 to
> > track this, and have a PR ready.   Could we shoot to get the fix into
> 3.5.4?
> >
> > Thanks,
> > Eron Wright
>
>

Re: [Discuss] ZOOKEEPER-2982 DNS negative caching in 3.5

Posted by Flavio Junqueira <fp...@apache.org>.
Thanks for catching this, Eron. It looks like the port to 3.5 misses changes as you correctly pointed out:

       https://github.com/apache/zookeeper/commit/d2a49163b7bc7c9589140dbba7f60e591028f908 <https://github.com/apache/zookeeper/commit/d2a49163b7bc7c9589140dbba7f60e591028f908>

In particular, changes in Learner.java. I would say this should definitely be in 3.5.4.

-Flavio

> On 20 Feb 2018, at 01:14, Eron Wright <er...@gmail.com> wrote:
> 
> Hello,
> 
> I attempted to run ZK 3.5.3-beta in a Kubernetes cluster, using the typical
> approach of a StatefulSet plus a pair of Services.   I observed that some
> of my ZK servers would fail to resolve the DNS addresses of its peers
> indefinitely.   It is normal that addresses cannot be resolved immediately
> at startup because the records are created asynchronously by Kubernetes.
> One would expect ZK to keep trying and eventually succeed.   Note that
> this issue affects 3.5 only; 3.4 seems to work fine.
> 
> I tracked the root cause down to a regression in 3.5.  ZOOKEEPER-1506 made
> an improvement 3.4 that wasn't ported to 3.5.  I opened ZOOKEEPER-2982 to
> track this, and have a PR ready.   Could we shoot to get the fix into 3.5.4?
> 
> Thanks,
> Eron Wright