You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by Andor Molnar <an...@cloudera.com> on 2018/03/06 16:12:47 UTC

Re: Name resolution in StaticHostProvider

Hi Abe,

Unfortunately we haven't got any feedback yet. What do you think of
implementing Option #3?

Regards,
Andor


On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <an...@cloudera.com> wrote:

> Did anybody happen to take a quick look by any chance?
>
> I don't want to push this too hard, because I know it's a time consuming
> topic to think about, but this is a blocker in 3.5 which has been hanging
> around for a while and any feedback would be extremely helpful to close it
> quickly.
>
> Thanks,
> Andor
>
>
>
> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <an...@cloudera.com> wrote:
>
>> Hi all,
>>
>> We need more eyes and brains on the following PR:
>>
>> https://github.com/apache/zookeeper/pull/451
>>
>> I added a comment few days ago about the way we currently do DNS name
>> resolution in this class and a suggestion on how we could simplify things a
>> little bit. We talked about it with Abe Fine, but we're a little bit unsure
>> and cannot get a conclusion. It would be extremely handy to get more
>> feedback from you.
>>
>> To add some colour to it, let me elaborate on the situation here:
>>
>> In general, the task that StaticHostProvider does is to get a list of
>> potentially unresolved InetSocketAddress objects, resolve them and iterate
>> over the resolved objects by calling next() method.
>>
>> *Option #1 (current logic)*
>> - Resolve addresses with getAllByName() which returns a list of IP
>> addresses associated with the address.
>> - Cache all these IP's, shuffle them and iterate over.
>> - If client is unable to connect to an IP, remove all IPs from the list
>> which the original servername was resolved to and re-resolve it.
>>
>> *Option #2 (getByName())*
>> - Resolve address with getByName() instead which returns only the first
>> IP address of the name,
>> - Do not cache IPs,
>> - Shuffle the *names* and resolve with getByName() *every time* when
>> next() is called,
>> - JDK's built-in caching will prevent name servers from being flooded and
>> will do the re-resolution automatically when cache expires,
>> - Names with multiple IPs will be handled by DNS servers which (if
>> configured properly) return IPs in different order - this is called DNS
>> Round Robin -, so getByName() will return different IP on each call.
>>
>> *Options #3*
>> - There's a small problem with option#2: if DNS server is not configured
>> properly and handles the round-robin case in a way that it always return
>> the IP list in the same order, getByName() will never return the next ip,
>> - In order to overcome that, use getAllByName() instead, shuffle the list
>> and return the first IP.
>>
>> All feedback if much appreciated.
>>
>> Thanks,
>> Andor
>>
>>
>>
>>
>

Re: Name resolution in StaticHostProvider

Posted by Andor Molnar <an...@cloudera.com>.
Let's put it this way:

1- The interpretation of next() is "give me the next server's IP address
that I should connect to and do whatever preparation (resolution) is
necessary to do so".
The answer could be:
a) I'm done, here it is,
b) there's a problem with the next item in the list, for instance it's
unreasolvable (or else), please deal with it

When it returns, the caller can decide if it wants to advance to the next
item (call next() again) or exit with a fatal error (list items must always
be resolvable) or exit because it doesn't need to connect anymore (client
closed the socket). (...and there could be a whole bunch of other possible
cases...)

So, to wrap up, if you take a look at the retry logic already built in
ClientCnxn, the while-try-catch block is 100+ lines long. First, it already
has the retry logic, second we don't want to go down the road of copying
part of the logic to the HostProvider's level. Reliability, testability,
corner-cases, etc, etc.

What do you think?






On Tue, May 8, 2018 at 1:55 PM, Flavio Junqueira <fp...@apache.org> wrote:

> The refactoring did not seem justifiable at first, so my reaction to it.
> You have clarified the reason for including the changes, and I actually
> like it.
>
> About the exception, there are two points for me:
>
> 1- You don't really need to throw to execute the plan you described.
> 2- In the case we do throw, which is not entirely unreasonable, I'd think
> about the expected behaviour of a method called "next()". In general, I'd
> expect it to either return me the next item or error saying that it cannot
> return an item. The "UnknownHostException" is not doing the latter, though.
> It is indicating that one of the elements of the host provider set is
> "broken" (not resolvable). That's one interpretation. Another
> interpretation is that the logic in "next" needs to indicate to the caller
> that there is a problem with an item in the set of elements of the host
> provider.
>
> For (2) let's converge on what semantics we are trying to provide first,
> please.
>
> -Flavio
>
> > On 8 May 2018, at 21:20, Andor Molnar <an...@cloudera.com> wrote:
> >
> > Sorry, I thought you were against the whole refactoring.
> >
> > "2- That we discuss separately whether we want to change the behaviour of
> > the next()in the HostProvider interface."
> >
> > From this it seemed to me, it's not just a polishing issue, but maybe
> I've
> > gotten you wrong.
> > Anyway, there're 2 contention points we've encountered so far:
> >
> > 1. Do we need to refactor at all?
> > 2. If we do so, shall next() throw UnknownHostException or deal with
> error
> > inside.
> >
> > And I'd still go with:
> > 1. Yes
> > 2. Yes, throw
> >
> > So, I would leave the PR as it is now.
> >
> > Andor
> >
> >
> >
> >
> >
> > On Tue, May 8, 2018 at 12:11 PM, Flavio Junqueira <fp...@apache.org>
> wrote:
> >
> >> Can you list what the contention points are according to you? Feel free
> to
> >> do that in the PR as well, I want to make sure we have the same
> >> understanding of the points that still need to be resolved. From where I
> >> stand, there is no major issue pending other than one polishing issue
> that
> >> I brought upon in the PR.
> >>
> >> -Flavio
> >>
> >>> On 8 May 2018, at 21:08, Andor Molnar <an...@cloudera.com> wrote:
> >>>
> >>> I'm happy to do that once we have an agreement.
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, May 8, 2018 at 8:34 AM, Flavio Junqueira <fp...@apache.org>
> wrote:
> >>>
> >>>> It might be a good idea to document whatever we end up doing.
> >>>>
> >>>> -Flavio
> >>>>
> >>>>> On 8 May 2018, at 17:22, Andor Molnar <an...@cloudera.com> wrote:
> >>>>>
> >>>>> "If refactoring is necessary to address the issue, then it should be
> >> part
> >>>>> of the PR."
> >>>>>
> >>>>> Agreed. I think it is and it also makes things a lot more simpler, so
> >>>> it's
> >>>>> easier to review.
> >>>>>
> >>>>> "I'm not sure what kind of confirmation you are after here. Could you
> >>>>> elaborate, please?"
> >>>>>
> >>>>> I'm just wondering what could have been the reason for caching host
> >>>>> addresses in StaticHostProvider in the first place.
> >>>>>
> >>>>> "The other solution, if I remember enough of it, tried to avoid
> >> resolving
> >>>>> unnecessarily, but perhaps the DNS client cache is enough..."
> >>>>>
> >>>>> That's exactly my point: what JDK is doing internally is perfectly
> fine
> >>>> for
> >>>>> us and we don't need extra logic here.
> >>>>>
> >>>>> "Could you elaborate on this point, please? What exactly do you mean
> >> with
> >>>>> this statement?"
> >>>>>
> >>>>> See my previous point. Caching is already done in all DNS servers in
> >> the
> >>>>> chain and also there's also caching in JDK already, which means by
> >>>> calling
> >>>>> getByName() you don't necessarily fire a DNS request, only when the
> TTL
> >>>> is
> >>>>> expired.
> >>>>>
> >>>>> Andor
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Tue, May 8, 2018 at 8:12 AM, Flavio Junqueira <fp...@apache.org>
> >> wrote:
> >>>>>
> >>>>>> Hi Andor,
> >>>>>>
> >>>>>> Thanks for your work on addressing the issue.
> >>>>>>
> >>>>>>> On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wrote:
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> Updating this thread, because the PR is still being review on
> GitHub.
> >>>>>>>
> >>>>>>> So, the reason why I refactored the original behaviour of
> >>>>>>> StaticHostProvider is that I believe that it's trying to do
> something
> >>>>>> which
> >>>>>>> is not its responsibility.
> >>>>>>
> >>>>>> If refactoring is necessary to address the issue, then it should be
> >> part
> >>>>>> of the PR. Otherwise, it is better to refactor in a separate PR.
> >>>>>>
> >>>>>>
> >>>>>>> Please tell me if there's a good historical
> >>>>>>> reason for that.
> >>>>>>
> >>>>>> I'm not sure what kind of confirmation you are after here. Could you
> >>>>>> elaborate, please?
> >>>>>>
> >>>>>>>
> >>>>>>> My approach is giving the user the following to options:
> >>>>>>> 1- Use static IP addresses, if you don't want to deal with DNS
> >>>> resolution
> >>>>>>> at all - we guarantee that no DNS logic will involved in this case
> at
> >>>>>> all.
> >>>>>>
> >>>>>> Sounds fine.
> >>>>>>
> >>>>>>> 2- Use DNS hostnames if you have a reliable DNS service for
> >> resolution
> >>>>>>> (with HA, secondary servers, backups, etc.) - we must use DNS in
> the
> >>>>>> right
> >>>>>>> way in this case e.g. do NOT cache IP address for a longer period
> >> that
> >>>>>> DNS
> >>>>>>> server allows to and re-resolve after TTL expries, because it's
> >>>> mandatory
> >>>>>>> by protocol.
> >>>>>>
> >>>>>> Sounds fine.
> >>>>>>
> >>>>>>>
> >>>>>>> My 2 cents here:
> >>>>>>> - the fix which was originally posted for re-resolution is a
> >> workaround
> >>>>>> and
> >>>>>>> doesn't satisfy the requirement for #2,
> >>>>>>
> >>>>>> The other solution, if I remember enough of it, tried to avoid
> >> resolving
> >>>>>> unnecessarily, but perhaps the DNS client cache is enough to avoid
> the
> >>>>>> unnecessary round-trips. This observation actually brings me to the
> >> next
> >>>>>> point:
> >>>>>>
> >>>>>>> - the solution is already built-in in JDK and DNS clients in the
> >> right
> >>>>>> way
> >>>>>>
> >>>>>> Could you elaborate on this point, please? What exactly do you mean
> >> with
> >>>>>> this statement?
> >>>>>>
> >>>>>>> - can't see a reason why we shouldn't use that
> >>>>>>>
> >>>>>>> I checked this in some other projects as well and found very
> similar
> >>>>>>> approach in hadoop-common's SecurityUtil.java. It has 2 built-in
> >>>> plugins
> >>>>>>> for that:
> >>>>>>> - Standard resolver uses java's built-in getByName().
> >>>>>>> - Qualified resolver still uses getByName(), but adds some logic to
> >>>> avoid
> >>>>>>> incorrect re-resolutions and reverse IP lookups.
> >>>>>>>
> >>>>>>> Please let me know your thoughts.
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> Andor
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <an...@cloudera.com>
> >>>> wrote:
> >>>>>>>
> >>>>>>>> Hi Abe,
> >>>>>>>>
> >>>>>>>> Unfortunately we haven't got any feedback yet. What do you think
> of
> >>>>>>>> implementing Option #3?
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Andor
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <andor@cloudera.com
> >
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Did anybody happen to take a quick look by any chance?
> >>>>>>>>>
> >>>>>>>>> I don't want to push this too hard, because I know it's a time
> >>>>>> consuming
> >>>>>>>>> topic to think about, but this is a blocker in 3.5 which has been
> >>>>>> hanging
> >>>>>>>>> around for a while and any feedback would be extremely helpful to
> >>>>>> close it
> >>>>>>>>> quickly.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Andor
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <
> andor@cloudera.com
> >>>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> We need more eyes and brains on the following PR:
> >>>>>>>>>>
> >>>>>>>>>> https://github.com/apache/zookeeper/pull/451
> >>>>>>>>>>
> >>>>>>>>>> I added a comment few days ago about the way we currently do DNS
> >>>> name
> >>>>>>>>>> resolution in this class and a suggestion on how we could
> simplify
> >>>>>> things a
> >>>>>>>>>> little bit. We talked about it with Abe Fine, but we're a little
> >> bit
> >>>>>> unsure
> >>>>>>>>>> and cannot get a conclusion. It would be extremely handy to get
> >> more
> >>>>>>>>>> feedback from you.
> >>>>>>>>>>
> >>>>>>>>>> To add some colour to it, let me elaborate on the situation
> here:
> >>>>>>>>>>
> >>>>>>>>>> In general, the task that StaticHostProvider does is to get a
> list
> >>>> of
> >>>>>>>>>> potentially unresolved InetSocketAddress objects, resolve them
> and
> >>>>>> iterate
> >>>>>>>>>> over the resolved objects by calling next() method.
> >>>>>>>>>>
> >>>>>>>>>> *Option #1 (current logic)*
> >>>>>>>>>> - Resolve addresses with getAllByName() which returns a list of
> IP
> >>>>>>>>>> addresses associated with the address.
> >>>>>>>>>> - Cache all these IP's, shuffle them and iterate over.
> >>>>>>>>>> - If client is unable to connect to an IP, remove all IPs from
> the
> >>>>>> list
> >>>>>>>>>> which the original servername was resolved to and re-resolve it.
> >>>>>>>>>>
> >>>>>>>>>> *Option #2 (getByName())*
> >>>>>>>>>> - Resolve address with getByName() instead which returns only
> the
> >>>>>> first
> >>>>>>>>>> IP address of the name,
> >>>>>>>>>> - Do not cache IPs,
> >>>>>>>>>> - Shuffle the *names* and resolve with getByName() *every time*
> >> when
> >>>>>>>>>> next() is called,
> >>>>>>>>>> - JDK's built-in caching will prevent name servers from being
> >>>> flooded
> >>>>>>>>>> and will do the re-resolution automatically when cache expires,
> >>>>>>>>>> - Names with multiple IPs will be handled by DNS servers which
> (if
> >>>>>>>>>> configured properly) return IPs in different order - this is
> >> called
> >>>>>> DNS
> >>>>>>>>>> Round Robin -, so getByName() will return different IP on each
> >> call.
> >>>>>>>>>>
> >>>>>>>>>> *Options #3*
> >>>>>>>>>> - There's a small problem with option#2: if DNS server is not
> >>>>>> configured
> >>>>>>>>>> properly and handles the round-robin case in a way that it
> always
> >>>>>> return
> >>>>>>>>>> the IP list in the same order, getByName() will never return the
> >>>> next
> >>>>>> ip,
> >>>>>>>>>> - In order to overcome that, use getAllByName() instead, shuffle
> >> the
> >>>>>>>>>> list and return the first IP.
> >>>>>>>>>>
> >>>>>>>>>> All feedback if much appreciated.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Andor
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: Name resolution in StaticHostProvider

Posted by Flavio Junqueira <fp...@apache.org>.
The refactoring did not seem justifiable at first, so my reaction to it. You have clarified the reason for including the changes, and I actually like it.

About the exception, there are two points for me:

1- You don't really need to throw to execute the plan you described.
2- In the case we do throw, which is not entirely unreasonable, I'd think about the expected behaviour of a method called "next()". In general, I'd expect it to either return me the next item or error saying that it cannot return an item. The "UnknownHostException" is not doing the latter, though. It is indicating that one of the elements of the host provider set is "broken" (not resolvable). That's one interpretation. Another interpretation is that the logic in "next" needs to indicate to the caller that there is a problem with an item in the set of elements of the host provider. 

For (2) let's converge on what semantics we are trying to provide first, please.

-Flavio

> On 8 May 2018, at 21:20, Andor Molnar <an...@cloudera.com> wrote:
> 
> Sorry, I thought you were against the whole refactoring.
> 
> "2- That we discuss separately whether we want to change the behaviour of
> the next()in the HostProvider interface."
> 
> From this it seemed to me, it's not just a polishing issue, but maybe I've
> gotten you wrong.
> Anyway, there're 2 contention points we've encountered so far:
> 
> 1. Do we need to refactor at all?
> 2. If we do so, shall next() throw UnknownHostException or deal with error
> inside.
> 
> And I'd still go with:
> 1. Yes
> 2. Yes, throw
> 
> So, I would leave the PR as it is now.
> 
> Andor
> 
> 
> 
> 
> 
> On Tue, May 8, 2018 at 12:11 PM, Flavio Junqueira <fp...@apache.org> wrote:
> 
>> Can you list what the contention points are according to you? Feel free to
>> do that in the PR as well, I want to make sure we have the same
>> understanding of the points that still need to be resolved. From where I
>> stand, there is no major issue pending other than one polishing issue that
>> I brought upon in the PR.
>> 
>> -Flavio
>> 
>>> On 8 May 2018, at 21:08, Andor Molnar <an...@cloudera.com> wrote:
>>> 
>>> I'm happy to do that once we have an agreement.
>>> 
>>> 
>>> 
>>> 
>>> On Tue, May 8, 2018 at 8:34 AM, Flavio Junqueira <fp...@apache.org> wrote:
>>> 
>>>> It might be a good idea to document whatever we end up doing.
>>>> 
>>>> -Flavio
>>>> 
>>>>> On 8 May 2018, at 17:22, Andor Molnar <an...@cloudera.com> wrote:
>>>>> 
>>>>> "If refactoring is necessary to address the issue, then it should be
>> part
>>>>> of the PR."
>>>>> 
>>>>> Agreed. I think it is and it also makes things a lot more simpler, so
>>>> it's
>>>>> easier to review.
>>>>> 
>>>>> "I'm not sure what kind of confirmation you are after here. Could you
>>>>> elaborate, please?"
>>>>> 
>>>>> I'm just wondering what could have been the reason for caching host
>>>>> addresses in StaticHostProvider in the first place.
>>>>> 
>>>>> "The other solution, if I remember enough of it, tried to avoid
>> resolving
>>>>> unnecessarily, but perhaps the DNS client cache is enough..."
>>>>> 
>>>>> That's exactly my point: what JDK is doing internally is perfectly fine
>>>> for
>>>>> us and we don't need extra logic here.
>>>>> 
>>>>> "Could you elaborate on this point, please? What exactly do you mean
>> with
>>>>> this statement?"
>>>>> 
>>>>> See my previous point. Caching is already done in all DNS servers in
>> the
>>>>> chain and also there's also caching in JDK already, which means by
>>>> calling
>>>>> getByName() you don't necessarily fire a DNS request, only when the TTL
>>>> is
>>>>> expired.
>>>>> 
>>>>> Andor
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Tue, May 8, 2018 at 8:12 AM, Flavio Junqueira <fp...@apache.org>
>> wrote:
>>>>> 
>>>>>> Hi Andor,
>>>>>> 
>>>>>> Thanks for your work on addressing the issue.
>>>>>> 
>>>>>>> On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wrote:
>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> Updating this thread, because the PR is still being review on GitHub.
>>>>>>> 
>>>>>>> So, the reason why I refactored the original behaviour of
>>>>>>> StaticHostProvider is that I believe that it's trying to do something
>>>>>> which
>>>>>>> is not its responsibility.
>>>>>> 
>>>>>> If refactoring is necessary to address the issue, then it should be
>> part
>>>>>> of the PR. Otherwise, it is better to refactor in a separate PR.
>>>>>> 
>>>>>> 
>>>>>>> Please tell me if there's a good historical
>>>>>>> reason for that.
>>>>>> 
>>>>>> I'm not sure what kind of confirmation you are after here. Could you
>>>>>> elaborate, please?
>>>>>> 
>>>>>>> 
>>>>>>> My approach is giving the user the following to options:
>>>>>>> 1- Use static IP addresses, if you don't want to deal with DNS
>>>> resolution
>>>>>>> at all - we guarantee that no DNS logic will involved in this case at
>>>>>> all.
>>>>>> 
>>>>>> Sounds fine.
>>>>>> 
>>>>>>> 2- Use DNS hostnames if you have a reliable DNS service for
>> resolution
>>>>>>> (with HA, secondary servers, backups, etc.) - we must use DNS in the
>>>>>> right
>>>>>>> way in this case e.g. do NOT cache IP address for a longer period
>> that
>>>>>> DNS
>>>>>>> server allows to and re-resolve after TTL expries, because it's
>>>> mandatory
>>>>>>> by protocol.
>>>>>> 
>>>>>> Sounds fine.
>>>>>> 
>>>>>>> 
>>>>>>> My 2 cents here:
>>>>>>> - the fix which was originally posted for re-resolution is a
>> workaround
>>>>>> and
>>>>>>> doesn't satisfy the requirement for #2,
>>>>>> 
>>>>>> The other solution, if I remember enough of it, tried to avoid
>> resolving
>>>>>> unnecessarily, but perhaps the DNS client cache is enough to avoid the
>>>>>> unnecessary round-trips. This observation actually brings me to the
>> next
>>>>>> point:
>>>>>> 
>>>>>>> - the solution is already built-in in JDK and DNS clients in the
>> right
>>>>>> way
>>>>>> 
>>>>>> Could you elaborate on this point, please? What exactly do you mean
>> with
>>>>>> this statement?
>>>>>> 
>>>>>>> - can't see a reason why we shouldn't use that
>>>>>>> 
>>>>>>> I checked this in some other projects as well and found very similar
>>>>>>> approach in hadoop-common's SecurityUtil.java. It has 2 built-in
>>>> plugins
>>>>>>> for that:
>>>>>>> - Standard resolver uses java's built-in getByName().
>>>>>>> - Qualified resolver still uses getByName(), but adds some logic to
>>>> avoid
>>>>>>> incorrect re-resolutions and reverse IP lookups.
>>>>>>> 
>>>>>>> Please let me know your thoughts.
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Andor
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <an...@cloudera.com>
>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Abe,
>>>>>>>> 
>>>>>>>> Unfortunately we haven't got any feedback yet. What do you think of
>>>>>>>> implementing Option #3?
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Andor
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <an...@cloudera.com>
>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Did anybody happen to take a quick look by any chance?
>>>>>>>>> 
>>>>>>>>> I don't want to push this too hard, because I know it's a time
>>>>>> consuming
>>>>>>>>> topic to think about, but this is a blocker in 3.5 which has been
>>>>>> hanging
>>>>>>>>> around for a while and any feedback would be extremely helpful to
>>>>>> close it
>>>>>>>>> quickly.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Andor
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <andor@cloudera.com
>>> 
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi all,
>>>>>>>>>> 
>>>>>>>>>> We need more eyes and brains on the following PR:
>>>>>>>>>> 
>>>>>>>>>> https://github.com/apache/zookeeper/pull/451
>>>>>>>>>> 
>>>>>>>>>> I added a comment few days ago about the way we currently do DNS
>>>> name
>>>>>>>>>> resolution in this class and a suggestion on how we could simplify
>>>>>> things a
>>>>>>>>>> little bit. We talked about it with Abe Fine, but we're a little
>> bit
>>>>>> unsure
>>>>>>>>>> and cannot get a conclusion. It would be extremely handy to get
>> more
>>>>>>>>>> feedback from you.
>>>>>>>>>> 
>>>>>>>>>> To add some colour to it, let me elaborate on the situation here:
>>>>>>>>>> 
>>>>>>>>>> In general, the task that StaticHostProvider does is to get a list
>>>> of
>>>>>>>>>> potentially unresolved InetSocketAddress objects, resolve them and
>>>>>> iterate
>>>>>>>>>> over the resolved objects by calling next() method.
>>>>>>>>>> 
>>>>>>>>>> *Option #1 (current logic)*
>>>>>>>>>> - Resolve addresses with getAllByName() which returns a list of IP
>>>>>>>>>> addresses associated with the address.
>>>>>>>>>> - Cache all these IP's, shuffle them and iterate over.
>>>>>>>>>> - If client is unable to connect to an IP, remove all IPs from the
>>>>>> list
>>>>>>>>>> which the original servername was resolved to and re-resolve it.
>>>>>>>>>> 
>>>>>>>>>> *Option #2 (getByName())*
>>>>>>>>>> - Resolve address with getByName() instead which returns only the
>>>>>> first
>>>>>>>>>> IP address of the name,
>>>>>>>>>> - Do not cache IPs,
>>>>>>>>>> - Shuffle the *names* and resolve with getByName() *every time*
>> when
>>>>>>>>>> next() is called,
>>>>>>>>>> - JDK's built-in caching will prevent name servers from being
>>>> flooded
>>>>>>>>>> and will do the re-resolution automatically when cache expires,
>>>>>>>>>> - Names with multiple IPs will be handled by DNS servers which (if
>>>>>>>>>> configured properly) return IPs in different order - this is
>> called
>>>>>> DNS
>>>>>>>>>> Round Robin -, so getByName() will return different IP on each
>> call.
>>>>>>>>>> 
>>>>>>>>>> *Options #3*
>>>>>>>>>> - There's a small problem with option#2: if DNS server is not
>>>>>> configured
>>>>>>>>>> properly and handles the round-robin case in a way that it always
>>>>>> return
>>>>>>>>>> the IP list in the same order, getByName() will never return the
>>>> next
>>>>>> ip,
>>>>>>>>>> - In order to overcome that, use getAllByName() instead, shuffle
>> the
>>>>>>>>>> list and return the first IP.
>>>>>>>>>> 
>>>>>>>>>> All feedback if much appreciated.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Andor
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 


Re: Name resolution in StaticHostProvider

Posted by Andor Molnar <an...@cloudera.com>.
Sorry, I thought you were against the whole refactoring.

"2- That we discuss separately whether we want to change the behaviour of
the next()in the HostProvider interface."

From this it seemed to me, it's not just a polishing issue, but maybe I've
gotten you wrong.
Anyway, there're 2 contention points we've encountered so far:

1. Do we need to refactor at all?
2. If we do so, shall next() throw UnknownHostException or deal with error
inside.

And I'd still go with:
1. Yes
2. Yes, throw

So, I would leave the PR as it is now.

Andor





On Tue, May 8, 2018 at 12:11 PM, Flavio Junqueira <fp...@apache.org> wrote:

> Can you list what the contention points are according to you? Feel free to
> do that in the PR as well, I want to make sure we have the same
> understanding of the points that still need to be resolved. From where I
> stand, there is no major issue pending other than one polishing issue that
> I brought upon in the PR.
>
> -Flavio
>
> > On 8 May 2018, at 21:08, Andor Molnar <an...@cloudera.com> wrote:
> >
> > I'm happy to do that once we have an agreement.
> >
> >
> >
> >
> > On Tue, May 8, 2018 at 8:34 AM, Flavio Junqueira <fp...@apache.org> wrote:
> >
> >> It might be a good idea to document whatever we end up doing.
> >>
> >> -Flavio
> >>
> >>> On 8 May 2018, at 17:22, Andor Molnar <an...@cloudera.com> wrote:
> >>>
> >>> "If refactoring is necessary to address the issue, then it should be
> part
> >>> of the PR."
> >>>
> >>> Agreed. I think it is and it also makes things a lot more simpler, so
> >> it's
> >>> easier to review.
> >>>
> >>> "I'm not sure what kind of confirmation you are after here. Could you
> >>> elaborate, please?"
> >>>
> >>> I'm just wondering what could have been the reason for caching host
> >>> addresses in StaticHostProvider in the first place.
> >>>
> >>> "The other solution, if I remember enough of it, tried to avoid
> resolving
> >>> unnecessarily, but perhaps the DNS client cache is enough..."
> >>>
> >>> That's exactly my point: what JDK is doing internally is perfectly fine
> >> for
> >>> us and we don't need extra logic here.
> >>>
> >>> "Could you elaborate on this point, please? What exactly do you mean
> with
> >>> this statement?"
> >>>
> >>> See my previous point. Caching is already done in all DNS servers in
> the
> >>> chain and also there's also caching in JDK already, which means by
> >> calling
> >>> getByName() you don't necessarily fire a DNS request, only when the TTL
> >> is
> >>> expired.
> >>>
> >>> Andor
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, May 8, 2018 at 8:12 AM, Flavio Junqueira <fp...@apache.org>
> wrote:
> >>>
> >>>> Hi Andor,
> >>>>
> >>>> Thanks for your work on addressing the issue.
> >>>>
> >>>>> On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> Updating this thread, because the PR is still being review on GitHub.
> >>>>>
> >>>>> So, the reason why I refactored the original behaviour of
> >>>>> StaticHostProvider is that I believe that it's trying to do something
> >>>> which
> >>>>> is not its responsibility.
> >>>>
> >>>> If refactoring is necessary to address the issue, then it should be
> part
> >>>> of the PR. Otherwise, it is better to refactor in a separate PR.
> >>>>
> >>>>
> >>>>> Please tell me if there's a good historical
> >>>>> reason for that.
> >>>>
> >>>> I'm not sure what kind of confirmation you are after here. Could you
> >>>> elaborate, please?
> >>>>
> >>>>>
> >>>>> My approach is giving the user the following to options:
> >>>>> 1- Use static IP addresses, if you don't want to deal with DNS
> >> resolution
> >>>>> at all - we guarantee that no DNS logic will involved in this case at
> >>>> all.
> >>>>
> >>>> Sounds fine.
> >>>>
> >>>>> 2- Use DNS hostnames if you have a reliable DNS service for
> resolution
> >>>>> (with HA, secondary servers, backups, etc.) - we must use DNS in the
> >>>> right
> >>>>> way in this case e.g. do NOT cache IP address for a longer period
> that
> >>>> DNS
> >>>>> server allows to and re-resolve after TTL expries, because it's
> >> mandatory
> >>>>> by protocol.
> >>>>
> >>>> Sounds fine.
> >>>>
> >>>>>
> >>>>> My 2 cents here:
> >>>>> - the fix which was originally posted for re-resolution is a
> workaround
> >>>> and
> >>>>> doesn't satisfy the requirement for #2,
> >>>>
> >>>> The other solution, if I remember enough of it, tried to avoid
> resolving
> >>>> unnecessarily, but perhaps the DNS client cache is enough to avoid the
> >>>> unnecessary round-trips. This observation actually brings me to the
> next
> >>>> point:
> >>>>
> >>>>> - the solution is already built-in in JDK and DNS clients in the
> right
> >>>> way
> >>>>
> >>>> Could you elaborate on this point, please? What exactly do you mean
> with
> >>>> this statement?
> >>>>
> >>>>> - can't see a reason why we shouldn't use that
> >>>>>
> >>>>> I checked this in some other projects as well and found very similar
> >>>>> approach in hadoop-common's SecurityUtil.java. It has 2 built-in
> >> plugins
> >>>>> for that:
> >>>>> - Standard resolver uses java's built-in getByName().
> >>>>> - Qualified resolver still uses getByName(), but adds some logic to
> >> avoid
> >>>>> incorrect re-resolutions and reverse IP lookups.
> >>>>>
> >>>>> Please let me know your thoughts.
> >>>>>
> >>>>> Regards,
> >>>>> Andor
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <an...@cloudera.com>
> >> wrote:
> >>>>>
> >>>>>> Hi Abe,
> >>>>>>
> >>>>>> Unfortunately we haven't got any feedback yet. What do you think of
> >>>>>> implementing Option #3?
> >>>>>>
> >>>>>> Regards,
> >>>>>> Andor
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <an...@cloudera.com>
> >>>> wrote:
> >>>>>>
> >>>>>>> Did anybody happen to take a quick look by any chance?
> >>>>>>>
> >>>>>>> I don't want to push this too hard, because I know it's a time
> >>>> consuming
> >>>>>>> topic to think about, but this is a blocker in 3.5 which has been
> >>>> hanging
> >>>>>>> around for a while and any feedback would be extremely helpful to
> >>>> close it
> >>>>>>> quickly.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Andor
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <andor@cloudera.com
> >
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi all,
> >>>>>>>>
> >>>>>>>> We need more eyes and brains on the following PR:
> >>>>>>>>
> >>>>>>>> https://github.com/apache/zookeeper/pull/451
> >>>>>>>>
> >>>>>>>> I added a comment few days ago about the way we currently do DNS
> >> name
> >>>>>>>> resolution in this class and a suggestion on how we could simplify
> >>>> things a
> >>>>>>>> little bit. We talked about it with Abe Fine, but we're a little
> bit
> >>>> unsure
> >>>>>>>> and cannot get a conclusion. It would be extremely handy to get
> more
> >>>>>>>> feedback from you.
> >>>>>>>>
> >>>>>>>> To add some colour to it, let me elaborate on the situation here:
> >>>>>>>>
> >>>>>>>> In general, the task that StaticHostProvider does is to get a list
> >> of
> >>>>>>>> potentially unresolved InetSocketAddress objects, resolve them and
> >>>> iterate
> >>>>>>>> over the resolved objects by calling next() method.
> >>>>>>>>
> >>>>>>>> *Option #1 (current logic)*
> >>>>>>>> - Resolve addresses with getAllByName() which returns a list of IP
> >>>>>>>> addresses associated with the address.
> >>>>>>>> - Cache all these IP's, shuffle them and iterate over.
> >>>>>>>> - If client is unable to connect to an IP, remove all IPs from the
> >>>> list
> >>>>>>>> which the original servername was resolved to and re-resolve it.
> >>>>>>>>
> >>>>>>>> *Option #2 (getByName())*
> >>>>>>>> - Resolve address with getByName() instead which returns only the
> >>>> first
> >>>>>>>> IP address of the name,
> >>>>>>>> - Do not cache IPs,
> >>>>>>>> - Shuffle the *names* and resolve with getByName() *every time*
> when
> >>>>>>>> next() is called,
> >>>>>>>> - JDK's built-in caching will prevent name servers from being
> >> flooded
> >>>>>>>> and will do the re-resolution automatically when cache expires,
> >>>>>>>> - Names with multiple IPs will be handled by DNS servers which (if
> >>>>>>>> configured properly) return IPs in different order - this is
> called
> >>>> DNS
> >>>>>>>> Round Robin -, so getByName() will return different IP on each
> call.
> >>>>>>>>
> >>>>>>>> *Options #3*
> >>>>>>>> - There's a small problem with option#2: if DNS server is not
> >>>> configured
> >>>>>>>> properly and handles the round-robin case in a way that it always
> >>>> return
> >>>>>>>> the IP list in the same order, getByName() will never return the
> >> next
> >>>> ip,
> >>>>>>>> - In order to overcome that, use getAllByName() instead, shuffle
> the
> >>>>>>>> list and return the first IP.
> >>>>>>>>
> >>>>>>>> All feedback if much appreciated.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Andor
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: Name resolution in StaticHostProvider

Posted by Flavio Junqueira <fp...@apache.org>.
Can you list what the contention points are according to you? Feel free to do that in the PR as well, I want to make sure we have the same understanding of the points that still need to be resolved. From where I stand, there is no major issue pending other than one polishing issue that I brought upon in the PR.

-Flavio

> On 8 May 2018, at 21:08, Andor Molnar <an...@cloudera.com> wrote:
> 
> I'm happy to do that once we have an agreement.
> 
> 
> 
> 
> On Tue, May 8, 2018 at 8:34 AM, Flavio Junqueira <fp...@apache.org> wrote:
> 
>> It might be a good idea to document whatever we end up doing.
>> 
>> -Flavio
>> 
>>> On 8 May 2018, at 17:22, Andor Molnar <an...@cloudera.com> wrote:
>>> 
>>> "If refactoring is necessary to address the issue, then it should be part
>>> of the PR."
>>> 
>>> Agreed. I think it is and it also makes things a lot more simpler, so
>> it's
>>> easier to review.
>>> 
>>> "I'm not sure what kind of confirmation you are after here. Could you
>>> elaborate, please?"
>>> 
>>> I'm just wondering what could have been the reason for caching host
>>> addresses in StaticHostProvider in the first place.
>>> 
>>> "The other solution, if I remember enough of it, tried to avoid resolving
>>> unnecessarily, but perhaps the DNS client cache is enough..."
>>> 
>>> That's exactly my point: what JDK is doing internally is perfectly fine
>> for
>>> us and we don't need extra logic here.
>>> 
>>> "Could you elaborate on this point, please? What exactly do you mean with
>>> this statement?"
>>> 
>>> See my previous point. Caching is already done in all DNS servers in the
>>> chain and also there's also caching in JDK already, which means by
>> calling
>>> getByName() you don't necessarily fire a DNS request, only when the TTL
>> is
>>> expired.
>>> 
>>> Andor
>>> 
>>> 
>>> 
>>> 
>>> On Tue, May 8, 2018 at 8:12 AM, Flavio Junqueira <fp...@apache.org> wrote:
>>> 
>>>> Hi Andor,
>>>> 
>>>> Thanks for your work on addressing the issue.
>>>> 
>>>>> On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> Updating this thread, because the PR is still being review on GitHub.
>>>>> 
>>>>> So, the reason why I refactored the original behaviour of
>>>>> StaticHostProvider is that I believe that it's trying to do something
>>>> which
>>>>> is not its responsibility.
>>>> 
>>>> If refactoring is necessary to address the issue, then it should be part
>>>> of the PR. Otherwise, it is better to refactor in a separate PR.
>>>> 
>>>> 
>>>>> Please tell me if there's a good historical
>>>>> reason for that.
>>>> 
>>>> I'm not sure what kind of confirmation you are after here. Could you
>>>> elaborate, please?
>>>> 
>>>>> 
>>>>> My approach is giving the user the following to options:
>>>>> 1- Use static IP addresses, if you don't want to deal with DNS
>> resolution
>>>>> at all - we guarantee that no DNS logic will involved in this case at
>>>> all.
>>>> 
>>>> Sounds fine.
>>>> 
>>>>> 2- Use DNS hostnames if you have a reliable DNS service for resolution
>>>>> (with HA, secondary servers, backups, etc.) - we must use DNS in the
>>>> right
>>>>> way in this case e.g. do NOT cache IP address for a longer period that
>>>> DNS
>>>>> server allows to and re-resolve after TTL expries, because it's
>> mandatory
>>>>> by protocol.
>>>> 
>>>> Sounds fine.
>>>> 
>>>>> 
>>>>> My 2 cents here:
>>>>> - the fix which was originally posted for re-resolution is a workaround
>>>> and
>>>>> doesn't satisfy the requirement for #2,
>>>> 
>>>> The other solution, if I remember enough of it, tried to avoid resolving
>>>> unnecessarily, but perhaps the DNS client cache is enough to avoid the
>>>> unnecessary round-trips. This observation actually brings me to the next
>>>> point:
>>>> 
>>>>> - the solution is already built-in in JDK and DNS clients in the right
>>>> way
>>>> 
>>>> Could you elaborate on this point, please? What exactly do you mean with
>>>> this statement?
>>>> 
>>>>> - can't see a reason why we shouldn't use that
>>>>> 
>>>>> I checked this in some other projects as well and found very similar
>>>>> approach in hadoop-common's SecurityUtil.java. It has 2 built-in
>> plugins
>>>>> for that:
>>>>> - Standard resolver uses java's built-in getByName().
>>>>> - Qualified resolver still uses getByName(), but adds some logic to
>> avoid
>>>>> incorrect re-resolutions and reverse IP lookups.
>>>>> 
>>>>> Please let me know your thoughts.
>>>>> 
>>>>> Regards,
>>>>> Andor
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <an...@cloudera.com>
>> wrote:
>>>>> 
>>>>>> Hi Abe,
>>>>>> 
>>>>>> Unfortunately we haven't got any feedback yet. What do you think of
>>>>>> implementing Option #3?
>>>>>> 
>>>>>> Regards,
>>>>>> Andor
>>>>>> 
>>>>>> 
>>>>>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <an...@cloudera.com>
>>>> wrote:
>>>>>> 
>>>>>>> Did anybody happen to take a quick look by any chance?
>>>>>>> 
>>>>>>> I don't want to push this too hard, because I know it's a time
>>>> consuming
>>>>>>> topic to think about, but this is a blocker in 3.5 which has been
>>>> hanging
>>>>>>> around for a while and any feedback would be extremely helpful to
>>>> close it
>>>>>>> quickly.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Andor
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <an...@cloudera.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi all,
>>>>>>>> 
>>>>>>>> We need more eyes and brains on the following PR:
>>>>>>>> 
>>>>>>>> https://github.com/apache/zookeeper/pull/451
>>>>>>>> 
>>>>>>>> I added a comment few days ago about the way we currently do DNS
>> name
>>>>>>>> resolution in this class and a suggestion on how we could simplify
>>>> things a
>>>>>>>> little bit. We talked about it with Abe Fine, but we're a little bit
>>>> unsure
>>>>>>>> and cannot get a conclusion. It would be extremely handy to get more
>>>>>>>> feedback from you.
>>>>>>>> 
>>>>>>>> To add some colour to it, let me elaborate on the situation here:
>>>>>>>> 
>>>>>>>> In general, the task that StaticHostProvider does is to get a list
>> of
>>>>>>>> potentially unresolved InetSocketAddress objects, resolve them and
>>>> iterate
>>>>>>>> over the resolved objects by calling next() method.
>>>>>>>> 
>>>>>>>> *Option #1 (current logic)*
>>>>>>>> - Resolve addresses with getAllByName() which returns a list of IP
>>>>>>>> addresses associated with the address.
>>>>>>>> - Cache all these IP's, shuffle them and iterate over.
>>>>>>>> - If client is unable to connect to an IP, remove all IPs from the
>>>> list
>>>>>>>> which the original servername was resolved to and re-resolve it.
>>>>>>>> 
>>>>>>>> *Option #2 (getByName())*
>>>>>>>> - Resolve address with getByName() instead which returns only the
>>>> first
>>>>>>>> IP address of the name,
>>>>>>>> - Do not cache IPs,
>>>>>>>> - Shuffle the *names* and resolve with getByName() *every time* when
>>>>>>>> next() is called,
>>>>>>>> - JDK's built-in caching will prevent name servers from being
>> flooded
>>>>>>>> and will do the re-resolution automatically when cache expires,
>>>>>>>> - Names with multiple IPs will be handled by DNS servers which (if
>>>>>>>> configured properly) return IPs in different order - this is called
>>>> DNS
>>>>>>>> Round Robin -, so getByName() will return different IP on each call.
>>>>>>>> 
>>>>>>>> *Options #3*
>>>>>>>> - There's a small problem with option#2: if DNS server is not
>>>> configured
>>>>>>>> properly and handles the round-robin case in a way that it always
>>>> return
>>>>>>>> the IP list in the same order, getByName() will never return the
>> next
>>>> ip,
>>>>>>>> - In order to overcome that, use getAllByName() instead, shuffle the
>>>>>>>> list and return the first IP.
>>>>>>>> 
>>>>>>>> All feedback if much appreciated.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Andor
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 


Re: Name resolution in StaticHostProvider

Posted by Andor Molnar <an...@cloudera.com>.
I'm happy to do that once we have an agreement.




On Tue, May 8, 2018 at 8:34 AM, Flavio Junqueira <fp...@apache.org> wrote:

> It might be a good idea to document whatever we end up doing.
>
> -Flavio
>
> > On 8 May 2018, at 17:22, Andor Molnar <an...@cloudera.com> wrote:
> >
> > "If refactoring is necessary to address the issue, then it should be part
> > of the PR."
> >
> > Agreed. I think it is and it also makes things a lot more simpler, so
> it's
> > easier to review.
> >
> > "I'm not sure what kind of confirmation you are after here. Could you
> > elaborate, please?"
> >
> > I'm just wondering what could have been the reason for caching host
> > addresses in StaticHostProvider in the first place.
> >
> > "The other solution, if I remember enough of it, tried to avoid resolving
> > unnecessarily, but perhaps the DNS client cache is enough..."
> >
> > That's exactly my point: what JDK is doing internally is perfectly fine
> for
> > us and we don't need extra logic here.
> >
> > "Could you elaborate on this point, please? What exactly do you mean with
> > this statement?"
> >
> > See my previous point. Caching is already done in all DNS servers in the
> > chain and also there's also caching in JDK already, which means by
> calling
> > getByName() you don't necessarily fire a DNS request, only when the TTL
> is
> > expired.
> >
> > Andor
> >
> >
> >
> >
> > On Tue, May 8, 2018 at 8:12 AM, Flavio Junqueira <fp...@apache.org> wrote:
> >
> >> Hi Andor,
> >>
> >> Thanks for your work on addressing the issue.
> >>
> >>> On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> Updating this thread, because the PR is still being review on GitHub.
> >>>
> >>> So, the reason why I refactored the original behaviour of
> >>> StaticHostProvider is that I believe that it's trying to do something
> >> which
> >>> is not its responsibility.
> >>
> >> If refactoring is necessary to address the issue, then it should be part
> >> of the PR. Otherwise, it is better to refactor in a separate PR.
> >>
> >>
> >>> Please tell me if there's a good historical
> >>> reason for that.
> >>
> >> I'm not sure what kind of confirmation you are after here. Could you
> >> elaborate, please?
> >>
> >>>
> >>> My approach is giving the user the following to options:
> >>> 1- Use static IP addresses, if you don't want to deal with DNS
> resolution
> >>> at all - we guarantee that no DNS logic will involved in this case at
> >> all.
> >>
> >> Sounds fine.
> >>
> >>> 2- Use DNS hostnames if you have a reliable DNS service for resolution
> >>> (with HA, secondary servers, backups, etc.) - we must use DNS in the
> >> right
> >>> way in this case e.g. do NOT cache IP address for a longer period that
> >> DNS
> >>> server allows to and re-resolve after TTL expries, because it's
> mandatory
> >>> by protocol.
> >>
> >> Sounds fine.
> >>
> >>>
> >>> My 2 cents here:
> >>> - the fix which was originally posted for re-resolution is a workaround
> >> and
> >>> doesn't satisfy the requirement for #2,
> >>
> >> The other solution, if I remember enough of it, tried to avoid resolving
> >> unnecessarily, but perhaps the DNS client cache is enough to avoid the
> >> unnecessary round-trips. This observation actually brings me to the next
> >> point:
> >>
> >>> - the solution is already built-in in JDK and DNS clients in the right
> >> way
> >>
> >> Could you elaborate on this point, please? What exactly do you mean with
> >> this statement?
> >>
> >>> - can't see a reason why we shouldn't use that
> >>>
> >>> I checked this in some other projects as well and found very similar
> >>> approach in hadoop-common's SecurityUtil.java. It has 2 built-in
> plugins
> >>> for that:
> >>> - Standard resolver uses java's built-in getByName().
> >>> - Qualified resolver still uses getByName(), but adds some logic to
> avoid
> >>> incorrect re-resolutions and reverse IP lookups.
> >>>
> >>> Please let me know your thoughts.
> >>>
> >>> Regards,
> >>> Andor
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <an...@cloudera.com>
> wrote:
> >>>
> >>>> Hi Abe,
> >>>>
> >>>> Unfortunately we haven't got any feedback yet. What do you think of
> >>>> implementing Option #3?
> >>>>
> >>>> Regards,
> >>>> Andor
> >>>>
> >>>>
> >>>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <an...@cloudera.com>
> >> wrote:
> >>>>
> >>>>> Did anybody happen to take a quick look by any chance?
> >>>>>
> >>>>> I don't want to push this too hard, because I know it's a time
> >> consuming
> >>>>> topic to think about, but this is a blocker in 3.5 which has been
> >> hanging
> >>>>> around for a while and any feedback would be extremely helpful to
> >> close it
> >>>>> quickly.
> >>>>>
> >>>>> Thanks,
> >>>>> Andor
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <an...@cloudera.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi all,
> >>>>>>
> >>>>>> We need more eyes and brains on the following PR:
> >>>>>>
> >>>>>> https://github.com/apache/zookeeper/pull/451
> >>>>>>
> >>>>>> I added a comment few days ago about the way we currently do DNS
> name
> >>>>>> resolution in this class and a suggestion on how we could simplify
> >> things a
> >>>>>> little bit. We talked about it with Abe Fine, but we're a little bit
> >> unsure
> >>>>>> and cannot get a conclusion. It would be extremely handy to get more
> >>>>>> feedback from you.
> >>>>>>
> >>>>>> To add some colour to it, let me elaborate on the situation here:
> >>>>>>
> >>>>>> In general, the task that StaticHostProvider does is to get a list
> of
> >>>>>> potentially unresolved InetSocketAddress objects, resolve them and
> >> iterate
> >>>>>> over the resolved objects by calling next() method.
> >>>>>>
> >>>>>> *Option #1 (current logic)*
> >>>>>> - Resolve addresses with getAllByName() which returns a list of IP
> >>>>>> addresses associated with the address.
> >>>>>> - Cache all these IP's, shuffle them and iterate over.
> >>>>>> - If client is unable to connect to an IP, remove all IPs from the
> >> list
> >>>>>> which the original servername was resolved to and re-resolve it.
> >>>>>>
> >>>>>> *Option #2 (getByName())*
> >>>>>> - Resolve address with getByName() instead which returns only the
> >> first
> >>>>>> IP address of the name,
> >>>>>> - Do not cache IPs,
> >>>>>> - Shuffle the *names* and resolve with getByName() *every time* when
> >>>>>> next() is called,
> >>>>>> - JDK's built-in caching will prevent name servers from being
> flooded
> >>>>>> and will do the re-resolution automatically when cache expires,
> >>>>>> - Names with multiple IPs will be handled by DNS servers which (if
> >>>>>> configured properly) return IPs in different order - this is called
> >> DNS
> >>>>>> Round Robin -, so getByName() will return different IP on each call.
> >>>>>>
> >>>>>> *Options #3*
> >>>>>> - There's a small problem with option#2: if DNS server is not
> >> configured
> >>>>>> properly and handles the round-robin case in a way that it always
> >> return
> >>>>>> the IP list in the same order, getByName() will never return the
> next
> >> ip,
> >>>>>> - In order to overcome that, use getAllByName() instead, shuffle the
> >>>>>> list and return the first IP.
> >>>>>>
> >>>>>> All feedback if much appreciated.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Andor
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>
> >>
>
>

Re: Name resolution in StaticHostProvider

Posted by Flavio Junqueira <fp...@apache.org>.
It might be a good idea to document whatever we end up doing.

-Flavio

> On 8 May 2018, at 17:22, Andor Molnar <an...@cloudera.com> wrote:
> 
> "If refactoring is necessary to address the issue, then it should be part
> of the PR."
> 
> Agreed. I think it is and it also makes things a lot more simpler, so it's
> easier to review.
> 
> "I'm not sure what kind of confirmation you are after here. Could you
> elaborate, please?"
> 
> I'm just wondering what could have been the reason for caching host
> addresses in StaticHostProvider in the first place.
> 
> "The other solution, if I remember enough of it, tried to avoid resolving
> unnecessarily, but perhaps the DNS client cache is enough..."
> 
> That's exactly my point: what JDK is doing internally is perfectly fine for
> us and we don't need extra logic here.
> 
> "Could you elaborate on this point, please? What exactly do you mean with
> this statement?"
> 
> See my previous point. Caching is already done in all DNS servers in the
> chain and also there's also caching in JDK already, which means by calling
> getByName() you don't necessarily fire a DNS request, only when the TTL is
> expired.
> 
> Andor
> 
> 
> 
> 
> On Tue, May 8, 2018 at 8:12 AM, Flavio Junqueira <fp...@apache.org> wrote:
> 
>> Hi Andor,
>> 
>> Thanks for your work on addressing the issue.
>> 
>>> On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wrote:
>>> 
>>> Hi,
>>> 
>>> Updating this thread, because the PR is still being review on GitHub.
>>> 
>>> So, the reason why I refactored the original behaviour of
>>> StaticHostProvider is that I believe that it's trying to do something
>> which
>>> is not its responsibility.
>> 
>> If refactoring is necessary to address the issue, then it should be part
>> of the PR. Otherwise, it is better to refactor in a separate PR.
>> 
>> 
>>> Please tell me if there's a good historical
>>> reason for that.
>> 
>> I'm not sure what kind of confirmation you are after here. Could you
>> elaborate, please?
>> 
>>> 
>>> My approach is giving the user the following to options:
>>> 1- Use static IP addresses, if you don't want to deal with DNS resolution
>>> at all - we guarantee that no DNS logic will involved in this case at
>> all.
>> 
>> Sounds fine.
>> 
>>> 2- Use DNS hostnames if you have a reliable DNS service for resolution
>>> (with HA, secondary servers, backups, etc.) - we must use DNS in the
>> right
>>> way in this case e.g. do NOT cache IP address for a longer period that
>> DNS
>>> server allows to and re-resolve after TTL expries, because it's mandatory
>>> by protocol.
>> 
>> Sounds fine.
>> 
>>> 
>>> My 2 cents here:
>>> - the fix which was originally posted for re-resolution is a workaround
>> and
>>> doesn't satisfy the requirement for #2,
>> 
>> The other solution, if I remember enough of it, tried to avoid resolving
>> unnecessarily, but perhaps the DNS client cache is enough to avoid the
>> unnecessary round-trips. This observation actually brings me to the next
>> point:
>> 
>>> - the solution is already built-in in JDK and DNS clients in the right
>> way
>> 
>> Could you elaborate on this point, please? What exactly do you mean with
>> this statement?
>> 
>>> - can't see a reason why we shouldn't use that
>>> 
>>> I checked this in some other projects as well and found very similar
>>> approach in hadoop-common's SecurityUtil.java. It has 2 built-in plugins
>>> for that:
>>> - Standard resolver uses java's built-in getByName().
>>> - Qualified resolver still uses getByName(), but adds some logic to avoid
>>> incorrect re-resolutions and reverse IP lookups.
>>> 
>>> Please let me know your thoughts.
>>> 
>>> Regards,
>>> Andor
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <an...@cloudera.com> wrote:
>>> 
>>>> Hi Abe,
>>>> 
>>>> Unfortunately we haven't got any feedback yet. What do you think of
>>>> implementing Option #3?
>>>> 
>>>> Regards,
>>>> Andor
>>>> 
>>>> 
>>>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <an...@cloudera.com>
>> wrote:
>>>> 
>>>>> Did anybody happen to take a quick look by any chance?
>>>>> 
>>>>> I don't want to push this too hard, because I know it's a time
>> consuming
>>>>> topic to think about, but this is a blocker in 3.5 which has been
>> hanging
>>>>> around for a while and any feedback would be extremely helpful to
>> close it
>>>>> quickly.
>>>>> 
>>>>> Thanks,
>>>>> Andor
>>>>> 
>>>>> 
>>>>> 
>>>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <an...@cloudera.com>
>>>>> wrote:
>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> We need more eyes and brains on the following PR:
>>>>>> 
>>>>>> https://github.com/apache/zookeeper/pull/451
>>>>>> 
>>>>>> I added a comment few days ago about the way we currently do DNS name
>>>>>> resolution in this class and a suggestion on how we could simplify
>> things a
>>>>>> little bit. We talked about it with Abe Fine, but we're a little bit
>> unsure
>>>>>> and cannot get a conclusion. It would be extremely handy to get more
>>>>>> feedback from you.
>>>>>> 
>>>>>> To add some colour to it, let me elaborate on the situation here:
>>>>>> 
>>>>>> In general, the task that StaticHostProvider does is to get a list of
>>>>>> potentially unresolved InetSocketAddress objects, resolve them and
>> iterate
>>>>>> over the resolved objects by calling next() method.
>>>>>> 
>>>>>> *Option #1 (current logic)*
>>>>>> - Resolve addresses with getAllByName() which returns a list of IP
>>>>>> addresses associated with the address.
>>>>>> - Cache all these IP's, shuffle them and iterate over.
>>>>>> - If client is unable to connect to an IP, remove all IPs from the
>> list
>>>>>> which the original servername was resolved to and re-resolve it.
>>>>>> 
>>>>>> *Option #2 (getByName())*
>>>>>> - Resolve address with getByName() instead which returns only the
>> first
>>>>>> IP address of the name,
>>>>>> - Do not cache IPs,
>>>>>> - Shuffle the *names* and resolve with getByName() *every time* when
>>>>>> next() is called,
>>>>>> - JDK's built-in caching will prevent name servers from being flooded
>>>>>> and will do the re-resolution automatically when cache expires,
>>>>>> - Names with multiple IPs will be handled by DNS servers which (if
>>>>>> configured properly) return IPs in different order - this is called
>> DNS
>>>>>> Round Robin -, so getByName() will return different IP on each call.
>>>>>> 
>>>>>> *Options #3*
>>>>>> - There's a small problem with option#2: if DNS server is not
>> configured
>>>>>> properly and handles the round-robin case in a way that it always
>> return
>>>>>> the IP list in the same order, getByName() will never return the next
>> ip,
>>>>>> - In order to overcome that, use getAllByName() instead, shuffle the
>>>>>> list and return the first IP.
>>>>>> 
>>>>>> All feedback if much appreciated.
>>>>>> 
>>>>>> Thanks,
>>>>>> Andor
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> 
>> 


Re: Name resolution in StaticHostProvider

Posted by Andor Molnar <an...@cloudera.com>.
"If refactoring is necessary to address the issue, then it should be part
of the PR."

Agreed. I think it is and it also makes things a lot more simpler, so it's
easier to review.

"I'm not sure what kind of confirmation you are after here. Could you
elaborate, please?"

I'm just wondering what could have been the reason for caching host
addresses in StaticHostProvider in the first place.

"The other solution, if I remember enough of it, tried to avoid resolving
unnecessarily, but perhaps the DNS client cache is enough..."

That's exactly my point: what JDK is doing internally is perfectly fine for
us and we don't need extra logic here.

"Could you elaborate on this point, please? What exactly do you mean with
this statement?"

See my previous point. Caching is already done in all DNS servers in the
chain and also there's also caching in JDK already, which means by calling
getByName() you don't necessarily fire a DNS request, only when the TTL is
expired.

Andor




On Tue, May 8, 2018 at 8:12 AM, Flavio Junqueira <fp...@apache.org> wrote:

> Hi Andor,
>
> Thanks for your work on addressing the issue.
>
> > On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wrote:
> >
> > Hi,
> >
> > Updating this thread, because the PR is still being review on GitHub.
> >
> > So, the reason why I refactored the original behaviour of
> > StaticHostProvider is that I believe that it's trying to do something
> which
> > is not its responsibility.
>
> If refactoring is necessary to address the issue, then it should be part
> of the PR. Otherwise, it is better to refactor in a separate PR.
>
>
> > Please tell me if there's a good historical
> > reason for that.
>
> I'm not sure what kind of confirmation you are after here. Could you
> elaborate, please?
>
> >
> > My approach is giving the user the following to options:
> > 1- Use static IP addresses, if you don't want to deal with DNS resolution
> > at all - we guarantee that no DNS logic will involved in this case at
> all.
>
> Sounds fine.
>
> > 2- Use DNS hostnames if you have a reliable DNS service for resolution
> > (with HA, secondary servers, backups, etc.) - we must use DNS in the
> right
> > way in this case e.g. do NOT cache IP address for a longer period that
> DNS
> > server allows to and re-resolve after TTL expries, because it's mandatory
> > by protocol.
>
> Sounds fine.
>
> >
> > My 2 cents here:
> > - the fix which was originally posted for re-resolution is a workaround
> and
> > doesn't satisfy the requirement for #2,
>
> The other solution, if I remember enough of it, tried to avoid resolving
> unnecessarily, but perhaps the DNS client cache is enough to avoid the
> unnecessary round-trips. This observation actually brings me to the next
> point:
>
> > - the solution is already built-in in JDK and DNS clients in the right
> way
>
> Could you elaborate on this point, please? What exactly do you mean with
> this statement?
>
> > - can't see a reason why we shouldn't use that
> >
> > I checked this in some other projects as well and found very similar
> > approach in hadoop-common's SecurityUtil.java. It has 2 built-in plugins
> > for that:
> > - Standard resolver uses java's built-in getByName().
> > - Qualified resolver still uses getByName(), but adds some logic to avoid
> > incorrect re-resolutions and reverse IP lookups.
> >
> > Please let me know your thoughts.
> >
> > Regards,
> > Andor
> >
> >
> >
> >
> >
> >
> > On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <an...@cloudera.com> wrote:
> >
> >> Hi Abe,
> >>
> >> Unfortunately we haven't got any feedback yet. What do you think of
> >> implementing Option #3?
> >>
> >> Regards,
> >> Andor
> >>
> >>
> >> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <an...@cloudera.com>
> wrote:
> >>
> >>> Did anybody happen to take a quick look by any chance?
> >>>
> >>> I don't want to push this too hard, because I know it's a time
> consuming
> >>> topic to think about, but this is a blocker in 3.5 which has been
> hanging
> >>> around for a while and any feedback would be extremely helpful to
> close it
> >>> quickly.
> >>>
> >>> Thanks,
> >>> Andor
> >>>
> >>>
> >>>
> >>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <an...@cloudera.com>
> >>> wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>> We need more eyes and brains on the following PR:
> >>>>
> >>>> https://github.com/apache/zookeeper/pull/451
> >>>>
> >>>> I added a comment few days ago about the way we currently do DNS name
> >>>> resolution in this class and a suggestion on how we could simplify
> things a
> >>>> little bit. We talked about it with Abe Fine, but we're a little bit
> unsure
> >>>> and cannot get a conclusion. It would be extremely handy to get more
> >>>> feedback from you.
> >>>>
> >>>> To add some colour to it, let me elaborate on the situation here:
> >>>>
> >>>> In general, the task that StaticHostProvider does is to get a list of
> >>>> potentially unresolved InetSocketAddress objects, resolve them and
> iterate
> >>>> over the resolved objects by calling next() method.
> >>>>
> >>>> *Option #1 (current logic)*
> >>>> - Resolve addresses with getAllByName() which returns a list of IP
> >>>> addresses associated with the address.
> >>>> - Cache all these IP's, shuffle them and iterate over.
> >>>> - If client is unable to connect to an IP, remove all IPs from the
> list
> >>>> which the original servername was resolved to and re-resolve it.
> >>>>
> >>>> *Option #2 (getByName())*
> >>>> - Resolve address with getByName() instead which returns only the
> first
> >>>> IP address of the name,
> >>>> - Do not cache IPs,
> >>>> - Shuffle the *names* and resolve with getByName() *every time* when
> >>>> next() is called,
> >>>> - JDK's built-in caching will prevent name servers from being flooded
> >>>> and will do the re-resolution automatically when cache expires,
> >>>> - Names with multiple IPs will be handled by DNS servers which (if
> >>>> configured properly) return IPs in different order - this is called
> DNS
> >>>> Round Robin -, so getByName() will return different IP on each call.
> >>>>
> >>>> *Options #3*
> >>>> - There's a small problem with option#2: if DNS server is not
> configured
> >>>> properly and handles the round-robin case in a way that it always
> return
> >>>> the IP list in the same order, getByName() will never return the next
> ip,
> >>>> - In order to overcome that, use getAllByName() instead, shuffle the
> >>>> list and return the first IP.
> >>>>
> >>>> All feedback if much appreciated.
> >>>>
> >>>> Thanks,
> >>>> Andor
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>
>
>

Re: Name resolution in StaticHostProvider

Posted by Flavio Junqueira <fp...@apache.org>.
Hi Andor,

Thanks for your work on addressing the issue.

> On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wrote:
> 
> Hi,
> 
> Updating this thread, because the PR is still being review on GitHub.
> 
> So, the reason why I refactored the original behaviour of
> StaticHostProvider is that I believe that it's trying to do something which
> is not its responsibility.

If refactoring is necessary to address the issue, then it should be part of the PR. Otherwise, it is better to refactor in a separate PR.


> Please tell me if there's a good historical
> reason for that.

I'm not sure what kind of confirmation you are after here. Could you elaborate, please?

> 
> My approach is giving the user the following to options:
> 1- Use static IP addresses, if you don't want to deal with DNS resolution
> at all - we guarantee that no DNS logic will involved in this case at all.

Sounds fine.

> 2- Use DNS hostnames if you have a reliable DNS service for resolution
> (with HA, secondary servers, backups, etc.) - we must use DNS in the right
> way in this case e.g. do NOT cache IP address for a longer period that DNS
> server allows to and re-resolve after TTL expries, because it's mandatory
> by protocol.

Sounds fine.

> 
> My 2 cents here:
> - the fix which was originally posted for re-resolution is a workaround and
> doesn't satisfy the requirement for #2,

The other solution, if I remember enough of it, tried to avoid resolving unnecessarily, but perhaps the DNS client cache is enough to avoid the unnecessary round-trips. This observation actually brings me to the next point:

> - the solution is already built-in in JDK and DNS clients in the right way

Could you elaborate on this point, please? What exactly do you mean with this statement?

> - can't see a reason why we shouldn't use that
> 
> I checked this in some other projects as well and found very similar
> approach in hadoop-common's SecurityUtil.java. It has 2 built-in plugins
> for that:
> - Standard resolver uses java's built-in getByName().
> - Qualified resolver still uses getByName(), but adds some logic to avoid
> incorrect re-resolutions and reverse IP lookups.
> 
> Please let me know your thoughts.
> 
> Regards,
> Andor
> 
> 
> 
> 
> 
> 
> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <an...@cloudera.com> wrote:
> 
>> Hi Abe,
>> 
>> Unfortunately we haven't got any feedback yet. What do you think of
>> implementing Option #3?
>> 
>> Regards,
>> Andor
>> 
>> 
>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <an...@cloudera.com> wrote:
>> 
>>> Did anybody happen to take a quick look by any chance?
>>> 
>>> I don't want to push this too hard, because I know it's a time consuming
>>> topic to think about, but this is a blocker in 3.5 which has been hanging
>>> around for a while and any feedback would be extremely helpful to close it
>>> quickly.
>>> 
>>> Thanks,
>>> Andor
>>> 
>>> 
>>> 
>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <an...@cloudera.com>
>>> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> We need more eyes and brains on the following PR:
>>>> 
>>>> https://github.com/apache/zookeeper/pull/451
>>>> 
>>>> I added a comment few days ago about the way we currently do DNS name
>>>> resolution in this class and a suggestion on how we could simplify things a
>>>> little bit. We talked about it with Abe Fine, but we're a little bit unsure
>>>> and cannot get a conclusion. It would be extremely handy to get more
>>>> feedback from you.
>>>> 
>>>> To add some colour to it, let me elaborate on the situation here:
>>>> 
>>>> In general, the task that StaticHostProvider does is to get a list of
>>>> potentially unresolved InetSocketAddress objects, resolve them and iterate
>>>> over the resolved objects by calling next() method.
>>>> 
>>>> *Option #1 (current logic)*
>>>> - Resolve addresses with getAllByName() which returns a list of IP
>>>> addresses associated with the address.
>>>> - Cache all these IP's, shuffle them and iterate over.
>>>> - If client is unable to connect to an IP, remove all IPs from the list
>>>> which the original servername was resolved to and re-resolve it.
>>>> 
>>>> *Option #2 (getByName())*
>>>> - Resolve address with getByName() instead which returns only the first
>>>> IP address of the name,
>>>> - Do not cache IPs,
>>>> - Shuffle the *names* and resolve with getByName() *every time* when
>>>> next() is called,
>>>> - JDK's built-in caching will prevent name servers from being flooded
>>>> and will do the re-resolution automatically when cache expires,
>>>> - Names with multiple IPs will be handled by DNS servers which (if
>>>> configured properly) return IPs in different order - this is called DNS
>>>> Round Robin -, so getByName() will return different IP on each call.
>>>> 
>>>> *Options #3*
>>>> - There's a small problem with option#2: if DNS server is not configured
>>>> properly and handles the round-robin case in a way that it always return
>>>> the IP list in the same order, getByName() will never return the next ip,
>>>> - In order to overcome that, use getAllByName() instead, shuffle the
>>>> list and return the first IP.
>>>> 
>>>> All feedback if much appreciated.
>>>> 
>>>> Thanks,
>>>> Andor
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>> 


Re: Name resolution in StaticHostProvider

Posted by Andor Molnar <an...@cloudera.com>.
Flavio, did you have a chance to think about whether we should use
unchecked exception here?

Andor



On Sat, May 12, 2018 at 1:02 PM, Norbert Kalmar <nk...@cloudera.com>
wrote:

> Hi,
>
> In my opinion, when the client calls next(), and sees that this call could
> return with an UnknownHostException, it will assume next() is giving back
> the next available address in the list, whether it's a valid address or
> not.
> If next() would throw something like "NoValidAddressFoundException", then
> it would now it iterated the whole list, and there is no chance there is
> any valid address left.
>
> So the way I see it using checked exception helps here to understand next,
> and to signal it to the caller that it should prepare that there was a
> failed lookup, and do something about it.
>
> I would go with checked exception, but I don't know all that well the
> callers or the mechanic here.
>
> The rules of thumb I follow with checked vs unchecked exception is that "Is
> it reasonable for the caller to recover from this?" If yes, throw a checked
> exception. In this case, if next() only looks one element from the list,
> but there's an error with that address. then I think it is reasonable to
> expect the caller to recover.
>
> But I'm also open to pros and cons :)
>
> Regards,
> Norbert
>
>
>
> On Sat, May 12, 2018 at 4:09 PM Andor Molnar <an...@cloudera.com> wrote:
>
> > Hi team,
> >
> > Anyone else has thoughts about whether we should use checked or unchecked
> > exception here?
> >
> > Thanks,
> > Andor
> >
> >
> >
> >
> > On Thu, May 10, 2018 at 8:19 AM, Andor Molnar <an...@cloudera.com>
> wrote:
> >
> > > Interesting idea. What difference you think it could make comparing to
> > > checked?
> > >
> > >
> > >
> > > On Wed, May 9, 2018 at 8:01 AM, Flavio Junqueira <fp...@apache.org>
> wrote:
> > >
> > >> I'm actually now wondering whether we should be using an unchecked
> > >> exception instead. A lot of things have changed with exception
> handling
> > >> since we wrote this code base initially. An unchecked exception would
> > >> actually match better my current mental model of what that signature
> > should
> > >> look like.
> > >>
> > >> -Flavio
> > >>
> > >> > On 9 May 2018, at 16:44, Flavio Junqueira <fp...@apache.org> wrote:
> > >> >
> > >> > I like the idea of indicating to the application that there is
> > >> something wrong with the list of servers so that it has a chance to
> look
> > >> into it. With the current code in `ClientCnxn`, we will log at warn
> > level
> > >> and hope that someone sees it, but we are not really stopping the
> > client.
> > >> Throwing might actually be an improvement as it will output a log
> > message,
> > >> but I'm now wondering if we should propagate it all the way to the
> > >> application. Responding to myself, one reason for not doing it is that
> > it
> > >> is not a fatal error unless no server can be resolved.
> > >> >
> > >> > -Flavio
> > >> >
> > >> >> On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wrote:
> > >> >>
> > >> >> Hi,
> > >> >>
> > >> >> Updating this thread, because the PR is still being review on
> GitHub.
> > >> >>
> > >> >> So, the reason why I refactored the original behaviour of
> > >> >> StaticHostProvider is that I believe that it's trying to do
> something
> > >> which
> > >> >> is not its responsibility. Please tell me if there's a good
> > historical
> > >> >> reason for that.
> > >> >>
> > >> >> My approach is giving the user the following to options:
> > >> >> 1- Use static IP addresses, if you don't want to deal with DNS
> > >> resolution
> > >> >> at all - we guarantee that no DNS logic will involved in this case
> at
> > >> all.
> > >> >> 2- Use DNS hostnames if you have a reliable DNS service for
> > resolution
> > >> >> (with HA, secondary servers, backups, etc.) - we must use DNS in
> the
> > >> right
> > >> >> way in this case e.g. do NOT cache IP address for a longer period
> > that
> > >> DNS
> > >> >> server allows to and re-resolve after TTL expries, because it's
> > >> mandatory
> > >> >> by protocol.
> > >> >>
> > >> >> My 2 cents here:
> > >> >> - the fix which was originally posted for re-resolution is a
> > >> workaround and
> > >> >> doesn't satisfy the requirement for #2,
> > >> >> - the solution is already built-in in JDK and DNS clients in the
> > right
> > >> way
> > >> >> - can't see a reason why we shouldn't use that
> > >> >>
> > >> >> I checked this in some other projects as well and found very
> similar
> > >> >> approach in hadoop-common's SecurityUtil.java. It has 2 built-in
> > >> plugins
> > >> >> for that:
> > >> >> - Standard resolver uses java's built-in getByName().
> > >> >> - Qualified resolver still uses getByName(), but adds some logic to
> > >> avoid
> > >> >> incorrect re-resolutions and reverse IP lookups.
> > >> >>
> > >> >> Please let me know your thoughts.
> > >> >>
> > >> >> Regards,
> > >> >> Andor
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <an...@cloudera.com>
> > >> wrote:
> > >> >>
> > >> >>> Hi Abe,
> > >> >>>
> > >> >>> Unfortunately we haven't got any feedback yet. What do you think
> of
> > >> >>> implementing Option #3?
> > >> >>>
> > >> >>> Regards,
> > >> >>> Andor
> > >> >>>
> > >> >>>
> > >> >>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <andor@cloudera.com
> >
> > >> wrote:
> > >> >>>
> > >> >>>> Did anybody happen to take a quick look by any chance?
> > >> >>>>
> > >> >>>> I don't want to push this too hard, because I know it's a time
> > >> consuming
> > >> >>>> topic to think about, but this is a blocker in 3.5 which has been
> > >> hanging
> > >> >>>> around for a while and any feedback would be extremely helpful to
> > >> close it
> > >> >>>> quickly.
> > >> >>>>
> > >> >>>> Thanks,
> > >> >>>> Andor
> > >> >>>>
> > >> >>>>
> > >> >>>>
> > >> >>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <
> andor@cloudera.com
> > >
> > >> >>>> wrote:
> > >> >>>>
> > >> >>>>> Hi all,
> > >> >>>>>
> > >> >>>>> We need more eyes and brains on the following PR:
> > >> >>>>>
> > >> >>>>> https://github.com/apache/zookeeper/pull/451
> > >> >>>>>
> > >> >>>>> I added a comment few days ago about the way we currently do DNS
> > >> name
> > >> >>>>> resolution in this class and a suggestion on how we could
> simplify
> > >> things a
> > >> >>>>> little bit. We talked about it with Abe Fine, but we're a little
> > >> bit unsure
> > >> >>>>> and cannot get a conclusion. It would be extremely handy to get
> > more
> > >> >>>>> feedback from you.
> > >> >>>>>
> > >> >>>>> To add some colour to it, let me elaborate on the situation
> here:
> > >> >>>>>
> > >> >>>>> In general, the task that StaticHostProvider does is to get a
> list
> > >> of
> > >> >>>>> potentially unresolved InetSocketAddress objects, resolve them
> and
> > >> iterate
> > >> >>>>> over the resolved objects by calling next() method.
> > >> >>>>>
> > >> >>>>> *Option #1 (current logic)*
> > >> >>>>> - Resolve addresses with getAllByName() which returns a list of
> IP
> > >> >>>>> addresses associated with the address.
> > >> >>>>> - Cache all these IP's, shuffle them and iterate over.
> > >> >>>>> - If client is unable to connect to an IP, remove all IPs from
> the
> > >> list
> > >> >>>>> which the original servername was resolved to and re-resolve it.
> > >> >>>>>
> > >> >>>>> *Option #2 (getByName())*
> > >> >>>>> - Resolve address with getByName() instead which returns only
> the
> > >> first
> > >> >>>>> IP address of the name,
> > >> >>>>> - Do not cache IPs,
> > >> >>>>> - Shuffle the *names* and resolve with getByName() *every time*
> > when
> > >> >>>>> next() is called,
> > >> >>>>> - JDK's built-in caching will prevent name servers from being
> > >> flooded
> > >> >>>>> and will do the re-resolution automatically when cache expires,
> > >> >>>>> - Names with multiple IPs will be handled by DNS servers which
> (if
> > >> >>>>> configured properly) return IPs in different order - this is
> > called
> > >> DNS
> > >> >>>>> Round Robin -, so getByName() will return different IP on each
> > call.
> > >> >>>>>
> > >> >>>>> *Options #3*
> > >> >>>>> - There's a small problem with option#2: if DNS server is not
> > >> configured
> > >> >>>>> properly and handles the round-robin case in a way that it
> always
> > >> return
> > >> >>>>> the IP list in the same order, getByName() will never return the
> > >> next ip,
> > >> >>>>> - In order to overcome that, use getAllByName() instead, shuffle
> > the
> > >> >>>>> list and return the first IP.
> > >> >>>>>
> > >> >>>>> All feedback if much appreciated.
> > >> >>>>>
> > >> >>>>> Thanks,
> > >> >>>>> Andor
> > >> >>>>>
> > >> >>>>>
> > >> >>>>>
> > >> >>>>>
> > >> >>>>
> > >> >>>
> > >> >
> > >>
> > >>
> > >
> >
>

Re: Name resolution in StaticHostProvider

Posted by Norbert Kalmar <nk...@cloudera.com>.
Hi,

In my opinion, when the client calls next(), and sees that this call could
return with an UnknownHostException, it will assume next() is giving back
the next available address in the list, whether it's a valid address or not.
If next() would throw something like "NoValidAddressFoundException", then
it would now it iterated the whole list, and there is no chance there is
any valid address left.

So the way I see it using checked exception helps here to understand next,
and to signal it to the caller that it should prepare that there was a
failed lookup, and do something about it.

I would go with checked exception, but I don't know all that well the
callers or the mechanic here.

The rules of thumb I follow with checked vs unchecked exception is that "Is
it reasonable for the caller to recover from this?" If yes, throw a checked
exception. In this case, if next() only looks one element from the list,
but there's an error with that address. then I think it is reasonable to
expect the caller to recover.

But I'm also open to pros and cons :)

Regards,
Norbert



On Sat, May 12, 2018 at 4:09 PM Andor Molnar <an...@cloudera.com> wrote:

> Hi team,
>
> Anyone else has thoughts about whether we should use checked or unchecked
> exception here?
>
> Thanks,
> Andor
>
>
>
>
> On Thu, May 10, 2018 at 8:19 AM, Andor Molnar <an...@cloudera.com> wrote:
>
> > Interesting idea. What difference you think it could make comparing to
> > checked?
> >
> >
> >
> > On Wed, May 9, 2018 at 8:01 AM, Flavio Junqueira <fp...@apache.org> wrote:
> >
> >> I'm actually now wondering whether we should be using an unchecked
> >> exception instead. A lot of things have changed with exception handling
> >> since we wrote this code base initially. An unchecked exception would
> >> actually match better my current mental model of what that signature
> should
> >> look like.
> >>
> >> -Flavio
> >>
> >> > On 9 May 2018, at 16:44, Flavio Junqueira <fp...@apache.org> wrote:
> >> >
> >> > I like the idea of indicating to the application that there is
> >> something wrong with the list of servers so that it has a chance to look
> >> into it. With the current code in `ClientCnxn`, we will log at warn
> level
> >> and hope that someone sees it, but we are not really stopping the
> client.
> >> Throwing might actually be an improvement as it will output a log
> message,
> >> but I'm now wondering if we should propagate it all the way to the
> >> application. Responding to myself, one reason for not doing it is that
> it
> >> is not a fatal error unless no server can be resolved.
> >> >
> >> > -Flavio
> >> >
> >> >> On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> Updating this thread, because the PR is still being review on GitHub.
> >> >>
> >> >> So, the reason why I refactored the original behaviour of
> >> >> StaticHostProvider is that I believe that it's trying to do something
> >> which
> >> >> is not its responsibility. Please tell me if there's a good
> historical
> >> >> reason for that.
> >> >>
> >> >> My approach is giving the user the following to options:
> >> >> 1- Use static IP addresses, if you don't want to deal with DNS
> >> resolution
> >> >> at all - we guarantee that no DNS logic will involved in this case at
> >> all.
> >> >> 2- Use DNS hostnames if you have a reliable DNS service for
> resolution
> >> >> (with HA, secondary servers, backups, etc.) - we must use DNS in the
> >> right
> >> >> way in this case e.g. do NOT cache IP address for a longer period
> that
> >> DNS
> >> >> server allows to and re-resolve after TTL expries, because it's
> >> mandatory
> >> >> by protocol.
> >> >>
> >> >> My 2 cents here:
> >> >> - the fix which was originally posted for re-resolution is a
> >> workaround and
> >> >> doesn't satisfy the requirement for #2,
> >> >> - the solution is already built-in in JDK and DNS clients in the
> right
> >> way
> >> >> - can't see a reason why we shouldn't use that
> >> >>
> >> >> I checked this in some other projects as well and found very similar
> >> >> approach in hadoop-common's SecurityUtil.java. It has 2 built-in
> >> plugins
> >> >> for that:
> >> >> - Standard resolver uses java's built-in getByName().
> >> >> - Qualified resolver still uses getByName(), but adds some logic to
> >> avoid
> >> >> incorrect re-resolutions and reverse IP lookups.
> >> >>
> >> >> Please let me know your thoughts.
> >> >>
> >> >> Regards,
> >> >> Andor
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <an...@cloudera.com>
> >> wrote:
> >> >>
> >> >>> Hi Abe,
> >> >>>
> >> >>> Unfortunately we haven't got any feedback yet. What do you think of
> >> >>> implementing Option #3?
> >> >>>
> >> >>> Regards,
> >> >>> Andor
> >> >>>
> >> >>>
> >> >>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <an...@cloudera.com>
> >> wrote:
> >> >>>
> >> >>>> Did anybody happen to take a quick look by any chance?
> >> >>>>
> >> >>>> I don't want to push this too hard, because I know it's a time
> >> consuming
> >> >>>> topic to think about, but this is a blocker in 3.5 which has been
> >> hanging
> >> >>>> around for a while and any feedback would be extremely helpful to
> >> close it
> >> >>>> quickly.
> >> >>>>
> >> >>>> Thanks,
> >> >>>> Andor
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <andor@cloudera.com
> >
> >> >>>> wrote:
> >> >>>>
> >> >>>>> Hi all,
> >> >>>>>
> >> >>>>> We need more eyes and brains on the following PR:
> >> >>>>>
> >> >>>>> https://github.com/apache/zookeeper/pull/451
> >> >>>>>
> >> >>>>> I added a comment few days ago about the way we currently do DNS
> >> name
> >> >>>>> resolution in this class and a suggestion on how we could simplify
> >> things a
> >> >>>>> little bit. We talked about it with Abe Fine, but we're a little
> >> bit unsure
> >> >>>>> and cannot get a conclusion. It would be extremely handy to get
> more
> >> >>>>> feedback from you.
> >> >>>>>
> >> >>>>> To add some colour to it, let me elaborate on the situation here:
> >> >>>>>
> >> >>>>> In general, the task that StaticHostProvider does is to get a list
> >> of
> >> >>>>> potentially unresolved InetSocketAddress objects, resolve them and
> >> iterate
> >> >>>>> over the resolved objects by calling next() method.
> >> >>>>>
> >> >>>>> *Option #1 (current logic)*
> >> >>>>> - Resolve addresses with getAllByName() which returns a list of IP
> >> >>>>> addresses associated with the address.
> >> >>>>> - Cache all these IP's, shuffle them and iterate over.
> >> >>>>> - If client is unable to connect to an IP, remove all IPs from the
> >> list
> >> >>>>> which the original servername was resolved to and re-resolve it.
> >> >>>>>
> >> >>>>> *Option #2 (getByName())*
> >> >>>>> - Resolve address with getByName() instead which returns only the
> >> first
> >> >>>>> IP address of the name,
> >> >>>>> - Do not cache IPs,
> >> >>>>> - Shuffle the *names* and resolve with getByName() *every time*
> when
> >> >>>>> next() is called,
> >> >>>>> - JDK's built-in caching will prevent name servers from being
> >> flooded
> >> >>>>> and will do the re-resolution automatically when cache expires,
> >> >>>>> - Names with multiple IPs will be handled by DNS servers which (if
> >> >>>>> configured properly) return IPs in different order - this is
> called
> >> DNS
> >> >>>>> Round Robin -, so getByName() will return different IP on each
> call.
> >> >>>>>
> >> >>>>> *Options #3*
> >> >>>>> - There's a small problem with option#2: if DNS server is not
> >> configured
> >> >>>>> properly and handles the round-robin case in a way that it always
> >> return
> >> >>>>> the IP list in the same order, getByName() will never return the
> >> next ip,
> >> >>>>> - In order to overcome that, use getAllByName() instead, shuffle
> the
> >> >>>>> list and return the first IP.
> >> >>>>>
> >> >>>>> All feedback if much appreciated.
> >> >>>>>
> >> >>>>> Thanks,
> >> >>>>> Andor
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >
> >>
> >>
> >
>

Re: Name resolution in StaticHostProvider

Posted by Andor Molnar <an...@cloudera.com>.
Hi team,

Anyone else has thoughts about whether we should use checked or unchecked
exception here?

Thanks,
Andor




On Thu, May 10, 2018 at 8:19 AM, Andor Molnar <an...@cloudera.com> wrote:

> Interesting idea. What difference you think it could make comparing to
> checked?
>
>
>
> On Wed, May 9, 2018 at 8:01 AM, Flavio Junqueira <fp...@apache.org> wrote:
>
>> I'm actually now wondering whether we should be using an unchecked
>> exception instead. A lot of things have changed with exception handling
>> since we wrote this code base initially. An unchecked exception would
>> actually match better my current mental model of what that signature should
>> look like.
>>
>> -Flavio
>>
>> > On 9 May 2018, at 16:44, Flavio Junqueira <fp...@apache.org> wrote:
>> >
>> > I like the idea of indicating to the application that there is
>> something wrong with the list of servers so that it has a chance to look
>> into it. With the current code in `ClientCnxn`, we will log at warn level
>> and hope that someone sees it, but we are not really stopping the client.
>> Throwing might actually be an improvement as it will output a log message,
>> but I'm now wondering if we should propagate it all the way to the
>> application. Responding to myself, one reason for not doing it is that it
>> is not a fatal error unless no server can be resolved.
>> >
>> > -Flavio
>> >
>> >> On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> Updating this thread, because the PR is still being review on GitHub.
>> >>
>> >> So, the reason why I refactored the original behaviour of
>> >> StaticHostProvider is that I believe that it's trying to do something
>> which
>> >> is not its responsibility. Please tell me if there's a good historical
>> >> reason for that.
>> >>
>> >> My approach is giving the user the following to options:
>> >> 1- Use static IP addresses, if you don't want to deal with DNS
>> resolution
>> >> at all - we guarantee that no DNS logic will involved in this case at
>> all.
>> >> 2- Use DNS hostnames if you have a reliable DNS service for resolution
>> >> (with HA, secondary servers, backups, etc.) - we must use DNS in the
>> right
>> >> way in this case e.g. do NOT cache IP address for a longer period that
>> DNS
>> >> server allows to and re-resolve after TTL expries, because it's
>> mandatory
>> >> by protocol.
>> >>
>> >> My 2 cents here:
>> >> - the fix which was originally posted for re-resolution is a
>> workaround and
>> >> doesn't satisfy the requirement for #2,
>> >> - the solution is already built-in in JDK and DNS clients in the right
>> way
>> >> - can't see a reason why we shouldn't use that
>> >>
>> >> I checked this in some other projects as well and found very similar
>> >> approach in hadoop-common's SecurityUtil.java. It has 2 built-in
>> plugins
>> >> for that:
>> >> - Standard resolver uses java's built-in getByName().
>> >> - Qualified resolver still uses getByName(), but adds some logic to
>> avoid
>> >> incorrect re-resolutions and reverse IP lookups.
>> >>
>> >> Please let me know your thoughts.
>> >>
>> >> Regards,
>> >> Andor
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <an...@cloudera.com>
>> wrote:
>> >>
>> >>> Hi Abe,
>> >>>
>> >>> Unfortunately we haven't got any feedback yet. What do you think of
>> >>> implementing Option #3?
>> >>>
>> >>> Regards,
>> >>> Andor
>> >>>
>> >>>
>> >>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <an...@cloudera.com>
>> wrote:
>> >>>
>> >>>> Did anybody happen to take a quick look by any chance?
>> >>>>
>> >>>> I don't want to push this too hard, because I know it's a time
>> consuming
>> >>>> topic to think about, but this is a blocker in 3.5 which has been
>> hanging
>> >>>> around for a while and any feedback would be extremely helpful to
>> close it
>> >>>> quickly.
>> >>>>
>> >>>> Thanks,
>> >>>> Andor
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <an...@cloudera.com>
>> >>>> wrote:
>> >>>>
>> >>>>> Hi all,
>> >>>>>
>> >>>>> We need more eyes and brains on the following PR:
>> >>>>>
>> >>>>> https://github.com/apache/zookeeper/pull/451
>> >>>>>
>> >>>>> I added a comment few days ago about the way we currently do DNS
>> name
>> >>>>> resolution in this class and a suggestion on how we could simplify
>> things a
>> >>>>> little bit. We talked about it with Abe Fine, but we're a little
>> bit unsure
>> >>>>> and cannot get a conclusion. It would be extremely handy to get more
>> >>>>> feedback from you.
>> >>>>>
>> >>>>> To add some colour to it, let me elaborate on the situation here:
>> >>>>>
>> >>>>> In general, the task that StaticHostProvider does is to get a list
>> of
>> >>>>> potentially unresolved InetSocketAddress objects, resolve them and
>> iterate
>> >>>>> over the resolved objects by calling next() method.
>> >>>>>
>> >>>>> *Option #1 (current logic)*
>> >>>>> - Resolve addresses with getAllByName() which returns a list of IP
>> >>>>> addresses associated with the address.
>> >>>>> - Cache all these IP's, shuffle them and iterate over.
>> >>>>> - If client is unable to connect to an IP, remove all IPs from the
>> list
>> >>>>> which the original servername was resolved to and re-resolve it.
>> >>>>>
>> >>>>> *Option #2 (getByName())*
>> >>>>> - Resolve address with getByName() instead which returns only the
>> first
>> >>>>> IP address of the name,
>> >>>>> - Do not cache IPs,
>> >>>>> - Shuffle the *names* and resolve with getByName() *every time* when
>> >>>>> next() is called,
>> >>>>> - JDK's built-in caching will prevent name servers from being
>> flooded
>> >>>>> and will do the re-resolution automatically when cache expires,
>> >>>>> - Names with multiple IPs will be handled by DNS servers which (if
>> >>>>> configured properly) return IPs in different order - this is called
>> DNS
>> >>>>> Round Robin -, so getByName() will return different IP on each call.
>> >>>>>
>> >>>>> *Options #3*
>> >>>>> - There's a small problem with option#2: if DNS server is not
>> configured
>> >>>>> properly and handles the round-robin case in a way that it always
>> return
>> >>>>> the IP list in the same order, getByName() will never return the
>> next ip,
>> >>>>> - In order to overcome that, use getAllByName() instead, shuffle the
>> >>>>> list and return the first IP.
>> >>>>>
>> >>>>> All feedback if much appreciated.
>> >>>>>
>> >>>>> Thanks,
>> >>>>> Andor
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>
>> >>>
>> >
>>
>>
>

Re: Name resolution in StaticHostProvider

Posted by Andor Molnar <an...@cloudera.com>.
Interesting idea. What difference you think it could make comparing to
checked?



On Wed, May 9, 2018 at 8:01 AM, Flavio Junqueira <fp...@apache.org> wrote:

> I'm actually now wondering whether we should be using an unchecked
> exception instead. A lot of things have changed with exception handling
> since we wrote this code base initially. An unchecked exception would
> actually match better my current mental model of what that signature should
> look like.
>
> -Flavio
>
> > On 9 May 2018, at 16:44, Flavio Junqueira <fp...@apache.org> wrote:
> >
> > I like the idea of indicating to the application that there is something
> wrong with the list of servers so that it has a chance to look into it.
> With the current code in `ClientCnxn`, we will log at warn level and hope
> that someone sees it, but we are not really stopping the client. Throwing
> might actually be an improvement as it will output a log message, but I'm
> now wondering if we should propagate it all the way to the application.
> Responding to myself, one reason for not doing it is that it is not a fatal
> error unless no server can be resolved.
> >
> > -Flavio
> >
> >> On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wrote:
> >>
> >> Hi,
> >>
> >> Updating this thread, because the PR is still being review on GitHub.
> >>
> >> So, the reason why I refactored the original behaviour of
> >> StaticHostProvider is that I believe that it's trying to do something
> which
> >> is not its responsibility. Please tell me if there's a good historical
> >> reason for that.
> >>
> >> My approach is giving the user the following to options:
> >> 1- Use static IP addresses, if you don't want to deal with DNS
> resolution
> >> at all - we guarantee that no DNS logic will involved in this case at
> all.
> >> 2- Use DNS hostnames if you have a reliable DNS service for resolution
> >> (with HA, secondary servers, backups, etc.) - we must use DNS in the
> right
> >> way in this case e.g. do NOT cache IP address for a longer period that
> DNS
> >> server allows to and re-resolve after TTL expries, because it's
> mandatory
> >> by protocol.
> >>
> >> My 2 cents here:
> >> - the fix which was originally posted for re-resolution is a workaround
> and
> >> doesn't satisfy the requirement for #2,
> >> - the solution is already built-in in JDK and DNS clients in the right
> way
> >> - can't see a reason why we shouldn't use that
> >>
> >> I checked this in some other projects as well and found very similar
> >> approach in hadoop-common's SecurityUtil.java. It has 2 built-in plugins
> >> for that:
> >> - Standard resolver uses java's built-in getByName().
> >> - Qualified resolver still uses getByName(), but adds some logic to
> avoid
> >> incorrect re-resolutions and reverse IP lookups.
> >>
> >> Please let me know your thoughts.
> >>
> >> Regards,
> >> Andor
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <an...@cloudera.com>
> wrote:
> >>
> >>> Hi Abe,
> >>>
> >>> Unfortunately we haven't got any feedback yet. What do you think of
> >>> implementing Option #3?
> >>>
> >>> Regards,
> >>> Andor
> >>>
> >>>
> >>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <an...@cloudera.com>
> wrote:
> >>>
> >>>> Did anybody happen to take a quick look by any chance?
> >>>>
> >>>> I don't want to push this too hard, because I know it's a time
> consuming
> >>>> topic to think about, but this is a blocker in 3.5 which has been
> hanging
> >>>> around for a while and any feedback would be extremely helpful to
> close it
> >>>> quickly.
> >>>>
> >>>> Thanks,
> >>>> Andor
> >>>>
> >>>>
> >>>>
> >>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <an...@cloudera.com>
> >>>> wrote:
> >>>>
> >>>>> Hi all,
> >>>>>
> >>>>> We need more eyes and brains on the following PR:
> >>>>>
> >>>>> https://github.com/apache/zookeeper/pull/451
> >>>>>
> >>>>> I added a comment few days ago about the way we currently do DNS name
> >>>>> resolution in this class and a suggestion on how we could simplify
> things a
> >>>>> little bit. We talked about it with Abe Fine, but we're a little bit
> unsure
> >>>>> and cannot get a conclusion. It would be extremely handy to get more
> >>>>> feedback from you.
> >>>>>
> >>>>> To add some colour to it, let me elaborate on the situation here:
> >>>>>
> >>>>> In general, the task that StaticHostProvider does is to get a list of
> >>>>> potentially unresolved InetSocketAddress objects, resolve them and
> iterate
> >>>>> over the resolved objects by calling next() method.
> >>>>>
> >>>>> *Option #1 (current logic)*
> >>>>> - Resolve addresses with getAllByName() which returns a list of IP
> >>>>> addresses associated with the address.
> >>>>> - Cache all these IP's, shuffle them and iterate over.
> >>>>> - If client is unable to connect to an IP, remove all IPs from the
> list
> >>>>> which the original servername was resolved to and re-resolve it.
> >>>>>
> >>>>> *Option #2 (getByName())*
> >>>>> - Resolve address with getByName() instead which returns only the
> first
> >>>>> IP address of the name,
> >>>>> - Do not cache IPs,
> >>>>> - Shuffle the *names* and resolve with getByName() *every time* when
> >>>>> next() is called,
> >>>>> - JDK's built-in caching will prevent name servers from being flooded
> >>>>> and will do the re-resolution automatically when cache expires,
> >>>>> - Names with multiple IPs will be handled by DNS servers which (if
> >>>>> configured properly) return IPs in different order - this is called
> DNS
> >>>>> Round Robin -, so getByName() will return different IP on each call.
> >>>>>
> >>>>> *Options #3*
> >>>>> - There's a small problem with option#2: if DNS server is not
> configured
> >>>>> properly and handles the round-robin case in a way that it always
> return
> >>>>> the IP list in the same order, getByName() will never return the
> next ip,
> >>>>> - In order to overcome that, use getAllByName() instead, shuffle the
> >>>>> list and return the first IP.
> >>>>>
> >>>>> All feedback if much appreciated.
> >>>>>
> >>>>> Thanks,
> >>>>> Andor
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >
>
>

Re: Name resolution in StaticHostProvider

Posted by Flavio Junqueira <fp...@apache.org>.
I'm actually now wondering whether we should be using an unchecked exception instead. A lot of things have changed with exception handling since we wrote this code base initially. An unchecked exception would actually match better my current mental model of what that signature should look like.

-Flavio

> On 9 May 2018, at 16:44, Flavio Junqueira <fp...@apache.org> wrote:
> 
> I like the idea of indicating to the application that there is something wrong with the list of servers so that it has a chance to look into it. With the current code in `ClientCnxn`, we will log at warn level and hope that someone sees it, but we are not really stopping the client. Throwing might actually be an improvement as it will output a log message, but I'm now wondering if we should propagate it all the way to the application. Responding to myself, one reason for not doing it is that it is not a fatal error unless no server can be resolved.
> 
> -Flavio
> 
>> On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wrote:
>> 
>> Hi,
>> 
>> Updating this thread, because the PR is still being review on GitHub.
>> 
>> So, the reason why I refactored the original behaviour of
>> StaticHostProvider is that I believe that it's trying to do something which
>> is not its responsibility. Please tell me if there's a good historical
>> reason for that.
>> 
>> My approach is giving the user the following to options:
>> 1- Use static IP addresses, if you don't want to deal with DNS resolution
>> at all - we guarantee that no DNS logic will involved in this case at all.
>> 2- Use DNS hostnames if you have a reliable DNS service for resolution
>> (with HA, secondary servers, backups, etc.) - we must use DNS in the right
>> way in this case e.g. do NOT cache IP address for a longer period that DNS
>> server allows to and re-resolve after TTL expries, because it's mandatory
>> by protocol.
>> 
>> My 2 cents here:
>> - the fix which was originally posted for re-resolution is a workaround and
>> doesn't satisfy the requirement for #2,
>> - the solution is already built-in in JDK and DNS clients in the right way
>> - can't see a reason why we shouldn't use that
>> 
>> I checked this in some other projects as well and found very similar
>> approach in hadoop-common's SecurityUtil.java. It has 2 built-in plugins
>> for that:
>> - Standard resolver uses java's built-in getByName().
>> - Qualified resolver still uses getByName(), but adds some logic to avoid
>> incorrect re-resolutions and reverse IP lookups.
>> 
>> Please let me know your thoughts.
>> 
>> Regards,
>> Andor
>> 
>> 
>> 
>> 
>> 
>> 
>> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <an...@cloudera.com> wrote:
>> 
>>> Hi Abe,
>>> 
>>> Unfortunately we haven't got any feedback yet. What do you think of
>>> implementing Option #3?
>>> 
>>> Regards,
>>> Andor
>>> 
>>> 
>>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <an...@cloudera.com> wrote:
>>> 
>>>> Did anybody happen to take a quick look by any chance?
>>>> 
>>>> I don't want to push this too hard, because I know it's a time consuming
>>>> topic to think about, but this is a blocker in 3.5 which has been hanging
>>>> around for a while and any feedback would be extremely helpful to close it
>>>> quickly.
>>>> 
>>>> Thanks,
>>>> Andor
>>>> 
>>>> 
>>>> 
>>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <an...@cloudera.com>
>>>> wrote:
>>>> 
>>>>> Hi all,
>>>>> 
>>>>> We need more eyes and brains on the following PR:
>>>>> 
>>>>> https://github.com/apache/zookeeper/pull/451
>>>>> 
>>>>> I added a comment few days ago about the way we currently do DNS name
>>>>> resolution in this class and a suggestion on how we could simplify things a
>>>>> little bit. We talked about it with Abe Fine, but we're a little bit unsure
>>>>> and cannot get a conclusion. It would be extremely handy to get more
>>>>> feedback from you.
>>>>> 
>>>>> To add some colour to it, let me elaborate on the situation here:
>>>>> 
>>>>> In general, the task that StaticHostProvider does is to get a list of
>>>>> potentially unresolved InetSocketAddress objects, resolve them and iterate
>>>>> over the resolved objects by calling next() method.
>>>>> 
>>>>> *Option #1 (current logic)*
>>>>> - Resolve addresses with getAllByName() which returns a list of IP
>>>>> addresses associated with the address.
>>>>> - Cache all these IP's, shuffle them and iterate over.
>>>>> - If client is unable to connect to an IP, remove all IPs from the list
>>>>> which the original servername was resolved to and re-resolve it.
>>>>> 
>>>>> *Option #2 (getByName())*
>>>>> - Resolve address with getByName() instead which returns only the first
>>>>> IP address of the name,
>>>>> - Do not cache IPs,
>>>>> - Shuffle the *names* and resolve with getByName() *every time* when
>>>>> next() is called,
>>>>> - JDK's built-in caching will prevent name servers from being flooded
>>>>> and will do the re-resolution automatically when cache expires,
>>>>> - Names with multiple IPs will be handled by DNS servers which (if
>>>>> configured properly) return IPs in different order - this is called DNS
>>>>> Round Robin -, so getByName() will return different IP on each call.
>>>>> 
>>>>> *Options #3*
>>>>> - There's a small problem with option#2: if DNS server is not configured
>>>>> properly and handles the round-robin case in a way that it always return
>>>>> the IP list in the same order, getByName() will never return the next ip,
>>>>> - In order to overcome that, use getAllByName() instead, shuffle the
>>>>> list and return the first IP.
>>>>> 
>>>>> All feedback if much appreciated.
>>>>> 
>>>>> Thanks,
>>>>> Andor
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
> 


Re: Name resolution in StaticHostProvider

Posted by Flavio Junqueira <fp...@apache.org>.
I like the idea of indicating to the application that there is something wrong with the list of servers so that it has a chance to look into it. With the current code in `ClientCnxn`, we will log at warn level and hope that someone sees it, but we are not really stopping the client. Throwing might actually be an improvement as it will output a log message, but I'm now wondering if we should propagate it all the way to the application. Responding to myself, one reason for not doing it is that it is not a fatal error unless no server can be resolved.

-Flavio
 
> On 8 May 2018, at 16:06, Andor Molnar <an...@cloudera.com> wrote:
> 
> Hi,
> 
> Updating this thread, because the PR is still being review on GitHub.
> 
> So, the reason why I refactored the original behaviour of
> StaticHostProvider is that I believe that it's trying to do something which
> is not its responsibility. Please tell me if there's a good historical
> reason for that.
> 
> My approach is giving the user the following to options:
> 1- Use static IP addresses, if you don't want to deal with DNS resolution
> at all - we guarantee that no DNS logic will involved in this case at all.
> 2- Use DNS hostnames if you have a reliable DNS service for resolution
> (with HA, secondary servers, backups, etc.) - we must use DNS in the right
> way in this case e.g. do NOT cache IP address for a longer period that DNS
> server allows to and re-resolve after TTL expries, because it's mandatory
> by protocol.
> 
> My 2 cents here:
> - the fix which was originally posted for re-resolution is a workaround and
> doesn't satisfy the requirement for #2,
> - the solution is already built-in in JDK and DNS clients in the right way
> - can't see a reason why we shouldn't use that
> 
> I checked this in some other projects as well and found very similar
> approach in hadoop-common's SecurityUtil.java. It has 2 built-in plugins
> for that:
> - Standard resolver uses java's built-in getByName().
> - Qualified resolver still uses getByName(), but adds some logic to avoid
> incorrect re-resolutions and reverse IP lookups.
> 
> Please let me know your thoughts.
> 
> Regards,
> Andor
> 
> 
> 
> 
> 
> 
> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <an...@cloudera.com> wrote:
> 
>> Hi Abe,
>> 
>> Unfortunately we haven't got any feedback yet. What do you think of
>> implementing Option #3?
>> 
>> Regards,
>> Andor
>> 
>> 
>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <an...@cloudera.com> wrote:
>> 
>>> Did anybody happen to take a quick look by any chance?
>>> 
>>> I don't want to push this too hard, because I know it's a time consuming
>>> topic to think about, but this is a blocker in 3.5 which has been hanging
>>> around for a while and any feedback would be extremely helpful to close it
>>> quickly.
>>> 
>>> Thanks,
>>> Andor
>>> 
>>> 
>>> 
>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <an...@cloudera.com>
>>> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> We need more eyes and brains on the following PR:
>>>> 
>>>> https://github.com/apache/zookeeper/pull/451
>>>> 
>>>> I added a comment few days ago about the way we currently do DNS name
>>>> resolution in this class and a suggestion on how we could simplify things a
>>>> little bit. We talked about it with Abe Fine, but we're a little bit unsure
>>>> and cannot get a conclusion. It would be extremely handy to get more
>>>> feedback from you.
>>>> 
>>>> To add some colour to it, let me elaborate on the situation here:
>>>> 
>>>> In general, the task that StaticHostProvider does is to get a list of
>>>> potentially unresolved InetSocketAddress objects, resolve them and iterate
>>>> over the resolved objects by calling next() method.
>>>> 
>>>> *Option #1 (current logic)*
>>>> - Resolve addresses with getAllByName() which returns a list of IP
>>>> addresses associated with the address.
>>>> - Cache all these IP's, shuffle them and iterate over.
>>>> - If client is unable to connect to an IP, remove all IPs from the list
>>>> which the original servername was resolved to and re-resolve it.
>>>> 
>>>> *Option #2 (getByName())*
>>>> - Resolve address with getByName() instead which returns only the first
>>>> IP address of the name,
>>>> - Do not cache IPs,
>>>> - Shuffle the *names* and resolve with getByName() *every time* when
>>>> next() is called,
>>>> - JDK's built-in caching will prevent name servers from being flooded
>>>> and will do the re-resolution automatically when cache expires,
>>>> - Names with multiple IPs will be handled by DNS servers which (if
>>>> configured properly) return IPs in different order - this is called DNS
>>>> Round Robin -, so getByName() will return different IP on each call.
>>>> 
>>>> *Options #3*
>>>> - There's a small problem with option#2: if DNS server is not configured
>>>> properly and handles the round-robin case in a way that it always return
>>>> the IP list in the same order, getByName() will never return the next ip,
>>>> - In order to overcome that, use getAllByName() instead, shuffle the
>>>> list and return the first IP.
>>>> 
>>>> All feedback if much appreciated.
>>>> 
>>>> Thanks,
>>>> Andor
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>> 


Re: Name resolution in StaticHostProvider

Posted by Andor Molnar <an...@cloudera.com>.
Hi,

Updating this thread, because the PR is still being review on GitHub.

So, the reason why I refactored the original behaviour of
StaticHostProvider is that I believe that it's trying to do something which
is not its responsibility. Please tell me if there's a good historical
reason for that.

My approach is giving the user the following to options:
1- Use static IP addresses, if you don't want to deal with DNS resolution
at all - we guarantee that no DNS logic will involved in this case at all.
2- Use DNS hostnames if you have a reliable DNS service for resolution
(with HA, secondary servers, backups, etc.) - we must use DNS in the right
way in this case e.g. do NOT cache IP address for a longer period that DNS
server allows to and re-resolve after TTL expries, because it's mandatory
by protocol.

My 2 cents here:
- the fix which was originally posted for re-resolution is a workaround and
doesn't satisfy the requirement for #2,
- the solution is already built-in in JDK and DNS clients in the right way
- can't see a reason why we shouldn't use that

I checked this in some other projects as well and found very similar
approach in hadoop-common's SecurityUtil.java. It has 2 built-in plugins
for that:
- Standard resolver uses java's built-in getByName().
- Qualified resolver still uses getByName(), but adds some logic to avoid
incorrect re-resolutions and reverse IP lookups.

Please let me know your thoughts.

Regards,
Andor






On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <an...@cloudera.com> wrote:

> Hi Abe,
>
> Unfortunately we haven't got any feedback yet. What do you think of
> implementing Option #3?
>
> Regards,
> Andor
>
>
> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <an...@cloudera.com> wrote:
>
>> Did anybody happen to take a quick look by any chance?
>>
>> I don't want to push this too hard, because I know it's a time consuming
>> topic to think about, but this is a blocker in 3.5 which has been hanging
>> around for a while and any feedback would be extremely helpful to close it
>> quickly.
>>
>> Thanks,
>> Andor
>>
>>
>>
>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <an...@cloudera.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> We need more eyes and brains on the following PR:
>>>
>>> https://github.com/apache/zookeeper/pull/451
>>>
>>> I added a comment few days ago about the way we currently do DNS name
>>> resolution in this class and a suggestion on how we could simplify things a
>>> little bit. We talked about it with Abe Fine, but we're a little bit unsure
>>> and cannot get a conclusion. It would be extremely handy to get more
>>> feedback from you.
>>>
>>> To add some colour to it, let me elaborate on the situation here:
>>>
>>> In general, the task that StaticHostProvider does is to get a list of
>>> potentially unresolved InetSocketAddress objects, resolve them and iterate
>>> over the resolved objects by calling next() method.
>>>
>>> *Option #1 (current logic)*
>>> - Resolve addresses with getAllByName() which returns a list of IP
>>> addresses associated with the address.
>>> - Cache all these IP's, shuffle them and iterate over.
>>> - If client is unable to connect to an IP, remove all IPs from the list
>>> which the original servername was resolved to and re-resolve it.
>>>
>>> *Option #2 (getByName())*
>>> - Resolve address with getByName() instead which returns only the first
>>> IP address of the name,
>>> - Do not cache IPs,
>>> - Shuffle the *names* and resolve with getByName() *every time* when
>>> next() is called,
>>> - JDK's built-in caching will prevent name servers from being flooded
>>> and will do the re-resolution automatically when cache expires,
>>> - Names with multiple IPs will be handled by DNS servers which (if
>>> configured properly) return IPs in different order - this is called DNS
>>> Round Robin -, so getByName() will return different IP on each call.
>>>
>>> *Options #3*
>>> - There's a small problem with option#2: if DNS server is not configured
>>> properly and handles the round-robin case in a way that it always return
>>> the IP list in the same order, getByName() will never return the next ip,
>>> - In order to overcome that, use getAllByName() instead, shuffle the
>>> list and return the first IP.
>>>
>>> All feedback if much appreciated.
>>>
>>> Thanks,
>>> Andor
>>>
>>>
>>>
>>>
>>
>