You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@zookeeper.apache.org by GitBox <gi...@apache.org> on 2020/11/05 10:34:03 UTC

[GitHub] [zookeeper] lvisterin opened a new pull request #1524: ZOOKEEPER-3991: Refresh InetSocketAddress before binding quorum server socket

lvisterin opened a new pull request #1524:
URL: https://github.com/apache/zookeeper/pull/1524


   Setting `electionPortBindRetry` to 0 (as recommended by the documentation for the setup we are using) results in this code being called in a loop. 
   
   The desired behaviour is that in case of a bad DNS reply the server socket creation will fail and keep retrying until the hostname resolves. The current behaviour is that it keeps throwing unresolved address errors even though the DNS entry became correct.
   
   For more details and reproduction steps see the Jira issue at https://issues.apache.org/jira/browse/ZOOKEEPER-3991
   
   Please note that the contents of this PR are different from the patch I initially attached to the issue.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] symat commented on pull request #1524: ZOOKEEPER-3991: Refresh InetSocketAddress before binding quorum server socket

Posted by GitBox <gi...@apache.org>.
symat commented on pull request #1524:
URL: https://github.com/apache/zookeeper/pull/1524#issuecomment-723857220


   I saw you originally guarded the address recreation with the following condition:
   
   ``` 
   if (address.isUnresolved()) { ... }
   ```
   
   Why did you decide to remove this condition? Anyway, this code shouldn't run frequently, so I'm fine to recreate the address every time... but I'm curious: having the `if (address.isUnresolved())` didn't solve the problem?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] symat commented on pull request #1524: ZOOKEEPER-3991: Refresh InetSocketAddress before binding quorum server socket

Posted by GitBox <gi...@apache.org>.
symat commented on pull request #1524:
URL: https://github.com/apache/zookeeper/pull/1524#issuecomment-727803765


   thank you @lvisterin for your contribution!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] symat edited a comment on pull request #1524: ZOOKEEPER-3991: Refresh InetSocketAddress before binding quorum server socket

Posted by GitBox <gi...@apache.org>.
symat edited a comment on pull request #1524:
URL: https://github.com/apache/zookeeper/pull/1524#issuecomment-723986677


   > That said I think it is reasonable to say Zookeeper does not need to support this scenario and that my DNS just has to be fixed. Most users won't run into this. So let me know if you'd prefer for me to add it back
   
   no, I actually prefer it as it is (without the `isUnresolved` check). I was just curious :) Thanks for your explanation!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] eolivelli commented on pull request #1524: ZOOKEEPER-3991: Refresh InetSocketAddress before binding quorum server socket

Posted by GitBox <gi...@apache.org>.
eolivelli commented on pull request #1524:
URL: https://github.com/apache/zookeeper/pull/1524#issuecomment-722934617


   I have restarted the build
   https://ci-hadoop.apache.org/blue/organizations/jenkins/zookeeper-precommit-github-pr/detail/PR-1524/2/pipeline
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] lvisterin commented on pull request #1524: ZOOKEEPER-3991: Refresh InetSocketAddress before binding quorum server socket

Posted by GitBox <gi...@apache.org>.
lvisterin commented on pull request #1524:
URL: https://github.com/apache/zookeeper/pull/1524#issuecomment-722357182


   I have tried to reproduce the CI error:
   
   ```
   StandaloneDisabledTest.startSingleServerTest:112->startObservers:224->startServer:176 Error- Server started in Standalone Mode! ==> expected: <true> but was: <false>
   ```
   
   But on my machine on a clean working tree I do not get any test failures, using an identical test command. Is this a known issue or should I dig deeper?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] symat commented on pull request #1524: ZOOKEEPER-3991: Refresh InetSocketAddress before binding quorum server socket

Posted by GitBox <gi...@apache.org>.
symat commented on pull request #1524:
URL: https://github.com/apache/zookeeper/pull/1524#issuecomment-727802805


   I merged this to branch-3.6 and to master.
   The cherry-pick to branch-3.5 was not clean. I'm not really sure, but quickly checking the code I think this fix is not even needed on 3.5, we might already recreate the socket address there.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] lvisterin commented on pull request #1524: ZOOKEEPER-3991: Refresh InetSocketAddress before binding quorum server socket

Posted by GitBox <gi...@apache.org>.
lvisterin commented on pull request #1524:
URL: https://github.com/apache/zookeeper/pull/1524#issuecomment-725991120


   >however today I noticed the same issue appearing in Leader.java 's createServerSocket
   
   I have not been able to repro this one, but identified it is most likely caused by an issue on our DNS where the response flips between bad and good IP addresses (due to consistency issues). That way it was able to pass this fixed part, land in that other piece of code, and break again. The problem was that at that point things were so messed up that there are many other issues so a simple fix there does not help.
   
   Therefore I believe this PR is sufficient to make Zookeeper more resilient for users with a laggy DNS, and in cases where the DNS gives a random mix of good and bad replies it is better fixed on the user's side instead (which I did on my end).
   
   So for me this is good to be merged if you agree.
   
   Thanks for the review! 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] lvisterin commented on pull request #1524: ZOOKEEPER-3991: Refresh InetSocketAddress before binding quorum server socket

Posted by GitBox <gi...@apache.org>.
lvisterin commented on pull request #1524:
URL: https://github.com/apache/zookeeper/pull/1524#issuecomment-723870998


   > I saw you originally guarded the address recreation with the following condition:
   > 
   > ```
   > if (address.isUnresolved()) { ... }
   > ```
   > 
   > Why did you decide to remove this condition? Anyway, this code shouldn't run frequently, so I'm fine to recreate the address every time... but I'm curious: having the `if (address.isUnresolved())` didn't solve the problem?
   
   In the container environment I am running in, it is possible to get the old address of a certain Zookeeper instance when it is rescheduled. In such case the address will be "resolved", just to the wrong one, and it will again fail in a loop.
   
   That said I think it is reasonable to say Zookeeper does not need to support this scenario and that my DNS just has to be fixed. Most users won't run into this. So let me know if you'd prefer for me to add it back, because I do have other workarounds for this error case.
   
   > sure, go ahead, I'll approve the change and merge it. Let us know when you think you finished
   
   Have not gotten to committing this yet because I am dealing with some other strange things happening in our setup and need to make sure this patch is good. I'll leave a comment when I know more.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] lvisterin commented on pull request #1524: ZOOKEEPER-3991: Refresh InetSocketAddress before binding quorum server socket

Posted by GitBox <gi...@apache.org>.
lvisterin commented on pull request #1524:
URL: https://github.com/apache/zookeeper/pull/1524#issuecomment-722938335


   I have been running this code for a couple of days, however today I noticed the same issue appearing in `Leader.java` 's `createServerSocket`, so I am planning to fix that in this PR too if possible.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] symat commented on pull request #1524: ZOOKEEPER-3991: Refresh InetSocketAddress before binding quorum server socket

Posted by GitBox <gi...@apache.org>.
symat commented on pull request #1524:
URL: https://github.com/apache/zookeeper/pull/1524#issuecomment-726157835


   sure, let's merge it! 
   I think this is some kind of a bugfix, so we should merge it to all active branches, not only to master.
   (I'm a bit overloaded right now, but I can do it on Monday, if noone else is doing it before)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] asfgit closed pull request #1524: ZOOKEEPER-3991: Refresh InetSocketAddress before binding quorum server socket

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #1524:
URL: https://github.com/apache/zookeeper/pull/1524


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] eolivelli commented on pull request #1524: ZOOKEEPER-3991: Refresh InetSocketAddress before binding quorum server socket

Posted by GitBox <gi...@apache.org>.
eolivelli commented on pull request #1524:
URL: https://github.com/apache/zookeeper/pull/1524#issuecomment-725994957


   @symat are you okay with this patch ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] symat commented on pull request #1524: ZOOKEEPER-3991: Refresh InetSocketAddress before binding quorum server socket

Posted by GitBox <gi...@apache.org>.
symat commented on pull request #1524:
URL: https://github.com/apache/zookeeper/pull/1524#issuecomment-723986677


   > That said I think it is reasonable to say Zookeeper does not need to support this scenario and that my DNS just has to be fixed. Most users won't run into this. So let me know if you'd prefer for me to add it back
   
   no, I actually prefer it as it is (without the `isUnresolved`) check. I was just curious :) Thanks for your explanation!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] symat commented on pull request #1524: ZOOKEEPER-3991: Refresh InetSocketAddress before binding quorum server socket

Posted by GitBox <gi...@apache.org>.
symat commented on pull request #1524:
URL: https://github.com/apache/zookeeper/pull/1524#issuecomment-723854767


   > I have been running this code for a couple of days, however today I noticed the same issue appearing in Leader.java 's createServerSocket, so I am planning to fix that in this PR too if possible.
   
   sure, go ahead, I'll approve the change and merge it. Let us know when you think you finished


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org