You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@geode.apache.org by "bschuchardt (GitHub)" <gi...@apache.org> on 2019/04/02 20:50:12 UTC

[GitHub] [geode] bschuchardt opened pull request #3397: GEODE-6423 availability checks sometimes immediately initiate removal

Do not loop in trying to form a tcp/ip connection to a suspect unless
the next step is to remove the suspect from membership.  In this case
there will be another invocation of the same method that will take the
removal step next.

Thank you for submitting a contribution to Apache Geode.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?

- [ ] Has your PR been rebased against the latest commit within the target branch (typically `develop`)?

- [ ] Is your initial contribution a single, squashed commit?

- [ ] Does `gradlew build` run cleanly?

- [ ] Have you written or updated unit tests to verify your changes?

- [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?

### Note:
Please ensure that once the PR is submitted, check Concourse for build issues and
submit an update to your PR as soon as possible. If you need help, please send an
email to dev@geode.apache.org.


[ Full content available at: https://github.com/apache/geode/pull/3397 ]
This message was relayed via gitbox.apache.org for notifications@geode.apache.org

[GitHub] [geode] jhuynh1 commented on pull request #3397: GEODE-6423 availability checks sometimes immediately initiate removal

Posted by "jhuynh1 (GitHub)" <gi...@apache.org>.
nitpick-typo retryIfConnectFails

[ Full content available at: https://github.com/apache/geode/pull/3397 ]
This message was relayed via gitbox.apache.org for notifications@geode.apache.org

[GitHub] [geode] bschuchardt commented on issue #3397: GEODE-6423 availability checks sometimes immediately initiate removal

Posted by "bschuchardt (GitHub)" <gi...@apache.org>.
The dunit failure doesn't reproduce.  I ran the test 500 times with no problems.

[ Full content available at: https://github.com/apache/geode/pull/3397 ]
This message was relayed via gitbox.apache.org for notifications@geode.apache.org

[GitHub] [geode] bschuchardt commented on pull request #3397: GEODE-6423 availability checks sometimes immediately initiate removal

Posted by "bschuchardt (GitHub)" <gi...@apache.org>.
Good question Ryan.  These log messages used to be at debug/trace level or didn't even exist.  We changed them info-level in response to some membership-service issues we kept running into.  Those are largely solved now and in order to keep the amount of logging under control I've changed them back.  This log message that I deleted was only added in mid-January.

[ Full content available at: https://github.com/apache/geode/pull/3397 ]
This message was relayed via gitbox.apache.org for notifications@geode.apache.org

[GitHub] [geode] mcmellawatt commented on pull request #3397: GEODE-6423 availability checks sometimes immediately initiate removal

Posted by "mcmellawatt (GitHub)" <gi...@apache.org>.
>From the PR title it sounds like the issue was that we aren't pinging the suspect member enough times before giving up (immediately initiating removal).  Why doesn't the giveupTime handle that problem?  We should continue to retry for up to the configured member timeout.  Since we are &&'ing all these conditions together, won't it still break if the timeout is exceeded even if retryIfConnectFails is true?  Forgive me if these are amateur questions, not super familiar with this code.

[ Full content available at: https://github.com/apache/geode/pull/3397 ]
This message was relayed via gitbox.apache.org for notifications@geode.apache.org

[GitHub] [geode] mcmellawatt commented on pull request #3397: GEODE-6423 availability checks sometimes immediately initiate removal

Posted by "mcmellawatt (GitHub)" <gi...@apache.org>.
I'm not super familiar with this area of the code so I can't do a great review in terms of the functional changes you made, but why are we able to increase verbosity levels of many log messages in this PR, and drop this message all together?  Were they not providing any value?

[ Full content available at: https://github.com/apache/geode/pull/3397 ]
This message was relayed via gitbox.apache.org for notifications@geode.apache.org

[GitHub] [geode] bschuchardt closed pull request #3397: GEODE-6423 availability checks sometimes immediately initiate removal

Posted by "bschuchardt (GitHub)" <gi...@apache.org>.
[ pull request closed by bschuchardt ]

[ Full content available at: https://github.com/apache/geode/pull/3397 ]
This message was relayed via gitbox.apache.org for notifications@geode.apache.org

[GitHub] [geode] bschuchardt commented on pull request #3397: GEODE-6423 availability checks sometimes immediately initiate removal

Posted by "bschuchardt (GitHub)" <gi...@apache.org>.
I'll fix that

[ Full content available at: https://github.com/apache/geode/pull/3397 ]
This message was relayed via gitbox.apache.org for notifications@geode.apache.org