You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tinkerpop.apache.org by GitBox <gi...@apache.org> on 2021/03/11 15:59:53 UTC

[GitHub] [tinkerpop] radu-iviniciu opened a new pull request #1404: TINKERPOP-2531: Make sure connection pool is restored and brought back to health if server is temporarily unavailable.

radu-iviniciu opened a new pull request #1404:
URL: https://github.com/apache/tinkerpop/pull/1404


   Link to Jira ticket: https://issues.apache.org/jira/browse/TINKERPOP-2531
   
   DESCITPTION:
       If the server becomes unavailable for a period of time the connection pool might not be able to recreate a connection right after it closes. Even with a few retries with a backoff to recrete the connections, it might not be enough and the connections end up being removed from the _deadConnections array and never end up as valid ones in the _connections array.
   
       This change tries to account for that, and, despite the caller getting a ServerUnavailableException, if the caller chooses to continue using the GremlinClient it already has, the ConnectionPool will try to restore itself.
   
       Alternatives considered:
       - Once the server is deemed unavailable we could leave it up to the caller to handle it and potentially recreate the GremlinClient. In that case this change would not be needed and the only improvement might be to better document that ServerUnavailableException means that the connection will most likely not be restored even if the server becomes available again.
   
       Chose this approach over the alternative since it seems like a feature that a ConnectionPool should have. If done on the caller side it might end having to implement a logic very similar to the one in the ConnectionPool itself just centered around GremlinClients rather than websocket connections. 
   
   TESTING: 
   - Tested in our project setup with a Gremlin Client aimed at an AWS Neptune instance. Manually interrupted the connection / restarted or modified the instance to observe reconnecting behavior. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] FlorianHockmann merged pull request #1404: TINKERPOP-2531: Make sure connection pool is restored and brought back to health if server is temporarily unavailable.

Posted by GitBox <gi...@apache.org>.
FlorianHockmann merged pull request #1404:
URL: https://github.com/apache/tinkerpop/pull/1404


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] spmallette commented on pull request #1404: TINKERPOP-2531: Make sure connection pool is restored and brought back to health if server is temporarily unavailable.

Posted by GitBox <gi...@apache.org>.
spmallette commented on pull request #1404:
URL: https://github.com/apache/tinkerpop/pull/1404#issuecomment-800992406


   @FlorianHockmann please be sure to add a changelog entry on merge. 
   
   VOTE +1


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] FlorianHockmann commented on pull request #1404: TINKERPOP-2531: Make sure connection pool is restored and brought back to health if server is temporarily unavailable.

Posted by GitBox <gi...@apache.org>.
FlorianHockmann commented on pull request #1404:
URL: https://github.com/apache/tinkerpop/pull/1404#issuecomment-800094696


   The tests look good and your latest modification to the pool also makes sense to me.
   
   VOTE +1


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] radu-iviniciu commented on pull request #1404: TINKERPOP-2531: Make sure connection pool is restored and brought back to health if server is temporarily unavailable.

Posted by GitBox <gi...@apache.org>.
radu-iviniciu commented on pull request #1404:
URL: https://github.com/apache/tinkerpop/pull/1404#issuecomment-800228354


   Thanks for the review @FlorianHockmann. 
   Is there an estimated timeline for the next minor release into which this change might make it ? That'll be 3.4.11 right ? 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] FlorianHockmann commented on pull request #1404: TINKERPOP-2531: Make sure connection pool is restored and brought back to health if server is temporarily unavailable.

Posted by GitBox <gi...@apache.org>.
FlorianHockmann commented on pull request #1404:
URL: https://github.com/apache/tinkerpop/pull/1404#issuecomment-801046798


   Sure, I can add a changelog entry for this when I merge it.
   
   > Is there an estimated timeline for the next minor release into which this change might make it ? That'll be 3.4.11 right ?
   
   3.4.11 will be the next version and yes, this should be included in that release. Since 3.4.11 will probably be released together with 3.5.0 and @spmallette [just commented](https://issues.apache.org/jira/browse/TINKERPOP-2534?focusedCommentId=17302827&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17302827) on an expected release data for that version, I'm just adding his comment here again:
   
   > we had discussed the end of this month but I think it will dip into April at this point unfortunately.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] radu-iviniciu commented on pull request #1404: TINKERPOP-2531: Make sure connection pool is restored and brought back to health if server is temporarily unavailable.

Posted by GitBox <gi...@apache.org>.
radu-iviniciu commented on pull request #1404:
URL: https://github.com/apache/tinkerpop/pull/1404#issuecomment-801985937


   Thank you. Looking forward to it. 
   And thanks again for all your work going into this. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] radu-iviniciu commented on pull request #1404: TINKERPOP-2531: Make sure connection pool is restored and brought back to health if server is temporarily unavailable.

Posted by GitBox <gi...@apache.org>.
radu-iviniciu commented on pull request #1404:
URL: https://github.com/apache/tinkerpop/pull/1404#issuecomment-797787485


   I added the tests and slightly loosened the condition in which we try to refill the pool with new connections to account for rare cases where some connections are restored and some not to avoid having the connection pool running on fewer connections that it is configured to.   
   
   Thanks for the quick review and looking forward to see this in action :) 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] FlorianHockmann commented on pull request #1404: TINKERPOP-2531: Make sure connection pool is restored and brought back to health if server is temporarily unavailable.

Posted by GitBox <gi...@apache.org>.
FlorianHockmann commented on pull request #1404:
URL: https://github.com/apache/tinkerpop/pull/1404#issuecomment-797394617


   Hey @radu-iviniciu and thanks for your contribution to TinkerPop!
   
   The functionality to replace closed connections automatically in the background was only added recently (see [TINKERPOP-2288](https://issues.apache.org/jira/browse/TINKERPOP-2288)). So, there are definitely cases that aren't covered yet by it. The problem you describe in TINKERPOP-2531 is of course a case where the current retry logic isn't enough when the server is offline longer than the retry period as we then don't start another repair attempt. We originally thought about adding a scheduled repair job that checks the pool every minute or so and tries to repair it if necessary but didn't implement it yet because it would add a lot of complexity to the driver that I would like to avoid having if we don't really need it.
   
   I really like your solution to this problem as it's much simpler than a scheduled repair but it still solves the problem. It ensures that we will try again to repair the pool if a new request should be sent but the previous attempt failed and it shouldn't add unnecessary costs as we already have a mechanism to only perform one repair operation in parallel.
   
   I see only one thing missing in this PR: Could you please add a test case for the problem this is trying to solve? We already have a few [`ConnectionPoolTests`](https://github.com/apache/tinkerpop/blob/3.4-dev/gremlin-dotnet/test/Gremlin.Net.UnitTest/Driver/ConnectionPoolTests.cs) that use mocks to create the scenarios they want to test. If you have problems with creating a test case for this, then I can also give it a try.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tinkerpop] radu-iviniciu commented on pull request #1404: TINKERPOP-2531: Make sure connection pool is restored and brought back to health if server is temporarily unavailable.

Posted by GitBox <gi...@apache.org>.
radu-iviniciu commented on pull request #1404:
URL: https://github.com/apache/tinkerpop/pull/1404#issuecomment-797712845


   Glad to hear that. 
   Sure thing, I'll try to push an update with tests for this as soon as possible.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org