You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tinkerpop.apache.org by GitBox <gi...@apache.org> on 2022/09/07 09:15:44 UTC

[GitHub] [tinkerpop] FlorianHockmann commented on pull request #1792: Implement Python Query/Connection Retry Logic

FlorianHockmann commented on PR #1792:
URL: https://github.com/apache/tinkerpop/pull/1792#issuecomment-1239127692

I agree that adding retry logic directly into the drivers isn't trivial. @kenhuuu mentions an important topic that we have to consider if we want to add a general retry logic:

> A possible improvement might be to test if the error should be retried. E.g. don't retry for non-recoverable errors like incorrect password.

Apart from non-recoverable errors, can't we also run into a situation where a traversal was already successfully evaluated on the server but sending back its results to the driver failed due to some network problem? Simply sending that traversal (which must be considered as failed by the driver as it didn't receive a successful response by the server) to the server again can be problematic if the traversal modified the graph, e.g., we shouldn't resend an `addV()` step as it results in duplicates.

[This article](https://devblogs.microsoft.com/azure-sql/configurable-retry-logic-for-microsoft-data-sqlclient/) explains some considerations that were made when a retry logic was added to Microsoft's .NET SQL client. That's also where I got the scenario from I just described with mutating traversals. They handle it by letting users configure which SQL commands should be retried / which not so that mutating commands can be skipped for the retry.

I think we can add a retry logic to the drivers, but we should make it configurable for users. This means that users should be able to configure:
1. Whether they want to use our retry logic in general (so they can implement their own instead / don't use a retry logic)
2. Which exceptions should be retried, e.g., transient network errors, but not failures from the server or only specific failures from the server, but not a `FORBIDDEN` response for example.
3. Which traversals should be retried, e.g., retry a `g.V().has()[...]` traversal, but not a mutating traversal.
4. Number of retries, times to wait between retry, and so on (exponential retry with / without a random jitter could be added, but doesn't have to be, especially in the first version, in my opinion).

Number 3 is probably a lot easier to implement, then letting users specify whether they don't want to retry specific Gremlin steps or mutating traversals in general.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tinkerpop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org