You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Stephen Mallette (Jira)" <ji...@apache.org> on 2022/12/15 23:48:00 UTC

[jira] [Closed] (TINKERPOP-2813) Improve driver usability for cases where NoHostAvailableException is currently thrown

     [ https://issues.apache.org/jira/browse/TINKERPOP-2813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen Mallette closed TINKERPOP-2813.
---------------------------------------
    Fix Version/s: 3.7.0
                   3.6.2
                   3.5.5
       Resolution: Done

> Improve driver usability for cases where NoHostAvailableException is currently thrown
> -------------------------------------------------------------------------------------
>
>                 Key: TINKERPOP-2813
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2813
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: driver
>    Affects Versions: 3.5.4
>            Reporter: Stephen Mallette
>            Assignee: Stephen Mallette
>            Priority: Blocker
>             Fix For: 3.7.0, 3.6.2, 3.5.5
>
>
> A {{NoHostAvailableException}} occurs in two cases:
> 1. where the {{Client}} is initialized and a failure occurs on all {{Host}} instances configured
> 2. when the {{Client}} attempts to {{chooseConnection()}} to send a request and all {{Host}} instances configured are marked unavailable.
> In the first case, you can get a cause for the failure which is helpful, but the inadequacy is that you only get the failure of the first {{Host}} to cause a problem. The second case is a bit worse because there you get no cause in the exception and it's a "fast fail" in that as soon as the request is sent there is no pause to see if the {{Host}} comes back online. Moreover, a {{Host}} can be marked for failure for the infraction of just a single {{Connection}} that may have just encountered a intermittent network issue, thus quite quickly killing the entire {{ConnectionPool}} and turning 100s or requests per second into 100s of {{NoHostAvailableException}} per second. Note that you can also get an infraction for the pool just being overloaded with requests which may signal that either the pool or server not being sized right for the current workload - in either case, the {{NoHostAvailableException}} is a bit of a harsh way to deal with that and in any event doesn't quite give the user clues as to how to deal with it.
> All in all, this situation makes {{NoHostAvailableException}} hard to debug. This ticket is meant to help smooth some of these problems. Initial thoughts for improvements include better logging, ensuring that {{NoHostAvailableException}} is not thrown without a cause, preferring more specific exceptions in the fist place to {{NoHostAvailableException}}, getting rid of "fast fails" in favor of longer pauses to see if a host can recover and taking a softer stance on when a {{Host}} is actually considered "unavailable".
> Expecting to implement this without breaking API changes, though exceptions may shift around a bit, but will try to keep those to a minimum.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)