You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/12/06 23:22:58 UTC

[jira] [Commented] (DRILL-5098) Improving fault tolerance for connection between client and foreman node.

    [ https://issues.apache.org/jira/browse/DRILL-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15727060#comment-15727060 ] 

ASF GitHub Bot commented on DRILL-5098:
---------------------------------------

GitHub user sohami opened a pull request:

    https://github.com/apache/drill/pull/679

    DRILL-5098: Improving fault tolerance for connection between client a…

    …nd foreman node.
    
             Note: Adding tries config option in connection string.
                   Improving fault tolerance in Drill client when trying to make first connection with foreman.
                   The client will try to connect to min(tries, num_drillbits) unique drillbits unless a successfull
                   connection is established.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sohami/drill DRILL-5098

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/679.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #679
    
----
commit d13c6cc91b72b91b481bd1d428ca77775490a8cf
Author: Sorabh Hamirwasia <sh...@maprtech.com>
Date:   2016-12-01T22:58:00Z

    DRILL-5098: Improving fault tolerance for connection between client and foreman node.
             Note: Adding tries config option in connection string.
                   Improving fault tolerance in Drill client when trying to make first connection with foreman.
                   The client will try to connect to min(tries, num_drillbits) unique drillbits unless a successfull
                   connection is established.

----


> Improving fault tolerance for connection between client and foreman node.
> -------------------------------------------------------------------------
>
>                 Key: DRILL-5098
>                 URL: https://issues.apache.org/jira/browse/DRILL-5098
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Client - JDBC
>            Reporter: Sorabh Hamirwasia
>            Assignee: Sorabh Hamirwasia
>              Labels: doc-impacting
>             Fix For: 1.10
>
>
> With DRILL-5015 we allowed support for specifying multiple Drillbits in connection string and randomly choosing one out of it. Over time some of the Drillbits specified in the connection string may die and the client can fail to connect to Foreman node if random selection happens to be of dead Drillbit.
> Even if ZooKeeper is used for selecting a random Drillbit from the registered one there is a small window when client selects one Drillbit and then that Drillbit went down. The client will fail to connect to this Drillbit and error out. 
> Instead if we try multiple Drillbits (configurable tries count through connection string) then the probability of hitting this error window will reduce in both the cases improving fault tolerance. During further investigation it was also found that if there is Authentication failure then we throw that error as generic RpcException. We need to improve that as well to capture this case explicitly since in case of Auth failure we don't want to try multiple Drillbits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)