You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org> on 2011/06/10 01:00:59 UTC

[jira] [Created] (HBASE-3975) NoServerForRegionException stalls write pipeline

NoServerForRegionException stalls write pipeline
------------------------------------------------

                 Key: HBASE-3975
                 URL: https://issues.apache.org/jira/browse/HBASE-3975
             Project: HBase
          Issue Type: Bug
          Components: client
    Affects Versions: 0.90.3, 0.89.20100924, 0.92.0
            Reporter: Nicolas Spiegelberg
            Assignee: Nicolas Spiegelberg


When we process a batch of puts, the current algorithm basically goes like this:

1. Find all servers for the Put requests
2. Partition Puts by servers
3. Make requests
4. Collect success/error results

If we throw an IOE in step 1 or 2, we will abort the whole batch operation.  In our case, this was an NoServerForRegionException due to region rebalancing.  However, the asynchronous put case normally has requests going to a wide variety of servers.  We should fail all the put requests that throw an IOE in Step 1 but continue to try all the put requests that succeed at this stage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3975) NoServerForRegionException stalls write pipeline

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13444266#comment-13444266 ] 

Lars Hofhansl commented on HBASE-3975:
--------------------------------------

@Nicolas: How do you envision the API here? Will the client get a list of successful Puts and then has to disentangle those from the successful and retry?
                
> NoServerForRegionException stalls write pipeline
> ------------------------------------------------
>
>                 Key: HBASE-3975
>                 URL: https://issues.apache.org/jira/browse/HBASE-3975
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.89.20100924, 0.90.3, 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>
> When we process a batch of puts, the current algorithm basically goes like this:
> 1. Find all servers for the Put requests
> 2. Partition Puts by servers
> 3. Make requests
> 4. Collect success/error results
> If we throw an IOE in step 1 or 2, we will abort the whole batch operation.  In our case, this was an NoServerForRegionException due to region rebalancing.  However, the asynchronous put case normally has requests going to a wide variety of servers.  We should fail all the put requests that throw an IOE in Step 1 but continue to try all the put requests that succeed at this stage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira