You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Gregory Chanan (JIRA)" <ji...@apache.org> on 2012/12/14 23:46:13 UTC
[jira] [Assigned] (HBASE-6752) On region server failure, serve
writes and timeranged reads during the log split
[ https://issues.apache.org/jira/browse/HBASE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gregory Chanan reassigned HBASE-6752:
-------------------------------------
Assignee: (was: Gregory Chanan)
> On region server failure, serve writes and timeranged reads during the log split
> --------------------------------------------------------------------------------
>
> Key: HBASE-6752
> URL: https://issues.apache.org/jira/browse/HBASE-6752
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Affects Versions: 0.96.0
> Reporter: nkeywal
> Priority: Minor
>
> Opening for write on failure would mean:
> - Assign the region to a new regionserver. It marks the region as recovering
> -- specific exception returned to the client when we cannot server.
> -- allow them to know where they stand. The exception can include some time information (failure stated on: ...)
> -- allow them to go immediately on the right regionserver, instead of retrying or calling the region holding meta to get the new address
> => save network calls, lower the load on meta.
> - Do the split as today. Priority is given to region server holding the new regions
> -- help to share the load balancing code: the split is done by region server considered as available for new regions
> -- help locality (the recovered edits are available on the region server) => lower the network usage
> - When the split is finished, we're done as of today
> - while the split is progressing, the region server can
> -- serve writes
> --- that's useful for all application that need to write but not read immediately:
> --- whatever logs events to analyze them later
> --- opentsdb is a perfect example.
> -- serve reads if they have a compatible time range. For heavily used tables, it could be an help, because:
> --- we can expect to have a few minutes of data only (as it's loaded)
> --- the heaviest queries, often accepts a few -or more- minutes delay.
> Some "What if":
> 1) the split fails
> => Retry until it works. As today. Just that we serves writes. We need to know (as today) that the region has not recovered if we fail again.
> 2) the regionserver fails during the split
> => As 1 and as of today/
> 3) the regionserver fails after the split but before the state change to fully available.
> => New assign. More logs to split (the ones already dones and the new ones).
> 4) the assignment fails
> => Retry until it works. As today.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira