You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Mikhail Antonov (JIRA)" <ji...@apache.org> on 2014/05/21 02:36:43 UTC
[jira] [Commented] (HBASE-10070) HBase read high-availability using timeline-consistent region replicas

    [ https://issues.apache.org/jira/browse/HBASE-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004159#comment-14004159 ] 

Mikhail Antonov commented on HBASE-10070:
-----------------------------------------

Guys, sorry for coming to this topic really late, but there're some considerations I'd like to bring up (I'm looking at whatever is committed currently in the branch).

I think about the client API changes needed for consistency (HBASE-10354) in light of the consensus-based effort which aims to having multiple active replicas, and how to make those 2 approaches smoothly work together so multiple active replicas could be built just upon this solid foundation, perhaps there may be more flexible alternative.

Instead of defining enum Consistency {strong, timeline} and hooking into Get and Scan,  and defining several possible internal strategies on how to send RPCs based on that ("primary timeout", "parallel", "parallel with delay" ) may be we can define pluggable strategy on how to execute RPCs? Similar to HDFS FailoverProxyProvider, which can be defined in the client's config.

This way we can use pluggable:

 - "no-op" provider, to have behavior like what we have now in trunk
 - timeline provider, which would work as described here
 - provider which can work with multiple active replicas and round-robin between them if some of them fail.

Thoughts?

> HBase read high-availability using timeline-consistent region replicas
> ----------------------------------------------------------------------
>
>                 Key: HBASE-10070
>                 URL: https://issues.apache.org/jira/browse/HBASE-10070
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>         Attachments: HighAvailabilityDesignforreadsApachedoc.pdf
>
>
> In the present HBase architecture, it is hard, probably impossible, to satisfy constraints like 99th percentile of the reads will be served under 10 ms. One of the major factors that affects this is the MTTR for regions. There are three phases in the MTTR process - detection, assignment, and recovery. Of these, the detection is usually the longest and is presently in the order of 20-30 seconds. During this time, the clients would not be able to read the region data.
> However, some clients will be better served if regions will be available for reads during recovery for doing eventually consistent reads. This will help with satisfying low latency guarantees for some class of applications which can work with stale reads.
> For improving read availability, we propose a replicated read-only region serving design, also referred as secondary regions, or region shadows. Extending current model of a region being opened for reads and writes in a single region server, the region will be also opened for reading in region servers. The region server which hosts the region for reads and writes (as in current case) will be declared as PRIMARY, while 0 or more region servers might be hosting the region as SECONDARY. There may be more than one secondary (replica count > 2).
> Will attach a design doc shortly which contains most of the details and some thoughts about development approaches. Reviews are more than welcome. 
> We also have a proof of concept patch, which includes the master and regions server side of changes. Client side changes will be coming soon as well. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)