You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Ajay Jadhav (JIRA)" <ji...@apache.org> on 2017/09/01 00:13:02 UTC

[jira] [Comment Edited] (HBASE-18477) Umbrella JIRA for HBase Read Replica clusters

    [ https://issues.apache.org/jira/browse/HBASE-18477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16142298#comment-16142298 ] 

Ajay Jadhav edited comment on HBASE-18477 at 9/1/17 12:12 AM:
--------------------------------------------------------------

[~ashish singhi]:
1. The main intention of having the read-replica feature was to provide HA in case one of the AZ goes down. For now, it is up to the
client to decide which cluster to connect to. Having a load balancer in front of the active clusters which tracks the health of each one
could be something we can think about in future work.

2 & 3. To make a cluster read-only, we added this config flag- hbase.global.readonly.enabled=true which can be set through hbase-site.xml
This is how a user can determine if the cluster is read-only.
But apart from that, doing this programmatically involves catching the readonly exception.
IMO I think since the client always know which cluster they are talking too, providing a programmatic way seems unnecessary.
Let me know what is your concern here.



was (Author: ajayjadhav):
[~ashish singhi]:
1. The main intention of having the read-replica feature was to provide HA in case one of the AZ goes down. For now, it is up to the
client to decide which cluster to connect to. Having a load balancer in front of the active clusters which tracks the health of each one
could be something we can think about in future work.

2 & 3. The hbase-site.xml for a read-replica will have this key set- "hbase.emr.readreplica.enabled":"true"
But apart from that, doing this programmatically involves catching the readonly exception.
IMO I think since the client always know which cluster they are talking too, providing a programmatic way seems unnecessary.
Let me know what is your concern here.

Sure, I'll update the documentation.



> Umbrella JIRA for HBase Read Replica clusters
> ---------------------------------------------
>
>                 Key: HBASE-18477
>                 URL: https://issues.apache.org/jira/browse/HBASE-18477
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Zach York
>            Assignee: Zach York
>         Attachments: HBase Read-Replica Clusters Scope doc.docx, HBase Read-Replica Clusters Scope doc.pdf, HBase Read-Replica Clusters Scope doc_v2.docx
>
>
> Recently, changes (such as HBASE-17437) have unblocked HBase to run with a root directory external to the cluster (such as in Amazon S3). This means that the data is stored outside of the cluster and can be accessible after the cluster has been terminated. One use case that is often asked about is pointing multiple clusters to one root directory (sharing the data) to have read resiliency in the case of a cluster failure.
>  
> This JIRA is an umbrella JIRA to contain all the tasks necessary to create a read-replica HBase cluster that is pointed at the same root directory.
>  
> This requires making the Read-Replica cluster Read-Only (no metadata operation or data operations).
> Separating the hbase:meta table for each cluster (Otherwise HBase gets confused with multiple clusters trying to update the meta table with their ip addresses)
> Adding refresh functionality for the meta table to ensure new metadata is picked up on the read replica cluster.
> Adding refresh functionality for HFiles for a given table to ensure new data is picked up on the read replica cluster.
>  
> This can be used with any existing cluster that is backed by an external filesystem.
>  
> Please note that this feature is still quite manual (with the potential for automation later).
>  
> More information on this particular feature can be found here: https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)