You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Austin Heyne <ah...@ccri.com> on 2018/08/30 15:30:17 UTC

HA master on EMR

HBase on EMR is fairly reliable but is still subject to hardware 
failures (which has happened to me before). Is there a best practice for 
adding backup masters to an EMR cluster?

I know this isn't technically a supported feature from AWS but we're 
already heavily invested into HBase on EMR and would like to investigate 
options on mitigating the risk of a master failure. In EMR if the master 
dies the entire cluster is terminated so we need fail over for HBase, 
Hadoop/HDFS and Zookeeper. The one idea that I've had is to create a 
second (or third) EMR cluster with its HBase, Zookeeper and Hadoop/HDFS 
configuration pointed to the primary cluster. This would in effect add 
the RegionServers and Datanodes to the primary cluster. I know that 
loosing 1/3 to 1/2 of your Datanodes would most likely mean you would 
loose some WALs but re-ingesting the last days worth of data is 
acceptable trade off for us in exchange for not having downtime.

I realize this is a slightly crazy idea and using something like 
Kubernetes is the 'correct' solution but I have to work with what we 
have and mitigate possible issues. My question is are there any big 
issues that anyone would foresee us having with this idea?

Thanks for the feedback,
Austin

Re: HA master on EMR

Posted by Austin Heyne <ah...@ccri.com>.

I have played around with ReadReplicas a fair bit and that might be a 
good enough stopgap should something go wrong. Ideally we wouldn't loose 
the primary cluster but that may not be reasonable with our given 
configuration.

Thanks,
Austin

On 08/31/2018 03:50 PM, Zach York wrote:
> Hey Austin,
>
> It sounds like you are asking about read availability in the case where a
> primary cluster becomes unhealthy?
>
> In that case, you should look at the HBase on S3 Read Replica clusters
> feature[1][2]. This allows for High availability reads if the primary
> cluster becomes unhealthy.
>
> Let me know if I misinterpreted your ask!
>
> Thanks,
> Zach
>
> [1]
> https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hbase-s3.html#emr-hbase-s3-read-replica
> [2]
> https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/
>
>
>> ---------- Forwarded message ---------
>>> From: Austin Heyne <ah...@ccri.com>
>>> Date: Thu, Aug 30, 2018 at 8:30 AM
>>> Subject: HA master on EMR
>>> To: <us...@hbase.apache.org>
>>>
>>>
>>> HBase on EMR is fairly reliable but is still subject to hardware
>>> failures (which has happened to me before). Is there a best practice for
>>> adding backup masters to an EMR cluster?
>>>
>>> I know this isn't technically a supported feature from AWS but we're
>>> already heavily invested into HBase on EMR and would like to investigate
>>> options on mitigating the risk of a master failure. In EMR if the master
>>> dies the entire cluster is terminated so we need fail over for HBase,
>>> Hadoop/HDFS and Zookeeper. The one idea that I've had is to create a
>>> second (or third) EMR cluster with its HBase, Zookeeper and Hadoop/HDFS
>>> configuration pointed to the primary cluster. This would in effect add
>>> the RegionServers and Datanodes to the primary cluster. I know that
>>> loosing 1/3 to 1/2 of your Datanodes would most likely mean you would
>>> loose some WALs but re-ingesting the last days worth of data is
>>> acceptable trade off for us in exchange for not having downtime.
>>>
>>> I realize this is a slightly crazy idea and using something like
>>> Kubernetes is the 'correct' solution but I have to work with what we
>>> have and mitigate possible issues. My question is are there any big
>>> issues that anyone would foresee us having with this idea?
>>>
>>> Thanks for the feedback,
>>> Austin
>>>
>>>

-- 
Austin L. Heyne

Re: HA master on EMR

Posted by Zach York <zy...@gmail.com>.

Hey Austin,

It sounds like you are asking about read availability in the case where a
primary cluster becomes unhealthy?

In that case, you should look at the HBase on S3 Read Replica clusters
feature[1][2]. This allows for High availability reads if the primary
cluster becomes unhealthy.

Let me know if I misinterpreted your ask!

Thanks,
Zach

[1]
https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hbase-s3.html#emr-hbase-s3-read-replica
[2]
https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/


> ---------- Forwarded message ---------
>> From: Austin Heyne <ah...@ccri.com>
>> Date: Thu, Aug 30, 2018 at 8:30 AM
>> Subject: HA master on EMR
>> To: <us...@hbase.apache.org>
>>
>>
>> HBase on EMR is fairly reliable but is still subject to hardware
>> failures (which has happened to me before). Is there a best practice for
>> adding backup masters to an EMR cluster?
>>
>> I know this isn't technically a supported feature from AWS but we're
>> already heavily invested into HBase on EMR and would like to investigate
>> options on mitigating the risk of a master failure. In EMR if the master
>> dies the entire cluster is terminated so we need fail over for HBase,
>> Hadoop/HDFS and Zookeeper. The one idea that I've had is to create a
>> second (or third) EMR cluster with its HBase, Zookeeper and Hadoop/HDFS
>> configuration pointed to the primary cluster. This would in effect add
>> the RegionServers and Datanodes to the primary cluster. I know that
>> loosing 1/3 to 1/2 of your Datanodes would most likely mean you would
>> loose some WALs but re-ingesting the last days worth of data is
>> acceptable trade off for us in exchange for not having downtime.
>>
>> I realize this is a slightly crazy idea and using something like
>> Kubernetes is the 'correct' solution but I have to work with what we
>> have and mitigate possible issues. My question is are there any big
>> issues that anyone would foresee us having with this idea?
>>
>> Thanks for the feedback,
>> Austin
>>
>>