You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hama.apache.org by Anbarasan Murthy <an...@hcl.com> on 2013/02/15 08:15:37 UTC

High Availability & Fault Tolerance

Does HAMA supports

*         Fault tolerance

*         High Availability

What happens when a groom server goes down?

What happens when a BSPMaster goes down?


Thanks,
Anbu.




::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------

Re: High Availability & Fault Tolerance

Posted by Suraj Menon <su...@apache.org>.
Hama's fault tolerance capability is immature at the moment.
If enabled, Hama checkpoints messages exchanged on HDFS to recover from
when the task fails. And, this is supported only when using Superstep API,
we have introduced.
BSPMaster is still a single point of failure. A GroomServer failure could
be recovered in most scenarios but not all.

I would be glad to help you if you intend to work on this. Our focus is
currently on messaging scalability, then on YARN, before we get back to
fault tolerance.

Thanks,
Suraj

On Fri, Feb 15, 2013 at 2:15 AM, Anbarasan Murthy <an...@hcl.com>wrote:

>
> Does HAMA supports
>
> *         Fault tolerance
>
> *         High Availability
>
> What happens when a groom server goes down?
>
> What happens when a BSPMaster goes down?
>
>
> Thanks,
> Anbu.
>
>
>
>
> ::DISCLAIMER::
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
> The contents of this e-mail and any attachment(s) are confidential and
> intended for the named recipient(s) only.
> E-mail transmission is not guaranteed to be secure or error-free as
> information could be intercepted, corrupted,
> lost, destroyed, arrive late or incomplete, or may contain viruses in
> transmission. The e mail and its contents
> (with or without referred errors) shall therefore not attach any liability
> on the originator or HCL or its affiliates.
> Views or opinions, if any, presented in this email are solely those of the
> author and may not necessarily reflect the
> views or opinions of HCL or its affiliates. Any form of reproduction,
> dissemination, copying, disclosure, modification,
> distribution and / or publication of this message without the prior
> written consent of authorized representative of
> HCL is strictly prohibited. If you have received this email in error
> please delete it and notify the sender immediately.
> Before opening any email and/or attachments, please check them for viruses
> and other defects.
>
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>