You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Thanh Do <th...@cs.wisc.edu> on 2011/05/05 18:28:28 UTC

experience with Backup Node

hi all,

any body deploy the Backup Node in your system.
I am curious about the impact of the Backup Node
to the NameNode throughput.

To my understanding, NameNode streams edits
log operation to the BackupNode (by an RPC call),
and only return once that operation has been applied
to the in memory state of the Backup Node.

Will this RPC call slow down the NameNode a little bit.

Thanks
Thanh

Re: experience with Backup Node

Posted by Thanh Do <th...@cs.wisc.edu>.
I see ...

Thanks for useful feedback, Todd!

On Fri, May 6, 2011 at 7:34 AM, Todd Lipcon <to...@cloudera.com> wrote:

> Hi Thanh,
>
> No, I doubt that anybody is running BackupNode in production, since it's
> only part of 0.21, and in my opinion an incomplete implementation. A few of
> the deficiencies I'm aware of:
>
> - Like you said, edits are transferred by synchronous RPC from the NN. As
> far as I know, there are no timeouts enabled on these RPCs, so if the
> backupnode hangs, so will the primary. In the case of a BN crash, the
> primary will hang for many minutes before noticing.
> - The BN doesn't provide hot standby since it doesn't yet receive block
> reports.
>
> The fact that the RPCs are synchronous seems unavoidable if you want to be
> able to do a failover without any lost edits. But without timeouts, it's a
> bit scary.
>
> Some work will be going on in trunk to address high availability over the
> next several months - we'd definitely appreciate your expertise in failure
> injection, etc, being applied to the new code as it goes in!
>
> -Todd
>
> On Thu, May 5, 2011 at 9:28 AM, Thanh Do <th...@cs.wisc.edu> wrote:
>
>> hi all,
>>
>> any body deploy the Backup Node in your system.
>> I am curious about the impact of the Backup Node
>> to the NameNode throughput.
>>
>> To my understanding, NameNode streams edits
>> log operation to the BackupNode (by an RPC call),
>> and only return once that operation has been applied
>> to the in memory state of the Backup Node.
>>
>> Will this RPC call slow down the NameNode a little bit.
>>
>> Thanks
>> Thanh
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Re: experience with Backup Node

Posted by Todd Lipcon <to...@cloudera.com>.
Hi Thanh,

No, I doubt that anybody is running BackupNode in production, since it's
only part of 0.21, and in my opinion an incomplete implementation. A few of
the deficiencies I'm aware of:

- Like you said, edits are transferred by synchronous RPC from the NN. As
far as I know, there are no timeouts enabled on these RPCs, so if the
backupnode hangs, so will the primary. In the case of a BN crash, the
primary will hang for many minutes before noticing.
- The BN doesn't provide hot standby since it doesn't yet receive block
reports.

The fact that the RPCs are synchronous seems unavoidable if you want to be
able to do a failover without any lost edits. But without timeouts, it's a
bit scary.

Some work will be going on in trunk to address high availability over the
next several months - we'd definitely appreciate your expertise in failure
injection, etc, being applied to the new code as it goes in!

-Todd

On Thu, May 5, 2011 at 9:28 AM, Thanh Do <th...@cs.wisc.edu> wrote:

> hi all,
>
> any body deploy the Backup Node in your system.
> I am curious about the impact of the Backup Node
> to the NameNode throughput.
>
> To my understanding, NameNode streams edits
> log operation to the BackupNode (by an RPC call),
> and only return once that operation has been applied
> to the in memory state of the Backup Node.
>
> Will this RPC call slow down the NameNode a little bit.
>
> Thanks
> Thanh
>



-- 
Todd Lipcon
Software Engineer, Cloudera