You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Raju Bairishetti <ra...@apache.org> on 2015/08/20 16:03:31 UTC

How to bring standby namenode to active mode in case of active host node down?

Hello devs,

We are hitting an issue, in which none of the  namenodes are in active
state.

Our Setup:
---------------
   We are using *automatic failover* for Hadoop Namenode HA(two node) and
*SshFencing* as fencing mechanism


*1) What happens if namenode host goes down?*

*          This is causing none of the namenodes are active as standby
namenode** tries to fence the older active node but SshFence will be failed
as it won't be even able to connect to host itself. Standby node rejoin the
election after some sleep time and again tries to fence and it fails. It
keeps on trying until fencing is successful*

*What should be the better way to bring the standby node to active mode in
case of active namenode host down? *

Re: How to bring standby namenode to active mode in case of active host node down?

Posted by Raju Bairishetti <ra...@apache.org>.
Seems fencing mechanism is still required with apache hadoop 2.6 branch to
avoid split brain case. We are already using 3 journal code node quorum in
production. We are not able to bring it standby to active in case of active
namenode host down.

What should be the better way to bring stnadby namenode to active when the
active namenode host down (with hadoop 2.6) ?



On Fri, Aug 21, 2015 at 8:16 AM, Raju Bairishetti <ra...@gmail.com>
wrote:

> Thanks @vinayakumarb for the reply,
>
>  Seems we are using QJM (3 journal nodes) from 2.6 apache hadoop branch.
>
>         <property>
>
>                 <name>dfs.namenode.shared.edits.dir</name>
>
>                  <value>
> qjournal://<journalnode1>:<port>;<journalnode2>:<port>;<journalnode3>:<port>/<journalId>
> </value>
>
>         </property>
>
> On Thu, Aug 20, 2015 at 8:05 PM, Vinayakumar B <vi...@apache.org>
> wrote:
>
>> Actually SSH fencing was introduced before QuorumJournal was there to
>> ensure only one active writer at all times.
>>
>> If you are using QuorumJournal for shared edits.. you dont need to
>> configure SSH fencing. QuorumJournal will ensure its allowing only one
>> writer at a time.
>>
>> -Vinay
>>
>
>
>
> --
> Thanks
> Raju Bairishetti,
>
> www.inmobi.com
>
>
>
>
> *2012 ME Best Mobile Ad Network Award Winner
>
> [image: http://www.inmobi.com/press-releases/files/2013/02/MIT.png]
>
>
>

Re: How to bring standby namenode to active mode in case of active host node down?

Posted by Raju Bairishetti <ra...@gmail.com>.
Thanks @vinayakumarb for the reply,

 Seems we are using QJM (3 journal nodes) from 2.6 apache hadoop branch.

        <property>

                <name>dfs.namenode.shared.edits.dir</name>

                 <value>
qjournal://<journalnode1>:<port>;<journalnode2>:<port>;<journalnode3>:<port>/<journalId>
</value>

        </property>

On Thu, Aug 20, 2015 at 8:05 PM, Vinayakumar B <vi...@apache.org>
wrote:

> Actually SSH fencing was introduced before QuorumJournal was there to
> ensure only one active writer at all times.
>
> If you are using QuorumJournal for shared edits.. you dont need to
> configure SSH fencing. QuorumJournal will ensure its allowing only one
> writer at a time.
>
> -Vinay
>



-- 
Thanks
Raju Bairishetti,

www.inmobi.com




*2012 ME Best Mobile Ad Network Award Winner

[image: http://www.inmobi.com/press-releases/files/2013/02/MIT.png]

Re: How to bring standby namenode to active mode in case of active host node down?

Posted by Vinayakumar B <vi...@apache.org>.
Actually SSH fencing was introduced before QuorumJournal was there to
ensure only one active writer at all times.

If you are using QuorumJournal for shared edits.. you dont need to
configure SSH fencing. QuorumJournal will ensure its allowing only one
writer at a time.

-Vinay

Re: How to bring standby namenode to active mode in case of active host node down?

Posted by Raju Bairishetti <ra...@gmail.com>.
Thanks in advance :).

We are using apache hadoop 2.6 branch.

On Thu, Aug 20, 2015 at 7:33 PM, Raju Bairishetti <ra...@apache.org> wrote:

> Hello devs,
>
> We are hitting an issue, in which none of the  namenodes are in active
> state.
>
> Our Setup:
> ---------------
>    We are using *automatic failover* for Hadoop Namenode HA(two node) and
> *SshFencing* as fencing mechanism
>
>
> *1) What happens if namenode host goes down?*
>
> *          This is causing none of the namenodes are active as standby
> namenode** tries to fence the older active node but SshFence will be
> failed as it won't be even able to connect to host itself. Standby node
> rejoin the election after some sleep time and again tries to fence and it
> fails. It keeps on trying until fencing is successful*
>
> *What should be the better way to bring the standby node to active mode in
> case of active namenode host down? *
>
>


-- 
Thanks
Raju Bairishetti,

www.inmobi.com




*2012 ME Best Mobile Ad Network Award Winner

[image: http://www.inmobi.com/press-releases/files/2013/02/MIT.png]