You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Krishna Kishore Bonagiri <wr...@gmail.com> on 2013/05/06 10:57:50 UTC

Namenode going to safe mode on YARN

Hi,

  I have been running application on my YARN cluster since around 20 days,
about 5000 applications a day. I am getting the following error today.
Please let me know how can I avoid this, is this happening because of a bug?

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
The reported blocks 4775 needs additional 880 blocks to reach the threshold
0.9990 of total blocks 5660. Safe mode will be turned off automatically.
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
        at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
        at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
        at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)


Thanks,
Kishore

Re: Namenode going to safe mode on YARN

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Nithin & Ted,

  Thanks for the replies.

 I don't know what my replication factor is, I don't seem to have set
anything in my configuration files. I run on a single node cluster. My data
node has gone down and came back, and also I didn't delete any of the hdfs
blocks.

 I know that name node enter safe mode when HDFS is restarted, and will
leave soon. Is it safe to execute command to leave safe mode? I mean, can
something wrong happen if we do it ourselves? because it wouldn't have
collected the needed data and could not leave the safe mode by itself?

  And, does the error I gave above indicate some clue as to what I could do
better?

Thanks,
Kishore



On Mon, May 6, 2013 at 2:56 PM, Ted Xu <tx...@gopivotal.com> wrote:

> Hi Kishore,
>
> It should not be a bug. After restarting HDFS, namenode will enter safe
> mode until all needed data is collected. During safe mode, all update
> operations will fail.
>
> In some cases, as Nitin mentioned, namenode will never leave safe mode
> because it can't get enough data. In that case you may need to force name
> node leave safe mode.
>
> For more information, see
> http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Safemode
> .
>
>
> On Mon, May 6, 2013 at 5:00 PM, Nitin Pawar <ni...@gmail.com>wrote:
>
>> What is your replication factor on hdfs?
>> Did any of your datanode go down recently and is not back in rotation?
>> Did you delete any hdfs blocks directly from datanodes?
>> On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <
>> write2kishore@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>   I have been running application on my YARN cluster since around 20
>>> days, about 5000 applications a day. I am getting the following error
>>> today. Please let me know how can I avoid this, is this happening because
>>> of a bug?
>>>
>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
>>> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
>>> The reported blocks 4775 needs additional 880 blocks to reach the
>>> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
>>> automatically.
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>>>          at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>>>         at
>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>>>         at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>         at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>>>
>>>
>>> Thanks,
>>> Kishore
>>>
>>
>
>
> --
> Regards,
> Ted Xu
>

Re: Namenode going to safe mode on YARN

Posted by Nitin Pawar <ni...@gmail.com>.
If you are running a single node and you have not changed the hdfs
configuration for repliaction then he default is set to 3 and you will just
have one available.

As long as this is not a production environment you can force namenode to
leave safe mode. Only issue in this will be in case any of the primary
blocks missing out from hdfs but that will only realised when you run fsck.
It  should tell you that missing blocks and under replicated blocks
On May 6, 2013 5:28 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com>
wrote:

> Hi Nithin & Ted,
>
>   Thanks for the replies.
>
>  I don't know what my replication factor is, I don't seem to have set
> anything in my configuration files. I run on a single node cluster. My data
> node has gone down and came back, and also I didn't delete any of the hdfs
> blocks.
>
>  I know that name node enter safe mode when HDFS is restarted, and will
> leave soon. Is it safe to execute command to leave safe mode? I mean, can
> something wrong happen if we do it ourselves? because it wouldn't have
> collected all the needed data and could not leave the safe mode by itself?
>
>   And, does the error I gave above indicate some clue as to what I could
> do better?
>
> Thanks,
> Kishore
>
>
>
> On Mon, May 6, 2013 at 2:56 PM, Ted Xu <tx...@gopivotal.com> wrote:
>
>> Hi Kishore,
>>
>> It should not be a bug. After restarting HDFS, namenode will enter safe
>> mode until all needed data is collected. During safe mode, all update
>> operations will fail.
>>
>> In some cases, as Nitin mentioned, namenode will never leave safe mode
>> because it can't get enough data. In that case you may need to force name
>> node leave safe mode.
>>
>> For more information, see
>> http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Safemode
>> .
>>
>>
>> On Mon, May 6, 2013 at 5:00 PM, Nitin Pawar <ni...@gmail.com>wrote:
>>
>>> What is your replication factor on hdfs?
>>> Did any of your datanode go down recently and is not back in rotation?
>>> Did you delete any hdfs blocks directly from datanodes?
>>> On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <
>>> write2kishore@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>>   I have been running application on my YARN cluster since around 20
>>>> days, about 5000 applications a day. I am getting the following error
>>>> today. Please let me know how can I avoid this, is this happening because
>>>> of a bug?
>>>>
>>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
>>>> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
>>>> The reported blocks 4775 needs additional 880 blocks to reach the
>>>> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
>>>> automatically.
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>>>>          at
>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>>>>         at
>>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>>>>         at
>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>         at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>>>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>>>>
>>>>
>>>> Thanks,
>>>> Kishore
>>>>
>>>
>>
>>
>> --
>> Regards,
>> Ted Xu
>>
>
>

Re: Namenode going to safe mode on YARN

Posted by Nitin Pawar <ni...@gmail.com>.
If you are running a single node and you have not changed the hdfs
configuration for repliaction then he default is set to 3 and you will just
have one available.

As long as this is not a production environment you can force namenode to
leave safe mode. Only issue in this will be in case any of the primary
blocks missing out from hdfs but that will only realised when you run fsck.
It  should tell you that missing blocks and under replicated blocks
On May 6, 2013 5:28 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com>
wrote:

> Hi Nithin & Ted,
>
>   Thanks for the replies.
>
>  I don't know what my replication factor is, I don't seem to have set
> anything in my configuration files. I run on a single node cluster. My data
> node has gone down and came back, and also I didn't delete any of the hdfs
> blocks.
>
>  I know that name node enter safe mode when HDFS is restarted, and will
> leave soon. Is it safe to execute command to leave safe mode? I mean, can
> something wrong happen if we do it ourselves? because it wouldn't have
> collected all the needed data and could not leave the safe mode by itself?
>
>   And, does the error I gave above indicate some clue as to what I could
> do better?
>
> Thanks,
> Kishore
>
>
>
> On Mon, May 6, 2013 at 2:56 PM, Ted Xu <tx...@gopivotal.com> wrote:
>
>> Hi Kishore,
>>
>> It should not be a bug. After restarting HDFS, namenode will enter safe
>> mode until all needed data is collected. During safe mode, all update
>> operations will fail.
>>
>> In some cases, as Nitin mentioned, namenode will never leave safe mode
>> because it can't get enough data. In that case you may need to force name
>> node leave safe mode.
>>
>> For more information, see
>> http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Safemode
>> .
>>
>>
>> On Mon, May 6, 2013 at 5:00 PM, Nitin Pawar <ni...@gmail.com>wrote:
>>
>>> What is your replication factor on hdfs?
>>> Did any of your datanode go down recently and is not back in rotation?
>>> Did you delete any hdfs blocks directly from datanodes?
>>> On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <
>>> write2kishore@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>>   I have been running application on my YARN cluster since around 20
>>>> days, about 5000 applications a day. I am getting the following error
>>>> today. Please let me know how can I avoid this, is this happening because
>>>> of a bug?
>>>>
>>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
>>>> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
>>>> The reported blocks 4775 needs additional 880 blocks to reach the
>>>> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
>>>> automatically.
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>>>>          at
>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>>>>         at
>>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>>>>         at
>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>         at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>>>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>>>>
>>>>
>>>> Thanks,
>>>> Kishore
>>>>
>>>
>>
>>
>> --
>> Regards,
>> Ted Xu
>>
>
>

Re: Namenode going to safe mode on YARN

Posted by Nitin Pawar <ni...@gmail.com>.
If you are running a single node and you have not changed the hdfs
configuration for repliaction then he default is set to 3 and you will just
have one available.

As long as this is not a production environment you can force namenode to
leave safe mode. Only issue in this will be in case any of the primary
blocks missing out from hdfs but that will only realised when you run fsck.
It  should tell you that missing blocks and under replicated blocks
On May 6, 2013 5:28 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com>
wrote:

> Hi Nithin & Ted,
>
>   Thanks for the replies.
>
>  I don't know what my replication factor is, I don't seem to have set
> anything in my configuration files. I run on a single node cluster. My data
> node has gone down and came back, and also I didn't delete any of the hdfs
> blocks.
>
>  I know that name node enter safe mode when HDFS is restarted, and will
> leave soon. Is it safe to execute command to leave safe mode? I mean, can
> something wrong happen if we do it ourselves? because it wouldn't have
> collected all the needed data and could not leave the safe mode by itself?
>
>   And, does the error I gave above indicate some clue as to what I could
> do better?
>
> Thanks,
> Kishore
>
>
>
> On Mon, May 6, 2013 at 2:56 PM, Ted Xu <tx...@gopivotal.com> wrote:
>
>> Hi Kishore,
>>
>> It should not be a bug. After restarting HDFS, namenode will enter safe
>> mode until all needed data is collected. During safe mode, all update
>> operations will fail.
>>
>> In some cases, as Nitin mentioned, namenode will never leave safe mode
>> because it can't get enough data. In that case you may need to force name
>> node leave safe mode.
>>
>> For more information, see
>> http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Safemode
>> .
>>
>>
>> On Mon, May 6, 2013 at 5:00 PM, Nitin Pawar <ni...@gmail.com>wrote:
>>
>>> What is your replication factor on hdfs?
>>> Did any of your datanode go down recently and is not back in rotation?
>>> Did you delete any hdfs blocks directly from datanodes?
>>> On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <
>>> write2kishore@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>>   I have been running application on my YARN cluster since around 20
>>>> days, about 5000 applications a day. I am getting the following error
>>>> today. Please let me know how can I avoid this, is this happening because
>>>> of a bug?
>>>>
>>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
>>>> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
>>>> The reported blocks 4775 needs additional 880 blocks to reach the
>>>> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
>>>> automatically.
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>>>>          at
>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>>>>         at
>>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>>>>         at
>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>         at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>>>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>>>>
>>>>
>>>> Thanks,
>>>> Kishore
>>>>
>>>
>>
>>
>> --
>> Regards,
>> Ted Xu
>>
>
>

Re: Namenode going to safe mode on YARN

Posted by Nitin Pawar <ni...@gmail.com>.
If you are running a single node and you have not changed the hdfs
configuration for repliaction then he default is set to 3 and you will just
have one available.

As long as this is not a production environment you can force namenode to
leave safe mode. Only issue in this will be in case any of the primary
blocks missing out from hdfs but that will only realised when you run fsck.
It  should tell you that missing blocks and under replicated blocks
On May 6, 2013 5:28 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com>
wrote:

> Hi Nithin & Ted,
>
>   Thanks for the replies.
>
>  I don't know what my replication factor is, I don't seem to have set
> anything in my configuration files. I run on a single node cluster. My data
> node has gone down and came back, and also I didn't delete any of the hdfs
> blocks.
>
>  I know that name node enter safe mode when HDFS is restarted, and will
> leave soon. Is it safe to execute command to leave safe mode? I mean, can
> something wrong happen if we do it ourselves? because it wouldn't have
> collected all the needed data and could not leave the safe mode by itself?
>
>   And, does the error I gave above indicate some clue as to what I could
> do better?
>
> Thanks,
> Kishore
>
>
>
> On Mon, May 6, 2013 at 2:56 PM, Ted Xu <tx...@gopivotal.com> wrote:
>
>> Hi Kishore,
>>
>> It should not be a bug. After restarting HDFS, namenode will enter safe
>> mode until all needed data is collected. During safe mode, all update
>> operations will fail.
>>
>> In some cases, as Nitin mentioned, namenode will never leave safe mode
>> because it can't get enough data. In that case you may need to force name
>> node leave safe mode.
>>
>> For more information, see
>> http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Safemode
>> .
>>
>>
>> On Mon, May 6, 2013 at 5:00 PM, Nitin Pawar <ni...@gmail.com>wrote:
>>
>>> What is your replication factor on hdfs?
>>> Did any of your datanode go down recently and is not back in rotation?
>>> Did you delete any hdfs blocks directly from datanodes?
>>> On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <
>>> write2kishore@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>>   I have been running application on my YARN cluster since around 20
>>>> days, about 5000 applications a day. I am getting the following error
>>>> today. Please let me know how can I avoid this, is this happening because
>>>> of a bug?
>>>>
>>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
>>>> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
>>>> The reported blocks 4775 needs additional 880 blocks to reach the
>>>> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
>>>> automatically.
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>>>>         at
>>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>>>>          at
>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>>>>         at
>>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>>>>         at
>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>         at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>>>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>>>>
>>>>
>>>> Thanks,
>>>> Kishore
>>>>
>>>
>>
>>
>> --
>> Regards,
>> Ted Xu
>>
>
>

Re: Namenode going to safe mode on YARN

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Nithin & Ted,

  Thanks for the replies.

 I don't know what my replication factor is, I don't seem to have set
anything in my configuration files. I run on a single node cluster. My data
node has gone down and came back, and also I didn't delete any of the hdfs
blocks.

 I know that name node enter safe mode when HDFS is restarted, and will
leave soon. Is it safe to execute command to leave safe mode? I mean, can
something wrong happen if we do it ourselves? because it wouldn't have
collected all the needed data and could not leave the safe mode by itself?

  And, does the error I gave above indicate some clue as to what I could do
better?

Thanks,
Kishore



On Mon, May 6, 2013 at 2:56 PM, Ted Xu <tx...@gopivotal.com> wrote:

> Hi Kishore,
>
> It should not be a bug. After restarting HDFS, namenode will enter safe
> mode until all needed data is collected. During safe mode, all update
> operations will fail.
>
> In some cases, as Nitin mentioned, namenode will never leave safe mode
> because it can't get enough data. In that case you may need to force name
> node leave safe mode.
>
> For more information, see
> http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Safemode
> .
>
>
> On Mon, May 6, 2013 at 5:00 PM, Nitin Pawar <ni...@gmail.com>wrote:
>
>> What is your replication factor on hdfs?
>> Did any of your datanode go down recently and is not back in rotation?
>> Did you delete any hdfs blocks directly from datanodes?
>> On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <
>> write2kishore@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>   I have been running application on my YARN cluster since around 20
>>> days, about 5000 applications a day. I am getting the following error
>>> today. Please let me know how can I avoid this, is this happening because
>>> of a bug?
>>>
>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
>>> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
>>> The reported blocks 4775 needs additional 880 blocks to reach the
>>> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
>>> automatically.
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>>>          at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>>>         at
>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>>>         at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>         at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>>>
>>>
>>> Thanks,
>>> Kishore
>>>
>>
>
>
> --
> Regards,
> Ted Xu
>

Re: Namenode going to safe mode on YARN

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Nithin & Ted,

  Thanks for the replies.

 I don't know what my replication factor is, I don't seem to have set
anything in my configuration files. I run on a single node cluster. My data
node has gone down and came back, and also I didn't delete any of the hdfs
blocks.

 I know that name node enter safe mode when HDFS is restarted, and will
leave soon. Is it safe to execute command to leave safe mode? I mean, can
something wrong happen if we do it ourselves? because it wouldn't have
collected the needed data and could not leave the safe mode by itself?

  And, does the error I gave above indicate some clue as to what I could do
better?

Thanks,
Kishore



On Mon, May 6, 2013 at 2:56 PM, Ted Xu <tx...@gopivotal.com> wrote:

> Hi Kishore,
>
> It should not be a bug. After restarting HDFS, namenode will enter safe
> mode until all needed data is collected. During safe mode, all update
> operations will fail.
>
> In some cases, as Nitin mentioned, namenode will never leave safe mode
> because it can't get enough data. In that case you may need to force name
> node leave safe mode.
>
> For more information, see
> http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Safemode
> .
>
>
> On Mon, May 6, 2013 at 5:00 PM, Nitin Pawar <ni...@gmail.com>wrote:
>
>> What is your replication factor on hdfs?
>> Did any of your datanode go down recently and is not back in rotation?
>> Did you delete any hdfs blocks directly from datanodes?
>> On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <
>> write2kishore@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>   I have been running application on my YARN cluster since around 20
>>> days, about 5000 applications a day. I am getting the following error
>>> today. Please let me know how can I avoid this, is this happening because
>>> of a bug?
>>>
>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
>>> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
>>> The reported blocks 4775 needs additional 880 blocks to reach the
>>> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
>>> automatically.
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>>>          at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>>>         at
>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>>>         at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>         at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>>>
>>>
>>> Thanks,
>>> Kishore
>>>
>>
>
>
> --
> Regards,
> Ted Xu
>

Re: Namenode going to safe mode on YARN

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Nithin & Ted,

  Thanks for the replies.

 I don't know what my replication factor is, I don't seem to have set
anything in my configuration files. I run on a single node cluster. My data
node has gone down and came back, and also I didn't delete any of the hdfs
blocks.

 I know that name node enter safe mode when HDFS is restarted, and will
leave soon. Is it safe to execute command to leave safe mode? I mean, can
something wrong happen if we do it ourselves? because it wouldn't have
collected all the needed data and could not leave the safe mode by itself?

  And, does the error I gave above indicate some clue as to what I could do
better?

Thanks,
Kishore



On Mon, May 6, 2013 at 2:56 PM, Ted Xu <tx...@gopivotal.com> wrote:

> Hi Kishore,
>
> It should not be a bug. After restarting HDFS, namenode will enter safe
> mode until all needed data is collected. During safe mode, all update
> operations will fail.
>
> In some cases, as Nitin mentioned, namenode will never leave safe mode
> because it can't get enough data. In that case you may need to force name
> node leave safe mode.
>
> For more information, see
> http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Safemode
> .
>
>
> On Mon, May 6, 2013 at 5:00 PM, Nitin Pawar <ni...@gmail.com>wrote:
>
>> What is your replication factor on hdfs?
>> Did any of your datanode go down recently and is not back in rotation?
>> Did you delete any hdfs blocks directly from datanodes?
>> On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <
>> write2kishore@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>   I have been running application on my YARN cluster since around 20
>>> days, about 5000 applications a day. I am getting the following error
>>> today. Please let me know how can I avoid this, is this happening because
>>> of a bug?
>>>
>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
>>> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
>>> The reported blocks 4775 needs additional 880 blocks to reach the
>>> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
>>> automatically.
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>>>          at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>>>         at
>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>>>         at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>         at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>>>
>>>
>>> Thanks,
>>> Kishore
>>>
>>
>
>
> --
> Regards,
> Ted Xu
>

Re: Namenode going to safe mode on YARN

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Nithin & Ted,

  Thanks for the replies.

 I don't know what my replication factor is, I don't seem to have set
anything in my configuration files. I run on a single node cluster. My data
node has gone down and came back, and also I didn't delete any of the hdfs
blocks.

 I know that name node enter safe mode when HDFS is restarted, and will
leave soon. Is it safe to execute command to leave safe mode? I mean, can
something wrong happen if we do it ourselves? because it wouldn't have
collected all the needed data and could not leave the safe mode by itself?

  And, does the error I gave above indicate some clue as to what I could do
better?

Thanks,
Kishore



On Mon, May 6, 2013 at 2:56 PM, Ted Xu <tx...@gopivotal.com> wrote:

> Hi Kishore,
>
> It should not be a bug. After restarting HDFS, namenode will enter safe
> mode until all needed data is collected. During safe mode, all update
> operations will fail.
>
> In some cases, as Nitin mentioned, namenode will never leave safe mode
> because it can't get enough data. In that case you may need to force name
> node leave safe mode.
>
> For more information, see
> http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Safemode
> .
>
>
> On Mon, May 6, 2013 at 5:00 PM, Nitin Pawar <ni...@gmail.com>wrote:
>
>> What is your replication factor on hdfs?
>> Did any of your datanode go down recently and is not back in rotation?
>> Did you delete any hdfs blocks directly from datanodes?
>> On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <
>> write2kishore@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>   I have been running application on my YARN cluster since around 20
>>> days, about 5000 applications a day. I am getting the following error
>>> today. Please let me know how can I avoid this, is this happening because
>>> of a bug?
>>>
>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
>>> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
>>> The reported blocks 4775 needs additional 880 blocks to reach the
>>> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
>>> automatically.
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>>>          at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>>>         at
>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>>>         at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>         at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>>>
>>>
>>> Thanks,
>>> Kishore
>>>
>>
>
>
> --
> Regards,
> Ted Xu
>

Re: Namenode going to safe mode on YARN

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Nithin & Ted,

  Thanks for the replies.

 I don't know what my replication factor is, I don't seem to have set
anything in my configuration files. I run on a single node cluster. My data
node has gone down and came back, and also I didn't delete any of the hdfs
blocks.

 I know that name node enter safe mode when HDFS is restarted, and will
leave soon. Is it safe to execute command to leave safe mode? I mean, can
something wrong happen if we do it ourselves? because it wouldn't have
collected the needed data and could not leave the safe mode by itself?

  And, does the error I gave above indicate some clue as to what I could do
better?

Thanks,
Kishore



On Mon, May 6, 2013 at 2:56 PM, Ted Xu <tx...@gopivotal.com> wrote:

> Hi Kishore,
>
> It should not be a bug. After restarting HDFS, namenode will enter safe
> mode until all needed data is collected. During safe mode, all update
> operations will fail.
>
> In some cases, as Nitin mentioned, namenode will never leave safe mode
> because it can't get enough data. In that case you may need to force name
> node leave safe mode.
>
> For more information, see
> http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Safemode
> .
>
>
> On Mon, May 6, 2013 at 5:00 PM, Nitin Pawar <ni...@gmail.com>wrote:
>
>> What is your replication factor on hdfs?
>> Did any of your datanode go down recently and is not back in rotation?
>> Did you delete any hdfs blocks directly from datanodes?
>> On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <
>> write2kishore@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>   I have been running application on my YARN cluster since around 20
>>> days, about 5000 applications a day. I am getting the following error
>>> today. Please let me know how can I avoid this, is this happening because
>>> of a bug?
>>>
>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
>>> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
>>> The reported blocks 4775 needs additional 880 blocks to reach the
>>> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
>>> automatically.
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>>>          at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>>>         at
>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>>>         at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>         at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>>>
>>>
>>> Thanks,
>>> Kishore
>>>
>>
>
>
> --
> Regards,
> Ted Xu
>

Re: Namenode going to safe mode on YARN

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Nithin & Ted,

  Thanks for the replies.

 I don't know what my replication factor is, I don't seem to have set
anything in my configuration files. I run on a single node cluster. My data
node has gone down and came back, and also I didn't delete any of the hdfs
blocks.

 I know that name node enter safe mode when HDFS is restarted, and will
leave soon. Is it safe to execute command to leave safe mode? I mean, can
something wrong happen if we do it ourselves? because it wouldn't have
collected the needed data and could not leave the safe mode by itself?

  And, does the error I gave above indicate some clue as to what I could do
better?

Thanks,
Kishore



On Mon, May 6, 2013 at 2:56 PM, Ted Xu <tx...@gopivotal.com> wrote:

> Hi Kishore,
>
> It should not be a bug. After restarting HDFS, namenode will enter safe
> mode until all needed data is collected. During safe mode, all update
> operations will fail.
>
> In some cases, as Nitin mentioned, namenode will never leave safe mode
> because it can't get enough data. In that case you may need to force name
> node leave safe mode.
>
> For more information, see
> http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Safemode
> .
>
>
> On Mon, May 6, 2013 at 5:00 PM, Nitin Pawar <ni...@gmail.com>wrote:
>
>> What is your replication factor on hdfs?
>> Did any of your datanode go down recently and is not back in rotation?
>> Did you delete any hdfs blocks directly from datanodes?
>> On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <
>> write2kishore@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>   I have been running application on my YARN cluster since around 20
>>> days, about 5000 applications a day. I am getting the following error
>>> today. Please let me know how can I avoid this, is this happening because
>>> of a bug?
>>>
>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
>>> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
>>> The reported blocks 4775 needs additional 880 blocks to reach the
>>> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
>>> automatically.
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>>>          at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>>>         at
>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>>>         at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>         at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>>>
>>>
>>> Thanks,
>>> Kishore
>>>
>>
>
>
> --
> Regards,
> Ted Xu
>

Re: Namenode going to safe mode on YARN

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Nithin & Ted,

  Thanks for the replies.

 I don't know what my replication factor is, I don't seem to have set
anything in my configuration files. I run on a single node cluster. My data
node has gone down and came back, and also I didn't delete any of the hdfs
blocks.

 I know that name node enter safe mode when HDFS is restarted, and will
leave soon. Is it safe to execute command to leave safe mode? I mean, can
something wrong happen if we do it ourselves? because it wouldn't have
collected all the needed data and could not leave the safe mode by itself?

  And, does the error I gave above indicate some clue as to what I could do
better?

Thanks,
Kishore



On Mon, May 6, 2013 at 2:56 PM, Ted Xu <tx...@gopivotal.com> wrote:

> Hi Kishore,
>
> It should not be a bug. After restarting HDFS, namenode will enter safe
> mode until all needed data is collected. During safe mode, all update
> operations will fail.
>
> In some cases, as Nitin mentioned, namenode will never leave safe mode
> because it can't get enough data. In that case you may need to force name
> node leave safe mode.
>
> For more information, see
> http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Safemode
> .
>
>
> On Mon, May 6, 2013 at 5:00 PM, Nitin Pawar <ni...@gmail.com>wrote:
>
>> What is your replication factor on hdfs?
>> Did any of your datanode go down recently and is not back in rotation?
>> Did you delete any hdfs blocks directly from datanodes?
>> On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <
>> write2kishore@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>   I have been running application on my YARN cluster since around 20
>>> days, about 5000 applications a day. I am getting the following error
>>> today. Please let me know how can I avoid this, is this happening because
>>> of a bug?
>>>
>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
>>> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
>>> The reported blocks 4775 needs additional 880 blocks to reach the
>>> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
>>> automatically.
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>>>          at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>>>         at
>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>>>         at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>         at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>>>
>>>
>>> Thanks,
>>> Kishore
>>>
>>
>
>
> --
> Regards,
> Ted Xu
>

Re: Namenode going to safe mode on YARN

Posted by Ted Xu <tx...@gopivotal.com>.
Hi Kishore,

It should not be a bug. After restarting HDFS, namenode will enter safe
mode until all needed data is collected. During safe mode, all update
operations will fail.

In some cases, as Nitin mentioned, namenode will never leave safe mode
because it can't get enough data. In that case you may need to force name
node leave safe mode.

For more information, see
http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Safemode
.


On Mon, May 6, 2013 at 5:00 PM, Nitin Pawar <ni...@gmail.com> wrote:

> What is your replication factor on hdfs?
> Did any of your datanode go down recently and is not back in rotation?
> Did you delete any hdfs blocks directly from datanodes?
> On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <
> write2kishore@gmail.com> wrote:
>
>> Hi,
>>
>>   I have been running application on my YARN cluster since around 20
>> days, about 5000 applications a day. I am getting the following error
>> today. Please let me know how can I avoid this, is this happening because
>> of a bug?
>>
>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
>> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
>> The reported blocks 4775 needs additional 880 blocks to reach the
>> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
>> automatically.
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>>          at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>>         at
>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>>         at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>>
>>
>> Thanks,
>> Kishore
>>
>


-- 
Regards,
Ted Xu

Re: Namenode going to safe mode on YARN

Posted by Ted Xu <tx...@gopivotal.com>.
Hi Kishore,

It should not be a bug. After restarting HDFS, namenode will enter safe
mode until all needed data is collected. During safe mode, all update
operations will fail.

In some cases, as Nitin mentioned, namenode will never leave safe mode
because it can't get enough data. In that case you may need to force name
node leave safe mode.

For more information, see
http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Safemode
.


On Mon, May 6, 2013 at 5:00 PM, Nitin Pawar <ni...@gmail.com> wrote:

> What is your replication factor on hdfs?
> Did any of your datanode go down recently and is not back in rotation?
> Did you delete any hdfs blocks directly from datanodes?
> On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <
> write2kishore@gmail.com> wrote:
>
>> Hi,
>>
>>   I have been running application on my YARN cluster since around 20
>> days, about 5000 applications a day. I am getting the following error
>> today. Please let me know how can I avoid this, is this happening because
>> of a bug?
>>
>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
>> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
>> The reported blocks 4775 needs additional 880 blocks to reach the
>> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
>> automatically.
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>>          at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>>         at
>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>>         at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>>
>>
>> Thanks,
>> Kishore
>>
>


-- 
Regards,
Ted Xu

Re: Namenode going to safe mode on YARN

Posted by Ted Xu <tx...@gopivotal.com>.
Hi Kishore,

It should not be a bug. After restarting HDFS, namenode will enter safe
mode until all needed data is collected. During safe mode, all update
operations will fail.

In some cases, as Nitin mentioned, namenode will never leave safe mode
because it can't get enough data. In that case you may need to force name
node leave safe mode.

For more information, see
http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Safemode
.


On Mon, May 6, 2013 at 5:00 PM, Nitin Pawar <ni...@gmail.com> wrote:

> What is your replication factor on hdfs?
> Did any of your datanode go down recently and is not back in rotation?
> Did you delete any hdfs blocks directly from datanodes?
> On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <
> write2kishore@gmail.com> wrote:
>
>> Hi,
>>
>>   I have been running application on my YARN cluster since around 20
>> days, about 5000 applications a day. I am getting the following error
>> today. Please let me know how can I avoid this, is this happening because
>> of a bug?
>>
>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
>> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
>> The reported blocks 4775 needs additional 880 blocks to reach the
>> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
>> automatically.
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>>          at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>>         at
>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>>         at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>>
>>
>> Thanks,
>> Kishore
>>
>


-- 
Regards,
Ted Xu

Re: Namenode going to safe mode on YARN

Posted by Ted Xu <tx...@gopivotal.com>.
Hi Kishore,

It should not be a bug. After restarting HDFS, namenode will enter safe
mode until all needed data is collected. During safe mode, all update
operations will fail.

In some cases, as Nitin mentioned, namenode will never leave safe mode
because it can't get enough data. In that case you may need to force name
node leave safe mode.

For more information, see
http://hadoop.apache.org/docs/r2.0.4-alpha/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Safemode
.


On Mon, May 6, 2013 at 5:00 PM, Nitin Pawar <ni...@gmail.com> wrote:

> What is your replication factor on hdfs?
> Did any of your datanode go down recently and is not back in rotation?
> Did you delete any hdfs blocks directly from datanodes?
> On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <
> write2kishore@gmail.com> wrote:
>
>> Hi,
>>
>>   I have been running application on my YARN cluster since around 20
>> days, about 5000 applications a day. I am getting the following error
>> today. Please let me know how can I avoid this, is this happening because
>> of a bug?
>>
>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
>> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
>> The reported blocks 4775 needs additional 880 blocks to reach the
>> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
>> automatically.
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>>          at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>>         at
>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>>         at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>>
>>
>> Thanks,
>> Kishore
>>
>


-- 
Regards,
Ted Xu

Re: Namenode going to safe mode on YARN

Posted by Nitin Pawar <ni...@gmail.com>.
What is your replication factor on hdfs?
Did any of your datanode go down recently and is not back in rotation?
Did you delete any hdfs blocks directly from datanodes?
On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com>
wrote:

> Hi,
>
>   I have been running application on my YARN cluster since around 20 days,
> about 5000 applications a day. I am getting the following error today.
> Please let me know how can I avoid this, is this happening because of a bug?
>
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
> The reported blocks 4775 needs additional 880 blocks to reach the
> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
> automatically.
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>
>
> Thanks,
> Kishore
>

Re: Namenode going to safe mode on YARN

Posted by Nitin Pawar <ni...@gmail.com>.
What is your replication factor on hdfs?
Did any of your datanode go down recently and is not back in rotation?
Did you delete any hdfs blocks directly from datanodes?
On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com>
wrote:

> Hi,
>
>   I have been running application on my YARN cluster since around 20 days,
> about 5000 applications a day. I am getting the following error today.
> Please let me know how can I avoid this, is this happening because of a bug?
>
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
> The reported blocks 4775 needs additional 880 blocks to reach the
> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
> automatically.
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>
>
> Thanks,
> Kishore
>

Re: Namenode going to safe mode on YARN

Posted by Nitin Pawar <ni...@gmail.com>.
What is your replication factor on hdfs?
Did any of your datanode go down recently and is not back in rotation?
Did you delete any hdfs blocks directly from datanodes?
On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com>
wrote:

> Hi,
>
>   I have been running application on my YARN cluster since around 20 days,
> about 5000 applications a day. I am getting the following error today.
> Please let me know how can I avoid this, is this happening because of a bug?
>
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
> The reported blocks 4775 needs additional 880 blocks to reach the
> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
> automatically.
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>
>
> Thanks,
> Kishore
>

Re: Namenode going to safe mode on YARN

Posted by Nitin Pawar <ni...@gmail.com>.
What is your replication factor on hdfs?
Did any of your datanode go down recently and is not back in rotation?
Did you delete any hdfs blocks directly from datanodes?
On May 6, 2013 2:28 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com>
wrote:

> Hi,
>
>   I have been running application on my YARN cluster since around 20 days,
> about 5000 applications a day. I am getting the following error today.
> Please let me know how can I avoid this, is this happening because of a bug?
>
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
> Cannot create file/1066/AppMaster.jar. Name node is in safe mode.
> The reported blocks 4775 needs additional 880 blocks to reach the
> threshold 0.9990 of total blocks 5660. Safe mode will be turned off
> automatically.
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1786)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1737)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1719)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:429)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:271)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40732)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>
>
> Thanks,
> Kishore
>