You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by springring <sp...@126.com> on 2011/04/08 08:20:48 UTC

Fw: start-up with safe mode?


> 
>> Hi,
>> 
>>   When I start up hadoop, the namenode log show "STATE* Safe mode ON" like that , how to set it off?
>     I can set it off with command "hadoop fs -dfsadmin leave" after start up, but how can I just start HDFS
>     out of Safe mode? 
>>   Thanks.
>> 
>> Ring
>> 
>> the startup log________________________________________________________________
>> 
>> 2011-04-08 11:58:20,655 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null
>> 2011-04-08 11:58:20,657 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
>> 2011-04-08 11:58:20,678 INFO org.apache.hadoop.hdfs.util.GSet: VM type       = 32-bit
>> 2011-04-08 11:58:20,678 INFO org.apache.hadoop.hdfs.util.GSet: 2% max memory = 17.77875 MB
>> 2011-04-08 11:58:20,678 INFO org.apache.hadoop.hdfs.util.GSet: capacity      = 2^22 = 4194304 entries
>> 2011-04-08 11:58:20,678 INFO org.apache.hadoop.hdfs.util.GSet: recommended=4194304, actual=4194304
>> 2011-04-08 11:58:20,697 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hdfs
>> 2011-04-08 11:58:20,697 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
>> 2011-04-08 11:58:20,697 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
>> 2011-04-08 11:58:20,701 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.block.invalidate.limit=1000
>> 2011-04-08 11:58:20,701 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
>> 2011-04-08 11:58:20,976 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext
>> 2011-04-08 11:58:21,001 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 17
>> 2011-04-08 11:58:21,007 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 0
>> 2011-04-08 11:58:21,007 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 1529 loaded in 0 seconds.
>> 2011-04-08 11:58:21,007 INFO org.apache.hadoop.hdfs.server.common.Storage: Edits file /tmp/hadoop-hdfs/dfs/name/current/edits of size 4 edits # 0 loaded in 0 seconds.
>> 2011-04-08 11:58:21,009 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 1529 saved in 0 seconds.
>> 2011-04-08 11:58:21,022 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 1529 saved in 0 seconds.
>> 2011-04-08 11:58:21,032 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 339 msecs
>> 2011-04-08 11:58:21,036 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe mode ON.
>> The reported blocks 0 needs additional 2 blocks to reach the threshold 0.9990 of total blocks 3. Safe mode will be turned off automatically.
>>

Re: start-up with safe mode?

Posted by Matthew Foley <ma...@yahoo-inc.com>.
Hi Ring,

The purpose of starting up with safe mode enabled, is to prevent replication thrashing before and during Initial Block Reports from all the datanodes.  Consider this thought experiment:
- a cluster with 100 datanodes and replication 3
- so any pair of datanodes only have aprx 2% overlap in their block content
- the datanodes don't all start up exactly simultaneously
- When the first two datanodes start up, if the cluster weren't in safe mode, 98% of their blocks would be declared under-replicated, and they would immediately be asked to replicate them ALL to each other!
- When the third datanode starts up, it gets even worse, since it also only has a 2% overlap with each of the other two.
- It just continues getting worse until many of the datanodes are registered, and the rate slows down for introduction of new blocks with only one found replica.

While safe mode is on, the namenode doesn't attempt to change anything in the namespace or blockspace, including which datanodes have replicas of which blocks, although it does accept Block Reports from the datanodes telling which blocks they have.  So the above described replication storm doesn't happen during safe mode.  All (or almost all) the datanodes are allowed to register and give their Block Reports.  THEN the namenode scans for blocks that truly are under-replicated.  It gives a 30-second warning, then leaves safe mode, and generates appropriate replication requests to fix any under-replicated blocks now known.

That said, you can modify this behavior with the configuration parameters
dfs.namenode.safemode.threshold-pct
and dfs.namenode.safemode.extension
These default to 0.999 (100% minus delta), and 30000 (30 sec), respectively (defined in DFSConfigKeys).
Search for them in the docs for details.

If you want non-default values, you'd typically set them in hdfs-site.xml in your namenode's config directory.
Setting them both to 0 will give you what you are asking for, but it probably isn't what you want :-)

--Matt

On Apr 7, 2011, at 11:20 PM, springring wrote:


Hi,

 When I start up hadoop, the namenode log show "STATE* Safe mode ON" like that , how to set it off?
   I can set it off with command "hadoop fs -dfsadmin leave" after start up, but how can I just start HDFS
   out of Safe mode?
 Thanks.

Ring

the startup log________________________________________________________________

2011-04-08 11:58:20,655 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null
2011-04-08 11:58:20,657 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
2011-04-08 11:58:20,678 INFO org.apache.hadoop.hdfs.util.GSet: VM type       = 32-bit
2011-04-08 11:58:20,678 INFO org.apache.hadoop.hdfs.util.GSet: 2% max memory = 17.77875 MB
2011-04-08 11:58:20,678 INFO org.apache.hadoop.hdfs.util.GSet: capacity      = 2^22 = 4194304 entries
2011-04-08 11:58:20,678 INFO org.apache.hadoop.hdfs.util.GSet: recommended=4194304, actual=4194304
2011-04-08 11:58:20,697 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hdfs
2011-04-08 11:58:20,697 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
2011-04-08 11:58:20,697 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
2011-04-08 11:58:20,701 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.block.invalidate.limit=1000
2011-04-08 11:58:20,701 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
2011-04-08 11:58:20,976 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext
2011-04-08 11:58:21,001 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 17
2011-04-08 11:58:21,007 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 0
2011-04-08 11:58:21,007 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 1529 loaded in 0 seconds.
2011-04-08 11:58:21,007 INFO org.apache.hadoop.hdfs.server.common.Storage: Edits file /tmp/hadoop-hdfs/dfs/name/current/edits of size 4 edits # 0 loaded in 0 seconds.
2011-04-08 11:58:21,009 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 1529 saved in 0 seconds.
2011-04-08 11:58:21,022 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 1529 saved in 0 seconds.
2011-04-08 11:58:21,032 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 339 msecs
2011-04-08 11:58:21,036 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe mode ON.
The reported blocks 0 needs additional 2 blocks to reach the threshold 0.9990 of total blocks 3. Safe mode will be turned off automatically.