You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Manish N <m1...@gmail.com> on 2010/04/07 07:16:44 UTC

Cluster in Safe Mode

Hey all,

I've a 2 Node cluster which is now running in Safe Mode. Its been 15-16 hrs
now & yet to come out of Safe Mode. Does it normally take that long ?

The DataNode logs on Node running NameNode indicates following & similar
output on the slave node ( running only Data Node ) as well.

2010-04-07 10:03:10,687 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-310922324774702076_996024
2010-04-07 10:03:10,705 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_3302288729849061244_813694
2010-04-07 10:03:10,730 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-7252548330326272479_1259723
2010-04-07 10:03:10,745 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-5909954202848831867_1075933
2010-04-07 10:03:10,886 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-3213723859645738103_1075939
2010-04-07 10:03:10,910 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-2209269106581706132_676390
2010-04-07 10:03:10,923 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-6007998488187910667_676379
2010-04-07 10:03:11,086 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-1024215056075897357_676383
2010-04-07 10:03:11,127 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_3780597313184168671_1270304
2010-04-07 10:03:11,160 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_8891623760013835158_676336

One thing I wanted to point out is sometime back I'd to do setrep on the
entire Cluster, are these verifications messages related to that ?

Also while going through the NameNode logs i encountered following things.

2010-04-05 21:01:31,383 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.21:50010
2010-04-05 21:01:49,240 INFO org.apache.hadoop.net.NetworkTopology: Removing
a node: /default-rack/192.168.100.21:50010
2010-04-05 21:01:49,243 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.2:50010
2010-04-05 21:02:01,791 INFO org.apache.hadoop.net.NetworkTopology: Removing
a node: /default-rack/192.168.100.2:50010

then again @

2010-04-06 06:41:56,290 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.21:50010
2010-04-06 06:41:56,290 INFO org.apache.hadoop.net.NetworkTopology: Removing
a node: /default-rack/192.168.100.21:50010
2010-04-06 06:41:56,290 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.2:50010
2010-04-06 06:41:56,290 INFO org.apache.hadoop.net.NetworkTopology: Removing
a node: /default-rack/192.168.100.2:50010

I had to restart the cluster post which I got both the nodes back.

2010-04-06 10:11:24,325 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.registerDatanode: node registration from
192.168.100.21:50010storage DS-455083797-192
.168.100.21-50010-1268220157729
2010-04-06 10:11:24,328 INFO org.apache.hadoop.net.NetworkTopology: Adding a
new node: /default-rack/192.168.100.21:50010
2010-04-06 10:11:25,245 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.allocateBlock:
/data/listing/image/5/84025/35924c87e664a43893904effbd2be601_list.jpg.
blk_-1845977707636580795_1665561
2010-04-06 10:11:25,342 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.addStoredBlock: blockMap updated: 192.168.100.21:50010 is added
to blk_-1845977707636580795_1665561 size 72753
2010-04-06 10:11:44,257 INFO org.apache.hadoop.fs.FSNamesystem: Number of
transactions: 64 Total time for transactions(ms): 4 Number of syncs: 45
SyncTimes(ms): 387
2010-04-06 10:11:51,485 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.registerDatanode: node registration from
192.168.100.2:50010storage
DS-1237294752-192.168.100.2-50010-1252010614375
2010-04-06 10:11:51,488 INFO org.apache.hadoop.net.NetworkTopology: Adding a
new node: /default-rack/192.168.100.2:50010

Then again subsequently they were removed. No clue why this happened.

Ever since I'm seeing following things in logs..

2010-04-06 10:00:49,052 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 2 on 54310, call
create(/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg,
rwxr-xr-x, DFSClient_1226879860, true, 2, 67108864) from 192.168.100.5:40437:
error: org.apache.hadoop.dfs.SafeModeException: Cannot create
file/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg.
Name node is in safe mode.
The ratio of reported blocks 0.0000 has not reached the threshold 0.9990.
Safe mode will be turned off automatically.
org.apache.hadoop.dfs.SafeModeException: Cannot create
file/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg.
Name node is in safe mode.
The ratio of reported blocks 0.0000 has not reached the threshold 0.9990.
Safe mode will be turned off automatically.

I ran FSCK also on the entire cluster, it gave following o/p in the summary.

Total size:    540525108291 B
Total dirs:    53298
Total files:    1617706
Total blocks (validated):    1601927 (avg. block size 337421 B)
  ********************************
  CORRUPT FILES:    1601525
  MISSING BLOCKS:    1601927
  MISSING SIZE:        540525108291 B
  CORRUPT BLOCKS:     1601927
  ********************************
Minimally replicated blocks:    0 (0.0 %)
Over-replicated blocks:    0 (0.0 %)
Under-replicated blocks:    0 (0.0 %)
Mis-replicated blocks:        0 (0.0 %)
Default replication factor:    2
Average block replication:    0.0
Corrupt blocks:        1601927

The filesystem under path '/data' is CORRUPT

I'm using hadoop-0.18.3 on Ubuntu

I'm completely clueless as to why its taking that long coming out of Safe
Mode. Suggestions / Comments appreciated.

- Manish

Re: Cluster in Safe Mode

Posted by Manish N <m1...@gmail.com>.
On Wed, Apr 7, 2010 at 7:27 PM, Edson Ramiro <er...@gmail.com> wrote:

> To solve the safemode problem, you may first start the DFS, leave the
> safemode and do a fsck.
>
> ./bin/start-dfs
> ./bin/hadoop dfs -safemode leave
> ./bin/hadoop fsck /
>
> After this, restart the DFS.
>
> You can configure HADOOP_OPTS in conf/hadoop-env.sh to give more mem. to
> Java.
> Also configure HADOOP_HEAPSIZE.
>


Yes that's exactly what I did, DataNodes are back. I've taken them out of
safe mode.

I'm planning to upgrade the hadoop instance to latest stable release. How
wise would this be ?




> # export HADOOP_OPTS="-server -XX:+HeapDumpOnOutOfMemoryError
> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX    :+UseParallelGC
> -XX:ParallelGCThreads=4 -XX:NewSize=1G -XX:MaxNewSize=1G
>
> Edson Ramiro
>
>
> On 7 April 2010 06:04, Manish N <m1...@gmail.com> wrote:
>
> > On Wed, Apr 7, 2010 at 10:59 AM, Sagar Shukla <
> > sagar_shukla@persistent.co.in
> > > wrote:
> >
> > > Hi Manish,
> > >      Do you see any errors on DataNode log-files ? It is quite likely
> > that
> > > after the namenode starts the processes on datanode then are failing to
> > > start, causing the namenode to wait in safe mode for datanode services
> to
> > > start.
> > >
> >
> > I do see following in the DataNode.out file whenever I start a DataNode
> on
> > both the DataNodes of mine, after sometime they are marked as dead as
> > expected.
> >
> > Exception in thread "DataNode: [/root/Datadir/hadoop/dfs/data]"
> > java.lang.OutOfMemoryError: Java heap space
> >        at java.util.Arrays.copyOf(Arrays.java:2786)
> >        at
> > java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:71)
> >        at java.io.DataOutputStream.writeByte(DataOutputStream.java:136)
> >        at org.apache.hadoop.io.UTF8.writeChars(UTF8.java:274)
> >        at org.apache.hadoop.io.UTF8.writeString(UTF8.java:246)
> >        at
> > org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:120)
> >        at
> > org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:126)
> >        at org.apache.hadoop.ipc.RPC$Invocation.write(RPC.java:109)
> >        at
> > org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:474)
> >        at org.apache.hadoop.ipc.Client.call(Client.java:706)
> >        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
> >        at org.apache.hadoop.dfs.$Proxy4.blockReport(Unknown Source)
> >        at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:744)
> >        at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2967)
> >        at java.lang.Thread.run(Thread.java:619)
> >
> >
> >
> >
> >
> >
> > >
> > > Thanks,
> > > Sagar
> > >
> > > -----Original Message-----
> > > From: Manish N [mailto:m1n3s6@gmail.com]
> > > Sent: Wednesday, April 07, 2010 10:47 AM
> > > To: common-user@hadoop.apache.org
> > > Subject: Cluster in Safe Mode
> > >
> > > Hey all,
> > >
> > > I've a 2 Node cluster which is now running in Safe Mode. Its been 15-16
> > hrs
> > > now & yet to come out of Safe Mode. Does it normally take that long ?
> > >
> > > The DataNode logs on Node running NameNode indicates following &
> similar
> > > output on the slave node ( running only Data Node ) as well.
> > >
> > > 2010-04-07 10:03:10,687 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > > Verification succeeded for blk_-310922324774702076_996024
> > > 2010-04-07 10:03:10,705 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > > Verification succeeded for blk_3302288729849061244_813694
> > > 2010-04-07 10:03:10,730 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > > Verification succeeded for blk_-7252548330326272479_1259723
> > > 2010-04-07 10:03:10,745 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > > Verification succeeded for blk_-5909954202848831867_1075933
> > > 2010-04-07 10:03:10,886 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > > Verification succeeded for blk_-3213723859645738103_1075939
> > > 2010-04-07 10:03:10,910 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > > Verification succeeded for blk_-2209269106581706132_676390
> > > 2010-04-07 10:03:10,923 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > > Verification succeeded for blk_-6007998488187910667_676379
> > > 2010-04-07 10:03:11,086 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > > Verification succeeded for blk_-1024215056075897357_676383
> > > 2010-04-07 10:03:11,127 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > > Verification succeeded for blk_3780597313184168671_1270304
> > > 2010-04-07 10:03:11,160 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > > Verification succeeded for blk_8891623760013835158_676336
> > >
> > > One thing I wanted to point out is sometime back I'd to do setrep on
> the
> > > entire Cluster, are these verifications messages related to that ?
> > >
> > > Also while going through the NameNode logs i encountered following
> > things.
> > >
> > > 2010-04-05 21:01:31,383 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> > > NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.21:50010
> > > 2010-04-05 21:01:49,240 INFO org.apache.hadoop.net.NetworkTopology:
> > > Removing
> > > a node: /default-rack/192.168.100.21:50010
> > > 2010-04-05 21:01:49,243 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> > > NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.2:50010
> > > 2010-04-05 21:02:01,791 INFO org.apache.hadoop.net.NetworkTopology:
> > > Removing
> > > a node: /default-rack/192.168.100.2:50010
> > >
> > > then again @
> > >
> > > 2010-04-06 06:41:56,290 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> > > NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.21:50010
> > > 2010-04-06 06:41:56,290 INFO org.apache.hadoop.net.NetworkTopology:
> > > Removing
> > > a node: /default-rack/192.168.100.21:50010
> > > 2010-04-06 06:41:56,290 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> > > NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.2:50010
> > > 2010-04-06 06:41:56,290 INFO org.apache.hadoop.net.NetworkTopology:
> > > Removing
> > > a node: /default-rack/192.168.100.2:50010
> > >
> > > I had to restart the cluster post which I got both the nodes back.
> > >
> > > 2010-04-06 10:11:24,325 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> > > NameSystem.registerDatanode: node registration from
> > > 192.168.100.21:50010storage DS-455083797-192
> > > .168.100.21-50010-1268220157729
> > > 2010-04-06 10:11:24,328 INFO org.apache.hadoop.net.NetworkTopology:
> > Adding
> > > a
> > > new node: /default-rack/192.168.100.21:50010
> > > 2010-04-06 10:11:25,245 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> > > NameSystem.allocateBlock:
> > > /data/listing/image/5/84025/35924c87e664a43893904effbd2be601_list.jpg.
> > > blk_-1845977707636580795_1665561
> > > 2010-04-06 10:11:25,342 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> > > NameSystem.addStoredBlock: blockMap updated: 192.168.100.21:50010 is
> > added
> > > to blk_-1845977707636580795_1665561 size 72753
> > > 2010-04-06 10:11:44,257 INFO org.apache.hadoop.fs.FSNamesystem: Number
> of
> > > transactions: 64 Total time for transactions(ms): 4 Number of syncs: 45
> > > SyncTimes(ms): 387
> > > 2010-04-06 10:11:51,485 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> > > NameSystem.registerDatanode: node registration from
> > > 192.168.100.2:50010storage
> > > DS-1237294752-192.168.100.2-50010-1252010614375
> > > 2010-04-06 10:11:51,488 INFO org.apache.hadoop.net.NetworkTopology:
> > Adding
> > > a
> > > new node: /default-rack/192.168.100.2:50010
> > >
> > > Then again subsequently they were removed. No clue why this happened.
> > >
> > > Ever since I'm seeing following things in logs..
> > >
> > > 2010-04-06 10:00:49,052 INFO org.apache.hadoop.ipc.Server: IPC Server
> > > handler 2 on 54310, call
> > >
> > >
> >
> create(/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg,
> > > rwxr-xr-x, DFSClient_1226879860, true, 2, 67108864) from
> > > 192.168.100.5:40437:
> > > error: org.apache.hadoop.dfs.SafeModeException: Cannot create
> > >
> >
> file/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg.
> > > Name node is in safe mode.
> > > The ratio of reported blocks 0.0000 has not reached the threshold
> 0.9990.
> > > Safe mode will be turned off automatically.
> > > org.apache.hadoop.dfs.SafeModeException: Cannot create
> > >
> >
> file/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg.
> > > Name node is in safe mode.
> > > The ratio of reported blocks 0.0000 has not reached the threshold
> 0.9990.
> > > Safe mode will be turned off automatically.
> > >
> > > I ran FSCK also on the entire cluster, it gave following o/p in the
> > > summary.
> > >
> > > Total size:    540525108291 B
> > > Total dirs:    53298
> > > Total files:    1617706
> > > Total blocks (validated):    1601927 (avg. block size 337421 B)
> > >  ********************************
> > >  CORRUPT FILES:    1601525
> > >  MISSING BLOCKS:    1601927
> > >  MISSING SIZE:        540525108291 B
> > >  CORRUPT BLOCKS:     1601927
> > >  ********************************
> > > Minimally replicated blocks:    0 (0.0 %)
> > > Over-replicated blocks:    0 (0.0 %)
> > > Under-replicated blocks:    0 (0.0 %)
> > > Mis-replicated blocks:        0 (0.0 %)
> > > Default replication factor:    2
> > > Average block replication:    0.0
> > > Corrupt blocks:        1601927
> > >
> > > The filesystem under path '/data' is CORRUPT
> > >
> > > I'm using hadoop-0.18.3 on Ubuntu
> > >
> > > I'm completely clueless as to why its taking that long coming out of
> Safe
> > > Mode. Suggestions / Comments appreciated.
> > >
> > > - Manish
> > >
> > > DISCLAIMER
> > > ==========
> > > This e-mail may contain privileged and confidential information which
> is
> > > the property of Persistent Systems Ltd. It is intended only for the use
> > of
> > > the individual or entity to which it is addressed. If you are not the
> > > intended recipient, you are not authorized to read, retain, copy,
> print,
> > > distribute or use this message. If you have received this communication
> > in
> > > error, please notify the sender and delete all copies of this message.
> > > Persistent Systems Ltd. does not accept any liability for virus
> infected
> > > mails.
> > >
> >
> >
> > - Manish
> >
>


- Manish.

Re: Cluster in Safe Mode

Posted by Edson Ramiro <er...@gmail.com>.
To solve the safemode problem, you may first start the DFS, leave the
safemode and do a fsck.

./bin/start-dfs
./bin/hadoop dfs -safemode leave
./bin/hadoop fsck /

After this, restart the DFS.

You can configure HADOOP_OPTS in conf/hadoop-env.sh to give more mem. to
Java.
Also configure HADOOP_HEAPSIZE.

# export HADOOP_OPTS="-server -XX:+HeapDumpOnOutOfMemoryError
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX    :+UseParallelGC
-XX:ParallelGCThreads=4 -XX:NewSize=1G -XX:MaxNewSize=1G

Edson Ramiro


On 7 April 2010 06:04, Manish N <m1...@gmail.com> wrote:

> On Wed, Apr 7, 2010 at 10:59 AM, Sagar Shukla <
> sagar_shukla@persistent.co.in
> > wrote:
>
> > Hi Manish,
> >      Do you see any errors on DataNode log-files ? It is quite likely
> that
> > after the namenode starts the processes on datanode then are failing to
> > start, causing the namenode to wait in safe mode for datanode services to
> > start.
> >
>
> I do see following in the DataNode.out file whenever I start a DataNode on
> both the DataNodes of mine, after sometime they are marked as dead as
> expected.
>
> Exception in thread "DataNode: [/root/Datadir/hadoop/dfs/data]"
> java.lang.OutOfMemoryError: Java heap space
>        at java.util.Arrays.copyOf(Arrays.java:2786)
>        at
> java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:71)
>        at java.io.DataOutputStream.writeByte(DataOutputStream.java:136)
>        at org.apache.hadoop.io.UTF8.writeChars(UTF8.java:274)
>        at org.apache.hadoop.io.UTF8.writeString(UTF8.java:246)
>        at
> org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:120)
>        at
> org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:126)
>        at org.apache.hadoop.ipc.RPC$Invocation.write(RPC.java:109)
>        at
> org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:474)
>        at org.apache.hadoop.ipc.Client.call(Client.java:706)
>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>        at org.apache.hadoop.dfs.$Proxy4.blockReport(Unknown Source)
>        at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:744)
>        at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2967)
>        at java.lang.Thread.run(Thread.java:619)
>
>
>
>
>
>
> >
> > Thanks,
> > Sagar
> >
> > -----Original Message-----
> > From: Manish N [mailto:m1n3s6@gmail.com]
> > Sent: Wednesday, April 07, 2010 10:47 AM
> > To: common-user@hadoop.apache.org
> > Subject: Cluster in Safe Mode
> >
> > Hey all,
> >
> > I've a 2 Node cluster which is now running in Safe Mode. Its been 15-16
> hrs
> > now & yet to come out of Safe Mode. Does it normally take that long ?
> >
> > The DataNode logs on Node running NameNode indicates following & similar
> > output on the slave node ( running only Data Node ) as well.
> >
> > 2010-04-07 10:03:10,687 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > Verification succeeded for blk_-310922324774702076_996024
> > 2010-04-07 10:03:10,705 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > Verification succeeded for blk_3302288729849061244_813694
> > 2010-04-07 10:03:10,730 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > Verification succeeded for blk_-7252548330326272479_1259723
> > 2010-04-07 10:03:10,745 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > Verification succeeded for blk_-5909954202848831867_1075933
> > 2010-04-07 10:03:10,886 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > Verification succeeded for blk_-3213723859645738103_1075939
> > 2010-04-07 10:03:10,910 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > Verification succeeded for blk_-2209269106581706132_676390
> > 2010-04-07 10:03:10,923 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > Verification succeeded for blk_-6007998488187910667_676379
> > 2010-04-07 10:03:11,086 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > Verification succeeded for blk_-1024215056075897357_676383
> > 2010-04-07 10:03:11,127 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > Verification succeeded for blk_3780597313184168671_1270304
> > 2010-04-07 10:03:11,160 INFO org.apache.hadoop.dfs.DataBlockScanner:
> > Verification succeeded for blk_8891623760013835158_676336
> >
> > One thing I wanted to point out is sometime back I'd to do setrep on the
> > entire Cluster, are these verifications messages related to that ?
> >
> > Also while going through the NameNode logs i encountered following
> things.
> >
> > 2010-04-05 21:01:31,383 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> > NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.21:50010
> > 2010-04-05 21:01:49,240 INFO org.apache.hadoop.net.NetworkTopology:
> > Removing
> > a node: /default-rack/192.168.100.21:50010
> > 2010-04-05 21:01:49,243 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> > NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.2:50010
> > 2010-04-05 21:02:01,791 INFO org.apache.hadoop.net.NetworkTopology:
> > Removing
> > a node: /default-rack/192.168.100.2:50010
> >
> > then again @
> >
> > 2010-04-06 06:41:56,290 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> > NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.21:50010
> > 2010-04-06 06:41:56,290 INFO org.apache.hadoop.net.NetworkTopology:
> > Removing
> > a node: /default-rack/192.168.100.21:50010
> > 2010-04-06 06:41:56,290 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> > NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.2:50010
> > 2010-04-06 06:41:56,290 INFO org.apache.hadoop.net.NetworkTopology:
> > Removing
> > a node: /default-rack/192.168.100.2:50010
> >
> > I had to restart the cluster post which I got both the nodes back.
> >
> > 2010-04-06 10:11:24,325 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> > NameSystem.registerDatanode: node registration from
> > 192.168.100.21:50010storage DS-455083797-192
> > .168.100.21-50010-1268220157729
> > 2010-04-06 10:11:24,328 INFO org.apache.hadoop.net.NetworkTopology:
> Adding
> > a
> > new node: /default-rack/192.168.100.21:50010
> > 2010-04-06 10:11:25,245 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> > NameSystem.allocateBlock:
> > /data/listing/image/5/84025/35924c87e664a43893904effbd2be601_list.jpg.
> > blk_-1845977707636580795_1665561
> > 2010-04-06 10:11:25,342 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> > NameSystem.addStoredBlock: blockMap updated: 192.168.100.21:50010 is
> added
> > to blk_-1845977707636580795_1665561 size 72753
> > 2010-04-06 10:11:44,257 INFO org.apache.hadoop.fs.FSNamesystem: Number of
> > transactions: 64 Total time for transactions(ms): 4 Number of syncs: 45
> > SyncTimes(ms): 387
> > 2010-04-06 10:11:51,485 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> > NameSystem.registerDatanode: node registration from
> > 192.168.100.2:50010storage
> > DS-1237294752-192.168.100.2-50010-1252010614375
> > 2010-04-06 10:11:51,488 INFO org.apache.hadoop.net.NetworkTopology:
> Adding
> > a
> > new node: /default-rack/192.168.100.2:50010
> >
> > Then again subsequently they were removed. No clue why this happened.
> >
> > Ever since I'm seeing following things in logs..
> >
> > 2010-04-06 10:00:49,052 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 2 on 54310, call
> >
> >
> create(/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg,
> > rwxr-xr-x, DFSClient_1226879860, true, 2, 67108864) from
> > 192.168.100.5:40437:
> > error: org.apache.hadoop.dfs.SafeModeException: Cannot create
> >
> file/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg.
> > Name node is in safe mode.
> > The ratio of reported blocks 0.0000 has not reached the threshold 0.9990.
> > Safe mode will be turned off automatically.
> > org.apache.hadoop.dfs.SafeModeException: Cannot create
> >
> file/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg.
> > Name node is in safe mode.
> > The ratio of reported blocks 0.0000 has not reached the threshold 0.9990.
> > Safe mode will be turned off automatically.
> >
> > I ran FSCK also on the entire cluster, it gave following o/p in the
> > summary.
> >
> > Total size:    540525108291 B
> > Total dirs:    53298
> > Total files:    1617706
> > Total blocks (validated):    1601927 (avg. block size 337421 B)
> >  ********************************
> >  CORRUPT FILES:    1601525
> >  MISSING BLOCKS:    1601927
> >  MISSING SIZE:        540525108291 B
> >  CORRUPT BLOCKS:     1601927
> >  ********************************
> > Minimally replicated blocks:    0 (0.0 %)
> > Over-replicated blocks:    0 (0.0 %)
> > Under-replicated blocks:    0 (0.0 %)
> > Mis-replicated blocks:        0 (0.0 %)
> > Default replication factor:    2
> > Average block replication:    0.0
> > Corrupt blocks:        1601927
> >
> > The filesystem under path '/data' is CORRUPT
> >
> > I'm using hadoop-0.18.3 on Ubuntu
> >
> > I'm completely clueless as to why its taking that long coming out of Safe
> > Mode. Suggestions / Comments appreciated.
> >
> > - Manish
> >
> > DISCLAIMER
> > ==========
> > This e-mail may contain privileged and confidential information which is
> > the property of Persistent Systems Ltd. It is intended only for the use
> of
> > the individual or entity to which it is addressed. If you are not the
> > intended recipient, you are not authorized to read, retain, copy, print,
> > distribute or use this message. If you have received this communication
> in
> > error, please notify the sender and delete all copies of this message.
> > Persistent Systems Ltd. does not accept any liability for virus infected
> > mails.
> >
>
>
> - Manish
>

Re: Cluster in Safe Mode

Posted by Manish N <m1...@gmail.com>.
On Wed, Apr 7, 2010 at 10:59 AM, Sagar Shukla <sagar_shukla@persistent.co.in
> wrote:

> Hi Manish,
>      Do you see any errors on DataNode log-files ? It is quite likely that
> after the namenode starts the processes on datanode then are failing to
> start, causing the namenode to wait in safe mode for datanode services to
> start.
>

I do see following in the DataNode.out file whenever I start a DataNode on
both the DataNodes of mine, after sometime they are marked as dead as
expected.

Exception in thread "DataNode: [/root/Datadir/hadoop/dfs/data]"
java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2786)
        at
java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:71)
        at java.io.DataOutputStream.writeByte(DataOutputStream.java:136)
        at org.apache.hadoop.io.UTF8.writeChars(UTF8.java:274)
        at org.apache.hadoop.io.UTF8.writeString(UTF8.java:246)
        at
org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:120)
        at
org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:126)
        at org.apache.hadoop.ipc.RPC$Invocation.write(RPC.java:109)
        at
org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:474)
        at org.apache.hadoop.ipc.Client.call(Client.java:706)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
        at org.apache.hadoop.dfs.$Proxy4.blockReport(Unknown Source)
        at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:744)
        at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2967)
        at java.lang.Thread.run(Thread.java:619)






>
> Thanks,
> Sagar
>
> -----Original Message-----
> From: Manish N [mailto:m1n3s6@gmail.com]
> Sent: Wednesday, April 07, 2010 10:47 AM
> To: common-user@hadoop.apache.org
> Subject: Cluster in Safe Mode
>
> Hey all,
>
> I've a 2 Node cluster which is now running in Safe Mode. Its been 15-16 hrs
> now & yet to come out of Safe Mode. Does it normally take that long ?
>
> The DataNode logs on Node running NameNode indicates following & similar
> output on the slave node ( running only Data Node ) as well.
>
> 2010-04-07 10:03:10,687 INFO org.apache.hadoop.dfs.DataBlockScanner:
> Verification succeeded for blk_-310922324774702076_996024
> 2010-04-07 10:03:10,705 INFO org.apache.hadoop.dfs.DataBlockScanner:
> Verification succeeded for blk_3302288729849061244_813694
> 2010-04-07 10:03:10,730 INFO org.apache.hadoop.dfs.DataBlockScanner:
> Verification succeeded for blk_-7252548330326272479_1259723
> 2010-04-07 10:03:10,745 INFO org.apache.hadoop.dfs.DataBlockScanner:
> Verification succeeded for blk_-5909954202848831867_1075933
> 2010-04-07 10:03:10,886 INFO org.apache.hadoop.dfs.DataBlockScanner:
> Verification succeeded for blk_-3213723859645738103_1075939
> 2010-04-07 10:03:10,910 INFO org.apache.hadoop.dfs.DataBlockScanner:
> Verification succeeded for blk_-2209269106581706132_676390
> 2010-04-07 10:03:10,923 INFO org.apache.hadoop.dfs.DataBlockScanner:
> Verification succeeded for blk_-6007998488187910667_676379
> 2010-04-07 10:03:11,086 INFO org.apache.hadoop.dfs.DataBlockScanner:
> Verification succeeded for blk_-1024215056075897357_676383
> 2010-04-07 10:03:11,127 INFO org.apache.hadoop.dfs.DataBlockScanner:
> Verification succeeded for blk_3780597313184168671_1270304
> 2010-04-07 10:03:11,160 INFO org.apache.hadoop.dfs.DataBlockScanner:
> Verification succeeded for blk_8891623760013835158_676336
>
> One thing I wanted to point out is sometime back I'd to do setrep on the
> entire Cluster, are these verifications messages related to that ?
>
> Also while going through the NameNode logs i encountered following things.
>
> 2010-04-05 21:01:31,383 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.21:50010
> 2010-04-05 21:01:49,240 INFO org.apache.hadoop.net.NetworkTopology:
> Removing
> a node: /default-rack/192.168.100.21:50010
> 2010-04-05 21:01:49,243 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.2:50010
> 2010-04-05 21:02:01,791 INFO org.apache.hadoop.net.NetworkTopology:
> Removing
> a node: /default-rack/192.168.100.2:50010
>
> then again @
>
> 2010-04-06 06:41:56,290 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.21:50010
> 2010-04-06 06:41:56,290 INFO org.apache.hadoop.net.NetworkTopology:
> Removing
> a node: /default-rack/192.168.100.21:50010
> 2010-04-06 06:41:56,290 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.2:50010
> 2010-04-06 06:41:56,290 INFO org.apache.hadoop.net.NetworkTopology:
> Removing
> a node: /default-rack/192.168.100.2:50010
>
> I had to restart the cluster post which I got both the nodes back.
>
> 2010-04-06 10:11:24,325 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> NameSystem.registerDatanode: node registration from
> 192.168.100.21:50010storage DS-455083797-192
> .168.100.21-50010-1268220157729
> 2010-04-06 10:11:24,328 INFO org.apache.hadoop.net.NetworkTopology: Adding
> a
> new node: /default-rack/192.168.100.21:50010
> 2010-04-06 10:11:25,245 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> NameSystem.allocateBlock:
> /data/listing/image/5/84025/35924c87e664a43893904effbd2be601_list.jpg.
> blk_-1845977707636580795_1665561
> 2010-04-06 10:11:25,342 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> NameSystem.addStoredBlock: blockMap updated: 192.168.100.21:50010 is added
> to blk_-1845977707636580795_1665561 size 72753
> 2010-04-06 10:11:44,257 INFO org.apache.hadoop.fs.FSNamesystem: Number of
> transactions: 64 Total time for transactions(ms): 4 Number of syncs: 45
> SyncTimes(ms): 387
> 2010-04-06 10:11:51,485 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
> NameSystem.registerDatanode: node registration from
> 192.168.100.2:50010storage
> DS-1237294752-192.168.100.2-50010-1252010614375
> 2010-04-06 10:11:51,488 INFO org.apache.hadoop.net.NetworkTopology: Adding
> a
> new node: /default-rack/192.168.100.2:50010
>
> Then again subsequently they were removed. No clue why this happened.
>
> Ever since I'm seeing following things in logs..
>
> 2010-04-06 10:00:49,052 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 54310, call
>
> create(/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg,
> rwxr-xr-x, DFSClient_1226879860, true, 2, 67108864) from
> 192.168.100.5:40437:
> error: org.apache.hadoop.dfs.SafeModeException: Cannot create
> file/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg.
> Name node is in safe mode.
> The ratio of reported blocks 0.0000 has not reached the threshold 0.9990.
> Safe mode will be turned off automatically.
> org.apache.hadoop.dfs.SafeModeException: Cannot create
> file/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg.
> Name node is in safe mode.
> The ratio of reported blocks 0.0000 has not reached the threshold 0.9990.
> Safe mode will be turned off automatically.
>
> I ran FSCK also on the entire cluster, it gave following o/p in the
> summary.
>
> Total size:    540525108291 B
> Total dirs:    53298
> Total files:    1617706
> Total blocks (validated):    1601927 (avg. block size 337421 B)
>  ********************************
>  CORRUPT FILES:    1601525
>  MISSING BLOCKS:    1601927
>  MISSING SIZE:        540525108291 B
>  CORRUPT BLOCKS:     1601927
>  ********************************
> Minimally replicated blocks:    0 (0.0 %)
> Over-replicated blocks:    0 (0.0 %)
> Under-replicated blocks:    0 (0.0 %)
> Mis-replicated blocks:        0 (0.0 %)
> Default replication factor:    2
> Average block replication:    0.0
> Corrupt blocks:        1601927
>
> The filesystem under path '/data' is CORRUPT
>
> I'm using hadoop-0.18.3 on Ubuntu
>
> I'm completely clueless as to why its taking that long coming out of Safe
> Mode. Suggestions / Comments appreciated.
>
> - Manish
>
> DISCLAIMER
> ==========
> This e-mail may contain privileged and confidential information which is
> the property of Persistent Systems Ltd. It is intended only for the use of
> the individual or entity to which it is addressed. If you are not the
> intended recipient, you are not authorized to read, retain, copy, print,
> distribute or use this message. If you have received this communication in
> error, please notify the sender and delete all copies of this message.
> Persistent Systems Ltd. does not accept any liability for virus infected
> mails.
>


- Manish

RE: Cluster in Safe Mode

Posted by Sagar Shukla <sa...@persistent.co.in>.
Hi Manish,
      Do you see any errors on DataNode log-files ? It is quite likely that after the namenode starts the processes on datanode then are failing to start, causing the namenode to wait in safe mode for datanode services to start.

Thanks,
Sagar

-----Original Message-----
From: Manish N [mailto:m1n3s6@gmail.com] 
Sent: Wednesday, April 07, 2010 10:47 AM
To: common-user@hadoop.apache.org
Subject: Cluster in Safe Mode

Hey all,

I've a 2 Node cluster which is now running in Safe Mode. Its been 15-16 hrs
now & yet to come out of Safe Mode. Does it normally take that long ?

The DataNode logs on Node running NameNode indicates following & similar
output on the slave node ( running only Data Node ) as well.

2010-04-07 10:03:10,687 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-310922324774702076_996024
2010-04-07 10:03:10,705 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_3302288729849061244_813694
2010-04-07 10:03:10,730 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-7252548330326272479_1259723
2010-04-07 10:03:10,745 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-5909954202848831867_1075933
2010-04-07 10:03:10,886 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-3213723859645738103_1075939
2010-04-07 10:03:10,910 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-2209269106581706132_676390
2010-04-07 10:03:10,923 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-6007998488187910667_676379
2010-04-07 10:03:11,086 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-1024215056075897357_676383
2010-04-07 10:03:11,127 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_3780597313184168671_1270304
2010-04-07 10:03:11,160 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_8891623760013835158_676336

One thing I wanted to point out is sometime back I'd to do setrep on the
entire Cluster, are these verifications messages related to that ?

Also while going through the NameNode logs i encountered following things.

2010-04-05 21:01:31,383 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.21:50010
2010-04-05 21:01:49,240 INFO org.apache.hadoop.net.NetworkTopology: Removing
a node: /default-rack/192.168.100.21:50010
2010-04-05 21:01:49,243 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.2:50010
2010-04-05 21:02:01,791 INFO org.apache.hadoop.net.NetworkTopology: Removing
a node: /default-rack/192.168.100.2:50010

then again @

2010-04-06 06:41:56,290 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.21:50010
2010-04-06 06:41:56,290 INFO org.apache.hadoop.net.NetworkTopology: Removing
a node: /default-rack/192.168.100.21:50010
2010-04-06 06:41:56,290 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.2:50010
2010-04-06 06:41:56,290 INFO org.apache.hadoop.net.NetworkTopology: Removing
a node: /default-rack/192.168.100.2:50010

I had to restart the cluster post which I got both the nodes back.

2010-04-06 10:11:24,325 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.registerDatanode: node registration from
192.168.100.21:50010storage DS-455083797-192
.168.100.21-50010-1268220157729
2010-04-06 10:11:24,328 INFO org.apache.hadoop.net.NetworkTopology: Adding a
new node: /default-rack/192.168.100.21:50010
2010-04-06 10:11:25,245 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.allocateBlock:
/data/listing/image/5/84025/35924c87e664a43893904effbd2be601_list.jpg.
blk_-1845977707636580795_1665561
2010-04-06 10:11:25,342 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.addStoredBlock: blockMap updated: 192.168.100.21:50010 is added
to blk_-1845977707636580795_1665561 size 72753
2010-04-06 10:11:44,257 INFO org.apache.hadoop.fs.FSNamesystem: Number of
transactions: 64 Total time for transactions(ms): 4 Number of syncs: 45
SyncTimes(ms): 387
2010-04-06 10:11:51,485 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.registerDatanode: node registration from
192.168.100.2:50010storage
DS-1237294752-192.168.100.2-50010-1252010614375
2010-04-06 10:11:51,488 INFO org.apache.hadoop.net.NetworkTopology: Adding a
new node: /default-rack/192.168.100.2:50010

Then again subsequently they were removed. No clue why this happened.

Ever since I'm seeing following things in logs..

2010-04-06 10:00:49,052 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 2 on 54310, call
create(/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg,
rwxr-xr-x, DFSClient_1226879860, true, 2, 67108864) from 192.168.100.5:40437:
error: org.apache.hadoop.dfs.SafeModeException: Cannot create
file/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg.
Name node is in safe mode.
The ratio of reported blocks 0.0000 has not reached the threshold 0.9990.
Safe mode will be turned off automatically.
org.apache.hadoop.dfs.SafeModeException: Cannot create
file/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg.
Name node is in safe mode.
The ratio of reported blocks 0.0000 has not reached the threshold 0.9990.
Safe mode will be turned off automatically.

I ran FSCK also on the entire cluster, it gave following o/p in the summary.

Total size:    540525108291 B
Total dirs:    53298
Total files:    1617706
Total blocks (validated):    1601927 (avg. block size 337421 B)
  ********************************
  CORRUPT FILES:    1601525
  MISSING BLOCKS:    1601927
  MISSING SIZE:        540525108291 B
  CORRUPT BLOCKS:     1601927
  ********************************
Minimally replicated blocks:    0 (0.0 %)
Over-replicated blocks:    0 (0.0 %)
Under-replicated blocks:    0 (0.0 %)
Mis-replicated blocks:        0 (0.0 %)
Default replication factor:    2
Average block replication:    0.0
Corrupt blocks:        1601927

The filesystem under path '/data' is CORRUPT

I'm using hadoop-0.18.3 on Ubuntu

I'm completely clueless as to why its taking that long coming out of Safe
Mode. Suggestions / Comments appreciated.

- Manish

DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.

Re: Cluster in Safe Mode

Posted by Ravi Phulari <rp...@yahoo-inc.com>.
Looks like your all data nodes are down. Please make sure your data nodes are up and running (Check from Name node web ui and by jps on data nodes).
Fsck is showing that there are 0 minimally replicated files and Average block replication is 0.
Also please verify if your Data nodes data dir has any blocks.

-
Ravi


On 4/6/10 10:16 PM, "Manish N" <m1...@gmail.com> wrote:

CORRUPT FILES:    1601525
  MISSING BLOCKS:    1601927
  MISSING SIZE:        540525108291 B
  CORRUPT BLOCKS:     1601927
  ********************************
Minimally replicated blocks:    0 (0.0 %)
Over-replicated blocks:    0 (0.0 %)
Under-replicated blocks:    0 (0.0 %)
Mis-replicated blocks:        0 (0.0 %)
Default replication factor:    2
Average block replication:    0.0
Corrupt blocks:        1601927

Ravi
--