You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Wei Wu <wo...@gmail.com> on 2011/07/07 15:19:38 UTC

Re: NameNode stuck in safemode with few missing blocks

Sorry, changed the title: without -> with

On Thu, Jul 7, 2011 at 9:14 PM, Wei Wu <wo...@gmail.com> wrote:

> Hi,
>
> We encountered a strange situation when restarting NameNode: it can not
> leave safe mode automatically. "The ratio of reported blocks 0.9986 has not
> reached the threshold 0.999". Our cluster has totally 83,276,820 blocks. So,
> if the counter is right, we are missing about 116,587 blocks. But fsck
> reported 83,276,779 blocks were healthy and 37 blocks in open files. Only 4
> blocks were marked as corrupt because its length is shorter than existing
> ones. If the fsck result is believable, we got ratio higher than 0.999999
> and the threshold was reached.
>
> I think maybe the counter of blockSafe didn't function accurately. Is that
> possible? Our case is similar to the situation described in jira:
> https://issues.apache.org/jira/browse/HADOOP-2159 (our Hadoop release
> already included this patch).
>
> Any suggestions?
>
> Wei
>