You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "Eason.Lee" <le...@gmail.com> on 2009/10/13 03:41:50 UTC

How to deal with safemode?

There is something wrong with network, so i killed all the hadoop thread buy
"kill -9 pid"
when i try to start hadoop today, it can't leave safemode automatically!
the web ui shows:
*Safe mode is ON. The ratio of reported blocks 0.9951 has not reached the
threshold 0.9990. Safe mode will be turned off automatically.
* * 1849 files and directories, 1826 blocks = 3675 total. Heap Size is 26.69
MB / 888.94 MB (3%)
*seams someblock is missing
i don't know how to deal with this?
any suggestion ?  thx~~~

Re: How to deal with safemode?

Posted by "Eason.Lee" <le...@gmail.com>.
2009/10/13 Amandeep Khurana <am...@gmail.com>

> The NN is expecting more blocks than that are being reported. The kill
> must have happened when some file handles were still open and you lost
> those files/blocks.
>
> Force it out of safe mode and put the data into hdfs again.
>

I don't know which block is lost...
I was doing nothing when the network went wrong~
I have install hbase on the hdfs
Does that means I have lost the data in the hbase?

>
> On 10/12/09, Eason.Lee <le...@gmail.com> wrote:
> > thx for reply~~
> > more info show by fsck
> >
> > Status: HEALTHY
> >  Total size:    72991113326 B (Total open files size: 46882304 B)
> >  Total dirs:    820
> >  Total files:   1019 (Files currently being written: 9)
> >  Total blocks (validated):      1817 (avg. block size 40171223 B) (Total
> > open file blocks (not validated): 9)(is there anything wrong here?)
> >  Minimally replicated blocks:   1817 (100.0 %)
> >  Over-replicated blocks:        0 (0.0 %)
> >  Under-replicated blocks:       0 (0.0 %)
> >  Mis-replicated blocks:         0 (0.0 %)
> >  Default replication factor:    2
> >  Average block replication:     2.8233352
> >  Corrupt blocks:                0
> >  Missing replicas:              0 (0.0 %)
> >  Number of data-nodes:          4
> >  Number of racks:               1
> >
> > it seems everything is ok~~
> >
> > 2009/10/13 Amandeep Khurana <am...@gmail.com>
> >
> >> 1. You can force the cluster out of safe mode if its needed.
> >>
> >> I have tried that.
> > But when i restart the cluster, it still can't leave the safemode
> >
> > 2. Check if all your datanodes are coming up. Could be that there's
> >> some DN that isn't coming up - causing the under reporting of blocks.
> >>
> >> all the datanode is coming up
> > I have only 4 datanode
> >
> >
> >
> >> On 10/12/09, Eason.Lee <le...@gmail.com> wrote:
> >> > There is something wrong with network, so i killed all the hadoop
> thread
> >> buy
> >> > "kill -9 pid"
> >> > when i try to start hadoop today, it can't leave safemode
> automatically!
> >> > the web ui shows:
> >> > *Safe mode is ON. The ratio of reported blocks 0.9951 has not reached
> >> > the
> >> > threshold 0.9990. Safe mode will be turned off automatically.
> >> > * * 1849 files and directories, 1826 blocks = 3675 total. Heap Size is
> >> 26.69
> >> > MB / 888.94 MB (3%)
> >> > *seams someblock is missing
> >> > i don't know how to deal with this?
> >> > any suggestion ?  thx~~~
> >> >
> >>
> >>
> >> --
> >>
> >>
> >> Amandeep Khurana
> >> Computer Science Graduate Student
> >> University of California, Santa Cruz
> >>
> >
>
>
> --
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>

Re: How to deal with safemode?

Posted by Amandeep Khurana <am...@gmail.com>.
The NN is expecting more blocks than that are being reported. The kill
must have happened when some file handles were still open and you lost
those files/blocks.

Force it out of safe mode and put the data into hdfs again.

On 10/12/09, Eason.Lee <le...@gmail.com> wrote:
> thx for reply~~
> more info show by fsck
>
> Status: HEALTHY
>  Total size:    72991113326 B (Total open files size: 46882304 B)
>  Total dirs:    820
>  Total files:   1019 (Files currently being written: 9)
>  Total blocks (validated):      1817 (avg. block size 40171223 B) (Total
> open file blocks (not validated): 9)(is there anything wrong here?)
>  Minimally replicated blocks:   1817 (100.0 %)
>  Over-replicated blocks:        0 (0.0 %)
>  Under-replicated blocks:       0 (0.0 %)
>  Mis-replicated blocks:         0 (0.0 %)
>  Default replication factor:    2
>  Average block replication:     2.8233352
>  Corrupt blocks:                0
>  Missing replicas:              0 (0.0 %)
>  Number of data-nodes:          4
>  Number of racks:               1
>
> it seems everything is ok~~
>
> 2009/10/13 Amandeep Khurana <am...@gmail.com>
>
>> 1. You can force the cluster out of safe mode if its needed.
>>
>> I have tried that.
> But when i restart the cluster, it still can't leave the safemode
>
> 2. Check if all your datanodes are coming up. Could be that there's
>> some DN that isn't coming up - causing the under reporting of blocks.
>>
>> all the datanode is coming up
> I have only 4 datanode
>
>
>
>> On 10/12/09, Eason.Lee <le...@gmail.com> wrote:
>> > There is something wrong with network, so i killed all the hadoop thread
>> buy
>> > "kill -9 pid"
>> > when i try to start hadoop today, it can't leave safemode automatically!
>> > the web ui shows:
>> > *Safe mode is ON. The ratio of reported blocks 0.9951 has not reached
>> > the
>> > threshold 0.9990. Safe mode will be turned off automatically.
>> > * * 1849 files and directories, 1826 blocks = 3675 total. Heap Size is
>> 26.69
>> > MB / 888.94 MB (3%)
>> > *seams someblock is missing
>> > i don't know how to deal with this?
>> > any suggestion ?  thx~~~
>> >
>>
>>
>> --
>>
>>
>> Amandeep Khurana
>> Computer Science Graduate Student
>> University of California, Santa Cruz
>>
>


-- 


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz

Re: How to deal with safemode?

Posted by "Eason.Lee" <le...@gmail.com>.
thx for reply~~
more info show by fsck

Status: HEALTHY
 Total size:    72991113326 B (Total open files size: 46882304 B)
 Total dirs:    820
 Total files:   1019 (Files currently being written: 9)
 Total blocks (validated):      1817 (avg. block size 40171223 B) (Total
open file blocks (not validated): 9)(is there anything wrong here?)
 Minimally replicated blocks:   1817 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    2
 Average block replication:     2.8233352
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          4
 Number of racks:               1

it seems everything is ok~~

2009/10/13 Amandeep Khurana <am...@gmail.com>

> 1. You can force the cluster out of safe mode if its needed.
>
> I have tried that.
But when i restart the cluster, it still can't leave the safemode

2. Check if all your datanodes are coming up. Could be that there's
> some DN that isn't coming up - causing the under reporting of blocks.
>
> all the datanode is coming up
I have only 4 datanode



> On 10/12/09, Eason.Lee <le...@gmail.com> wrote:
> > There is something wrong with network, so i killed all the hadoop thread
> buy
> > "kill -9 pid"
> > when i try to start hadoop today, it can't leave safemode automatically!
> > the web ui shows:
> > *Safe mode is ON. The ratio of reported blocks 0.9951 has not reached the
> > threshold 0.9990. Safe mode will be turned off automatically.
> > * * 1849 files and directories, 1826 blocks = 3675 total. Heap Size is
> 26.69
> > MB / 888.94 MB (3%)
> > *seams someblock is missing
> > i don't know how to deal with this?
> > any suggestion ?  thx~~~
> >
>
>
> --
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>

Re: How to deal with safemode?

Posted by Amandeep Khurana <am...@gmail.com>.
1. You can force the cluster out of safe mode if its needed.

2. Check if all your datanodes are coming up. Could be that there's
some DN that isn't coming up - causing the under reporting of blocks.

On 10/12/09, Eason.Lee <le...@gmail.com> wrote:
> There is something wrong with network, so i killed all the hadoop thread buy
> "kill -9 pid"
> when i try to start hadoop today, it can't leave safemode automatically!
> the web ui shows:
> *Safe mode is ON. The ratio of reported blocks 0.9951 has not reached the
> threshold 0.9990. Safe mode will be turned off automatically.
> * * 1849 files and directories, 1826 blocks = 3675 total. Heap Size is 26.69
> MB / 888.94 MB (3%)
> *seams someblock is missing
> i don't know how to deal with this?
> any suggestion ?  thx~~~
>


-- 


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz