You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Himanish Kushary <hi...@gmail.com> on 2011/05/23 16:50:44 UTC

HBase Not Starting after improper shutdown

Hi,

Our hbase/hadoop servers machines were shutdown without bringing the hadoop
and hbase services down properly.Now when we try to bring up hbase we get
the following error in the master log:

org.apache.hadoop.hbase.NotServingRegionException: Region is not online:
-ROOT-,,0

Hadoop services (namenode,jobtracker,datanode etc) have come up properly and
we are able to see the files in HDFS. But HBase Master keeps on throwing
this exception and then finally throws a Java Heap Space error.

Note: We have two datanodes, replication set to 2 and around 900 blocks are
shown as under-replicated.

---------------------------------
Thanks & Regards
Himanish

Re: HBase Not Starting after improper shutdown

Posted by Stack <st...@duboce.net>.

On Tue, May 24, 2011 at 7:19 AM, Himanish Kushary <hi...@gmail.com> wrote:
> The Region Server logs also shows the same -ROOT- Region not online error.
>

The above does not give us any information that we can use to help us
diagnose your issue.  Can you pastebin the master log?
Yours,
St.Ack

Re: HBase Not Starting after improper shutdown

Posted by Himanish Kushary <hi...@gmail.com>.

The Region Server logs also shows the same -ROOT- Region not online error.

On Mon, May 23, 2011 at 1:10 PM, Bill Graham <bi...@gmail.com> wrote:

> Is there anything meaningful in the RS logs? I've seen situations like this
> where a RS is failing to start due to issues reading the WAL. If this is
> the
> case it would list which WAL is problematic, which is zero-length in my
> experience, so I delete it from HDFS and things start up.
>
>
> On Mon, May 23, 2011 at 9:16 AM, Himanish Kushary <himanish@gmail.com
> >wrote:
>
> > Both the Master and hbck command prints
> >
> > org.apache.hadoop.hbase.NotServingRegionException:
> > org.apache.hadoop.hbase.NotServingRegionException: Region is not online:
> > -ROOT-,,0
> >
> > After the master thread exits due to the Heap Space error the hbck
> command
> > throws:
> >
> > org.apache.hadoop.hbase.MasterNotRunningException
> >
> > Is there anyway to fix this kind of issue.We are keeping the datanodes up
> > to
> > see whether the under replicated blocks may be recovered.Does improper
> > shutdown of the hadoop/hbase services cause this kind of issues? What
> > happens in case of disaster recovery situation, how are those situaltions
> > handled ?
> >
> > Thanks
> >
> >
> > On Mon, May 23, 2011 at 11:36 AM, Stack <st...@duboce.net> wrote:
> >
> > > What does hbase hbck say?  (http://hbase.apache.org/book.html#hbck).
> > >
> > > What does the master log have in it?  Anything of interest.
> > >
> > > St.Ack
> > >
> > > On Mon, May 23, 2011 at 7:53 AM, Himanish Kushary <hi...@gmail.com>
> > > wrote:
> > > > Pressed the send button too soon...
> > > >
> > > > Also here is the output from hadoop fsck
> > > >
> > > > *Status: HEALTHY*
> > > > * Total size: 37678848280 B*
> > > > * Total dirs: 941*
> > > > * Total files: 902 (Files currently being written: 1)*
> > > > * Total blocks (validated): 1141 (avg. block size 33022654 B) (Total
> > open
> > > > file blocks (not validated): 1)*
> > > > * Minimally replicated blocks: 1141 (100.0 %)*
> > > > * Over-replicated blocks: 0 (0.0 %)*
> > > > * Under-replicated blocks: 906 (79.40403 %)*
> > > > * Mis-replicated blocks: 0 (0.0 %)*
> > > > * Default replication factor: 2*
> > > > * Average block replication: 2.0*
> > > > * Corrupt blocks: 0*
> > > > * Missing replicas: 1886 (82.646805 %)*
> > > > * Number of data-nodes: 2*
> > > > * Number of racks: 1*
> > > > *FSCK ended at Mon May 23 10:51:13 EDT 2011 in 257 milliseconds*
> > > > *
> > > > *
> > > > *
> > > > *
> > > > *The filesystem under path '/' is HEALTHY*
> > > >
> > > >
> > > > Could anybody please help on how to recover from this scenario .
> > > >
> > > > Thanks
> > > >
> > > >
> > > > On Mon, May 23, 2011 at 10:50 AM, Himanish Kushary <
> himanish@gmail.com
> > > >wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> Our hbase/hadoop servers machines were shutdown without bringing the
> > > hadoop
> > > >> and hbase services down properly.Now when we try to bring up hbase
> we
> > > get
> > > >> the following error in the master log:
> > > >>
> > > >> org.apache.hadoop.hbase.NotServingRegionException: Region is not
> > online:
> > > >> -ROOT-,,0
> > > >>
> > > >> Hadoop services (namenode,jobtracker,datanode etc) have come up
> > properly
> > > >> and we are able to see the files in HDFS. But HBase Master keeps on
> > > throwing
> > > >> this exception and then finally throws a Java Heap Space error.
> > > >>
> > > >> Note: We have two datanodes, replication set to 2 and around 900
> > blocks
> > > are
> > > >> shown as under-replicated.
> > > >>
> > > >> ---------------------------------
> > > >> Thanks & Regards
> > > >> Himanish
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Thanks & Regards
> > > > Himanish
> > > >
> > >
> >
> >
> >
> > --
> > Thanks & Regards
> > Himanish
> >
>



-- 
Thanks & Regards
Himanish

Re: HBase Not Starting after improper shutdown

Posted by Bill Graham <bi...@gmail.com>.

Is there anything meaningful in the RS logs? I've seen situations like this
where a RS is failing to start due to issues reading the WAL. If this is the
case it would list which WAL is problematic, which is zero-length in my
experience, so I delete it from HDFS and things start up.


On Mon, May 23, 2011 at 9:16 AM, Himanish Kushary <hi...@gmail.com>wrote:

> Both the Master and hbck command prints
>
> org.apache.hadoop.hbase.NotServingRegionException:
> org.apache.hadoop.hbase.NotServingRegionException: Region is not online:
> -ROOT-,,0
>
> After the master thread exits due to the Heap Space error the hbck command
> throws:
>
> org.apache.hadoop.hbase.MasterNotRunningException
>
> Is there anyway to fix this kind of issue.We are keeping the datanodes up
> to
> see whether the under replicated blocks may be recovered.Does improper
> shutdown of the hadoop/hbase services cause this kind of issues? What
> happens in case of disaster recovery situation, how are those situaltions
> handled ?
>
> Thanks
>
>
> On Mon, May 23, 2011 at 11:36 AM, Stack <st...@duboce.net> wrote:
>
> > What does hbase hbck say?  (http://hbase.apache.org/book.html#hbck).
> >
> > What does the master log have in it?  Anything of interest.
> >
> > St.Ack
> >
> > On Mon, May 23, 2011 at 7:53 AM, Himanish Kushary <hi...@gmail.com>
> > wrote:
> > > Pressed the send button too soon...
> > >
> > > Also here is the output from hadoop fsck
> > >
> > > *Status: HEALTHY*
> > > * Total size: 37678848280 B*
> > > * Total dirs: 941*
> > > * Total files: 902 (Files currently being written: 1)*
> > > * Total blocks (validated): 1141 (avg. block size 33022654 B) (Total
> open
> > > file blocks (not validated): 1)*
> > > * Minimally replicated blocks: 1141 (100.0 %)*
> > > * Over-replicated blocks: 0 (0.0 %)*
> > > * Under-replicated blocks: 906 (79.40403 %)*
> > > * Mis-replicated blocks: 0 (0.0 %)*
> > > * Default replication factor: 2*
> > > * Average block replication: 2.0*
> > > * Corrupt blocks: 0*
> > > * Missing replicas: 1886 (82.646805 %)*
> > > * Number of data-nodes: 2*
> > > * Number of racks: 1*
> > > *FSCK ended at Mon May 23 10:51:13 EDT 2011 in 257 milliseconds*
> > > *
> > > *
> > > *
> > > *
> > > *The filesystem under path '/' is HEALTHY*
> > >
> > >
> > > Could anybody please help on how to recover from this scenario .
> > >
> > > Thanks
> > >
> > >
> > > On Mon, May 23, 2011 at 10:50 AM, Himanish Kushary <himanish@gmail.com
> > >wrote:
> > >
> > >> Hi,
> > >>
> > >> Our hbase/hadoop servers machines were shutdown without bringing the
> > hadoop
> > >> and hbase services down properly.Now when we try to bring up hbase we
> > get
> > >> the following error in the master log:
> > >>
> > >> org.apache.hadoop.hbase.NotServingRegionException: Region is not
> online:
> > >> -ROOT-,,0
> > >>
> > >> Hadoop services (namenode,jobtracker,datanode etc) have come up
> properly
> > >> and we are able to see the files in HDFS. But HBase Master keeps on
> > throwing
> > >> this exception and then finally throws a Java Heap Space error.
> > >>
> > >> Note: We have two datanodes, replication set to 2 and around 900
> blocks
> > are
> > >> shown as under-replicated.
> > >>
> > >> ---------------------------------
> > >> Thanks & Regards
> > >> Himanish
> > >>
> > >
> > >
> > >
> > > --
> > > Thanks & Regards
> > > Himanish
> > >
> >
>
>
>
> --
> Thanks & Regards
> Himanish
>

Re: HBase Not Starting after improper shutdown

Posted by Himanish Kushary <hi...@gmail.com>.

Both the Master and hbck command prints

org.apache.hadoop.hbase.NotServingRegionException:
org.apache.hadoop.hbase.NotServingRegionException: Region is not online:
-ROOT-,,0

After the master thread exits due to the Heap Space error the hbck command
throws:

org.apache.hadoop.hbase.MasterNotRunningException

Is there anyway to fix this kind of issue.We are keeping the datanodes up to
see whether the under replicated blocks may be recovered.Does improper
shutdown of the hadoop/hbase services cause this kind of issues? What
happens in case of disaster recovery situation, how are those situaltions
handled ?

Thanks


On Mon, May 23, 2011 at 11:36 AM, Stack <st...@duboce.net> wrote:

> What does hbase hbck say?  (http://hbase.apache.org/book.html#hbck).
>
> What does the master log have in it?  Anything of interest.
>
> St.Ack
>
> On Mon, May 23, 2011 at 7:53 AM, Himanish Kushary <hi...@gmail.com>
> wrote:
> > Pressed the send button too soon...
> >
> > Also here is the output from hadoop fsck
> >
> > *Status: HEALTHY*
> > * Total size: 37678848280 B*
> > * Total dirs: 941*
> > * Total files: 902 (Files currently being written: 1)*
> > * Total blocks (validated): 1141 (avg. block size 33022654 B) (Total open
> > file blocks (not validated): 1)*
> > * Minimally replicated blocks: 1141 (100.0 %)*
> > * Over-replicated blocks: 0 (0.0 %)*
> > * Under-replicated blocks: 906 (79.40403 %)*
> > * Mis-replicated blocks: 0 (0.0 %)*
> > * Default replication factor: 2*
> > * Average block replication: 2.0*
> > * Corrupt blocks: 0*
> > * Missing replicas: 1886 (82.646805 %)*
> > * Number of data-nodes: 2*
> > * Number of racks: 1*
> > *FSCK ended at Mon May 23 10:51:13 EDT 2011 in 257 milliseconds*
> > *
> > *
> > *
> > *
> > *The filesystem under path '/' is HEALTHY*
> >
> >
> > Could anybody please help on how to recover from this scenario .
> >
> > Thanks
> >
> >
> > On Mon, May 23, 2011 at 10:50 AM, Himanish Kushary <himanish@gmail.com
> >wrote:
> >
> >> Hi,
> >>
> >> Our hbase/hadoop servers machines were shutdown without bringing the
> hadoop
> >> and hbase services down properly.Now when we try to bring up hbase we
> get
> >> the following error in the master log:
> >>
> >> org.apache.hadoop.hbase.NotServingRegionException: Region is not online:
> >> -ROOT-,,0
> >>
> >> Hadoop services (namenode,jobtracker,datanode etc) have come up properly
> >> and we are able to see the files in HDFS. But HBase Master keeps on
> throwing
> >> this exception and then finally throws a Java Heap Space error.
> >>
> >> Note: We have two datanodes, replication set to 2 and around 900 blocks
> are
> >> shown as under-replicated.
> >>
> >> ---------------------------------
> >> Thanks & Regards
> >> Himanish
> >>
> >
> >
> >
> > --
> > Thanks & Regards
> > Himanish
> >
>



-- 
Thanks & Regards
Himanish

Re: HBase Not Starting after improper shutdown

Posted by Stack <st...@duboce.net>.

What does hbase hbck say?  (http://hbase.apache.org/book.html#hbck).

What does the master log have in it?  Anything of interest.

St.Ack

On Mon, May 23, 2011 at 7:53 AM, Himanish Kushary <hi...@gmail.com> wrote:
> Pressed the send button too soon...
>
> Also here is the output from hadoop fsck
>
> *Status: HEALTHY*
> * Total size: 37678848280 B*
> * Total dirs: 941*
> * Total files: 902 (Files currently being written: 1)*
> * Total blocks (validated): 1141 (avg. block size 33022654 B) (Total open
> file blocks (not validated): 1)*
> * Minimally replicated blocks: 1141 (100.0 %)*
> * Over-replicated blocks: 0 (0.0 %)*
> * Under-replicated blocks: 906 (79.40403 %)*
> * Mis-replicated blocks: 0 (0.0 %)*
> * Default replication factor: 2*
> * Average block replication: 2.0*
> * Corrupt blocks: 0*
> * Missing replicas: 1886 (82.646805 %)*
> * Number of data-nodes: 2*
> * Number of racks: 1*
> *FSCK ended at Mon May 23 10:51:13 EDT 2011 in 257 milliseconds*
> *
> *
> *
> *
> *The filesystem under path '/' is HEALTHY*
>
>
> Could anybody please help on how to recover from this scenario .
>
> Thanks
>
>
> On Mon, May 23, 2011 at 10:50 AM, Himanish Kushary <hi...@gmail.com>wrote:
>
>> Hi,
>>
>> Our hbase/hadoop servers machines were shutdown without bringing the hadoop
>> and hbase services down properly.Now when we try to bring up hbase we get
>> the following error in the master log:
>>
>> org.apache.hadoop.hbase.NotServingRegionException: Region is not online:
>> -ROOT-,,0
>>
>> Hadoop services (namenode,jobtracker,datanode etc) have come up properly
>> and we are able to see the files in HDFS. But HBase Master keeps on throwing
>> this exception and then finally throws a Java Heap Space error.
>>
>> Note: We have two datanodes, replication set to 2 and around 900 blocks are
>> shown as under-replicated.
>>
>> ---------------------------------
>> Thanks & Regards
>> Himanish
>>
>
>
>
> --
> Thanks & Regards
> Himanish
>

Re: HBase Not Starting after improper shutdown

Posted by Himanish Kushary <hi...@gmail.com>.

Pressed the send button too soon...

Also here is the output from hadoop fsck

*Status: HEALTHY*
* Total size: 37678848280 B*
* Total dirs: 941*
* Total files: 902 (Files currently being written: 1)*
* Total blocks (validated): 1141 (avg. block size 33022654 B) (Total open
file blocks (not validated): 1)*
* Minimally replicated blocks: 1141 (100.0 %)*
* Over-replicated blocks: 0 (0.0 %)*
* Under-replicated blocks: 906 (79.40403 %)*
* Mis-replicated blocks: 0 (0.0 %)*
* Default replication factor: 2*
* Average block replication: 2.0*
* Corrupt blocks: 0*
* Missing replicas: 1886 (82.646805 %)*
* Number of data-nodes: 2*
* Number of racks: 1*
*FSCK ended at Mon May 23 10:51:13 EDT 2011 in 257 milliseconds*
*
*
*
*
*The filesystem under path '/' is HEALTHY*


Could anybody please help on how to recover from this scenario .

Thanks


On Mon, May 23, 2011 at 10:50 AM, Himanish Kushary <hi...@gmail.com>wrote:

> Hi,
>
> Our hbase/hadoop servers machines were shutdown without bringing the hadoop
> and hbase services down properly.Now when we try to bring up hbase we get
> the following error in the master log:
>
> org.apache.hadoop.hbase.NotServingRegionException: Region is not online:
> -ROOT-,,0
>
> Hadoop services (namenode,jobtracker,datanode etc) have come up properly
> and we are able to see the files in HDFS. But HBase Master keeps on throwing
> this exception and then finally throws a Java Heap Space error.
>
> Note: We have two datanodes, replication set to 2 and around 900 blocks are
> shown as under-replicated.
>
> ---------------------------------
> Thanks & Regards
> Himanish
>



-- 
Thanks & Regards
Himanish