You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Yabo-Arber Xu <ar...@gmail.com> on 2009/06/07 15:21:02 UTC

Again, HBase Data Lost!!

Hi there,

I had a hbase data lost couple of days ago due to the crash of HBase. At
that time i asked on the list and was told that it may be due to the lost of
META data on the master ( not flushed into disk during crash ). Just several
days after, this happened to me again, this time all the data is gone, while
all the tables are still there.  This time is on a 10-node hbase cluster.

I checked the log but did not find anything strange. Could anybody shed a
light on this? Given such stability, i really worried whether we can use it
in production phase.

Best,
Arber

On Mon, May 25, 2009 at 10:25 AM, Yabo-Arber Xu <ar...@gmail.com>wrote:

> Hi there,
>
> I had a single-node cluster up and running. And yesterday the node crashed
> for unknown reason, and when I restart it, everything appears to work except
> that all the tables are LOST( UI says that there is no user tables)!! I
> checked the log file and didn't find any clue; while I found the tables
> files are still there on HDFS.
>
> Anybody has any clue?
>
> It's quite urgent. Any help will be really appreciated.
>
> Best,
> Arber
>
>

Re: Again, HBase Data Lost!!

Posted by Yabo-Arber Xu <ar...@gmail.com>.

Hi J-D,

Thanks for your reply. We have a 10-node cluster installed with HBase/Hadoop
0.19.1. We wrote about 1.8M records into HBase, and it appeared fine. We can
use 'count' on shell to get the number. But after a while ( not sure exactly
when ) all nodes crashed right away. I checked the log and but did not find
anything particular at that moment. When I restart it again, all the rows in
the tables are lost but the tables are still there.

I have difficulty to connect IRC channel now ( will try it later ). For now,
I shared the master's log
here<http://cid-c31c56409ed94abd.skydrive.live.com/self.aspx/Public/Summba/hbase-cloudadmin-master-A6.log.2009-06-07>,
and the crash happens before the line "18:46:54 CST Starting master on A6".
You would notice that the period from "2009-06-07 17:00:57" to "2009-06-07
18:46:54" is empty ( due to the crash).

Thanks for your help!

Best,
Arber

On Sun, Jun 7, 2009 at 11:56 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> Arber,
>
> From your email I have a hard time understanding exactly what happened
> on your cluster. But, before asking any question, I have to say that I
> never saw data just "disappearing" like this on the production cluster
> I've been managing for the last year.
>
> So, what happened? Did you lose the region server holding .META or
> -ROOT-? Was it after an importing job and all nodes crashed right
> away? If any crash, do you know why it happened?
>
> Giving us those details (and more) will help us solve your problem.
> You can also drop by the hbase IRC channel, there's always people
> there happy to take a look at your logs and debugging goes much
> faster.
>
> Thx,
>
> J-D
>
> On Sun, Jun 7, 2009 at 9:21 AM, Yabo-Arber Xu<ar...@gmail.com>
> wrote:
> > Hi there,
> >
> > I had a hbase data lost couple of days ago due to the crash of HBase. At
> > that time i asked on the list and was told that it may be due to the lost
> of
> > META data on the master ( not flushed into disk during crash ). Just
> several
> > days after, this happened to me again, this time all the data is gone,
> while
> > all the tables are still there.  This time is on a 10-node hbase cluster.
> >
> > I checked the log but did not find anything strange. Could anybody shed a
> > light on this? Given such stability, i really worried whether we can use
> it
> > in production phase.
> >
> > Best,
> > Arber
> >
> > On Mon, May 25, 2009 at 10:25 AM, Yabo-Arber Xu <
> arber.research@gmail.com>wrote:
> >
> >> Hi there,
> >>
> >> I had a single-node cluster up and running. And yesterday the node
> crashed
> >> for unknown reason, and when I restart it, everything appears to work
> except
> >> that all the tables are LOST( UI says that there is no user tables)!! I
> >> checked the log file and didn't find any clue; while I found the tables
> >> files are still there on HDFS.
> >>
> >> Anybody has any clue?
> >>
> >> It's quite urgent. Any help will be really appreciated.
> >>
> >> Best,
> >> Arber
> >>
> >>
> >
>

Re: Again, HBase Data Lost!!

Posted by Billy Pearson <sa...@pearsonwholesale.com>.

If the region servers are dying then there logs are more likely to be 
helpful then the master.

Billy




"Yabo-Arber Xu" <yx...@summba.com> wrote in 
message news:382e1efc0906072003i9ea9733h5ab51ce12a370d5@mail.gmail.com...
> Hi J-D,
>
> Thanks for your reply. We have a 10-node cluster installed with 
> HBase/Hadoop
> 0.19.1. We wrote about 1.8M records into HBase, and it appeared fine. We 
> can
> use 'count' on shell to get the number. But after a while ( not sure 
> exactly
> when ) all nodes crashed right away. I checked the log and but did not 
> find
> anything particular at that moment. When I restart it again, all the rows 
> in
> the tables are lost but the tables are still there.
>
> I have difficulty to connect IRC channel now ( will try it later ). For 
> now,
> I shared the master's log
> here<http://cid-c31c56409ed94abd.skydrive.live.com/self.aspx/Public/Summba/hbase-cloudadmin-master-A6.log.2009-06-07>,
> and the crash happens before the line "18:46:54 CST Starting master on 
> A6".
> You would notice that the period from "2009-06-07 17:00:57" to "2009-06-07
> 18:46:54" is empty ( due to the crash).
>
> Thanks for your help!
>
> Best,
> Arber
>
>
> On Sun, Jun 7, 2009 at 11:56 PM, Jean-Daniel Cryans 
> <jd...@apache.org>wrote:
>
>> Arber,
>>
>> From your email I have a hard time understanding exactly what happened
>> on your cluster. But, before asking any question, I have to say that I
>> never saw data just "disappearing" like this on the production cluster
>> I've been managing for the last year.
>>
>> So, what happened? Did you lose the region server holding .META or
>> -ROOT-? Was it after an importing job and all nodes crashed right
>> away? If any crash, do you know why it happened?
>>
>> Giving us those details (and more) will help us solve your problem.
>> You can also drop by the hbase IRC channel, there's always people
>> there happy to take a look at your logs and debugging goes much
>> faster.
>>
>> Thx,
>>
>> J-D
>>
>> On Sun, Jun 7, 2009 at 9:21 AM, Yabo-Arber 
>> Xu<ar...@gmail.com>
>> wrote:
>> > Hi there,
>> >
>> > I had a hbase data lost couple of days ago due to the crash of HBase. 
>> > At
>> > that time i asked on the list and was told that it may be due to the 
>> > lost
>> of
>> > META data on the master ( not flushed into disk during crash ). Just
>> several
>> > days after, this happened to me again, this time all the data is gone,
>> while
>> > all the tables are still there.  This time is on a 10-node hbase 
>> > cluster.
>> >
>> > I checked the log but did not find anything strange. Could anybody shed 
>> > a
>> > light on this? Given such stability, i really worried whether we can 
>> > use
>> it
>> > in production phase.
>> >
>> > Best,
>> > Arber
>> >
>> > On Mon, May 25, 2009 at 10:25 AM, Yabo-Arber Xu <
>> arber.research@gmail.com>wrote:
>> >
>> >> Hi there,
>> >>
>> >> I had a single-node cluster up and running. And yesterday the node
>> crashed
>> >> for unknown reason, and when I restart it, everything appears to work
>> except
>> >> that all the tables are LOST( UI says that there is no user tables)!! 
>> >> I
>> >> checked the log file and didn't find any clue; while I found the 
>> >> tables
>> >> files are still there on HDFS.
>> >>
>> >> Anybody has any clue?
>> >>
>> >> It's quite urgent. Any help will be really appreciated.
>> >>
>> >> Best,
>> >> Arber
>> >>
>> >>
>> >
>>
>

Re: Again, HBase Data Lost!!

Posted by Yabo-Arber Xu <yx...@summba.com>.

Hi J-D,

Thanks for your reply. We have a 10-node cluster installed with HBase/Hadoop
0.19.1. We wrote about 1.8M records into HBase, and it appeared fine. We can
use 'count' on shell to get the number. But after a while ( not sure exactly
when ) all nodes crashed right away. I checked the log and but did not find
anything particular at that moment. When I restart it again, all the rows in
the tables are lost but the tables are still there.

I have difficulty to connect IRC channel now ( will try it later ). For now,
I shared the master's log
here<http://cid-c31c56409ed94abd.skydrive.live.com/self.aspx/Public/Summba/hbase-cloudadmin-master-A6.log.2009-06-07>,
and the crash happens before the line "18:46:54 CST Starting master on A6".
You would notice that the period from "2009-06-07 17:00:57" to "2009-06-07
18:46:54" is empty ( due to the crash).

Thanks for your help!

Best,
Arber

On Sun, Jun 7, 2009 at 11:56 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> Arber,
>
> From your email I have a hard time understanding exactly what happened
> on your cluster. But, before asking any question, I have to say that I
> never saw data just "disappearing" like this on the production cluster
> I've been managing for the last year.
>
> So, what happened? Did you lose the region server holding .META or
> -ROOT-? Was it after an importing job and all nodes crashed right
> away? If any crash, do you know why it happened?
>
> Giving us those details (and more) will help us solve your problem.
> You can also drop by the hbase IRC channel, there's always people
> there happy to take a look at your logs and debugging goes much
> faster.
>
> Thx,
>
> J-D
>
> On Sun, Jun 7, 2009 at 9:21 AM, Yabo-Arber Xu<ar...@gmail.com>
> wrote:
> > Hi there,
> >
> > I had a hbase data lost couple of days ago due to the crash of HBase. At
> > that time i asked on the list and was told that it may be due to the lost
> of
> > META data on the master ( not flushed into disk during crash ). Just
> several
> > days after, this happened to me again, this time all the data is gone,
> while
> > all the tables are still there.  This time is on a 10-node hbase cluster.
> >
> > I checked the log but did not find anything strange. Could anybody shed a
> > light on this? Given such stability, i really worried whether we can use
> it
> > in production phase.
> >
> > Best,
> > Arber
> >
> > On Mon, May 25, 2009 at 10:25 AM, Yabo-Arber Xu <
> arber.research@gmail.com>wrote:
> >
> >> Hi there,
> >>
> >> I had a single-node cluster up and running. And yesterday the node
> crashed
> >> for unknown reason, and when I restart it, everything appears to work
> except
> >> that all the tables are LOST( UI says that there is no user tables)!! I
> >> checked the log file and didn't find any clue; while I found the tables
> >> files are still there on HDFS.
> >>
> >> Anybody has any clue?
> >>
> >> It's quite urgent. Any help will be really appreciated.
> >>
> >> Best,
> >> Arber
> >>
> >>
> >
>

Re: Again, HBase Data Lost!!

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Arber,

>From your email I have a hard time understanding exactly what happened
on your cluster. But, before asking any question, I have to say that I
never saw data just "disappearing" like this on the production cluster
I've been managing for the last year.

So, what happened? Did you lose the region server holding .META or
-ROOT-? Was it after an importing job and all nodes crashed right
away? If any crash, do you know why it happened?

Giving us those details (and more) will help us solve your problem.
You can also drop by the hbase IRC channel, there's always people
there happy to take a look at your logs and debugging goes much
faster.

Thx,

J-D

On Sun, Jun 7, 2009 at 9:21 AM, Yabo-Arber Xu<ar...@gmail.com> wrote:
> Hi there,
>
> I had a hbase data lost couple of days ago due to the crash of HBase. At
> that time i asked on the list and was told that it may be due to the lost of
> META data on the master ( not flushed into disk during crash ). Just several
> days after, this happened to me again, this time all the data is gone, while
> all the tables are still there.  This time is on a 10-node hbase cluster.
>
> I checked the log but did not find anything strange. Could anybody shed a
> light on this? Given such stability, i really worried whether we can use it
> in production phase.
>
> Best,
> Arber
>
> On Mon, May 25, 2009 at 10:25 AM, Yabo-Arber Xu <ar...@gmail.com>wrote:
>
>> Hi there,
>>
>> I had a single-node cluster up and running. And yesterday the node crashed
>> for unknown reason, and when I restart it, everything appears to work except
>> that all the tables are LOST( UI says that there is no user tables)!! I
>> checked the log file and didn't find any clue; while I found the tables
>> files are still there on HDFS.
>>
>> Anybody has any clue?
>>
>> It's quite urgent. Any help will be really appreciated.
>>
>> Best,
>> Arber
>>
>>
>