You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Bradford Stephens <br...@gmail.com> on 2009/08/21 00:48:17 UTC

Story of my HBase Bugs / Feature Suggestions

Hey there,

I'm sending out this summary of how I diagnosed what was wrong with my
cluster in hopes that you can glean some knowledge/suggestions from it :)
Thanks for the diagnostic footwork.

A few days ago,  I noticed that simple MR jobs I was running against .20-RC2
were failing. Scanners were reaching the end of a region, and then simply
freezing. The only indication I had of this was the Mapper timing out after
1000 seconds -- there were no error messages in the logs for either Hadoop
or HBase.

It turns out that my table was corrupt:

1. Doing a 'GET' from the shell on a row near the end of a region resulted
in an error: "Row not in expected region", or something to that effect. It
re-appeared several times, and I never got the row content.
2. What the Master UI indicated for the region distribution was totally
different from what the RS reported. Row key ranges were on different
servers than the UI knew about, and the nodes reported different start and
end keys for a region than the UI.

I'm not sure how this arose: I noticed after a heavy insert job that when we
tried to shut down our cluster, it took 30 dots and more -- so we manually
killed master. Would that lead to corruption?

I finally resolved the problem by dropping the table and re-loading the data

A few suggestions going forward:
1. More useful scanner error messages: GET reported that there was a problem
finding a certain row, why couldn't Scanner? There wasn't even a timeout or
anything -- it just sat there.
2. A fsck / restore would be useful for HBase. I imagine you can recreate
.META. using .regioninfo and scanning blocks out of HDFS. This would play
nice with the HBase bulk loader story, I suppose.

I'll be happy to work on these in my spare time, if I ever get any ;)

Cheers,
Bradford


-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
and Computer Science

Re: Story of my HBase Bugs / Feature Suggestions

Posted by Jonathan Gray <jl...@streamy.com>.

That is excellent news, Bradford!  Thanks for sticking with us :)

JG

On Tue, August 25, 2009 7:14 pm, Bradford Stephens wrote:
> As a side note, we've been beating on RC2 for a week solid, and it's very
>  stable. We're really only limited by our RAM and GC, now :)
>
> On Sat, Aug 22, 2009 at 6:59 AM, Andrew Purtell <ap...@apache.org>
> wrote:
>
>
>> Jon,
>>
>>
>> Cool. I suspected as much. I'm really glad to see those bugs were found
>> and fixed...
>>
>> - Andy
>>
>>
>>
>>
>>
>> ________________________________
>> From: Jonathan Gray <jl...@streamy.com>
>> To: hbase-user@hadoop.apache.org
>> Sent: Saturday, August 22, 2009 12:24:51 AM
>> Subject: Re: Story of my HBase Bugs / Feature Suggestions
>>
>>
>> Andy,
>>
>>
>> Bradford ran his imports when there was both a Scanner bug related to
>> snapshotting that opened up a race condition, as well as the nasty bugs
>> in getClosestBefore used to look things up in META.
>>
>> It was most likely a combination of both of these things making for
>> some rather nasty behavior.
>>
>> JG
>>
>>
>> Andrew Purtell wrote:
>>
>>> There are plans to host live region assignments in ZK and keep only
>>> an
>> up-to-date copy of this state in META for use on cold boot. This is on
>> the roadmap for 0.21 but perhaps could be considered for 0.20.1 also.
>> This may
>> help here.
>>> A TM development group saw the same behavior on a 0.19 cluster. We
>>> postponed looking at this because 0.20 has a significant rewrite of
>>> region assignment. However, it is interesting to hear such a similar
>>> description. I worry the underlying cause may be scanners getting
>>> stale
>> data on the RS as opposed to some master problem which could be solved
>> by the above, a more pervasive problem. Bradford, any chance you kept
>> around logs or similar which may provide clues?
>>>
>>> - Andy
>>>
>>>
>>>
>>>
>>>
>>> ________________________________
>>> From: Bradford Stephens <br...@gmail.com>
>>> To: hbase-user@hadoop.apache.org
>>> Sent: Friday, August 21, 2009 6:48:17 AM
>>> Subject: Story of my HBase Bugs / Feature Suggestions
>>>
>>>
>>> Hey there,
>>>
>>>
>>> I'm sending out this summary of how I diagnosed what was wrong with
>>> my cluster in hopes that you can glean some knowledge/suggestions from
>>> it :) Thanks for the diagnostic footwork.
>>>
>>>
>>> A few days ago,  I noticed that simple MR jobs I was running against
>>>
>> .20-RC2
>>
>>> were failing. Scanners were reaching the end of a region, and then
>>> simply freezing. The only indication I had of this was the Mapper
>>> timing out
>> after
>>> 1000 seconds -- there were no error messages in the logs for either
>>>
>> Hadoop
>>
>>> or HBase.
>>>
>>> It turns out that my table was corrupt:
>>>
>>>
>>> 1. Doing a 'GET' from the shell on a row near the end of a region
>>>
>> resulted
>>> in an error: "Row not in expected region", or something to that
>>> effect.
>> It
>>
>>> re-appeared several times, and I never got the row content. 2. What
>>> the Master UI indicated for the region distribution was totally
>>> different from what the RS reported. Row key ranges were on different
>>>  servers than the UI knew about, and the nodes reported different
>>> start
>> and
>>> end keys for a region than the UI.
>>>
>>> I'm not sure how this arose: I noticed after a heavy insert job that
>>> when
>> we
>>> tried to shut down our cluster, it took 30 dots and more -- so we
>> manually
>>> killed master. Would that lead to corruption?
>>>
>>> I finally resolved the problem by dropping the table and re-loading
>>> the
>> data
>>>
>>> A few suggestions going forward:
>>> 1. More useful scanner error messages: GET reported that there was a
>>>
>> problem
>>> finding a certain row, why couldn't Scanner? There wasn't even a
>>> timeout
>> or
>>> anything -- it just sat there. 2. A fsck / restore would be useful for
>>> HBase. I imagine you can recreate
>>> .META. using .regioninfo and scanning blocks out of HDFS. This would
>>> play nice with the HBase bulk loader story, I suppose.
>>>
>>> I'll be happy to work on these in my spare time, if I ever get any ;)
>>>
>>>
>>> Cheers,
>>> Bradford
>>>
>>>
>>>
>>
>>
>>
>>
>>
>
>
>
> --
> http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
> and Computer Science
>

Re: Story of my HBase Bugs / Feature Suggestions

Posted by Bradford Stephens <br...@gmail.com>.

As a side note, we've been beating on RC2 for a week solid, and it's very
stable. We're really only limited by our RAM and GC, now :)

On Sat, Aug 22, 2009 at 6:59 AM, Andrew Purtell <ap...@apache.org> wrote:

> Jon,
>
> Cool. I suspected as much. I'm really glad to see those bugs were found and
> fixed...
>
>   - Andy
>
>
>
>
> ________________________________
> From: Jonathan Gray <jl...@streamy.com>
> To: hbase-user@hadoop.apache.org
> Sent: Saturday, August 22, 2009 12:24:51 AM
> Subject: Re: Story of my HBase Bugs / Feature Suggestions
>
> Andy,
>
> Bradford ran his imports when there was both a Scanner bug related to
> snapshotting that opened up a race condition, as well as the nasty bugs in
> getClosestBefore used to look things up in META.
>
> It was most likely a combination of both of these things making for some
> rather nasty behavior.
>
> JG
>
> Andrew Purtell wrote:
> > There are plans to host live region assignments in ZK and keep only an
> up-to-date copy of this state in META for use on cold boot. This is on the
> roadmap for 0.21 but perhaps could be considered for 0.20.1 also. This may
> help here.
> > A TM development group saw the same behavior on a 0.19 cluster. We
> > postponed looking at this because 0.20 has a significant rewrite of
> > region assignment. However, it is interesting to hear such a similar
> > description. I worry the underlying cause may be scanners getting stale
> data on the RS as opposed to some master problem which could be solved by
> the above, a more pervasive problem. Bradford, any chance you kept around
> logs or similar which may provide clues?
> >
> >    - Andy
> >
> >
> >
> >
> > ________________________________
> > From: Bradford Stephens <br...@gmail.com>
> > To: hbase-user@hadoop.apache.org
> > Sent: Friday, August 21, 2009 6:48:17 AM
> > Subject: Story of my HBase Bugs / Feature Suggestions
> >
> > Hey there,
> >
> > I'm sending out this summary of how I diagnosed what was wrong with my
> > cluster in hopes that you can glean some knowledge/suggestions from it :)
> > Thanks for the diagnostic footwork.
> >
> > A few days ago,  I noticed that simple MR jobs I was running against
> .20-RC2
> > were failing. Scanners were reaching the end of a region, and then simply
> > freezing. The only indication I had of this was the Mapper timing out
> after
> > 1000 seconds -- there were no error messages in the logs for either
> Hadoop
> > or HBase.
> >
> > It turns out that my table was corrupt:
> >
> > 1. Doing a 'GET' from the shell on a row near the end of a region
> resulted
> > in an error: "Row not in expected region", or something to that effect.
> It
> > re-appeared several times, and I never got the row content.
> > 2. What the Master UI indicated for the region distribution was totally
> > different from what the RS reported. Row key ranges were on different
> > servers than the UI knew about, and the nodes reported different start
> and
> > end keys for a region than the UI.
> >
> > I'm not sure how this arose: I noticed after a heavy insert job that when
> we
> > tried to shut down our cluster, it took 30 dots and more -- so we
> manually
> > killed master. Would that lead to corruption?
> >
> > I finally resolved the problem by dropping the table and re-loading the
> data
> >
> > A few suggestions going forward:
> > 1. More useful scanner error messages: GET reported that there was a
> problem
> > finding a certain row, why couldn't Scanner? There wasn't even a timeout
> or
> > anything -- it just sat there.
> > 2. A fsck / restore would be useful for HBase. I imagine you can recreate
> > .META. using .regioninfo and scanning blocks out of HDFS. This would play
> > nice with the HBase bulk loader story, I suppose.
> >
> > I'll be happy to work on these in my spare time, if I ever get any ;)
> >
> > Cheers,
> > Bradford
> >
> >
>
>
>
>
>



-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
and Computer Science

Re: Story of my HBase Bugs / Feature Suggestions

Posted by Andrew Purtell <ap...@apache.org>.

Jon,

Cool. I suspected as much. I'm really glad to see those bugs were found and fixed... 

   - Andy




________________________________
From: Jonathan Gray <jl...@streamy.com>
To: hbase-user@hadoop.apache.org
Sent: Saturday, August 22, 2009 12:24:51 AM
Subject: Re: Story of my HBase Bugs / Feature Suggestions

Andy,

Bradford ran his imports when there was both a Scanner bug related to snapshotting that opened up a race condition, as well as the nasty bugs in getClosestBefore used to look things up in META.

It was most likely a combination of both of these things making for some rather nasty behavior.

JG

Andrew Purtell wrote:
> There are plans to host live region assignments in ZK and keep only an up-to-date copy of this state in META for use on cold boot. This is on the roadmap for 0.21 but perhaps could be considered for 0.20.1 also. This may help here. 
> A TM development group saw the same behavior on a 0.19 cluster. We
> postponed looking at this because 0.20 has a significant rewrite of
> region assignment. However, it is interesting to hear such a similar
> description. I worry the underlying cause may be scanners getting stale data on the RS as opposed to some master problem which could be solved by the above, a more pervasive problem. Bradford, any chance you kept around logs or similar which may provide clues?
> 
>    - Andy
> 
> 
> 
> 
> ________________________________
> From: Bradford Stephens <br...@gmail.com>
> To: hbase-user@hadoop.apache.org
> Sent: Friday, August 21, 2009 6:48:17 AM
> Subject: Story of my HBase Bugs / Feature Suggestions
> 
> Hey there,
> 
> I'm sending out this summary of how I diagnosed what was wrong with my
> cluster in hopes that you can glean some knowledge/suggestions from it :)
> Thanks for the diagnostic footwork.
> 
> A few days ago,  I noticed that simple MR jobs I was running against .20-RC2
> were failing. Scanners were reaching the end of a region, and then simply
> freezing. The only indication I had of this was the Mapper timing out after
> 1000 seconds -- there were no error messages in the logs for either Hadoop
> or HBase.
> 
> It turns out that my table was corrupt:
> 
> 1. Doing a 'GET' from the shell on a row near the end of a region resulted
> in an error: "Row not in expected region", or something to that effect. It
> re-appeared several times, and I never got the row content.
> 2. What the Master UI indicated for the region distribution was totally
> different from what the RS reported. Row key ranges were on different
> servers than the UI knew about, and the nodes reported different start and
> end keys for a region than the UI.
> 
> I'm not sure how this arose: I noticed after a heavy insert job that when we
> tried to shut down our cluster, it took 30 dots and more -- so we manually
> killed master. Would that lead to corruption?
> 
> I finally resolved the problem by dropping the table and re-loading the data
> 
> A few suggestions going forward:
> 1. More useful scanner error messages: GET reported that there was a problem
> finding a certain row, why couldn't Scanner? There wasn't even a timeout or
> anything -- it just sat there.
> 2. A fsck / restore would be useful for HBase. I imagine you can recreate
> .META. using .regioninfo and scanning blocks out of HDFS. This would play
> nice with the HBase bulk loader story, I suppose.
> 
> I'll be happy to work on these in my spare time, if I ever get any ;)
> 
> Cheers,
> Bradford
> 
>

Re: Story of my HBase Bugs / Feature Suggestions

Posted by Jonathan Gray <jl...@streamy.com>.

Andy,

Bradford ran his imports when there was both a Scanner bug related to 
snapshotting that opened up a race condition, as well as the nasty bugs 
in getClosestBefore used to look things up in META.

It was most likely a combination of both of these things making for some 
rather nasty behavior.

JG

Andrew Purtell wrote:
> There are plans to host live region assignments in ZK and keep only an up-to-date copy of this state in META for use on cold boot. This is on the roadmap for 0.21 but perhaps could be considered for 0.20.1 also. This may help here. 
> 
> A TM development group saw the same behavior on a 0.19 cluster. We
> postponed looking at this because 0.20 has a significant rewrite of
> region assignment. However, it is interesting to hear such a similar
> description. I worry the underlying cause may be scanners getting stale data on the RS as opposed to some master problem which could be solved by the above, a more pervasive problem. Bradford, any chance you kept around logs or similar which may provide clues?
> 
>    - Andy
> 
> 
> 
> 
> ________________________________
> From: Bradford Stephens <br...@gmail.com>
> To: hbase-user@hadoop.apache.org
> Sent: Friday, August 21, 2009 6:48:17 AM
> Subject: Story of my HBase Bugs / Feature Suggestions
> 
> Hey there,
> 
> I'm sending out this summary of how I diagnosed what was wrong with my
> cluster in hopes that you can glean some knowledge/suggestions from it :)
> Thanks for the diagnostic footwork.
> 
> A few days ago,  I noticed that simple MR jobs I was running against .20-RC2
> were failing. Scanners were reaching the end of a region, and then simply
> freezing. The only indication I had of this was the Mapper timing out after
> 1000 seconds -- there were no error messages in the logs for either Hadoop
> or HBase.
> 
> It turns out that my table was corrupt:
> 
> 1. Doing a 'GET' from the shell on a row near the end of a region resulted
> in an error: "Row not in expected region", or something to that effect. It
> re-appeared several times, and I never got the row content.
> 2. What the Master UI indicated for the region distribution was totally
> different from what the RS reported. Row key ranges were on different
> servers than the UI knew about, and the nodes reported different start and
> end keys for a region than the UI.
> 
> I'm not sure how this arose: I noticed after a heavy insert job that when we
> tried to shut down our cluster, it took 30 dots and more -- so we manually
> killed master. Would that lead to corruption?
> 
> I finally resolved the problem by dropping the table and re-loading the data
> 
> A few suggestions going forward:
> 1. More useful scanner error messages: GET reported that there was a problem
> finding a certain row, why couldn't Scanner? There wasn't even a timeout or
> anything -- it just sat there.
> 2. A fsck / restore would be useful for HBase. I imagine you can recreate
> .META. using .regioninfo and scanning blocks out of HDFS. This would play
> nice with the HBase bulk loader story, I suppose.
> 
> I'll be happy to work on these in my spare time, if I ever get any ;)
> 
> Cheers,
> Bradford
> 
>

Re: Story of my HBase Bugs / Feature Suggestions

Posted by Bradford Stephens <br...@gmail.com>.

Sure, I've got a ton of logs. I'll try to grab what's most pertinent and put
them on rapidshare, but there will be a ton of data to sift through :)

On Thu, Aug 20, 2009 at 8:57 PM, Andrew Purtell <ap...@apache.org> wrote:

> There are plans to host live region assignments in ZK and keep only an
> up-to-date copy of this state in META for use on cold boot. This is on the
> roadmap for 0.21 but perhaps could be considered for 0.20.1 also. This may
> help here.
>
> A TM development group saw the same behavior on a 0.19 cluster. We
> postponed looking at this because 0.20 has a significant rewrite of
> region assignment. However, it is interesting to hear such a similar
> description. I worry the underlying cause may be scanners getting stale
> data on the RS as opposed to some master problem which could be solved by
> the above, a more pervasive problem. Bradford, any chance you kept around
> logs or similar which may provide clues?
>
>   - Andy
>
>
>
>
> ________________________________
> From: Bradford Stephens <br...@gmail.com>
> To: hbase-user@hadoop.apache.org
> Sent: Friday, August 21, 2009 6:48:17 AM
> Subject: Story of my HBase Bugs / Feature Suggestions
>
> Hey there,
>
> I'm sending out this summary of how I diagnosed what was wrong with my
> cluster in hopes that you can glean some knowledge/suggestions from it :)
> Thanks for the diagnostic footwork.
>
> A few days ago,  I noticed that simple MR jobs I was running against
> .20-RC2
> were failing. Scanners were reaching the end of a region, and then simply
> freezing. The only indication I had of this was the Mapper timing out after
> 1000 seconds -- there were no error messages in the logs for either Hadoop
> or HBase.
>
> It turns out that my table was corrupt:
>
> 1. Doing a 'GET' from the shell on a row near the end of a region resulted
> in an error: "Row not in expected region", or something to that effect. It
> re-appeared several times, and I never got the row content.
> 2. What the Master UI indicated for the region distribution was totally
> different from what the RS reported. Row key ranges were on different
> servers than the UI knew about, and the nodes reported different start and
> end keys for a region than the UI.
>
> I'm not sure how this arose: I noticed after a heavy insert job that when
> we
> tried to shut down our cluster, it took 30 dots and more -- so we manually
> killed master. Would that lead to corruption?
>
> I finally resolved the problem by dropping the table and re-loading the
> data
>
> A few suggestions going forward:
> 1. More useful scanner error messages: GET reported that there was a
> problem
> finding a certain row, why couldn't Scanner? There wasn't even a timeout or
> anything -- it just sat there.
> 2. A fsck / restore would be useful for HBase. I imagine you can recreate
> .META. using .regioninfo and scanning blocks out of HDFS. This would play
> nice with the HBase bulk loader story, I suppose.
>
> I'll be happy to work on these in my spare time, if I ever get any ;)
>
> Cheers,
> Bradford
>
>
> --
> http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
> and Computer Science
>
>
>
>
>



-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
and Computer Science

Re: Story of my HBase Bugs / Feature Suggestions

Posted by Andrew Purtell <ap...@apache.org>.

There are plans to host live region assignments in ZK and keep only an up-to-date copy of this state in META for use on cold boot. This is on the roadmap for 0.21 but perhaps could be considered for 0.20.1 also. This may help here. 

A TM development group saw the same behavior on a 0.19 cluster. We
postponed looking at this because 0.20 has a significant rewrite of
region assignment. However, it is interesting to hear such a similar
description. I worry the underlying cause may be scanners getting stale data on the RS as opposed to some master problem which could be solved by the above, a more pervasive problem. Bradford, any chance you kept around logs or similar which may provide clues?

   - Andy




________________________________
From: Bradford Stephens <br...@gmail.com>
To: hbase-user@hadoop.apache.org
Sent: Friday, August 21, 2009 6:48:17 AM
Subject: Story of my HBase Bugs / Feature Suggestions

Hey there,

I'm sending out this summary of how I diagnosed what was wrong with my
cluster in hopes that you can glean some knowledge/suggestions from it :)
Thanks for the diagnostic footwork.

A few days ago,  I noticed that simple MR jobs I was running against .20-RC2
were failing. Scanners were reaching the end of a region, and then simply
freezing. The only indication I had of this was the Mapper timing out after
1000 seconds -- there were no error messages in the logs for either Hadoop
or HBase.

It turns out that my table was corrupt:

1. Doing a 'GET' from the shell on a row near the end of a region resulted
in an error: "Row not in expected region", or something to that effect. It
re-appeared several times, and I never got the row content.
2. What the Master UI indicated for the region distribution was totally
different from what the RS reported. Row key ranges were on different
servers than the UI knew about, and the nodes reported different start and
end keys for a region than the UI.

I'm not sure how this arose: I noticed after a heavy insert job that when we
tried to shut down our cluster, it took 30 dots and more -- so we manually
killed master. Would that lead to corruption?

I finally resolved the problem by dropping the table and re-loading the data

A few suggestions going forward:
1. More useful scanner error messages: GET reported that there was a problem
finding a certain row, why couldn't Scanner? There wasn't even a timeout or
anything -- it just sat there.
2. A fsck / restore would be useful for HBase. I imagine you can recreate
.META. using .regioninfo and scanning blocks out of HDFS. This would play
nice with the HBase bulk loader story, I suppose.

I'll be happy to work on these in my spare time, if I ever get any ;)

Cheers,
Bradford


-- 
http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
and Computer Science