You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Geoff Hendrey <gh...@decarta.com> on 2011/08/13 02:37:27 UTC

"unlink" orpan row from .META.

our table inconsistency is due to an orphaned row in .META.

 

What do I mean

 

startkey  endkey

===========

A            B

B            C

C            X

D            E

 

Notice that endkey "X" doesn't exist anywhere as a startkey. I want to
fix this by doing a put that will replace start/end pair {B,C} with
{B,D} to "unlink" the orphaned row {C,X} from the ".META.". I have
already backed up all the data in the orphaned region. Then I intend to
delete the unlinked orphaned row {C,X} and subsequently PUT back all the
backed up data back into the table.

 

My concerns is that the "ENCODED" column in .META. encodes the endrow.
Is this the case? (In which case I need the encoding function in order
to make my proposed fix work).

 

Looking for someone to ack that my repair strategy is viable. Please
advise.

 

-geoff


Re: "unlink" orpan row from .META.

Posted by Stack <st...@duboce.net>.
You could ... problem though is getting all them imports right and
then the bit of script all lined up.. You'll make mistakes.  Easier to
do scrip out in a file then echo it to the shell or pass on STDIN or
to jruby direct (see head of bin/check_meta.rb for how to do the
latter.

St.Ack

On Fri, Aug 12, 2011 at 8:57 PM, Geoff Hendrey <gh...@decarta.com> wrote:
> This is good information, thanks. I was planning to make the repairs just using the shell though. Per previous email, I'll first delete the offending orphan row from .META., then put a new endkey column into the row above it. Could I do this all from the shell?
>
> Sent from my iPhone
>
> On Aug 12, 2011, at 8:22 PM, "Stack" <st...@duboce.net> wrote:
>
>> One thing you might not realize is that the .META. row is the region
>> name as a byte array and then the info:regioninfo is the serialized
>> HRegionInfo.  To see how to do the deserialization, see samples in the
>> bin directory; e.g. check_meta.rb
>>
>> St.Ack
>>
>> On Fri, Aug 12, 2011 at 6:08 PM, Stack <st...@duboce.net> wrote:
>> > On Fri, Aug 12, 2011 at 5:37 PM, Geoff Hendrey <gh...@decarta.com> wrote:
>> >> Notice that endkey "X" doesn't exist anywhere as a startkey. I want to
>> >> fix this by doing a put that will replace start/end pair {B,C} with
>> >> {B,D} to "unlink" the orphaned row {C,X} from the ".META.".
>> >
>> > You should remove the {C,X} row first (I see that you do
>> >
>> > Is it really an X as in an end row that sorts after 'D'?  If so, that
>> > could be a problem.  You'll have entries in this {C,X} region that
>> > will be outside of the new {C,D} boundary.
>> >
>> > If it is outside of the boundary, then we need to be careful putting
>> > back the store files after you make the new {C,D} region.  We need to
>> > not put back storefiles with entries that sort after the 'D'.
>> >
>> >> I have
>> >> already backed up all the data in the orphaned region. Then I intend to
>> >> delete the unlinked orphaned row {C,X} and subsequently PUT back all the
>> >> backed up data back into the table.
>> >>
>> >
>> > This should work w/ above caveat
>> >
>> >
>> >
>> >> My concerns is that the "ENCODED" column in .META. encodes the endrow.
>> >> Is this the case? (In which case I need the encoding function in order
>> >> to make my proposed fix work).
>> >>
>> >
>> > When you create a region, it will do the encoding for you.  See here
>> > http://hbase.apache.org/xref/org/apache/hadoop/hbase/HRegionInfo.html#225
>> >
>> > You should be able to create a region in the shell and toString the
>> > output to find the encoding as in:
>> >
>> > hbase> import org.apache.hadoop.hbase.HRegionInfo
>> > hbase> hri = HRegionInfo.new(tablename, startrow, endrow)
>> > hbase> puts hri.toString()
>> >
>> > You might have to mess around to get byte array version of string and
>> > row names (Use to_java_bytes if you have strings)
>> >
>> > St.Ack
>> >
>> >>
>> >>
>> >> Looking for someone to ack that my repair strategy is viable. Please
>> >> advise.
>> >>
>> >>
>> >>
>> >> -geoff
>> >>
>> >>
>> >
>

Re: "unlink" orpan row from .META.

Posted by Geoff Hendrey <gh...@decarta.com>.
This is good information, thanks. I was planning to make the repairs just using the shell though. Per previous email, I'll first delete the offending orphan row from .META., then put a new endkey column into the row above it. Could I do this all from the shell?

Sent from my iPhone

On Aug 12, 2011, at 8:22 PM, "Stack" <st...@duboce.net> wrote:

> One thing you might not realize is that the .META. row is the region
> name as a byte array and then the info:regioninfo is the serialized
> HRegionInfo.  To see how to do the deserialization, see samples in the
> bin directory; e.g. check_meta.rb
> 
> St.Ack
> 
> On Fri, Aug 12, 2011 at 6:08 PM, Stack <st...@duboce.net> wrote:
> > On Fri, Aug 12, 2011 at 5:37 PM, Geoff Hendrey <gh...@decarta.com> wrote:
> >> Notice that endkey "X" doesn't exist anywhere as a startkey. I want to
> >> fix this by doing a put that will replace start/end pair {B,C} with
> >> {B,D} to "unlink" the orphaned row {C,X} from the ".META.".
> >
> > You should remove the {C,X} row first (I see that you do
> >
> > Is it really an X as in an end row that sorts after 'D'?  If so, that
> > could be a problem.  You'll have entries in this {C,X} region that
> > will be outside of the new {C,D} boundary.
> >
> > If it is outside of the boundary, then we need to be careful putting
> > back the store files after you make the new {C,D} region.  We need to
> > not put back storefiles with entries that sort after the 'D'.
> >
> >> I have
> >> already backed up all the data in the orphaned region. Then I intend to
> >> delete the unlinked orphaned row {C,X} and subsequently PUT back all the
> >> backed up data back into the table.
> >>
> >
> > This should work w/ above caveat
> >
> >
> >
> >> My concerns is that the "ENCODED" column in .META. encodes the endrow.
> >> Is this the case? (In which case I need the encoding function in order
> >> to make my proposed fix work).
> >>
> >
> > When you create a region, it will do the encoding for you.  See here
> > http://hbase.apache.org/xref/org/apache/hadoop/hbase/HRegionInfo.html#225
> >
> > You should be able to create a region in the shell and toString the
> > output to find the encoding as in:
> >
> > hbase> import org.apache.hadoop.hbase.HRegionInfo
> > hbase> hri = HRegionInfo.new(tablename, startrow, endrow)
> > hbase> puts hri.toString()
> >
> > You might have to mess around to get byte array version of string and
> > row names (Use to_java_bytes if you have strings)
> >
> > St.Ack
> >
> >>
> >>
> >> Looking for someone to ack that my repair strategy is viable. Please
> >> advise.
> >>
> >>
> >>
> >> -geoff
> >>
> >>
> >

Re: "unlink" orpan row from .META.

Posted by Stack <st...@duboce.net>.
One thing you might not realize is that the .META. row is the region
name as a byte array and then the info:regioninfo is the serialized
HRegionInfo.  To see how to do the deserialization, see samples in the
bin directory; e.g. check_meta.rb

St.Ack

On Fri, Aug 12, 2011 at 6:08 PM, Stack <st...@duboce.net> wrote:
> On Fri, Aug 12, 2011 at 5:37 PM, Geoff Hendrey <gh...@decarta.com> wrote:
>> Notice that endkey "X" doesn't exist anywhere as a startkey. I want to
>> fix this by doing a put that will replace start/end pair {B,C} with
>> {B,D} to "unlink" the orphaned row {C,X} from the ".META.".
>
> You should remove the {C,X} row first (I see that you do
>
> Is it really an X as in an end row that sorts after 'D'?  If so, that
> could be a problem.  You'll have entries in this {C,X} region that
> will be outside of the new {C,D} boundary.
>
> If it is outside of the boundary, then we need to be careful putting
> back the store files after you make the new {C,D} region.  We need to
> not put back storefiles with entries that sort after the 'D'.
>
>> I have
>> already backed up all the data in the orphaned region. Then I intend to
>> delete the unlinked orphaned row {C,X} and subsequently PUT back all the
>> backed up data back into the table.
>>
>
> This should work w/ above caveat
>
>
>
>> My concerns is that the "ENCODED" column in .META. encodes the endrow.
>> Is this the case? (In which case I need the encoding function in order
>> to make my proposed fix work).
>>
>
> When you create a region, it will do the encoding for you.  See here
> http://hbase.apache.org/xref/org/apache/hadoop/hbase/HRegionInfo.html#225
>
> You should be able to create a region in the shell and toString the
> output to find the encoding as in:
>
> hbase> import org.apache.hadoop.hbase.HRegionInfo
> hbase> hri = HRegionInfo.new(tablename, startrow, endrow)
> hbase> puts hri.toString()
>
> You might have to mess around to get byte array version of string and
> row names (Use to_java_bytes if you have strings)
>
> St.Ack
>
>>
>>
>> Looking for someone to ack that my repair strategy is viable. Please
>> advise.
>>
>>
>>
>> -geoff
>>
>>
>

Re: "unlink" orpan row from .META.

Posted by Geoff Hendrey <gh...@decarta.com>.
Thanks. I can see from the code link you sent that the encoding does not include the endrow. So we are good to go.  

About "x": poor choice of example on my part; the real endrow would have been less than the next startrow.


We'll give this a shot and let you know how it goes.

Sent from my iPhone

On Aug 12, 2011, at 6:08 PM, "Stack" <st...@duboce.net> wrote:

> On Fri, Aug 12, 2011 at 5:37 PM, Geoff Hendrey <gh...@decarta.com> wrote:
>> Notice that endkey "X" doesn't exist anywhere as a startkey. I want to
>> fix this by doing a put that will replace start/end pair {B,C} with
>> {B,D} to "unlink" the orphaned row {C,X} from the ".META.".
> 
> You should remove the {C,X} row first (I see that you do
> 
> Is it really an X as in an end row that sorts after 'D'?  If so, that
> could be a problem.  You'll have entries in this {C,X} region that
> will be outside of the new {C,D} boundary.
> 
> If it is outside of the boundary, then we need to be careful putting
> back the store files after you make the new {C,D} region.  We need to
> not put back storefiles with entries that sort after the 'D'.
> 
>> I have
>> already backed up all the data in the orphaned region. Then I intend to
>> delete the unlinked orphaned row {C,X} and subsequently PUT back all the
>> backed up data back into the table.
>> 
> 
> This should work w/ above caveat
> 
> 
> 
>> My concerns is that the "ENCODED" column in .META. encodes the endrow.
>> Is this the case? (In which case I need the encoding function in order
>> to make my proposed fix work).
>> 
> 
> When you create a region, it will do the encoding for you.  See here
> http://hbase.apache.org/xref/org/apache/hadoop/hbase/HRegionInfo.html#225
> 
> You should be able to create a region in the shell and toString the
> output to find the encoding as in:
> 
> hbase> import org.apache.hadoop.hbase.HRegionInfo
> hbase> hri = HRegionInfo.new(tablename, startrow, endrow)
> hbase> puts hri.toString()
> 
> You might have to mess around to get byte array version of string and
> row names (Use to_java_bytes if you have strings)
> 
> St.Ack
> 
>> 
>> 
>> Looking for someone to ack that my repair strategy is viable. Please
>> advise.
>> 
>> 
>> 
>> -geoff
>> 
>> 

Re: "unlink" orpan row from .META.

Posted by Stack <st...@duboce.net>.
On Fri, Aug 12, 2011 at 5:37 PM, Geoff Hendrey <gh...@decarta.com> wrote:
> Notice that endkey "X" doesn't exist anywhere as a startkey. I want to
> fix this by doing a put that will replace start/end pair {B,C} with
> {B,D} to "unlink" the orphaned row {C,X} from the ".META.".

You should remove the {C,X} row first (I see that you do

Is it really an X as in an end row that sorts after 'D'?  If so, that
could be a problem.  You'll have entries in this {C,X} region that
will be outside of the new {C,D} boundary.

If it is outside of the boundary, then we need to be careful putting
back the store files after you make the new {C,D} region.  We need to
not put back storefiles with entries that sort after the 'D'.

> I have
> already backed up all the data in the orphaned region. Then I intend to
> delete the unlinked orphaned row {C,X} and subsequently PUT back all the
> backed up data back into the table.
>

This should work w/ above caveat



> My concerns is that the "ENCODED" column in .META. encodes the endrow.
> Is this the case? (In which case I need the encoding function in order
> to make my proposed fix work).
>

When you create a region, it will do the encoding for you.  See here
http://hbase.apache.org/xref/org/apache/hadoop/hbase/HRegionInfo.html#225

You should be able to create a region in the shell and toString the
output to find the encoding as in:

hbase> import org.apache.hadoop.hbase.HRegionInfo
hbase> hri = HRegionInfo.new(tablename, startrow, endrow)
hbase> puts hri.toString()

You might have to mess around to get byte array version of string and
row names (Use to_java_bytes if you have strings)

St.Ack

>
>
> Looking for someone to ack that my repair strategy is viable. Please
> advise.
>
>
>
> -geoff
>
>