You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Denis <de...@camfex.cz> on 2012/08/19 05:08:42 UTC

METADATA recovery

Hi.

I have a trouble with my Accumulo installation.
After hardware failure on NameNode, !METATABLE's root_tables is broken :(

>From "fsck /" output:
....
/accumulo/tables/!0/root_tablet/A000ornd.rf: CORRUPT block
blk_-8590712379082603283
/accumulo/tables/!0/root_tablet/A000ornd.rf: MISSING 1 blocks of total
size 896 B..
....


What could you recommend to recover the data?
Is it possible to reconstruct !METATABLE's root_tablet based on the
rest of !METATABLE files ?
Or is possible to reconstruct the whole !METATABLE based on th content
of the all found tablets ?
Are there any ready tools to do it ?

Thanks.

Re: METADATA recovery

Posted by Eric Newton <er...@gmail.com>.
$ ./bin/accumulo org.apache.accumulo.server,logger.LogReader
/some/walog/filename

The log reader truncates long mutations by default. Tablets names are
compressed to unique ids, so you will see DEFINE_TABLET entries, which map
the tablet to a tablet id, and then the tablet id is used in the recorded
mutations.

You will want to keep the Accumulo garbage collector offline until you are
done.

-Eric

On Sun, Aug 19, 2012 at 8:24 AM, William Slacum <
wilhelm.von.cloud@accumulo.net> wrote:

> I know in trunk, the ability to run `./bin/accumulo rfile-info -d
> /accumulo/path/to/rfile`. If that's unavailable, you can run
> `./bin/accumulo org.apache.accumulo.core.file.rfile.PrintInfo
> -d /accumulo/path/to/rfile`. I'll defer to someone else for the walogs.
>
> On Sun, Aug 19, 2012 at 12:50 AM, Denis <de...@camfex.cz> wrote:
>
>> Hi.
>>
>> That is what am going to do.
>> I still have terabytes of .rf files (!METADATA was the only table
>> affected by the crash) and gigabytes of walog files and I am trying to
>> extract info from them and insert into new database.
>>
>> Do you know any dumping tools for .rf and walog? (i started to create
>> my own, but if there are existing ones, it could help to save the
>> time).
>> Am I right in understanding, that if all the content of .rf and walog
>> will just be inserted into new db, VersioningIterator will remove all
>> collision they may have ?
>>
>> On 8/19/12, John Vines <vi...@apache.org> wrote:
>> > When you have a namenode failure and you recover with teh Secondary
>> > Namenode info, you're dealing with one level of potentially expired
>> > pointers. On top of that, you have more layers of pointers WRT the root
>> > tablet and !METADATA tablets. You can make attempts to recover, but
>> what is
>> > more apt to happen is you'll get a Root tablet up that has some, but not
>> > all of the current !METADATA table files. And then the ones you get do
>> get
>> > up may or may not be pointing to the existing files for your tablets.
>> >
>> > What I'm ultimately trying to say is that you already lost some files,
>> you
>> > are more apt to lose more by trying to recover your old information
>> instead
>> > of taking what you have and starting over. I would suggest taking your
>> > accumulo directory, moving it to accumulo_old or something along those
>> > lines, reinstantiate an instance, and begin bulk importing the remaining
>> > old information back into the new system.
>> >
>> > John
>> >
>> > On Sat, Aug 18, 2012 at 11:08 PM, Denis <de...@camfex.cz> wrote:
>> >
>> >> Hi.
>> >>
>> >> I have a trouble with my Accumulo installation.
>> >> After hardware failure on NameNode, !METATABLE's root_tables is broken
>> :(
>> >>
>> >> From "fsck /" output:
>> >> ....
>> >> /accumulo/tables/!0/root_tablet/A000ornd.rf: CORRUPT block
>> >> blk_-8590712379082603283
>> >> /accumulo/tables/!0/root_tablet/A000ornd.rf: MISSING 1 blocks of total
>> >> size 896 B..
>> >> ....
>> >>
>> >>
>> >> What could you recommend to recover the data?
>> >> Is it possible to reconstruct !METATABLE's root_tablet based on the
>> >> rest of !METATABLE files ?
>> >> Or is possible to reconstruct the whole !METATABLE based on th content
>> >> of the all found tablets ?
>> >> Are there any ready tools to do it ?
>> >>
>> >> Thanks.
>> >>
>> >
>>
>
>

Re: METADATA recovery

Posted by William Slacum <wi...@accumulo.net>.
I know in trunk, the ability to run `./bin/accumulo rfile-info -d
/accumulo/path/to/rfile`. If that's unavailable, you can run
`./bin/accumulo org.apache.accumulo.core.file.rfile.PrintInfo
-d /accumulo/path/to/rfile`. I'll defer to someone else for the walogs.

On Sun, Aug 19, 2012 at 12:50 AM, Denis <de...@camfex.cz> wrote:

> Hi.
>
> That is what am going to do.
> I still have terabytes of .rf files (!METADATA was the only table
> affected by the crash) and gigabytes of walog files and I am trying to
> extract info from them and insert into new database.
>
> Do you know any dumping tools for .rf and walog? (i started to create
> my own, but if there are existing ones, it could help to save the
> time).
> Am I right in understanding, that if all the content of .rf and walog
> will just be inserted into new db, VersioningIterator will remove all
> collision they may have ?
>
> On 8/19/12, John Vines <vi...@apache.org> wrote:
> > When you have a namenode failure and you recover with teh Secondary
> > Namenode info, you're dealing with one level of potentially expired
> > pointers. On top of that, you have more layers of pointers WRT the root
> > tablet and !METADATA tablets. You can make attempts to recover, but what
> is
> > more apt to happen is you'll get a Root tablet up that has some, but not
> > all of the current !METADATA table files. And then the ones you get do
> get
> > up may or may not be pointing to the existing files for your tablets.
> >
> > What I'm ultimately trying to say is that you already lost some files,
> you
> > are more apt to lose more by trying to recover your old information
> instead
> > of taking what you have and starting over. I would suggest taking your
> > accumulo directory, moving it to accumulo_old or something along those
> > lines, reinstantiate an instance, and begin bulk importing the remaining
> > old information back into the new system.
> >
> > John
> >
> > On Sat, Aug 18, 2012 at 11:08 PM, Denis <de...@camfex.cz> wrote:
> >
> >> Hi.
> >>
> >> I have a trouble with my Accumulo installation.
> >> After hardware failure on NameNode, !METATABLE's root_tables is broken
> :(
> >>
> >> From "fsck /" output:
> >> ....
> >> /accumulo/tables/!0/root_tablet/A000ornd.rf: CORRUPT block
> >> blk_-8590712379082603283
> >> /accumulo/tables/!0/root_tablet/A000ornd.rf: MISSING 1 blocks of total
> >> size 896 B..
> >> ....
> >>
> >>
> >> What could you recommend to recover the data?
> >> Is it possible to reconstruct !METATABLE's root_tablet based on the
> >> rest of !METATABLE files ?
> >> Or is possible to reconstruct the whole !METATABLE based on th content
> >> of the all found tablets ?
> >> Are there any ready tools to do it ?
> >>
> >> Thanks.
> >>
> >
>

Re: METADATA recovery

Posted by Keith Turner <ke...@deenlo.com>.
On Sun, Aug 19, 2012 at 12:50 AM, Denis <de...@camfex.cz> wrote:
> Hi.
>
> That is what am going to do.
> I still have terabytes of .rf files (!METADATA was the only table
> affected by the crash) and gigabytes of walog files and I am trying to
> extract info from them and insert into new database.
>
> Do you know any dumping tools for .rf and walog? (i started to create
> my own, but if there are existing ones, it could help to save the
> time).
> Am I right in understanding, that if all the content of .rf and walog
> will just be inserted into new db, VersioningIterator will remove all
> collision they may have ?

One thing to be aware of is that deleted data could come back when you
just import all of the files. This can happen because tablets can
reference files that contain data outside of the tablets range.  This
happens as a result of splits and bulk imports.   Below is an example
of this.

 * Tablet1 refs file F1 which contains X
 * Tablet1 splits into Tablet2 and Tablet3. Both tablets reference
file F1.  X falls within Tablet 3.
 * X is deleted from Tablet3
 * Tablet3 compacts to file F2, F2 does not contain X

After the sequence of events above, X still exist in file F1 even
though it was deleted and compacted away in Tablet3.  Normally this is
not a problem because Tablet2 does not read the part of file F1 that
contains X.  However if you just take F1 and import the file, then X
will come back.  The fact that you are only interested in a portion of
file F1 is lost with the !METADATA table.



>
> On 8/19/12, John Vines <vi...@apache.org> wrote:
>> When you have a namenode failure and you recover with teh Secondary
>> Namenode info, you're dealing with one level of potentially expired
>> pointers. On top of that, you have more layers of pointers WRT the root
>> tablet and !METADATA tablets. You can make attempts to recover, but what is
>> more apt to happen is you'll get a Root tablet up that has some, but not
>> all of the current !METADATA table files. And then the ones you get do get
>> up may or may not be pointing to the existing files for your tablets.
>>
>> What I'm ultimately trying to say is that you already lost some files, you
>> are more apt to lose more by trying to recover your old information instead
>> of taking what you have and starting over. I would suggest taking your
>> accumulo directory, moving it to accumulo_old or something along those
>> lines, reinstantiate an instance, and begin bulk importing the remaining
>> old information back into the new system.
>>
>> John
>>
>> On Sat, Aug 18, 2012 at 11:08 PM, Denis <de...@camfex.cz> wrote:
>>
>>> Hi.
>>>
>>> I have a trouble with my Accumulo installation.
>>> After hardware failure on NameNode, !METATABLE's root_tables is broken :(
>>>
>>> From "fsck /" output:
>>> ....
>>> /accumulo/tables/!0/root_tablet/A000ornd.rf: CORRUPT block
>>> blk_-8590712379082603283
>>> /accumulo/tables/!0/root_tablet/A000ornd.rf: MISSING 1 blocks of total
>>> size 896 B..
>>> ....
>>>
>>>
>>> What could you recommend to recover the data?
>>> Is it possible to reconstruct !METATABLE's root_tablet based on the
>>> rest of !METATABLE files ?
>>> Or is possible to reconstruct the whole !METATABLE based on th content
>>> of the all found tablets ?
>>> Are there any ready tools to do it ?
>>>
>>> Thanks.
>>>
>>

Re: METADATA recovery

Posted by Denis <de...@camfex.cz>.
Hi.

That is what am going to do.
I still have terabytes of .rf files (!METADATA was the only table
affected by the crash) and gigabytes of walog files and I am trying to
extract info from them and insert into new database.

Do you know any dumping tools for .rf and walog? (i started to create
my own, but if there are existing ones, it could help to save the
time).
Am I right in understanding, that if all the content of .rf and walog
will just be inserted into new db, VersioningIterator will remove all
collision they may have ?

On 8/19/12, John Vines <vi...@apache.org> wrote:
> When you have a namenode failure and you recover with teh Secondary
> Namenode info, you're dealing with one level of potentially expired
> pointers. On top of that, you have more layers of pointers WRT the root
> tablet and !METADATA tablets. You can make attempts to recover, but what is
> more apt to happen is you'll get a Root tablet up that has some, but not
> all of the current !METADATA table files. And then the ones you get do get
> up may or may not be pointing to the existing files for your tablets.
>
> What I'm ultimately trying to say is that you already lost some files, you
> are more apt to lose more by trying to recover your old information instead
> of taking what you have and starting over. I would suggest taking your
> accumulo directory, moving it to accumulo_old or something along those
> lines, reinstantiate an instance, and begin bulk importing the remaining
> old information back into the new system.
>
> John
>
> On Sat, Aug 18, 2012 at 11:08 PM, Denis <de...@camfex.cz> wrote:
>
>> Hi.
>>
>> I have a trouble with my Accumulo installation.
>> After hardware failure on NameNode, !METATABLE's root_tables is broken :(
>>
>> From "fsck /" output:
>> ....
>> /accumulo/tables/!0/root_tablet/A000ornd.rf: CORRUPT block
>> blk_-8590712379082603283
>> /accumulo/tables/!0/root_tablet/A000ornd.rf: MISSING 1 blocks of total
>> size 896 B..
>> ....
>>
>>
>> What could you recommend to recover the data?
>> Is it possible to reconstruct !METATABLE's root_tablet based on the
>> rest of !METATABLE files ?
>> Or is possible to reconstruct the whole !METATABLE based on th content
>> of the all found tablets ?
>> Are there any ready tools to do it ?
>>
>> Thanks.
>>
>

Re: METADATA recovery

Posted by John Vines <vi...@apache.org>.
When you have a namenode failure and you recover with teh Secondary
Namenode info, you're dealing with one level of potentially expired
pointers. On top of that, you have more layers of pointers WRT the root
tablet and !METADATA tablets. You can make attempts to recover, but what is
more apt to happen is you'll get a Root tablet up that has some, but not
all of the current !METADATA table files. And then the ones you get do get
up may or may not be pointing to the existing files for your tablets.

What I'm ultimately trying to say is that you already lost some files, you
are more apt to lose more by trying to recover your old information instead
of taking what you have and starting over. I would suggest taking your
accumulo directory, moving it to accumulo_old or something along those
lines, reinstantiate an instance, and begin bulk importing the remaining
old information back into the new system.

John

On Sat, Aug 18, 2012 at 11:08 PM, Denis <de...@camfex.cz> wrote:

> Hi.
>
> I have a trouble with my Accumulo installation.
> After hardware failure on NameNode, !METATABLE's root_tables is broken :(
>
> From "fsck /" output:
> ....
> /accumulo/tables/!0/root_tablet/A000ornd.rf: CORRUPT block
> blk_-8590712379082603283
> /accumulo/tables/!0/root_tablet/A000ornd.rf: MISSING 1 blocks of total
> size 896 B..
> ....
>
>
> What could you recommend to recover the data?
> Is it possible to reconstruct !METATABLE's root_tablet based on the
> rest of !METATABLE files ?
> Or is possible to reconstruct the whole !METATABLE based on th content
> of the all found tablets ?
> Are there any ready tools to do it ?
>
> Thanks.
>