You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by mike anderson <sa...@gmail.com> on 2009/12/03 16:29:01 UTC

hbase crashed, table missing

Hbase crashed on me this weekend, and upon restarting one of the tables is
just completely gone. All of the table data is still in HDFS and my missing
table is still mentioned in .META.. I tried restarting hbase a few times,
but the table didn't show up. What else can I do to debug this? I looked
through the logs, but nothing really jumped out at me. Is there something I
should look for?

I took a look at this ticket,
http://issues.apache.org/jira/browse/HBASE-1342, but don't know enough about
the inner workings of hbase to make sense of it.


thanks in advance.

Re: hbase crashed, table missing

Posted by mike anderson <sa...@gmail.com>.
I posted the master log up on a web server:

http://assets0.pubget.com/data/hbase-pubget-master-carr.projectlounge.com.log.2009-12-02

The crash happened around 22:00, though I can start to see a few exceptions
at 21:27, continuing on through the night. Again, the power took out all of
the nodes, but left the master in tact, so it looks like most of the log is
the master trying to regain communication with the nodes.

Would love to see any insight you have into the mystery.

Thanks,
Mike


On Thu, Dec 3, 2009 at 4:42 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> Mike,
>
> I'm glad it worked out for you! And I'm curious too, this shouldn't be
> happening. I'd love to take take a look at your master's log from the
> day of the failure. You could put it on a web server or try to attach
> it to a reply (but that usually gets filtered).
>
> J-D
>
> On Thu, Dec 3, 2009 at 1:23 PM, mike anderson <sa...@gmail.com>
> wrote:
> > wow! Thanks for all your help. I just took the add_table.rb script for a
> run
> > and it worked flawlessly. Kudos to the community!
> >
> > I'm still curious as to what might have happened? Was the .META. table
> just
> > slightly out of whack?
> >
> > -mike
> >
> > On Thu, Dec 3, 2009 at 3:36 PM, mike anderson <saidtherobot@gmail.com
> >wrote:
> >
> >> This was a table that had been around for almost two months now and had
> >> many regions. The web UI reports 231 regions, and I am certain that the
> >> tables being reported don't have nearly that many regions, so perhaps
> this
> >> count includes those from the missing table.
> >>
> >> In the folder: /hbase/cached_web_pages/1102708773/http is a single 130MB
> >> file full of rows/columns. We are caching the full html of websites into
> the
> >> columns so copying and pasting some of the rows won't be very useful,
> but
> >> the chunk starts with this:
> >>
> >> "DATABLK*f #ŸRhttp%3A%2F%2Fwww.informaworld.com
> %2Fsmpp%2Ftitle%7Edb%3Dall%7Econtent%3Dg903750466
> >> httpdata $í ó "
> >>
> >> I tried to enable a region, but get:
> >>
> >>  from (hbase):3hbase(main):003:0> enable_region
> >> 'cached_web_pages,metapress_ris_120417,1257429337740'
> >> NativeException: java.lang.NullPointerException: null
> >>  from org/apache/hadoop/hbase/util/Writables.java:74:in `getWritable'
> >> from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0'
> >>  from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke'
> >> from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
> >>  from java/lang/reflect/Method.java:597:in `invoke'
> >> from org/jruby/javasupport/JavaMethod.java:298:in
> >> `invokeWithExceptionHandling'
> >>  from org/jruby/javasupport/JavaMethod.java:278:in `invoke_static'
> >> from org/jruby/java/invokers/StaticMethodInvoker.java:57:in `call'
> >>  from org/jruby/runtime/callsite/CachingCallSite.java:150:in `call'
> >> from org/jruby/ast/CallTwoArgNode.java:59:in `interpret'
> >>  from org/jruby/ast/LocalAsgnNode.java:123:in `interpret'
> >> from org/jruby/ast/NewlineNode.java:104:in `interpret'
> >>  from org/jruby/ast/BlockNode.java:71:in `interpret'
> >> from org/jruby/internal/runtime/methods/InterpretedMethod.java:201:in
> >> `call'
> >>  from org/jruby/internal/runtime/methods/DefaultMethod.java:162:in
> `call'
> >> from org/jruby/runtime/callsite/CachingCallSite.java:150:in `call'
> >> ... 112 levels...
> >> from org/jruby/internal/runtime/methods/DynamicMethod.java:226:in `call'
> >> from org/jruby/internal/runtime/methods/CompiledMethod.java:211:in
> `call'
> >>  from org/jruby/internal/runtime/methods/CompiledMethod.java:71:in
> `call'
> >> from org/jruby/runtime/callsite/CachingCallSite.java:253:in
> `cacheAndCall'
> >>  from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call'
> >> from usr/local/hbase/bin/$_dot_dot_/bin/hirb.rb:487:in `__file__'
> >>  from usr/local/hbase/bin/$_dot_dot_/bin/hirb.rb:-1:in `load'
> >> from org/jruby/Ruby.java:577:in `runScript'
> >>  from org/jruby/Ruby.java:480:in `runNormally'
> >> from org/jruby/Ruby.java:354:in `runFromMain'
> >>  from org/jruby/Main.java:229:in `run'
> >> from org/jruby/Main.java:110:in `run'
> >>  from org/jruby/Main.java:94:in `main'
> >> from /usr/local/hbase/bin/../bin/HBase.rb:138:in `enable_region'
> >>  from /usr/local/hbase/bin/../bin/hirb.rb:350:in `enable_region'
> >> from (hbase):4hbase(main):004:0>
> >>
> >> Thanks again.
> >>
> >> -mike
> >>
> >> On Thu, Dec 3, 2009 at 3:21 PM, Jean-Daniel Cryans <jdcryans@apache.org
> >wrote:
> >>
> >>> What's in the HDFS folder of that table? Here I see that you should
> >>> have something like:
> >>>
> >>> /hbase/cached_web_pages/1325672518/http/  stuff...
> >>>
> >>> Was there only this one region?
> >>>
> >>> Also are you able to enable a region in the shell? Take one of the row
> >>> key from .META. and do
> >>>
> >>> > enable_region 'region name'
> >>>
> >>> J-D
> >>>
> >>> On Thu, Dec 3, 2009 at 12:11 PM, mike anderson <saidtherobot@gmail.com
> >
> >>> wrote:
> >>> > Here's a snippit from the meta table (I can send you the whole thing,
> >>> but
> >>> > it's quite large),
> >>> >
> >>> > cached_web_pages,http%3A%2F column=info:serverstartcode,
> >>> > timestamp=1259853027975, value=1259852967063
> >>> >  %2Fdx.doi.org%2F10.1002%252
> >>> >
> >>> >  Fajpa.21214,1259739437144
> >>> >
> >>> >  cached_web_pages,http%3A%2F column=historian:assignment,
> >>> > timestamp=1259807436758, value=Region assigned to se
> >>> >  %2Fdx.doi.org%2F10.1002%252 rver
> >>> > ghetto169.projectlounge.com,60020,1256139356112
> >>> >
> >>> >  Fejoc.200900768,12555040994
> >>> >
> >>> >  35
> >>> >
> >>> >  cached_web_pages,http%3A%2F column=historian:open,
> >>> timestamp=1259807436723,
> >>> > value=Region opened on server : g
> >>> >  %2Fdx.doi.org%2F10.1002%252 hetto169.projectlounge.com
> >>> >
> >>> >  Fejoc.200900768,12555040994
> >>> >
> >>> >  35
> >>> >
> >>> >  cached_web_pages,http%3A%2F column=historian:assignment,
> >>> > timestamp=1259853024917, value=Region assigned to se
> >>> >  %2Fdx.doi.org%2F10.1002%252 rver
> >>> > ghetto167.projectlounge.com,60020,1259852967063
> >>> >
> >>> >  Fsmi.1285,1258589376676
> >>> >
> >>> >  cached_web_pages,http%3A%2F column=historian:open,
> >>> timestamp=1259853027984,
> >>> > value=Region opened on server : g
> >>> >  %2Fdx.doi.org%2F10.1002%252 hetto167.projectlounge.com
> >>> >
> >>> >  Fsmi.1285,1258589376676
> >>> >
> >>> >  cached_web_pages,http%3A%2F column=info:regioninfo,
> >>> > timestamp=1258589203875, value=REGION => {NAME => 'cached
> >>> >  %2Fdx.doi.org%2F10.1002%252 _web_pages,http\\x253A\\x252F\\
> >>> x252Fdx.doi.org
> >>> > \\x252F10.1002\\x25252Fsmi.1285,125
> >>> >  Fsmi.1285,1258589376676     8589376676', STARTKEY =>
> >>> 'http\\x253A\\x252F\\
> >>> > x252Fdx.doi.org\\x252F10.1002\\x252
> >>> >                             52Fsmi.1285', ENDKEY =>
> >>> 'http\\x253A\\x252F\\
> >>> > x252Fdx.doi.org\\x252F10.1016\\x252F
> >>> >                             j.apergo.2009.09.005', ENCODED =>
> >>> 1325672518,
> >>> > TABLE => {{NAME => 'cached_web_page
> >>> >                             s', FAMILIES => [{NAME => 'http',
> VERSIONS
> >>> =>
> >>> > '1', COMPRESSION => 'NONE', TTL =>
> >>> >                             '2147483647', BLOCKSIZE => '65536',
> >>> IN_MEMORY
> >>> > => 'false', BLOCKCACHE => 'true'}]}
> >>> >                             }
> >>> >
> >>> >
> >>> > and you can see the table which has gone missing 'cached_web_pages'
> in
> >>> the
> >>> > key spot. The crash over the weekend was pretty traumatic. Complete
> >>> power
> >>> > outage to the entire cluster except(!) for the master.  The data is
> >>> > definitely still on HDFS, I will take a look at the add_table script
> and
> >>> > upgrade to 0.20.2.
> >>> >
> >>> >
> >>> > Cheers and thanks a lot.
> >>> >
> >>> > mike
> >>> >
> >>> >
> >>> > On Thu, Dec 3, 2009 at 2:51 PM, Jean-Daniel Cryans <
> jdcryans@apache.org
> >>> >wrote:
> >>> >
> >>> >> This is weird if the table is in .META. and still not showing up...
> >>> >> could you pastebin the .META. rows?
> >>> >>
> >>> >> Also was it a new table that was just created or has it been there
> for
> >>> >> some time?
> >>> >>
> >>> >> What kind of crash did you get this weekend?
> >>> >>
> >>> >> The best way to recover your data, if it's still on HDFS, will be to
> >>> >> upgrade to 0.20.2 and use the script bin/add_table.rb to rebuild
> >>> >> .META.
> >>> >>
> >>> >> J-D
> >>> >>
> >>> >> On Thu, Dec 3, 2009 at 11:29 AM, mike anderson <
> saidtherobot@gmail.com
> >>> >
> >>> >> wrote:
> >>> >> > From the web UI and from calling 'list' in the shell I can't see
> the
> >>> >> table
> >>> >> > name.
> >>> >> >
> >>> >> > Hadoop/Hbase 0.20/0.20.1, distributed setup, 10 nodes.
> >>> >> >
> >>> >> > -mike
> >>> >> >
> >>> >> > On Thu, Dec 3, 2009 at 1:54 PM, Jean-Daniel Cryans <
> >>> jdcryans@apache.org
> >>> >> >wrote:
> >>> >> >
> >>> >> >> Mike,
> >>> >> >>
> >>> >> >> So if you looked in .META. and the rows are there, how did you
> >>> figure
> >>> >> >> that the table is missing?
> >>> >> >>
> >>> >> >> Also the usuals: which version of Hadoop/HBase, what kind of
> setup,
> >>> etc
> >>> >> >>
> >>> >> >> J-D
> >>> >> >>
> >>> >> >> On Thu, Dec 3, 2009 at 7:29 AM, mike anderson <
> >>> saidtherobot@gmail.com>
> >>> >> >> wrote:
> >>> >> >> > Hbase crashed on me this weekend, and upon restarting one of
> the
> >>> >> tables
> >>> >> >> is
> >>> >> >> > just completely gone. All of the table data is still in HDFS
> and
> >>> my
> >>> >> >> missing
> >>> >> >> > table is still mentioned in .META.. I tried restarting hbase a
> few
> >>> >> times,
> >>> >> >> > but the table didn't show up. What else can I do to debug this?
> I
> >>> >> looked
> >>> >> >> > through the logs, but nothing really jumped out at me. Is there
> >>> >> something
> >>> >> >> I
> >>> >> >> > should look for?
> >>> >> >> >
> >>> >> >> > I took a look at this ticket,
> >>> >> >> > http://issues.apache.org/jira/browse/HBASE-1342, but don't
> know
> >>> >> enough
> >>> >> >> about
> >>> >> >> > the inner workings of hbase to make sense of it.
> >>> >> >> >
> >>> >> >> >
> >>> >> >> > thanks in advance.
> >>> >> >> >
> >>> >> >>
> >>> >> >
> >>> >>
> >>> >
> >>>
> >>
> >>
> >
>

Re: hbase crashed, table missing

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Mike,

I'm glad it worked out for you! And I'm curious too, this shouldn't be
happening. I'd love to take take a look at your master's log from the
day of the failure. You could put it on a web server or try to attach
it to a reply (but that usually gets filtered).

J-D

On Thu, Dec 3, 2009 at 1:23 PM, mike anderson <sa...@gmail.com> wrote:
> wow! Thanks for all your help. I just took the add_table.rb script for a run
> and it worked flawlessly. Kudos to the community!
>
> I'm still curious as to what might have happened? Was the .META. table just
> slightly out of whack?
>
> -mike
>
> On Thu, Dec 3, 2009 at 3:36 PM, mike anderson <sa...@gmail.com>wrote:
>
>> This was a table that had been around for almost two months now and had
>> many regions. The web UI reports 231 regions, and I am certain that the
>> tables being reported don't have nearly that many regions, so perhaps this
>> count includes those from the missing table.
>>
>> In the folder: /hbase/cached_web_pages/1102708773/http is a single 130MB
>> file full of rows/columns. We are caching the full html of websites into the
>> columns so copying and pasting some of the rows won't be very useful, but
>> the chunk starts with this:
>>
>> "DATABLK*f #ŸRhttp%3A%2F%2Fwww.informaworld.com%2Fsmpp%2Ftitle%7Edb%3Dall%7Econtent%3Dg903750466
>> httpdata $í ó "
>>
>> I tried to enable a region, but get:
>>
>>  from (hbase):3hbase(main):003:0> enable_region
>> 'cached_web_pages,metapress_ris_120417,1257429337740'
>> NativeException: java.lang.NullPointerException: null
>>  from org/apache/hadoop/hbase/util/Writables.java:74:in `getWritable'
>> from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0'
>>  from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke'
>> from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
>>  from java/lang/reflect/Method.java:597:in `invoke'
>> from org/jruby/javasupport/JavaMethod.java:298:in
>> `invokeWithExceptionHandling'
>>  from org/jruby/javasupport/JavaMethod.java:278:in `invoke_static'
>> from org/jruby/java/invokers/StaticMethodInvoker.java:57:in `call'
>>  from org/jruby/runtime/callsite/CachingCallSite.java:150:in `call'
>> from org/jruby/ast/CallTwoArgNode.java:59:in `interpret'
>>  from org/jruby/ast/LocalAsgnNode.java:123:in `interpret'
>> from org/jruby/ast/NewlineNode.java:104:in `interpret'
>>  from org/jruby/ast/BlockNode.java:71:in `interpret'
>> from org/jruby/internal/runtime/methods/InterpretedMethod.java:201:in
>> `call'
>>  from org/jruby/internal/runtime/methods/DefaultMethod.java:162:in `call'
>> from org/jruby/runtime/callsite/CachingCallSite.java:150:in `call'
>> ... 112 levels...
>> from org/jruby/internal/runtime/methods/DynamicMethod.java:226:in `call'
>> from org/jruby/internal/runtime/methods/CompiledMethod.java:211:in `call'
>>  from org/jruby/internal/runtime/methods/CompiledMethod.java:71:in `call'
>> from org/jruby/runtime/callsite/CachingCallSite.java:253:in `cacheAndCall'
>>  from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call'
>> from usr/local/hbase/bin/$_dot_dot_/bin/hirb.rb:487:in `__file__'
>>  from usr/local/hbase/bin/$_dot_dot_/bin/hirb.rb:-1:in `load'
>> from org/jruby/Ruby.java:577:in `runScript'
>>  from org/jruby/Ruby.java:480:in `runNormally'
>> from org/jruby/Ruby.java:354:in `runFromMain'
>>  from org/jruby/Main.java:229:in `run'
>> from org/jruby/Main.java:110:in `run'
>>  from org/jruby/Main.java:94:in `main'
>> from /usr/local/hbase/bin/../bin/HBase.rb:138:in `enable_region'
>>  from /usr/local/hbase/bin/../bin/hirb.rb:350:in `enable_region'
>> from (hbase):4hbase(main):004:0>
>>
>> Thanks again.
>>
>> -mike
>>
>> On Thu, Dec 3, 2009 at 3:21 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>>
>>> What's in the HDFS folder of that table? Here I see that you should
>>> have something like:
>>>
>>> /hbase/cached_web_pages/1325672518/http/  stuff...
>>>
>>> Was there only this one region?
>>>
>>> Also are you able to enable a region in the shell? Take one of the row
>>> key from .META. and do
>>>
>>> > enable_region 'region name'
>>>
>>> J-D
>>>
>>> On Thu, Dec 3, 2009 at 12:11 PM, mike anderson <sa...@gmail.com>
>>> wrote:
>>> > Here's a snippit from the meta table (I can send you the whole thing,
>>> but
>>> > it's quite large),
>>> >
>>> > cached_web_pages,http%3A%2F column=info:serverstartcode,
>>> > timestamp=1259853027975, value=1259852967063
>>> >  %2Fdx.doi.org%2F10.1002%252
>>> >
>>> >  Fajpa.21214,1259739437144
>>> >
>>> >  cached_web_pages,http%3A%2F column=historian:assignment,
>>> > timestamp=1259807436758, value=Region assigned to se
>>> >  %2Fdx.doi.org%2F10.1002%252 rver
>>> > ghetto169.projectlounge.com,60020,1256139356112
>>> >
>>> >  Fejoc.200900768,12555040994
>>> >
>>> >  35
>>> >
>>> >  cached_web_pages,http%3A%2F column=historian:open,
>>> timestamp=1259807436723,
>>> > value=Region opened on server : g
>>> >  %2Fdx.doi.org%2F10.1002%252 hetto169.projectlounge.com
>>> >
>>> >  Fejoc.200900768,12555040994
>>> >
>>> >  35
>>> >
>>> >  cached_web_pages,http%3A%2F column=historian:assignment,
>>> > timestamp=1259853024917, value=Region assigned to se
>>> >  %2Fdx.doi.org%2F10.1002%252 rver
>>> > ghetto167.projectlounge.com,60020,1259852967063
>>> >
>>> >  Fsmi.1285,1258589376676
>>> >
>>> >  cached_web_pages,http%3A%2F column=historian:open,
>>> timestamp=1259853027984,
>>> > value=Region opened on server : g
>>> >  %2Fdx.doi.org%2F10.1002%252 hetto167.projectlounge.com
>>> >
>>> >  Fsmi.1285,1258589376676
>>> >
>>> >  cached_web_pages,http%3A%2F column=info:regioninfo,
>>> > timestamp=1258589203875, value=REGION => {NAME => 'cached
>>> >  %2Fdx.doi.org%2F10.1002%252 _web_pages,http\\x253A\\x252F\\
>>> x252Fdx.doi.org
>>> > \\x252F10.1002\\x25252Fsmi.1285,125
>>> >  Fsmi.1285,1258589376676     8589376676', STARTKEY =>
>>> 'http\\x253A\\x252F\\
>>> > x252Fdx.doi.org\\x252F10.1002\\x252
>>> >                             52Fsmi.1285', ENDKEY =>
>>> 'http\\x253A\\x252F\\
>>> > x252Fdx.doi.org\\x252F10.1016\\x252F
>>> >                             j.apergo.2009.09.005', ENCODED =>
>>> 1325672518,
>>> > TABLE => {{NAME => 'cached_web_page
>>> >                             s', FAMILIES => [{NAME => 'http', VERSIONS
>>> =>
>>> > '1', COMPRESSION => 'NONE', TTL =>
>>> >                             '2147483647', BLOCKSIZE => '65536',
>>> IN_MEMORY
>>> > => 'false', BLOCKCACHE => 'true'}]}
>>> >                             }
>>> >
>>> >
>>> > and you can see the table which has gone missing 'cached_web_pages' in
>>> the
>>> > key spot. The crash over the weekend was pretty traumatic. Complete
>>> power
>>> > outage to the entire cluster except(!) for the master.  The data is
>>> > definitely still on HDFS, I will take a look at the add_table script and
>>> > upgrade to 0.20.2.
>>> >
>>> >
>>> > Cheers and thanks a lot.
>>> >
>>> > mike
>>> >
>>> >
>>> > On Thu, Dec 3, 2009 at 2:51 PM, Jean-Daniel Cryans <jdcryans@apache.org
>>> >wrote:
>>> >
>>> >> This is weird if the table is in .META. and still not showing up...
>>> >> could you pastebin the .META. rows?
>>> >>
>>> >> Also was it a new table that was just created or has it been there for
>>> >> some time?
>>> >>
>>> >> What kind of crash did you get this weekend?
>>> >>
>>> >> The best way to recover your data, if it's still on HDFS, will be to
>>> >> upgrade to 0.20.2 and use the script bin/add_table.rb to rebuild
>>> >> .META.
>>> >>
>>> >> J-D
>>> >>
>>> >> On Thu, Dec 3, 2009 at 11:29 AM, mike anderson <saidtherobot@gmail.com
>>> >
>>> >> wrote:
>>> >> > From the web UI and from calling 'list' in the shell I can't see the
>>> >> table
>>> >> > name.
>>> >> >
>>> >> > Hadoop/Hbase 0.20/0.20.1, distributed setup, 10 nodes.
>>> >> >
>>> >> > -mike
>>> >> >
>>> >> > On Thu, Dec 3, 2009 at 1:54 PM, Jean-Daniel Cryans <
>>> jdcryans@apache.org
>>> >> >wrote:
>>> >> >
>>> >> >> Mike,
>>> >> >>
>>> >> >> So if you looked in .META. and the rows are there, how did you
>>> figure
>>> >> >> that the table is missing?
>>> >> >>
>>> >> >> Also the usuals: which version of Hadoop/HBase, what kind of setup,
>>> etc
>>> >> >>
>>> >> >> J-D
>>> >> >>
>>> >> >> On Thu, Dec 3, 2009 at 7:29 AM, mike anderson <
>>> saidtherobot@gmail.com>
>>> >> >> wrote:
>>> >> >> > Hbase crashed on me this weekend, and upon restarting one of the
>>> >> tables
>>> >> >> is
>>> >> >> > just completely gone. All of the table data is still in HDFS and
>>> my
>>> >> >> missing
>>> >> >> > table is still mentioned in .META.. I tried restarting hbase a few
>>> >> times,
>>> >> >> > but the table didn't show up. What else can I do to debug this? I
>>> >> looked
>>> >> >> > through the logs, but nothing really jumped out at me. Is there
>>> >> something
>>> >> >> I
>>> >> >> > should look for?
>>> >> >> >
>>> >> >> > I took a look at this ticket,
>>> >> >> > http://issues.apache.org/jira/browse/HBASE-1342, but don't know
>>> >> enough
>>> >> >> about
>>> >> >> > the inner workings of hbase to make sense of it.
>>> >> >> >
>>> >> >> >
>>> >> >> > thanks in advance.
>>> >> >> >
>>> >> >>
>>> >> >
>>> >>
>>> >
>>>
>>
>>
>

Re: hbase crashed, table missing

Posted by mike anderson <sa...@gmail.com>.
wow! Thanks for all your help. I just took the add_table.rb script for a run
and it worked flawlessly. Kudos to the community!

I'm still curious as to what might have happened? Was the .META. table just
slightly out of whack?

-mike

On Thu, Dec 3, 2009 at 3:36 PM, mike anderson <sa...@gmail.com>wrote:

> This was a table that had been around for almost two months now and had
> many regions. The web UI reports 231 regions, and I am certain that the
> tables being reported don't have nearly that many regions, so perhaps this
> count includes those from the missing table.
>
> In the folder: /hbase/cached_web_pages/1102708773/http is a single 130MB
> file full of rows/columns. We are caching the full html of websites into the
> columns so copying and pasting some of the rows won't be very useful, but
> the chunk starts with this:
>
> "DATABLK*f #ŸRhttp%3A%2F%2Fwww.informaworld.com%2Fsmpp%2Ftitle%7Edb%3Dall%7Econtent%3Dg903750466
> httpdata $í ó "
>
> I tried to enable a region, but get:
>
>  from (hbase):3hbase(main):003:0> enable_region
> 'cached_web_pages,metapress_ris_120417,1257429337740'
> NativeException: java.lang.NullPointerException: null
>  from org/apache/hadoop/hbase/util/Writables.java:74:in `getWritable'
> from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0'
>  from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke'
> from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
>  from java/lang/reflect/Method.java:597:in `invoke'
> from org/jruby/javasupport/JavaMethod.java:298:in
> `invokeWithExceptionHandling'
>  from org/jruby/javasupport/JavaMethod.java:278:in `invoke_static'
> from org/jruby/java/invokers/StaticMethodInvoker.java:57:in `call'
>  from org/jruby/runtime/callsite/CachingCallSite.java:150:in `call'
> from org/jruby/ast/CallTwoArgNode.java:59:in `interpret'
>  from org/jruby/ast/LocalAsgnNode.java:123:in `interpret'
> from org/jruby/ast/NewlineNode.java:104:in `interpret'
>  from org/jruby/ast/BlockNode.java:71:in `interpret'
> from org/jruby/internal/runtime/methods/InterpretedMethod.java:201:in
> `call'
>  from org/jruby/internal/runtime/methods/DefaultMethod.java:162:in `call'
> from org/jruby/runtime/callsite/CachingCallSite.java:150:in `call'
> ... 112 levels...
> from org/jruby/internal/runtime/methods/DynamicMethod.java:226:in `call'
> from org/jruby/internal/runtime/methods/CompiledMethod.java:211:in `call'
>  from org/jruby/internal/runtime/methods/CompiledMethod.java:71:in `call'
> from org/jruby/runtime/callsite/CachingCallSite.java:253:in `cacheAndCall'
>  from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call'
> from usr/local/hbase/bin/$_dot_dot_/bin/hirb.rb:487:in `__file__'
>  from usr/local/hbase/bin/$_dot_dot_/bin/hirb.rb:-1:in `load'
> from org/jruby/Ruby.java:577:in `runScript'
>  from org/jruby/Ruby.java:480:in `runNormally'
> from org/jruby/Ruby.java:354:in `runFromMain'
>  from org/jruby/Main.java:229:in `run'
> from org/jruby/Main.java:110:in `run'
>  from org/jruby/Main.java:94:in `main'
> from /usr/local/hbase/bin/../bin/HBase.rb:138:in `enable_region'
>  from /usr/local/hbase/bin/../bin/hirb.rb:350:in `enable_region'
> from (hbase):4hbase(main):004:0>
>
> Thanks again.
>
> -mike
>
> On Thu, Dec 3, 2009 at 3:21 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> What's in the HDFS folder of that table? Here I see that you should
>> have something like:
>>
>> /hbase/cached_web_pages/1325672518/http/  stuff...
>>
>> Was there only this one region?
>>
>> Also are you able to enable a region in the shell? Take one of the row
>> key from .META. and do
>>
>> > enable_region 'region name'
>>
>> J-D
>>
>> On Thu, Dec 3, 2009 at 12:11 PM, mike anderson <sa...@gmail.com>
>> wrote:
>> > Here's a snippit from the meta table (I can send you the whole thing,
>> but
>> > it's quite large),
>> >
>> > cached_web_pages,http%3A%2F column=info:serverstartcode,
>> > timestamp=1259853027975, value=1259852967063
>> >  %2Fdx.doi.org%2F10.1002%252
>> >
>> >  Fajpa.21214,1259739437144
>> >
>> >  cached_web_pages,http%3A%2F column=historian:assignment,
>> > timestamp=1259807436758, value=Region assigned to se
>> >  %2Fdx.doi.org%2F10.1002%252 rver
>> > ghetto169.projectlounge.com,60020,1256139356112
>> >
>> >  Fejoc.200900768,12555040994
>> >
>> >  35
>> >
>> >  cached_web_pages,http%3A%2F column=historian:open,
>> timestamp=1259807436723,
>> > value=Region opened on server : g
>> >  %2Fdx.doi.org%2F10.1002%252 hetto169.projectlounge.com
>> >
>> >  Fejoc.200900768,12555040994
>> >
>> >  35
>> >
>> >  cached_web_pages,http%3A%2F column=historian:assignment,
>> > timestamp=1259853024917, value=Region assigned to se
>> >  %2Fdx.doi.org%2F10.1002%252 rver
>> > ghetto167.projectlounge.com,60020,1259852967063
>> >
>> >  Fsmi.1285,1258589376676
>> >
>> >  cached_web_pages,http%3A%2F column=historian:open,
>> timestamp=1259853027984,
>> > value=Region opened on server : g
>> >  %2Fdx.doi.org%2F10.1002%252 hetto167.projectlounge.com
>> >
>> >  Fsmi.1285,1258589376676
>> >
>> >  cached_web_pages,http%3A%2F column=info:regioninfo,
>> > timestamp=1258589203875, value=REGION => {NAME => 'cached
>> >  %2Fdx.doi.org%2F10.1002%252 _web_pages,http\\x253A\\x252F\\
>> x252Fdx.doi.org
>> > \\x252F10.1002\\x25252Fsmi.1285,125
>> >  Fsmi.1285,1258589376676     8589376676', STARTKEY =>
>> 'http\\x253A\\x252F\\
>> > x252Fdx.doi.org\\x252F10.1002\\x252
>> >                             52Fsmi.1285', ENDKEY =>
>> 'http\\x253A\\x252F\\
>> > x252Fdx.doi.org\\x252F10.1016\\x252F
>> >                             j.apergo.2009.09.005', ENCODED =>
>> 1325672518,
>> > TABLE => {{NAME => 'cached_web_page
>> >                             s', FAMILIES => [{NAME => 'http', VERSIONS
>> =>
>> > '1', COMPRESSION => 'NONE', TTL =>
>> >                             '2147483647', BLOCKSIZE => '65536',
>> IN_MEMORY
>> > => 'false', BLOCKCACHE => 'true'}]}
>> >                             }
>> >
>> >
>> > and you can see the table which has gone missing 'cached_web_pages' in
>> the
>> > key spot. The crash over the weekend was pretty traumatic. Complete
>> power
>> > outage to the entire cluster except(!) for the master.  The data is
>> > definitely still on HDFS, I will take a look at the add_table script and
>> > upgrade to 0.20.2.
>> >
>> >
>> > Cheers and thanks a lot.
>> >
>> > mike
>> >
>> >
>> > On Thu, Dec 3, 2009 at 2:51 PM, Jean-Daniel Cryans <jdcryans@apache.org
>> >wrote:
>> >
>> >> This is weird if the table is in .META. and still not showing up...
>> >> could you pastebin the .META. rows?
>> >>
>> >> Also was it a new table that was just created or has it been there for
>> >> some time?
>> >>
>> >> What kind of crash did you get this weekend?
>> >>
>> >> The best way to recover your data, if it's still on HDFS, will be to
>> >> upgrade to 0.20.2 and use the script bin/add_table.rb to rebuild
>> >> .META.
>> >>
>> >> J-D
>> >>
>> >> On Thu, Dec 3, 2009 at 11:29 AM, mike anderson <saidtherobot@gmail.com
>> >
>> >> wrote:
>> >> > From the web UI and from calling 'list' in the shell I can't see the
>> >> table
>> >> > name.
>> >> >
>> >> > Hadoop/Hbase 0.20/0.20.1, distributed setup, 10 nodes.
>> >> >
>> >> > -mike
>> >> >
>> >> > On Thu, Dec 3, 2009 at 1:54 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org
>> >> >wrote:
>> >> >
>> >> >> Mike,
>> >> >>
>> >> >> So if you looked in .META. and the rows are there, how did you
>> figure
>> >> >> that the table is missing?
>> >> >>
>> >> >> Also the usuals: which version of Hadoop/HBase, what kind of setup,
>> etc
>> >> >>
>> >> >> J-D
>> >> >>
>> >> >> On Thu, Dec 3, 2009 at 7:29 AM, mike anderson <
>> saidtherobot@gmail.com>
>> >> >> wrote:
>> >> >> > Hbase crashed on me this weekend, and upon restarting one of the
>> >> tables
>> >> >> is
>> >> >> > just completely gone. All of the table data is still in HDFS and
>> my
>> >> >> missing
>> >> >> > table is still mentioned in .META.. I tried restarting hbase a few
>> >> times,
>> >> >> > but the table didn't show up. What else can I do to debug this? I
>> >> looked
>> >> >> > through the logs, but nothing really jumped out at me. Is there
>> >> something
>> >> >> I
>> >> >> > should look for?
>> >> >> >
>> >> >> > I took a look at this ticket,
>> >> >> > http://issues.apache.org/jira/browse/HBASE-1342, but don't know
>> >> enough
>> >> >> about
>> >> >> > the inner workings of hbase to make sense of it.
>> >> >> >
>> >> >> >
>> >> >> > thanks in advance.
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>>
>
>

Re: hbase crashed, table missing

Posted by mike anderson <sa...@gmail.com>.
This was a table that had been around for almost two months now and had many
regions. The web UI reports 231 regions, and I am certain that the tables
being reported don't have nearly that many regions, so perhaps this count
includes those from the missing table.

In the folder: /hbase/cached_web_pages/1102708773/http is a single 130MB
file full of rows/columns. We are caching the full html of websites into the
columns so copying and pasting some of the rows won't be very useful, but
the chunk starts with this:

"DATABLK*f#ŸRhttp%3A%2F%2Fwww.informaworld.com
%2Fsmpp%2Ftitle%7Edb%3Dall%7Econtent%3Dg903750466httpdata$íó"

I tried to enable a region, but get:

from (hbase):3hbase(main):003:0> enable_region
'cached_web_pages,metapress_ris_120417,1257429337740'
NativeException: java.lang.NullPointerException: null
from org/apache/hadoop/hbase/util/Writables.java:74:in `getWritable'
from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0'
from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke'
from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
from java/lang/reflect/Method.java:597:in `invoke'
from org/jruby/javasupport/JavaMethod.java:298:in
`invokeWithExceptionHandling'
from org/jruby/javasupport/JavaMethod.java:278:in `invoke_static'
from org/jruby/java/invokers/StaticMethodInvoker.java:57:in `call'
from org/jruby/runtime/callsite/CachingCallSite.java:150:in `call'
from org/jruby/ast/CallTwoArgNode.java:59:in `interpret'
from org/jruby/ast/LocalAsgnNode.java:123:in `interpret'
from org/jruby/ast/NewlineNode.java:104:in `interpret'
from org/jruby/ast/BlockNode.java:71:in `interpret'
from org/jruby/internal/runtime/methods/InterpretedMethod.java:201:in `call'
from org/jruby/internal/runtime/methods/DefaultMethod.java:162:in `call'
from org/jruby/runtime/callsite/CachingCallSite.java:150:in `call'
... 112 levels...
from org/jruby/internal/runtime/methods/DynamicMethod.java:226:in `call'
from org/jruby/internal/runtime/methods/CompiledMethod.java:211:in `call'
from org/jruby/internal/runtime/methods/CompiledMethod.java:71:in `call'
from org/jruby/runtime/callsite/CachingCallSite.java:253:in `cacheAndCall'
from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call'
from usr/local/hbase/bin/$_dot_dot_/bin/hirb.rb:487:in `__file__'
from usr/local/hbase/bin/$_dot_dot_/bin/hirb.rb:-1:in `load'
from org/jruby/Ruby.java:577:in `runScript'
from org/jruby/Ruby.java:480:in `runNormally'
from org/jruby/Ruby.java:354:in `runFromMain'
from org/jruby/Main.java:229:in `run'
from org/jruby/Main.java:110:in `run'
from org/jruby/Main.java:94:in `main'
from /usr/local/hbase/bin/../bin/HBase.rb:138:in `enable_region'
from /usr/local/hbase/bin/../bin/hirb.rb:350:in `enable_region'
from (hbase):4hbase(main):004:0>

Thanks again.

-mike

On Thu, Dec 3, 2009 at 3:21 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> What's in the HDFS folder of that table? Here I see that you should
> have something like:
>
> /hbase/cached_web_pages/1325672518/http/  stuff...
>
> Was there only this one region?
>
> Also are you able to enable a region in the shell? Take one of the row
> key from .META. and do
>
> > enable_region 'region name'
>
> J-D
>
> On Thu, Dec 3, 2009 at 12:11 PM, mike anderson <sa...@gmail.com>
> wrote:
> > Here's a snippit from the meta table (I can send you the whole thing, but
> > it's quite large),
> >
> > cached_web_pages,http%3A%2F column=info:serverstartcode,
> > timestamp=1259853027975, value=1259852967063
> >  %2Fdx.doi.org%2F10.1002%252
> >
> >  Fajpa.21214,1259739437144
> >
> >  cached_web_pages,http%3A%2F column=historian:assignment,
> > timestamp=1259807436758, value=Region assigned to se
> >  %2Fdx.doi.org%2F10.1002%252 rver
> > ghetto169.projectlounge.com,60020,1256139356112
> >
> >  Fejoc.200900768,12555040994
> >
> >  35
> >
> >  cached_web_pages,http%3A%2F column=historian:open,
> timestamp=1259807436723,
> > value=Region opened on server : g
> >  %2Fdx.doi.org%2F10.1002%252 hetto169.projectlounge.com
> >
> >  Fejoc.200900768,12555040994
> >
> >  35
> >
> >  cached_web_pages,http%3A%2F column=historian:assignment,
> > timestamp=1259853024917, value=Region assigned to se
> >  %2Fdx.doi.org%2F10.1002%252 rver
> > ghetto167.projectlounge.com,60020,1259852967063
> >
> >  Fsmi.1285,1258589376676
> >
> >  cached_web_pages,http%3A%2F column=historian:open,
> timestamp=1259853027984,
> > value=Region opened on server : g
> >  %2Fdx.doi.org%2F10.1002%252 hetto167.projectlounge.com
> >
> >  Fsmi.1285,1258589376676
> >
> >  cached_web_pages,http%3A%2F column=info:regioninfo,
> > timestamp=1258589203875, value=REGION => {NAME => 'cached
> >  %2Fdx.doi.org%2F10.1002%252 _web_pages,http\\x253A\\x252F\\
> x252Fdx.doi.org
> > \\x252F10.1002\\x25252Fsmi.1285,125
> >  Fsmi.1285,1258589376676     8589376676', STARTKEY =>
> 'http\\x253A\\x252F\\
> > x252Fdx.doi.org\\x252F10.1002\\x252
> >                             52Fsmi.1285', ENDKEY => 'http\\x253A\\x252F\\
> > x252Fdx.doi.org\\x252F10.1016\\x252F
> >                             j.apergo.2009.09.005', ENCODED => 1325672518,
> > TABLE => {{NAME => 'cached_web_page
> >                             s', FAMILIES => [{NAME => 'http', VERSIONS =>
> > '1', COMPRESSION => 'NONE', TTL =>
> >                             '2147483647', BLOCKSIZE => '65536', IN_MEMORY
> > => 'false', BLOCKCACHE => 'true'}]}
> >                             }
> >
> >
> > and you can see the table which has gone missing 'cached_web_pages' in
> the
> > key spot. The crash over the weekend was pretty traumatic. Complete power
> > outage to the entire cluster except(!) for the master.  The data is
> > definitely still on HDFS, I will take a look at the add_table script and
> > upgrade to 0.20.2.
> >
> >
> > Cheers and thanks a lot.
> >
> > mike
> >
> >
> > On Thu, Dec 3, 2009 at 2:51 PM, Jean-Daniel Cryans <jdcryans@apache.org
> >wrote:
> >
> >> This is weird if the table is in .META. and still not showing up...
> >> could you pastebin the .META. rows?
> >>
> >> Also was it a new table that was just created or has it been there for
> >> some time?
> >>
> >> What kind of crash did you get this weekend?
> >>
> >> The best way to recover your data, if it's still on HDFS, will be to
> >> upgrade to 0.20.2 and use the script bin/add_table.rb to rebuild
> >> .META.
> >>
> >> J-D
> >>
> >> On Thu, Dec 3, 2009 at 11:29 AM, mike anderson <sa...@gmail.com>
> >> wrote:
> >> > From the web UI and from calling 'list' in the shell I can't see the
> >> table
> >> > name.
> >> >
> >> > Hadoop/Hbase 0.20/0.20.1, distributed setup, 10 nodes.
> >> >
> >> > -mike
> >> >
> >> > On Thu, Dec 3, 2009 at 1:54 PM, Jean-Daniel Cryans <
> jdcryans@apache.org
> >> >wrote:
> >> >
> >> >> Mike,
> >> >>
> >> >> So if you looked in .META. and the rows are there, how did you figure
> >> >> that the table is missing?
> >> >>
> >> >> Also the usuals: which version of Hadoop/HBase, what kind of setup,
> etc
> >> >>
> >> >> J-D
> >> >>
> >> >> On Thu, Dec 3, 2009 at 7:29 AM, mike anderson <
> saidtherobot@gmail.com>
> >> >> wrote:
> >> >> > Hbase crashed on me this weekend, and upon restarting one of the
> >> tables
> >> >> is
> >> >> > just completely gone. All of the table data is still in HDFS and my
> >> >> missing
> >> >> > table is still mentioned in .META.. I tried restarting hbase a few
> >> times,
> >> >> > but the table didn't show up. What else can I do to debug this? I
> >> looked
> >> >> > through the logs, but nothing really jumped out at me. Is there
> >> something
> >> >> I
> >> >> > should look for?
> >> >> >
> >> >> > I took a look at this ticket,
> >> >> > http://issues.apache.org/jira/browse/HBASE-1342, but don't know
> >> enough
> >> >> about
> >> >> > the inner workings of hbase to make sense of it.
> >> >> >
> >> >> >
> >> >> > thanks in advance.
> >> >> >
> >> >>
> >> >
> >>
> >
>

Re: hbase crashed, table missing

Posted by Jean-Daniel Cryans <jd...@apache.org>.
What's in the HDFS folder of that table? Here I see that you should
have something like:

/hbase/cached_web_pages/1325672518/http/  stuff...

Was there only this one region?

Also are you able to enable a region in the shell? Take one of the row
key from .META. and do

> enable_region 'region name'

J-D

On Thu, Dec 3, 2009 at 12:11 PM, mike anderson <sa...@gmail.com> wrote:
> Here's a snippit from the meta table (I can send you the whole thing, but
> it's quite large),
>
> cached_web_pages,http%3A%2F column=info:serverstartcode,
> timestamp=1259853027975, value=1259852967063
>  %2Fdx.doi.org%2F10.1002%252
>
>  Fajpa.21214,1259739437144
>
>  cached_web_pages,http%3A%2F column=historian:assignment,
> timestamp=1259807436758, value=Region assigned to se
>  %2Fdx.doi.org%2F10.1002%252 rver
> ghetto169.projectlounge.com,60020,1256139356112
>
>  Fejoc.200900768,12555040994
>
>  35
>
>  cached_web_pages,http%3A%2F column=historian:open, timestamp=1259807436723,
> value=Region opened on server : g
>  %2Fdx.doi.org%2F10.1002%252 hetto169.projectlounge.com
>
>  Fejoc.200900768,12555040994
>
>  35
>
>  cached_web_pages,http%3A%2F column=historian:assignment,
> timestamp=1259853024917, value=Region assigned to se
>  %2Fdx.doi.org%2F10.1002%252 rver
> ghetto167.projectlounge.com,60020,1259852967063
>
>  Fsmi.1285,1258589376676
>
>  cached_web_pages,http%3A%2F column=historian:open, timestamp=1259853027984,
> value=Region opened on server : g
>  %2Fdx.doi.org%2F10.1002%252 hetto167.projectlounge.com
>
>  Fsmi.1285,1258589376676
>
>  cached_web_pages,http%3A%2F column=info:regioninfo,
> timestamp=1258589203875, value=REGION => {NAME => 'cached
>  %2Fdx.doi.org%2F10.1002%252 _web_pages,http\\x253A\\x252F\\x252Fdx.doi.org
> \\x252F10.1002\\x25252Fsmi.1285,125
>  Fsmi.1285,1258589376676     8589376676', STARTKEY => 'http\\x253A\\x252F\\
> x252Fdx.doi.org\\x252F10.1002\\x252
>                             52Fsmi.1285', ENDKEY => 'http\\x253A\\x252F\\
> x252Fdx.doi.org\\x252F10.1016\\x252F
>                             j.apergo.2009.09.005', ENCODED => 1325672518,
> TABLE => {{NAME => 'cached_web_page
>                             s', FAMILIES => [{NAME => 'http', VERSIONS =>
> '1', COMPRESSION => 'NONE', TTL =>
>                             '2147483647', BLOCKSIZE => '65536', IN_MEMORY
> => 'false', BLOCKCACHE => 'true'}]}
>                             }
>
>
> and you can see the table which has gone missing 'cached_web_pages' in the
> key spot. The crash over the weekend was pretty traumatic. Complete power
> outage to the entire cluster except(!) for the master.  The data is
> definitely still on HDFS, I will take a look at the add_table script and
> upgrade to 0.20.2.
>
>
> Cheers and thanks a lot.
>
> mike
>
>
> On Thu, Dec 3, 2009 at 2:51 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> This is weird if the table is in .META. and still not showing up...
>> could you pastebin the .META. rows?
>>
>> Also was it a new table that was just created or has it been there for
>> some time?
>>
>> What kind of crash did you get this weekend?
>>
>> The best way to recover your data, if it's still on HDFS, will be to
>> upgrade to 0.20.2 and use the script bin/add_table.rb to rebuild
>> .META.
>>
>> J-D
>>
>> On Thu, Dec 3, 2009 at 11:29 AM, mike anderson <sa...@gmail.com>
>> wrote:
>> > From the web UI and from calling 'list' in the shell I can't see the
>> table
>> > name.
>> >
>> > Hadoop/Hbase 0.20/0.20.1, distributed setup, 10 nodes.
>> >
>> > -mike
>> >
>> > On Thu, Dec 3, 2009 at 1:54 PM, Jean-Daniel Cryans <jdcryans@apache.org
>> >wrote:
>> >
>> >> Mike,
>> >>
>> >> So if you looked in .META. and the rows are there, how did you figure
>> >> that the table is missing?
>> >>
>> >> Also the usuals: which version of Hadoop/HBase, what kind of setup, etc
>> >>
>> >> J-D
>> >>
>> >> On Thu, Dec 3, 2009 at 7:29 AM, mike anderson <sa...@gmail.com>
>> >> wrote:
>> >> > Hbase crashed on me this weekend, and upon restarting one of the
>> tables
>> >> is
>> >> > just completely gone. All of the table data is still in HDFS and my
>> >> missing
>> >> > table is still mentioned in .META.. I tried restarting hbase a few
>> times,
>> >> > but the table didn't show up. What else can I do to debug this? I
>> looked
>> >> > through the logs, but nothing really jumped out at me. Is there
>> something
>> >> I
>> >> > should look for?
>> >> >
>> >> > I took a look at this ticket,
>> >> > http://issues.apache.org/jira/browse/HBASE-1342, but don't know
>> enough
>> >> about
>> >> > the inner workings of hbase to make sense of it.
>> >> >
>> >> >
>> >> > thanks in advance.
>> >> >
>> >>
>> >
>>
>

Re: hbase crashed, table missing

Posted by mike anderson <sa...@gmail.com>.
Here's a snippit from the meta table (I can send you the whole thing, but
it's quite large),

cached_web_pages,http%3A%2F column=info:serverstartcode,
timestamp=1259853027975, value=1259852967063
 %2Fdx.doi.org%2F10.1002%252

 Fajpa.21214,1259739437144

 cached_web_pages,http%3A%2F column=historian:assignment,
timestamp=1259807436758, value=Region assigned to se
 %2Fdx.doi.org%2F10.1002%252 rver
ghetto169.projectlounge.com,60020,1256139356112

 Fejoc.200900768,12555040994

 35

 cached_web_pages,http%3A%2F column=historian:open, timestamp=1259807436723,
value=Region opened on server : g
 %2Fdx.doi.org%2F10.1002%252 hetto169.projectlounge.com

 Fejoc.200900768,12555040994

 35

 cached_web_pages,http%3A%2F column=historian:assignment,
timestamp=1259853024917, value=Region assigned to se
 %2Fdx.doi.org%2F10.1002%252 rver
ghetto167.projectlounge.com,60020,1259852967063

 Fsmi.1285,1258589376676

 cached_web_pages,http%3A%2F column=historian:open, timestamp=1259853027984,
value=Region opened on server : g
 %2Fdx.doi.org%2F10.1002%252 hetto167.projectlounge.com

 Fsmi.1285,1258589376676

 cached_web_pages,http%3A%2F column=info:regioninfo,
timestamp=1258589203875, value=REGION => {NAME => 'cached
 %2Fdx.doi.org%2F10.1002%252 _web_pages,http\\x253A\\x252F\\x252Fdx.doi.org
\\x252F10.1002\\x25252Fsmi.1285,125
 Fsmi.1285,1258589376676     8589376676', STARTKEY => 'http\\x253A\\x252F\\
x252Fdx.doi.org\\x252F10.1002\\x252
                             52Fsmi.1285', ENDKEY => 'http\\x253A\\x252F\\
x252Fdx.doi.org\\x252F10.1016\\x252F
                             j.apergo.2009.09.005', ENCODED => 1325672518,
TABLE => {{NAME => 'cached_web_page
                             s', FAMILIES => [{NAME => 'http', VERSIONS =>
'1', COMPRESSION => 'NONE', TTL =>
                             '2147483647', BLOCKSIZE => '65536', IN_MEMORY
=> 'false', BLOCKCACHE => 'true'}]}
                             }


and you can see the table which has gone missing 'cached_web_pages' in the
key spot. The crash over the weekend was pretty traumatic. Complete power
outage to the entire cluster except(!) for the master.  The data is
definitely still on HDFS, I will take a look at the add_table script and
upgrade to 0.20.2.


Cheers and thanks a lot.

mike


On Thu, Dec 3, 2009 at 2:51 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> This is weird if the table is in .META. and still not showing up...
> could you pastebin the .META. rows?
>
> Also was it a new table that was just created or has it been there for
> some time?
>
> What kind of crash did you get this weekend?
>
> The best way to recover your data, if it's still on HDFS, will be to
> upgrade to 0.20.2 and use the script bin/add_table.rb to rebuild
> .META.
>
> J-D
>
> On Thu, Dec 3, 2009 at 11:29 AM, mike anderson <sa...@gmail.com>
> wrote:
> > From the web UI and from calling 'list' in the shell I can't see the
> table
> > name.
> >
> > Hadoop/Hbase 0.20/0.20.1, distributed setup, 10 nodes.
> >
> > -mike
> >
> > On Thu, Dec 3, 2009 at 1:54 PM, Jean-Daniel Cryans <jdcryans@apache.org
> >wrote:
> >
> >> Mike,
> >>
> >> So if you looked in .META. and the rows are there, how did you figure
> >> that the table is missing?
> >>
> >> Also the usuals: which version of Hadoop/HBase, what kind of setup, etc
> >>
> >> J-D
> >>
> >> On Thu, Dec 3, 2009 at 7:29 AM, mike anderson <sa...@gmail.com>
> >> wrote:
> >> > Hbase crashed on me this weekend, and upon restarting one of the
> tables
> >> is
> >> > just completely gone. All of the table data is still in HDFS and my
> >> missing
> >> > table is still mentioned in .META.. I tried restarting hbase a few
> times,
> >> > but the table didn't show up. What else can I do to debug this? I
> looked
> >> > through the logs, but nothing really jumped out at me. Is there
> something
> >> I
> >> > should look for?
> >> >
> >> > I took a look at this ticket,
> >> > http://issues.apache.org/jira/browse/HBASE-1342, but don't know
> enough
> >> about
> >> > the inner workings of hbase to make sense of it.
> >> >
> >> >
> >> > thanks in advance.
> >> >
> >>
> >
>

Re: hbase crashed, table missing

Posted by Jean-Daniel Cryans <jd...@apache.org>.
This is weird if the table is in .META. and still not showing up...
could you pastebin the .META. rows?

Also was it a new table that was just created or has it been there for
some time?

What kind of crash did you get this weekend?

The best way to recover your data, if it's still on HDFS, will be to
upgrade to 0.20.2 and use the script bin/add_table.rb to rebuild
.META.

J-D

On Thu, Dec 3, 2009 at 11:29 AM, mike anderson <sa...@gmail.com> wrote:
> From the web UI and from calling 'list' in the shell I can't see the table
> name.
>
> Hadoop/Hbase 0.20/0.20.1, distributed setup, 10 nodes.
>
> -mike
>
> On Thu, Dec 3, 2009 at 1:54 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> Mike,
>>
>> So if you looked in .META. and the rows are there, how did you figure
>> that the table is missing?
>>
>> Also the usuals: which version of Hadoop/HBase, what kind of setup, etc
>>
>> J-D
>>
>> On Thu, Dec 3, 2009 at 7:29 AM, mike anderson <sa...@gmail.com>
>> wrote:
>> > Hbase crashed on me this weekend, and upon restarting one of the tables
>> is
>> > just completely gone. All of the table data is still in HDFS and my
>> missing
>> > table is still mentioned in .META.. I tried restarting hbase a few times,
>> > but the table didn't show up. What else can I do to debug this? I looked
>> > through the logs, but nothing really jumped out at me. Is there something
>> I
>> > should look for?
>> >
>> > I took a look at this ticket,
>> > http://issues.apache.org/jira/browse/HBASE-1342, but don't know enough
>> about
>> > the inner workings of hbase to make sense of it.
>> >
>> >
>> > thanks in advance.
>> >
>>
>

Re: hbase crashed, table missing

Posted by mike anderson <sa...@gmail.com>.
>From the web UI and from calling 'list' in the shell I can't see the table
name.

Hadoop/Hbase 0.20/0.20.1, distributed setup, 10 nodes.

-mike

On Thu, Dec 3, 2009 at 1:54 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> Mike,
>
> So if you looked in .META. and the rows are there, how did you figure
> that the table is missing?
>
> Also the usuals: which version of Hadoop/HBase, what kind of setup, etc
>
> J-D
>
> On Thu, Dec 3, 2009 at 7:29 AM, mike anderson <sa...@gmail.com>
> wrote:
> > Hbase crashed on me this weekend, and upon restarting one of the tables
> is
> > just completely gone. All of the table data is still in HDFS and my
> missing
> > table is still mentioned in .META.. I tried restarting hbase a few times,
> > but the table didn't show up. What else can I do to debug this? I looked
> > through the logs, but nothing really jumped out at me. Is there something
> I
> > should look for?
> >
> > I took a look at this ticket,
> > http://issues.apache.org/jira/browse/HBASE-1342, but don't know enough
> about
> > the inner workings of hbase to make sense of it.
> >
> >
> > thanks in advance.
> >
>

Re: hbase crashed, table missing

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Mike,

So if you looked in .META. and the rows are there, how did you figure
that the table is missing?

Also the usuals: which version of Hadoop/HBase, what kind of setup, etc

J-D

On Thu, Dec 3, 2009 at 7:29 AM, mike anderson <sa...@gmail.com> wrote:
> Hbase crashed on me this weekend, and upon restarting one of the tables is
> just completely gone. All of the table data is still in HDFS and my missing
> table is still mentioned in .META.. I tried restarting hbase a few times,
> but the table didn't show up. What else can I do to debug this? I looked
> through the logs, but nothing really jumped out at me. Is there something I
> should look for?
>
> I took a look at this ticket,
> http://issues.apache.org/jira/browse/HBASE-1342, but don't know enough about
> the inner workings of hbase to make sense of it.
>
>
> thanks in advance.
>