You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Stuart Smith <st...@yahoo.com> on 2010/08/23 22:04:09 UTC

WARN add_table: Missing .regioninfo:.. No server address.. what to do?

Hello,

  I'm missing several .regioninfo files for several of my regions. Not sure why or when. But after an addtable rebuild of the .META. entries - I get errors about "No server address listed in .META. for region". 

Which I guess would make sense if add_table failed to update the .META. table for regions that were missing regioninfo files.

However, this means my table is basically broken, since I get errors whenever I'm running a task that reads from it.

Any thoughts on recovery? This table is essential, and I'm losing a lot of work if it's dead :(

Take care,
  -stu


      

Re: WARN add_table: Missing .regioninfo:.. No server address.. what to do?

Posted by Luke Forehand <lu...@networkedinsights.com>.
Stuart Smith <st...@...> writes:

> 
> Hello,
> 
>   I'm missing several .regioninfo files for several of my regions. Not sure
why or when. But after an addtable
> rebuild of the .META. entries - I get errors about "No server address listed
in .META. for region". 
> 
> Which I guess would make sense if add_table failed to update the .META. table
for regions that were missing
> regioninfo files.
> 
> However, this means my table is basically broken, since I get errors whenever
I'm running a task that reads
> from it.
> 
> Any thoughts on recovery? This table is essential, and I'm losing a lot of
work if it's dead :(
> 
> Take care,
>   -stu
> 
> 

Hi Stu,

I'm also interested in knowing exactly what it means when .regioninfo files are
missing.  We experienced this problem recently (we believe due to the region
server with META crashing), but we were unsure when the .regioninfo files went
missing, or if they were simply never written.  

Our current belief is, when our region server holding the meta table crashed,
other regions were still splitting but unable to update META, so when META
eventually came back online, it held references to parent regions that were gone
due to being split into daughters, and also META was not aware of the new
daughter regions.  It seems a lot like this issue:
https://issues.apache.org/jira/browse/HBASE-869

We had to wipe the table and begin again... which is not a production solution.
 We have been tossing around an idea of backing up the META table after a batch
of imports and attempt to recover in that manner if necessary, but it's not
ideal.  I believe the HMaster rewrite scheduled for 0.92 would address this
issue.  https://issues.apache.org/jira/browse/HBASE-1816

-Luke


Re: WARN add_table: Missing .regioninfo:.. No server address.. what to do?

Posted by Ted Yu <yu...@gmail.com>.
Stuart:
If you have separated domain-specific tweaks out of this layer, definitely
you can put your code on github.

Cheers

On Wed, Aug 25, 2010 at 11:22 AM, Stuart Smith <st...@yahoo.com> wrote:

>
> Hello St.Ack,
>
> Hmm.. I actually just dropped everything in the database & rebuilt - along
> with some very much needed cleanups, improvements, code-refactoring, etc.
>
> Just curious, though, (if it happens again) - assume the regions were
> invalid - I don't know, maybe it was halfway through splitting something and
> died - but say they're invalid.
>
> Would the best thing to do in that case be a manual deletion of the hdfs
> directories containing the invalid regions? What hbase handle that OK?
>
> And a side question that ties a lot of my issues together - I finally have
> a (somewhat) clean interface that moves the occasional too big file into
> hdfs, and stores everything else into hbase - I built this up as a layer in
> java with a metadata/filestore split in hbase (all file metadata is in
> hbase, files are directed to hbase/hdfs based on size).
>
> Is there another project that does this? It seems too handy to be the first
> time someone did this... Or does something like this always end up needing
> domain-specific tweaks & interfaces?
>
> Because once you have huge cells in hbase, it really seems to be unhappy.
> Especially when a good chunk of your tasks are done as M/R tasks or some
> layer on top of M/R.
>
> Or would this be a good project to open-source? Or pointless to do so?
>
> I guess in the long-run hbase could absorb these requirements with some
> tweaks of the file format, but I thought it could be nice to do this with a
> little library layer on top.
>
> Take care,
>  -stu
>
>
> --- On Mon, 8/23/10, Stack <st...@duboce.net> wrote:
>
> > From: Stack <st...@duboce.net>
> > Subject: Re: WARN add_table: Missing .regioninfo:.. No server address..
> what to do?
> > To: user@hbase.apache.org
> > Date: Monday, August 23, 2010, 6:08 PM
> > On Mon, Aug 23, 2010 at 1:35 PM,
> > Stuart Smith <st...@yahoo.com>
> > wrote:
> > >
> > > Hmm... AFAICT, if the regioninfo files is gone from a
> > region directory (and I looked on hdfs, and it is gone), the
> > region is hosed.
> >
> > Is it a legit region?  Its wholesome looking with
> > hfiles that make
> > sense (non-zero)?  My guess is that the regions are
> > incompletes and
> > loadtable is not smart enough recognizing them as so.
> > If you grep
> > your master log for the region encoded name, do you find
> > anything?
> > Maybe this way you can figure its provenance?
> >
> > St.Ack
> >
>
>
>
>
>

Re: WARN add_table: Missing .regioninfo:.. No server address.. what to do?

Posted by Stuart Smith <st...@yahoo.com>.
Hey,

Awesome. Well, this is a research project for work, so I have to ask the powers that be if it's OK to publish the plumbing parts.

It's really just plumbing though, so from the techy perspective it's not the "interesting" part. So hopefully I can sell it as such (selling my work to the boss as not interesting.. hmm... ;) ).

We'll see. I'm not an expert Java coder either, but, hopefully I can get it up and stimulate something...

Take care,
  -stu

--- On Thu, 8/26/10, Stack <st...@duboce.net> wrote:

> From: Stack <st...@duboce.net>
> Subject: Re: WARN add_table: Missing .regioninfo:.. No server address.. what to do?
> To: user@hbase.apache.org
> Date: Thursday, August 26, 2010, 2:11 AM
> On Wed, Aug 25, 2010 at 11:22 AM,
> Stuart Smith <st...@yahoo.com>
> wrote:
> > Just curious, though, (if it happens again) - assume
> the regions were invalid - I don't know, maybe it was
> halfway through splitting something and died - but say
> they're invalid.
> >
> 
> (See if a failed MR task associated with the bad
> region.  You could
> also tgz' the bad region and we can take a look at it for
> you.)
> 
> > Would the best thing to do in that case be a manual
> deletion of the hdfs directories containing the invalid
> regions? What hbase handle that OK?
> >
> 
> If its a 'bad' region, should be fine.  There'd be no
> holes in loaded
> table.  But if its not...
> 
> > And a side question that ties a lot of my issues
> together - I finally have a (somewhat) clean interface that
> moves the occasional too big file into hdfs, and stores
> everything else into hbase - I built this up as a layer in
> java with a metadata/filestore split in hbase (all file
> metadata is in hbase, files are directed to hbase/hdfs based
> on size).
> >
> > Is there another project that does this? It seems too
> handy to be the first time someone did this... Or does
> something like this always end up needing domain-specific
> tweaks & interfaces?
> >
> 
> I haven't heard of a project like this (though as you say,
> you can't
> be the first... maybe you are though?)
> 
> > Because once you have huge cells in hbase, it really
> seems to be unhappy. Especially when a good chunk of your
> tasks are done as M/R tasks or some layer on top of M/R.
> >
> 
> Yeah, I'd imagine so.  At least default configuration
> is set for cells
> in the 0-50k or so size.  I'd imagine they'd need to
> be pulled around
> some if cells are MBs.
> 
> > Or would this be a good project to open-source? Or
> pointless to do so?
> >
> 
> Do it on github as Ted suggests.  It'll either
> flourish and then
> you'll have to figure out how to support it or it'll wither
> when you
> move on (add it to supporting projects on wiki so its
> easier for folks
> to find?)
> 
> > I guess in the long-run hbase could absorb these
> requirements with some tweaks of the file format, but I
> thought it could be nice to do this with a little library
> layer on top.
> >
> 
> You are a good man Stu,
> St.Ack
> 
> 
> > --- On Mon, 8/23/10, Stack <st...@duboce.net>
> wrote:
> >
> >> From: Stack <st...@duboce.net>
> >> Subject: Re: WARN add_table: Missing
> .regioninfo:.. No server address.. what to do?
> >> To: user@hbase.apache.org
> >> Date: Monday, August 23, 2010, 6:08 PM
> >> On Mon, Aug 23, 2010 at 1:35 PM,
> >> Stuart Smith <st...@yahoo.com>
> >> wrote:
> >> >
> >> > Hmm... AFAICT, if the regioninfo files is
> gone from a
> >> region directory (and I looked on hdfs, and it is
> gone), the
> >> region is hosed.
> >>
> >> Is it a legit region?  Its wholesome looking
> with
> >> hfiles that make
> >> sense (non-zero)?  My guess is that the regions
> are
> >> incompletes and
> >> loadtable is not smart enough recognizing them as
> so.
> >> If you grep
> >> your master log for the region encoded name, do
> you find
> >> anything?
> >> Maybe this way you can figure its provenance?
> >>
> >> St.Ack
> >>
> >
> >
> >
> >
> >
> 


      


Re: WARN add_table: Missing .regioninfo:.. No server address.. what to do?

Posted by Stack <st...@duboce.net>.
On Wed, Aug 25, 2010 at 11:22 AM, Stuart Smith <st...@yahoo.com> wrote:
> Just curious, though, (if it happens again) - assume the regions were invalid - I don't know, maybe it was halfway through splitting something and died - but say they're invalid.
>

(See if a failed MR task associated with the bad region.  You could
also tgz' the bad region and we can take a look at it for you.)

> Would the best thing to do in that case be a manual deletion of the hdfs directories containing the invalid regions? What hbase handle that OK?
>

If its a 'bad' region, should be fine.  There'd be no holes in loaded
table.  But if its not...

> And a side question that ties a lot of my issues together - I finally have a (somewhat) clean interface that moves the occasional too big file into hdfs, and stores everything else into hbase - I built this up as a layer in java with a metadata/filestore split in hbase (all file metadata is in hbase, files are directed to hbase/hdfs based on size).
>
> Is there another project that does this? It seems too handy to be the first time someone did this... Or does something like this always end up needing domain-specific tweaks & interfaces?
>

I haven't heard of a project like this (though as you say, you can't
be the first... maybe you are though?)

> Because once you have huge cells in hbase, it really seems to be unhappy. Especially when a good chunk of your tasks are done as M/R tasks or some layer on top of M/R.
>

Yeah, I'd imagine so.  At least default configuration is set for cells
in the 0-50k or so size.  I'd imagine they'd need to be pulled around
some if cells are MBs.

> Or would this be a good project to open-source? Or pointless to do so?
>

Do it on github as Ted suggests.  It'll either flourish and then
you'll have to figure out how to support it or it'll wither when you
move on (add it to supporting projects on wiki so its easier for folks
to find?)

> I guess in the long-run hbase could absorb these requirements with some tweaks of the file format, but I thought it could be nice to do this with a little library layer on top.
>

You are a good man Stu,
St.Ack


> --- On Mon, 8/23/10, Stack <st...@duboce.net> wrote:
>
>> From: Stack <st...@duboce.net>
>> Subject: Re: WARN add_table: Missing .regioninfo:.. No server address.. what to do?
>> To: user@hbase.apache.org
>> Date: Monday, August 23, 2010, 6:08 PM
>> On Mon, Aug 23, 2010 at 1:35 PM,
>> Stuart Smith <st...@yahoo.com>
>> wrote:
>> >
>> > Hmm... AFAICT, if the regioninfo files is gone from a
>> region directory (and I looked on hdfs, and it is gone), the
>> region is hosed.
>>
>> Is it a legit region?  Its wholesome looking with
>> hfiles that make
>> sense (non-zero)?  My guess is that the regions are
>> incompletes and
>> loadtable is not smart enough recognizing them as so.
>> If you grep
>> your master log for the region encoded name, do you find
>> anything?
>> Maybe this way you can figure its provenance?
>>
>> St.Ack
>>
>
>
>
>
>

Re: WARN add_table: Missing .regioninfo:.. No server address.. what to do?

Posted by Stuart Smith <st...@yahoo.com>.
Hello St.Ack,

Hmm.. I actually just dropped everything in the database & rebuilt - along with some very much needed cleanups, improvements, code-refactoring, etc.

Just curious, though, (if it happens again) - assume the regions were invalid - I don't know, maybe it was halfway through splitting something and died - but say they're invalid. 

Would the best thing to do in that case be a manual deletion of the hdfs directories containing the invalid regions? What hbase handle that OK?

And a side question that ties a lot of my issues together - I finally have a (somewhat) clean interface that moves the occasional too big file into hdfs, and stores everything else into hbase - I built this up as a layer in java with a metadata/filestore split in hbase (all file metadata is in hbase, files are directed to hbase/hdfs based on size).

Is there another project that does this? It seems too handy to be the first time someone did this... Or does something like this always end up needing domain-specific tweaks & interfaces?

Because once you have huge cells in hbase, it really seems to be unhappy. Especially when a good chunk of your tasks are done as M/R tasks or some layer on top of M/R. 

Or would this be a good project to open-source? Or pointless to do so? 

I guess in the long-run hbase could absorb these requirements with some tweaks of the file format, but I thought it could be nice to do this with a little library layer on top.

Take care,
  -stu


--- On Mon, 8/23/10, Stack <st...@duboce.net> wrote:

> From: Stack <st...@duboce.net>
> Subject: Re: WARN add_table: Missing .regioninfo:.. No server address.. what to do?
> To: user@hbase.apache.org
> Date: Monday, August 23, 2010, 6:08 PM
> On Mon, Aug 23, 2010 at 1:35 PM,
> Stuart Smith <st...@yahoo.com>
> wrote:
> >
> > Hmm... AFAICT, if the regioninfo files is gone from a
> region directory (and I looked on hdfs, and it is gone), the
> region is hosed.
> 
> Is it a legit region?  Its wholesome looking with
> hfiles that make
> sense (non-zero)?  My guess is that the regions are
> incompletes and
> loadtable is not smart enough recognizing them as so. 
> If you grep
> your master log for the region encoded name, do you find
> anything?
> Maybe this way you can figure its provenance?
> 
> St.Ack
> 


      


Re: WARN add_table: Missing .regioninfo:.. No server address.. what to do?

Posted by Stack <st...@duboce.net>.
On Mon, Aug 23, 2010 at 1:35 PM, Stuart Smith <st...@yahoo.com> wrote:
>
> Hmm... AFAICT, if the regioninfo files is gone from a region directory (and I looked on hdfs, and it is gone), the region is hosed.

Is it a legit region?  Its wholesome looking with hfiles that make
sense (non-zero)?  My guess is that the regions are incompletes and
loadtable is not smart enough recognizing them as so.  If you grep
your master log for the region encoded name, do you find anything?
Maybe this way you can figure its provenance?

St.Ack

Re: WARN add_table: Missing .regioninfo:.. No server address.. what to do?

Posted by Stuart Smith <st...@yahoo.com>.
Hmm... AFAICT, if the regioninfo files is gone from a region directory (and I looked on hdfs, and it is gone), the region is hosed.

So I think it's time to rebuild the table. Sigh.

Any idea on why these disappeared? I was subjecting the system to a lot of load - an M/R task scanning a large table, and populating another table. There were some large cells crashing regionservers at a decent clip (trying to identify those large cells was part of the M/R task). So I was babysitting & restarting regionservers. 

After that, there were just some periodic writes over the weekend, but when I came back 4/10 regionservers had died. So a lot of random crashes going on..

Take care,
  -stu

--- On Mon, 8/23/10, Stuart Smith <st...@yahoo.com> wrote:

> From: Stuart Smith <st...@yahoo.com>
> Subject: WARN add_table: Missing .regioninfo:.. No server address.. what to do?
> To: user@hbase.apache.org
> Date: Monday, August 23, 2010, 4:04 PM
> Hello,
> 
>   I'm missing several .regioninfo files for several of
> my regions. Not sure why or when. But after an addtable
> rebuild of the .META. entries - I get errors about "No
> server address listed in .META. for region". 
> 
> Which I guess would make sense if add_table failed to
> update the .META. table for regions that were missing
> regioninfo files.
> 
> However, this means my table is basically broken, since I
> get errors whenever I'm running a task that reads from it.
> 
> Any thoughts on recovery? This table is essential, and I'm
> losing a lot of work if it's dead :(
> 
> Take care,
>   -stu
> 
> 
>       
>