You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Stuart Smith <st...@yahoo.com> on 2010/08/07 00:50:38 UTC

Re: Batch puts interrupted ... Requested row out of range for HRegion filestore ...org.apache.hadoop.hbase.client.RetriesExhaustedException:

Hello Ryan,

  Yup. There's a hole, exactly where it should be.

I used add_table.rb once before, and am no expert on it.
All I have is a note written down:

To recover lost tables:
./hbase org.jruby.Main add_table.rb /hbase/filestore

Any thing else I need to know? Do I just run the script like so?
Anything need to be shut down before I do?

Thanks!

Take care,
  -stu


--- On Fri, 8/6/10, Ryan Rawson <ry...@gmail.com> wrote:

> From: Ryan Rawson <ry...@gmail.com>
> Subject: Re: Batch puts interrupted ... Requested row out of range for HRegion  filestore ...org.apache.hadoop.hbase.client.RetriesExhaustedException:
> To: user@hbase.apache.org
> Date: Friday, August 6, 2010, 6:08 PM
> Hi,
> 
> When you run into this problem, it's usually a sign of a
> META problem,
> specifically you have a 'hole' in the META table.
> 
> The META table contains a series of keys like so:
> table,start_row1,<timestamp>    [data]
> table,start_row2,<timestamp>    [data]
> 
> etc
> 
> When we search for a region for a given row, we build a key
> like so:
> 'table,my_row,9*19' and so a search called
> 'closestRowBefore'.  This
> finds the region that contains this row.
> 
> Now notice that we only put the start row in the key....
> each region
> has a start_row,end_row, and all the regions are mutually
> exclusive
> and form complete coverage.  Imagine a row for a
> region was missing,
> we'd consistently find the wrong region and the
> regionserver would
> reject the request (correctly so).
> 
> That is what is probably happening here.  Check the
> table dump in the
> master web-ui and see if you can find a 'hole'... where the
> end-key
> doesnt match up with the start-key.
> 
> If that is the case, there is a script add_table.rb which
> is used to
> fix these things.
> 
> -ryan
> 
> On Fri, Aug 6, 2010 at 2:59 PM, Stuart Smith <st...@yahoo.com>
> wrote:
> > Hello,
> >
> >  I'm running hbase 0.20.5, and seeing Puts() fail
> repeatedly when trying to insert a specific item into the
> database.
> >
> > Client side I see:
> >
> >
> org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Trying to contact region server Some server,
> retryOnlyOne=true, index=0, islastrow=true, tries=9,
> numtries=10, i=0, listsize=1,
> region=filestore,bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836
> for region filestore,
> >
> > I then looked up which node was hosting the given
> region
> (filestore,bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b)
> on the gui, found the following debug message in the
> regionserver log:
> >
> > 2010-08-06 14:23:47,414 DEBUG
> org.apache.hadoop.hbase.regionserver.HRegionServer: Batch
> puts interrupted at index=0 because:Requested row out of
> range for HRegion
> filestore,bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836,
> startKey='bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b',
> getEndKey()='be0bc7b3f8bc2a30910b9c758b47cdb730a4691e93f92abb857a2dcc7aefa633',
> row='be1681910b02db5da061659c2cb08f501a135c2f065559a37a1761bf6e203d1d'
> >
> >
> > Which appears to be coming from:
> >
> > /regionserver/HRegionServer.java:1786:    
>  LOG.debug("Batch puts interrupted at index=" + i + "
> because:" +
> >
> > Which is coming from:
> >
> >
> ./java/org/apache/hadoop/hbase/regionserver/HRegion.java:1658:
>      throw new WrongRegionException("Requested row out of
> range for " +
> >
> > This happens repeatedly on a specific item over at
> least a day or so, even when not much is happening with the
> cluster.
> >
> > As far as I can tell, it looks like the logic to
> select the correct region for a given row is wrong. The row
> is indeed not in the correct range (at least from what I can
> tell of the exception thrown), and the check in
> HRegion.java:1658:
> >
> >  /** Make sure this is a valid row for the HRegion
> */
> >  private void checkRow(final byte [] row) throws
> IOException {
> >    if(!rowIsInRange(regionInfo, row)) {
> >
> > Is correctly rejecting the Put().
> >
> > So it appears the error would be somewhere in:
> > HRegion.java:1550:
> >  private void put(final Map<byte
> [],List<KeyValue>> familyMap,
> >      boolean writeToWAL) throws IOException {
> >
> > Which appears to be the actual guts of the insert
> operation.
> > However, I don't know enough about the design of
> HRegions to really decipher this method. I'll dig into it
> more, but I thought it might be more efficient just to ask
> you guys first.
> >
> > Any ideas?
> >
> > I can update to 0.20.6, but I don't see any fixed
> jira's on 0.20.6 that seem related.. I could be wrong. I'm
> not sure what I should do next. Any more information you
> guys need?
> >
> > Note that I am inserting file into the database, and
> using it's sha256sum as the key. And the file that is
> failing does indeed have a sha that corresponds to the key
> in the message above (and is out of range).
> >
> > Take care,
> >  -stu
> >
> >
> >
> >
> >
> >
> 


      

Re: Batch puts interrupted ... Requested row out of range for HRegion filestore ...org.apache.hadoop.hbase.client.RetriesExhaustedException:

Posted by Stuart Smith <st...@yahoo.com>.
Just to follow up - I ran add_table as I had done when I lost a table before - and it fixed the error.

Thanks!

Take care,
  -stu

--- On Fri, 8/6/10, Stuart Smith <st...@yahoo.com> wrote:

> From: Stuart Smith <st...@yahoo.com>
> Subject: Re: Batch puts interrupted ... Requested row out of range for HRegion  filestore ...org.apache.hadoop.hbase.client.RetriesExhaustedException:
> To: user@hbase.apache.org
> Date: Friday, August 6, 2010, 6:50 PM
> Hello Ryan,
> 
>   Yup. There's a hole, exactly where it should be.
> 
> I used add_table.rb once before, and am no expert on it.
> All I have is a note written down:
> 
> To recover lost tables:
> ./hbase org.jruby.Main add_table.rb /hbase/filestore
> 
> Any thing else I need to know? Do I just run the script
> like so?
> Anything need to be shut down before I do?
> 
> Thanks!
> 
> Take care,
>   -stu
> 
> 
> --- On Fri, 8/6/10, Ryan Rawson <ry...@gmail.com>
> wrote:
> 
> > From: Ryan Rawson <ry...@gmail.com>
> > Subject: Re: Batch puts interrupted ... Requested row
> out of range for HRegion  filestore
> ...org.apache.hadoop.hbase.client.RetriesExhaustedException:
> > To: user@hbase.apache.org
> > Date: Friday, August 6, 2010, 6:08 PM
> > Hi,
> > 
> > When you run into this problem, it's usually a sign of
> a
> > META problem,
> > specifically you have a 'hole' in the META table.
> > 
> > The META table contains a series of keys like so:
> > table,start_row1,<timestamp>    [data]
> > table,start_row2,<timestamp>    [data]
> > 
> > etc
> > 
> > When we search for a region for a given row, we build
> a key
> > like so:
> > 'table,my_row,9*19' and so a search called
> > 'closestRowBefore'.  This
> > finds the region that contains this row.
> > 
> > Now notice that we only put the start row in the
> key....
> > each region
> > has a start_row,end_row, and all the regions are
> mutually
> > exclusive
> > and form complete coverage.  Imagine a row for a
> > region was missing,
> > we'd consistently find the wrong region and the
> > regionserver would
> > reject the request (correctly so).
> > 
> > That is what is probably happening here.  Check the
> > table dump in the
> > master web-ui and see if you can find a 'hole'...
> where the
> > end-key
> > doesnt match up with the start-key.
> > 
> > If that is the case, there is a script add_table.rb
> which
> > is used to
> > fix these things.
> > 
> > -ryan
> > 
> > On Fri, Aug 6, 2010 at 2:59 PM, Stuart Smith <st...@yahoo.com>
> > wrote:
> > > Hello,
> > >
> > >  I'm running hbase 0.20.5, and seeing Puts()
> fail
> > repeatedly when trying to insert a specific item into
> the
> > database.
> > >
> > > Client side I see:
> > >
> > >
> >
> org.apache.hadoop.hbase.client.RetriesExhaustedException:
> > Trying to contact region server Some server,
> > retryOnlyOne=true, index=0, islastrow=true, tries=9,
> > numtries=10, i=0, listsize=1,
> >
> region=filestore,bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836
> > for region filestore,
> > >
> > > I then looked up which node was hosting the
> given
> > region
> >
> (filestore,bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b)
> > on the gui, found the following debug message in the
> > regionserver log:
> > >
> > > 2010-08-06 14:23:47,414 DEBUG
> > org.apache.hadoop.hbase.regionserver.HRegionServer:
> Batch
> > puts interrupted at index=0 because:Requested row out
> of
> > range for HRegion
> >
> filestore,bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836,
> >
> startKey='bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b',
> >
> getEndKey()='be0bc7b3f8bc2a30910b9c758b47cdb730a4691e93f92abb857a2dcc7aefa633',
> >
> row='be1681910b02db5da061659c2cb08f501a135c2f065559a37a1761bf6e203d1d'
> > >
> > >
> > > Which appears to be coming from:
> > >
> > > /regionserver/HRegionServer.java:1786:    
> >  LOG.debug("Batch puts interrupted at index=" + i +
> "
> > because:" +
> > >
> > > Which is coming from:
> > >
> > >
> >
> ./java/org/apache/hadoop/hbase/regionserver/HRegion.java:1658:
> >      throw new WrongRegionException("Requested row
> out of
> > range for " +
> > >
> > > This happens repeatedly on a specific item over
> at
> > least a day or so, even when not much is happening
> with the
> > cluster.
> > >
> > > As far as I can tell, it looks like the logic to
> > select the correct region for a given row is wrong.
> The row
> > is indeed not in the correct range (at least from what
> I can
> > tell of the exception thrown), and the check in
> > HRegion.java:1658:
> > >
> > >  /** Make sure this is a valid row for the
> HRegion
> > */
> > >  private void checkRow(final byte [] row)
> throws
> > IOException {
> > >    if(!rowIsInRange(regionInfo, row)) {
> > >
> > > Is correctly rejecting the Put().
> > >
> > > So it appears the error would be somewhere in:
> > > HRegion.java:1550:
> > >  private void put(final Map<byte
> > [],List<KeyValue>> familyMap,
> > >      boolean writeToWAL) throws IOException {
> > >
> > > Which appears to be the actual guts of the
> insert
> > operation.
> > > However, I don't know enough about the design of
> > HRegions to really decipher this method. I'll dig into
> it
> > more, but I thought it might be more efficient just to
> ask
> > you guys first.
> > >
> > > Any ideas?
> > >
> > > I can update to 0.20.6, but I don't see any
> fixed
> > jira's on 0.20.6 that seem related.. I could be wrong.
> I'm
> > not sure what I should do next. Any more information
> you
> > guys need?
> > >
> > > Note that I am inserting file into the database,
> and
> > using it's sha256sum as the key. And the file that is
> > failing does indeed have a sha that corresponds to the
> key
> > in the message above (and is out of range).
> > >
> > > Take care,
> > >  -stu
> > >
> > >
> > >
> > >
> > >
> > >
> > 
> 
> 
> 
>