You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by charan kumar <ch...@gmail.com> on 2011/01/26 04:32:50 UTC

NotServingRegionException

Hi,

  Map Reduce Tasks are failing with the following exception.  There was
major compaction running on the region server around the same time.

  no. of retries are not customized, which is 10 by default. But I get this
exception for the first time , it gets this exception. Any suggestions?

   org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:
Failed 1 action: NotServingRegionException: 1 time, servers with issues:
XXXXXXX:60020,
    at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1220)
    at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1234)
    at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819)
    at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675)
    at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660)
    at
org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:126)
    at
org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:81)
    at
org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:508)
    at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
    at
com.ask.af.segscan.SegmentScanner$WebTableReducer.reduce(SegmentScanner.java:284)
    at
com.ask.af.segscan.SegmentScanner$WebTableReducer.reduce(SegmentScanner.java:91)
    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
    at
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)

Thanks,
Charan

Re: NotServingRegionException

Posted by charan kumar <ch...@gmail.com>.
Ryan,

   All regions seem to be online. hbck shows state as inconsistent,

 Region webtable,XXXXXXXXXXXXXXXX,1295614226878.00
465edbdfd73ad89de4a7cd6c0dc4ff. is listed in META on region server
<HOSTX>:60020
but is multiply assigned to region servers <HOSTX>:60020, <HOSTX>:60020

It shows the region is overallocated on the same region server. I am hoping
that this is related to the reverse DNS requirement.  I am running into
intermittent reverse DNS lookup issues.

   I dont see anything interesting related to this in the master log. The
following is the complete log from the region server around that time frame.

2011-01-25 19:05:59,717 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Starting compaction on region webtable,XXXXXXXXXXXXXXXXXXXXXXX., size=739.7k
2011-01-25 19:05:59,719 DEBUG org.apache.hadoop.hbase.regionserver.Store:
Major compaction triggered on store c; time since last major compaction
104139818msfc25d432e079d45/qa/2628506464708923281, keycount=3597,
bloomtype=NONE, size=84.9k
2011-01-25 19:05:59,719 INFO org.apache.hadoop.hbase.regionserver.Store:
Started compaction of 4 file(s) in cf=c  into
hdfs://XX.XX.XX:8020/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/.tmp,
seqid=9186434, totalSize=275.0m
2011-01-25 19:05:59,719 DEBUG org.apache.hadoop.hbase.regionserver.Store:
Compacting
hdfs://XX.XX.XX/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/c/9205102805597630312,
keycount=42464, bloomtype=NONE, size=253.9m size=824.1k; total size for
store is 824.1k
2011-01-25 19:05:59,719 DEBUG org.apache.hadoop.hbase.regionserver.Store:
Compacting
hdfs://XX.XX.XX/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/c/6125196014617064476,
keycount=444, bloomtype=NONE, size=2.7m
2011-01-25 19:05:59,719 DEBUG org.apache.hadoop.hbase.regionserver.Store:
Compacting
hdfs://XX.XX.XX/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/c/601650198232659486,
keycount=2630, bloomtype=NONE, size=15.7m
2011-01-25 19:05:59,719 DEBUG org.apache.hadoop.hbase.regionserver.Store:
Compacting
hdfs://XX.XX.XX/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/c/837285093771597419,
keycount=456, bloomtype=NONE, size=2.7m
************************ for almost 1 minute region server is idle..
2011-01-25 19:06:53,693 INFO org.apache.hadoop.hbase.regionserver.Store:
Completed major compaction of 4 file(s), new
file=hdfs://XX.XX.XX:8020/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/c/3351637132834457815,
size=274.9m; total size for store is 274.9m
2011-01-25 19:06:53,695 DEBUG org.apache.hadoop.hbase.regionserver.Store:
Major compaction triggered on store mtdt; time since last major compaction
104108312msed6b73705d16/c/9205102805597630312, keycount=42464,
bloomtype=NONE, size=253.9m
2011-01-25 19:06:53,695 INFO org.apache.hadoop.hbase.regionserver.Store:
Started compaction of 4 file(s) in cf=mtdt  into
hdfs://XX.XX.XX:8020/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/.tmp,
seqid=9186434, totalSize=1.3m
2011-01-25 19:06:53,695 DEBUG org.apache.hadoop.hbase.regionserver.Store:
Compacting
hdfs://XX.XX.XX/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/mtdt/1448570682542813090,
keycount=21232, bloomtype=NONE, size=1.2m
2011-01-25 19:06:53,695 DEBUG org.apache.hadoop.hbase.regionserver.Store:
Compacting
hdfs://XX.XX.XX/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/mtdt/9176424692900683560,
keycount=222, bloomtype=NONE, size=13.7ksize=274.9m; total size for store is
274.9m
2011-01-25 19:06:53,695 DEBUG org.apache.hadoop.hbase.regionserver.Store:
Compacting
hdfs://XX.XX.XX/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/mtdt/2626470500007036267,
keycount=1315, bloomtype=NONE, size=77.9klSize=1.3m
2011-01-25 19:06:53,695 DEBUG org.apache.hadoop.hbase.regionserver.Store:
Compacting
hdfs://XX.XX.XX/hbase/webtable/d8c98179e0d6257fe0e0ed6b73705d16/mtdt/6494685694636691529,
keycount=228, bloomtype=NONE, size=13.8k



On Tue, Jan 25, 2011 at 9:03 PM, Ryan Rawson <ry...@gmail.com> wrote:

> The exception text:
>
> Failed 1 action: NotServingRegionException: 1 time, servers with issues:
> XXXXXXX:60020,
>
> is attempting to summarize potentially dozens if not hundreds of
> exceptions, and '1 time' means the exception NSRE only appeared once.
> The client did try multiple times.
>
> are you sure every region is online?  Try hbck?
>
> -ryan
>
> On Tue, Jan 25, 2011 at 8:51 PM, Charan K <ch...@gmail.com> wrote:
> > Hi Ryan,
> >
> >  Table is online, since other mapred tasks continue to run without fail.
> >
> >  There was a major compaction running in the region server which took
> almost a minute . I am assuming one minute since there was no log entry for
> one minute, before it completed the compaction.
> >
> >   And from the exception it looks client tried only once, bcos it says 1
> times
> >
> > Thanks
> > Charan
> >
> > Sent from my iPhone
> >
> > On Jan 25, 2011, at 7:42 PM, Ryan Rawson <ry...@gmail.com> wrote:
> >
> >> the problem is the client was talking to the given regionserver, and
> >> that regionserver kept on rejecting the requests - NSRE.  Are you sure
> >> your table is online?  Are all regions online?  Anything interesting
> >> in the master log?
> >>
> >> -ryan
> >>
> >> On Tue, Jan 25, 2011 at 7:32 PM, charan kumar <ch...@gmail.com>
> wrote:
> >>> Hi,
> >>>
> >>>  Map Reduce Tasks are failing with the following exception.  There was
> >>> major compaction running on the region server around the same time.
> >>>
> >>>  no. of retries are not customized, which is 10 by default. But I get
> this
> >>> exception for the first time , it gets this exception. Any suggestions?
> >>>
> >>>   org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:
> >>> Failed 1 action: NotServingRegionException: 1 time, servers with
> issues:
> >>> XXXXXXX:60020,
> >>>    at
> >>>
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1220)
> >>>    at
> >>>
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1234)
> >>>    at
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819)
> >>>    at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675)
> >>>    at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660)
> >>>    at
> >>>
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:126)
> >>>    at
> >>>
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:81)
> >>>    at
> >>>
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:508)
> >>>    at
> >>>
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
> >>>    at
> >>>
> com.ask.af.segscan.SegmentScanner$WebTableReducer.reduce(SegmentScanner.java:284)
> >>>    at
> >>>
> com.ask.af.segscan.SegmentScanner$WebTableReducer.reduce(SegmentScanner.java:91)
> >>>    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> >>>    at
> >>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
> >>>    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> >>>    at org.apache.hadoop.mapred.Child.main(Child.java:170)
> >>>
> >>> Thanks,
> >>> Charan
> >>>
> >
>

Re: NotServingRegionException

Posted by Ryan Rawson <ry...@gmail.com>.
The exception text:

Failed 1 action: NotServingRegionException: 1 time, servers with issues:
XXXXXXX:60020,

is attempting to summarize potentially dozens if not hundreds of
exceptions, and '1 time' means the exception NSRE only appeared once.
The client did try multiple times.

are you sure every region is online?  Try hbck?

-ryan

On Tue, Jan 25, 2011 at 8:51 PM, Charan K <ch...@gmail.com> wrote:
> Hi Ryan,
>
>  Table is online, since other mapred tasks continue to run without fail.
>
>  There was a major compaction running in the region server which took almost a minute . I am assuming one minute since there was no log entry for one minute, before it completed the compaction.
>
>   And from the exception it looks client tried only once, bcos it says 1 times
>
> Thanks
> Charan
>
> Sent from my iPhone
>
> On Jan 25, 2011, at 7:42 PM, Ryan Rawson <ry...@gmail.com> wrote:
>
>> the problem is the client was talking to the given regionserver, and
>> that regionserver kept on rejecting the requests - NSRE.  Are you sure
>> your table is online?  Are all regions online?  Anything interesting
>> in the master log?
>>
>> -ryan
>>
>> On Tue, Jan 25, 2011 at 7:32 PM, charan kumar <ch...@gmail.com> wrote:
>>> Hi,
>>>
>>>  Map Reduce Tasks are failing with the following exception.  There was
>>> major compaction running on the region server around the same time.
>>>
>>>  no. of retries are not customized, which is 10 by default. But I get this
>>> exception for the first time , it gets this exception. Any suggestions?
>>>
>>>   org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:
>>> Failed 1 action: NotServingRegionException: 1 time, servers with issues:
>>> XXXXXXX:60020,
>>>    at
>>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1220)
>>>    at
>>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1234)
>>>    at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819)
>>>    at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675)
>>>    at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660)
>>>    at
>>> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:126)
>>>    at
>>> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:81)
>>>    at
>>> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:508)
>>>    at
>>> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>>>    at
>>> com.ask.af.segscan.SegmentScanner$WebTableReducer.reduce(SegmentScanner.java:284)
>>>    at
>>> com.ask.af.segscan.SegmentScanner$WebTableReducer.reduce(SegmentScanner.java:91)
>>>    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
>>>    at
>>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
>>>    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>>>    at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>>
>>> Thanks,
>>> Charan
>>>
>

Re: NotServingRegionException

Posted by Charan K <ch...@gmail.com>.
Hi Ryan,

  Table is online, since other mapred tasks continue to run without fail.

  There was a major compaction running in the region server which took almost a minute . I am assuming one minute since there was no log entry for one minute, before it completed the compaction.

   And from the exception it looks client tried only once, bcos it says 1 times

Thanks
Charan 

Sent from my iPhone

On Jan 25, 2011, at 7:42 PM, Ryan Rawson <ry...@gmail.com> wrote:

> the problem is the client was talking to the given regionserver, and
> that regionserver kept on rejecting the requests - NSRE.  Are you sure
> your table is online?  Are all regions online?  Anything interesting
> in the master log?
> 
> -ryan
> 
> On Tue, Jan 25, 2011 at 7:32 PM, charan kumar <ch...@gmail.com> wrote:
>> Hi,
>> 
>>  Map Reduce Tasks are failing with the following exception.  There was
>> major compaction running on the region server around the same time.
>> 
>>  no. of retries are not customized, which is 10 by default. But I get this
>> exception for the first time , it gets this exception. Any suggestions?
>> 
>>   org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:
>> Failed 1 action: NotServingRegionException: 1 time, servers with issues:
>> XXXXXXX:60020,
>>    at
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1220)
>>    at
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1234)
>>    at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819)
>>    at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675)
>>    at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660)
>>    at
>> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:126)
>>    at
>> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:81)
>>    at
>> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:508)
>>    at
>> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>>    at
>> com.ask.af.segscan.SegmentScanner$WebTableReducer.reduce(SegmentScanner.java:284)
>>    at
>> com.ask.af.segscan.SegmentScanner$WebTableReducer.reduce(SegmentScanner.java:91)
>>    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
>>    at
>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
>>    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>>    at org.apache.hadoop.mapred.Child.main(Child.java:170)
>> 
>> Thanks,
>> Charan
>> 

Re: NotServingRegionException

Posted by Ryan Rawson <ry...@gmail.com>.
the problem is the client was talking to the given regionserver, and
that regionserver kept on rejecting the requests - NSRE.  Are you sure
your table is online?  Are all regions online?  Anything interesting
in the master log?

-ryan

On Tue, Jan 25, 2011 at 7:32 PM, charan kumar <ch...@gmail.com> wrote:
> Hi,
>
>  Map Reduce Tasks are failing with the following exception.  There was
> major compaction running on the region server around the same time.
>
>  no. of retries are not customized, which is 10 by default. But I get this
> exception for the first time , it gets this exception. Any suggestions?
>
>   org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:
> Failed 1 action: NotServingRegionException: 1 time, servers with issues:
> XXXXXXX:60020,
>    at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1220)
>    at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1234)
>    at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819)
>    at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675)
>    at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660)
>    at
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:126)
>    at
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:81)
>    at
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:508)
>    at
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>    at
> com.ask.af.segscan.SegmentScanner$WebTableReducer.reduce(SegmentScanner.java:284)
>    at
> com.ask.af.segscan.SegmentScanner$WebTableReducer.reduce(SegmentScanner.java:91)
>    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
>    at
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
>    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>    at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
> Thanks,
> Charan
>