You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Mahesh Garg <m_...@hotmail.com> on 2014/03/25 22:52:14 UTC

HBase bulk load failing with RegionTooBusyException

Hi,

We are trying to bulk load into HBase. But, it it failing. We are using a very standard process:

      1. Create HFiles using an MR job.
      2. Call "completebulkload" to move HFiles into regions.

The table is pre-split. We are using HBase version "0.94.15-cdh4.6.0".

The load process runs in a loop and, about 10 million rows are loaded in each attempt. However, the process fails in
"completebulkload" with RegionTooBusyException. This error happen at different times after loading about 120 M rows.

Any suggestion on how to avoid/debug this problem? Any parameters to tune?
The stack track is listed below.

Thanks for your help!
Garg


14/03/25 03:10:39 INFO mapreduce.LoadIncrementalHFiles: Trying to load hfile=hdfs://XXX.XXX.XXX.XXX:8020/user/userName/outputdir/cf2/e2680aee969a47ae97c6980bbd76ea66 first=\x06\x00\x00\x00\x00\x00\x00\x00\x07_0_\x00\x00\x01D\xF73\xB7\x1F_\x00\x00\x00\x01 last=\x06\x00\x00\x00\x00\x00w[B_25_\x00\x00\x01D\xF74\xD7\xAC_\x00\x00\x00\x01
14/03/25 03:21:50 ERROR mapreduce.LoadIncrementalHFiles: Encountered unrecoverable error from region server
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions:
Tue Mar 25 03:11:39 UTC 2014, org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3@29324479, org.apache.hadoop.hbase.RegionTooBusyException: org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 60000 ms. regionName=table50,\x05\x80\x00\x00\x00\x00\x1E\x84\x80\xDF\x00\x00\x01D\xFD\xFD\x06\x7F^\xFF\xFF\xFF\xFD,1395709055737.5a772656accd69dd0354a09fc0c4415f., server=XXX.XXX.XXX.XXX,60020,1395216882104
        at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:5898)
        at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:5884)
        at org.apache.hadoop.hbase.regionserver.HRegion.startBulkRegionOperation(HRegion.java:5843)
        at org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3640)
        at org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3623)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFiles(HRegionServer.java:2978)
        at sun.reflect.GeneratedMethodAccessor61.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRp

 		 	   		  

RE: HBase bulk load failing with RegionTooBusyException

Posted by Mahesh Garg <m_...@hotmail.com>.
Our rows are evenly distributed among all the initial number of regions.
The row key is prefixed with a unique identifier for each split. However,
that changes are some regions are split.

I will try with a smaller region size.
Thanks!


> Date: Tue, 25 Mar 2014 16:15:59 -0700
> Subject: Re: HBase bulk load failing with RegionTooBusyException
> From: yuzhihong@gmail.com
> To: user@hbase.apache.org
> 
> I am not familiar with your schema so I don't know if some subset of
> regions would receive above average writes.
> 
> Please consider splitting large regions so that the impact from compaction
> doesn't affect bulk loading.
> 
> Cheers
> 
> 
> On Tue, Mar 25, 2014 at 4:04 PM, Mahesh Garg <m_...@hotmail.com> wrote:
> 
> > Yes. It took 132 seconds (from 03:11:26 -> 03:13:12). I also do see some
> > warnings saying "operationTooSlow".
> > Also, please note that the "major compaction" has been disabled.
> > Here is the log. (Please excuse for length of the log. But, I hope it will
> > be helpful.)
> >
> > Thanks again.
> > Garg
> >
> >
> > --------------------------------------------------------------------------------------------------------------------------
> > 2014-03-25 03:11:26,650 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> > Starting compaction on cf2 in region
> > table50,\x01\xD5UUUU_\x81\x814UUV\x9ASR[\xD4\xB4UUUU,1395709055737.be366e9e6be9675af7be72baf29f03de.
> > 2014-03-25 03:11:26,651 INFO org.apache.hadoop.hbase.regionserver.Store:
> > Starting compaction of 10 file(s) in cf2 of
> > table50,\x01\xD5UUUU_\x81\x814UUV\x9ASR[\xD4\xB4UUUU,1395709055737.be366e9e6be9675af7be72baf29f03de.
> > into
> > tmpdir=hdfs://AA.BB.CC.DD:8020/hbase/table50/be366e9e6be9675af7be72baf29f03de/.tmp,
> > seqid=0, totalSize=327.5 M
> > 2014-03-25 03:11:26,668 INFO
> > org.apache.hadoop.hbase.regionserver.StoreFile: Delete Family Bloom filter
> > type for
> > hdfs://AA.BB.CC.DD:8020/hbase/table50/be366e9e6be9675af7be72baf29f03de/.tmp/5951be6151474f369126cd38fc4b8f60:
> > CompoundBloomFilterWriter
> > 2014-03-25 03:11:41,029 WARN org.apache.hadoop.ipc.HBaseServer:
> > (operationTooSlow):
> > {"processingtimems":25293,"client":"XX.XX.XX.XX:54862","timeRange":[0,9223372036854775807],"starttimems":1395717075733,"responsesize":0,"class":"HRegionServer","cacheBlocks":false,"table":"table50","families":{"cf2":["ALL"],"cf":["ALL"]},"batch":1,"startRow":"\\x05\\x80\\x00\\x00\\x00\\x00\\x1E\\x84\\x80\\xDF\\x00\\x00\\x01D\\xFD\\xFD\\x06\\x7F^\\xFF\\xFF\\xFF\\xFD","queuetimems":0,"stopRow":"\\x06j\\xAA\\xAA\\xAA\\xAA\\xCEE@
> > \\xC9\\xAA\\xAA\\xAB\\xEF\\xA8\\xA7\\xB1*\\x09\\xAA\\xAA\\xAA\\xA7","method":"openScanner","totalColumns":2,"maxVersions":1,"filter":"KeyOnlyFilter","caching":1}
> > 2014-03-25 03:11:41,030 WARN org.apache.hadoop.ipc.HBaseServer:
> > (operationTooSlow):
> > {"processingtimems":56342,"client":"XX.XX.XX.XX:54577","timeRange":[0,9223372036854775807],"starttimems":1395717044685,"responsesize":0,"class":"HRegionServer","cacheBlocks":false,"table":"table50","families":{"cf2":["ALL"],"cf":["ALL"]},"batch":1,"startRow":"\\x05\\x80\\x00\\x00\\x00\\x00\\x1E\\x84\\x80\\xDF\\x00\\x00\\x01D\\xFD\\xFD\\x06\\x7F^\\xFF\\xFF\\xFF\\xFD","queuetimems":0,"stopRow":"\\x06j\\xAA\\xAA\\xAA\\xAA\\xCEE@
> > \\xC9\\xAA\\xAA\\xAB\\xEF\\xA8\\xA7\\xB1*\\x09\\xAA\\xAA\\xAA\\xA7","method":"openScanner","totalColumns":2,"maxVersions":1,"filter":"KeyOnlyFilter","caching":1}
> > 2014-03-25 03:12:12,085 INFO
> > org.apache.hadoop.hbase.regionserver.HRegionServer: Client tried to access
> > missing scanner 7468301718262159439 (no lease)
> > 2014-03-25 03:12:12,086 ERROR
> > org.apache.hadoop.hbase.regionserver.HRegionServer:
> > org.apache.hadoop.hbase.UnknownScannerException: No lease for
> > 7468301718262159439: lease '7468301718262159439' does not exist
> >         at
> > org.apache.hadoop.hbase.regionserver.HRegionServer.internalNext(HRegionServer.java:2640)
> >         at
> > org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2609)
> >         ......
> > 2014-03-25 03:12:12,088 ERROR
> > org.apache.hadoop.hbase.regionserver.HRegionServer:
> > org.apache.hadoop.hbase.regionserver.LeaseException: lease
> > '7468301718262159439' does not exist
> >         at
> > org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:231)
> >         at
> > org.apache.hadoop.hbase.regionserver.Leases.cancelLease(Leases.java:214)
> >         at
> > org.apache.hadoop.hbase.regionserver.HRegionServer.internalCloseScanner(HRegionServer.java:2749)
> >         at
> > org.apache.hadoop.hbase.regionserver.HRegionServer.close(HRegionServer.java:2723)
> >         ......
> > 2014-03-25 03:12:40,981 ERROR
> > org.apache.hadoop.hbase.regionserver.HRegionServer:
> > org.apache.hadoop.hbase.ipc.CallerDisconnectedException: Aborting on
> > region
> > table50,\x05\x80\x00\x00\x00\x00\x1E\x84\x80\xDF\x00\x00\x01D\xFD\xFD\x06\x7F^\xFF\xFF\xFF\xFD,1395709055737.5a772656accd69dd0354a09fc0c4415f.,
> > call next(7468301718262159439, 1), rpc version=1, client version=29,
> > methodsFingerPrint=-1105746420 from XX.XX.XX.XX:54928 after 59937 ms, since
> > caller disconnected
> >         at
> > org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:437)
> >         at
> > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3985)
> >         at
> > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3891)
> >         ......
> > 2014-03-25 03:12:41,027 INFO
> > org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
> > 9121928292667158981 lease expired on region
> > table50,\x05\x80\x00\x00\x00\x00\x1E\x84\x80\xDF\x00\x00\x01D\xFD\xFD\x06\x7F^\xFF\xFF\xFF\xFD,1395709055737.5a772656accd69dd0354a09fc0c4415f.
> > 2014-03-25 03:13:12,561 INFO
> > org.apache.hadoop.hbase.regionserver.StoreFile: NO General Bloom and NO
> > DeleteFamily was added to HFile
> > (hdfs://AA.BB.CC.DD:8020/hbase/table50/be366e9e6be9675af7be72baf29f03de/.tmp/5951be6151474f369126cd38fc4b8f60)
> > 2014-03-25 03:13:12,608 INFO org.apache.hadoop.hbase.regionserver.Store:
> > Renaming compacted file at
> > hdfs://AA.BB.CC.DD:8020/hbase/table50/be366e9e6be9675af7be72baf29f03de/.tmp/5951be6151474f369126cd38fc4b8f60
> > to
> > hdfs://AA.BB.CC.DD:8020/hbase/table50/be366e9e6be9675af7be72baf29f03de/cf2/5951be6151474f369126cd38fc4b8f60
> > 2014-03-25 03:13:12,696 INFO org.apache.hadoop.hbase.regionserver.Store:
> > Completed compaction of 10 file(s) in cf2 of
> > table50,\x01\xD5UUUU_\x81\x814UUV\x9ASR[\xD4\xB4UUUU,1395709055737.be366e9e6be9675af7be72baf29f03de.
> > into 5951be6151474f369126cd38fc4b8f60, size=112.3 M; total size for store
> > is 669.1 M
> > 2014-03-25 03:13:12,696 INFO
> > org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest:
> > completed compaction:
> > regionName=table50,\x01\xD5UUUU_\x81\x814UUV\x9ASR[\xD4\xB4UUUU,1395709055737.be366e9e6be9675af7be72baf29f03de.,
> > storeName=cf2, fileCount=10, fileSize=327.5 M, priority=-19,
> > time=1134091055662452; duration=1mins, 46sec
> >
> >
> > --------------------------------------------------------------------------------------------------------------------------------
> >
> > > Date: Tue, 25 Mar 2014 15:03:16 -0700
> > > Subject: Re: HBase bulk load failing with RegionTooBusyException
> > > From: yuzhihong@gmail.com
> > > To: user@hbase.apache.org
> > >
> > > Can you check region server log to see if there was concurrent compaction
> > > going on for table50,\x05\x80\x00\x00\x00\x00\x1E\x84\x80\
> > > xDF\x00\x00\x01D\xFD\xFD\x06\x7F^\xFF\xFF\xFF\xFD,1395709055737.
> > > 5a772656accd69dd0354a09fc0c4415f. ?
> > >
> > > If so, how long did the compaction take ?
> > >
> > > Cheers
> > >
> > >
> > > On Tue, Mar 25, 2014 at 2:52 PM, Mahesh Garg <m_...@hotmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > We are trying to bulk load into HBase. But, it it failing. We are
> > using a
> > > > very standard process:
> > > >
> > > >       1. Create HFiles using an MR job.
> > > >       2. Call "completebulkload" to move HFiles into regions.
> > > >
> > > > The table is pre-split. We are using HBase version "0.94.15-cdh4.6.0".
> > > >
> > > > The load process runs in a loop and, about 10 million rows are loaded
> > in
> > > > each attempt. However, the process fails in
> > > > "completebulkload" with RegionTooBusyException. This error happen at
> > > > different times after loading about 120 M rows.
> > > >
> > > > Any suggestion on how to avoid/debug this problem? Any parameters to
> > tune?
> > > > The stack track is listed below.
> > > >
> > > > Thanks for your help!
> > > > Garg
> > > >
> > > >
> > > > 14/03/25 03:10:39 INFO mapreduce.LoadIncrementalHFiles: Trying to load
> > > >
> > hfile=hdfs://XXX.XXX.XXX.XXX:8020/user/userName/outputdir/cf2/e2680aee969a47ae97c6980bbd76ea66
> > > >
> > first=\x06\x00\x00\x00\x00\x00\x00\x00\x07_0_\x00\x00\x01D\xF73\xB7\x1F_\x00\x00\x00\x01
> > > >
> > last=\x06\x00\x00\x00\x00\x00w[B_25_\x00\x00\x01D\xF74\xD7\xAC_\x00\x00\x00\x01
> > > > 14/03/25 03:21:50 ERROR mapreduce.LoadIncrementalHFiles: Encountered
> > > > unrecoverable error from region server
> > > > org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> > > > attempts=10, exceptions:
> > > > Tue Mar 25 03:11:39 UTC 2014,
> > > > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3@29324479,
> > > > org.apache.hadoop.hbase.RegionTooBusyException:
> > > > org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in
> > > > 60000 ms.
> > > >
> > regionName=table50,\x05\x80\x00\x00\x00\x00\x1E\x84\x80\xDF\x00\x00\x01D\xFD\xFD\x06\x7F^\xFF\xFF\xFF\xFD,1395709055737.5a772656accd69dd0354a09fc0c4415f.,
> > > > server=XXX.XXX.XXX.XXX,60020,1395216882104
> > > >         at
> > > > org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:5898)
> > > >         at
> > > > org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:5884)
> > > >         at
> > > >
> > org.apache.hadoop.hbase.regionserver.HRegion.startBulkRegionOperation(HRegion.java:5843)
> > > >         at
> > > >
> > org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3640)
> > > >         at
> > > >
> > org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3623)
> > > >         at
> > > >
> > org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFiles(HRegionServer.java:2978)
> > > >         at sun.reflect.GeneratedMethodAccessor61.invoke(Unknown Source)
> > > >         at
> > > >
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > > >         at java.lang.reflect.Method.invoke(Method.java:597)
> > > >         at
> > > > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRp
> > > >
> > > >
> >
 		 	   		  

Re: HBase bulk load failing with RegionTooBusyException

Posted by Ted Yu <yu...@gmail.com>.
I am not familiar with your schema so I don't know if some subset of
regions would receive above average writes.

Please consider splitting large regions so that the impact from compaction
doesn't affect bulk loading.

Cheers


On Tue, Mar 25, 2014 at 4:04 PM, Mahesh Garg <m_...@hotmail.com> wrote:

> Yes. It took 132 seconds (from 03:11:26 -> 03:13:12). I also do see some
> warnings saying "operationTooSlow".
> Also, please note that the "major compaction" has been disabled.
> Here is the log. (Please excuse for length of the log. But, I hope it will
> be helpful.)
>
> Thanks again.
> Garg
>
>
> --------------------------------------------------------------------------------------------------------------------------
> 2014-03-25 03:11:26,650 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Starting compaction on cf2 in region
> table50,\x01\xD5UUUU_\x81\x814UUV\x9ASR[\xD4\xB4UUUU,1395709055737.be366e9e6be9675af7be72baf29f03de.
> 2014-03-25 03:11:26,651 INFO org.apache.hadoop.hbase.regionserver.Store:
> Starting compaction of 10 file(s) in cf2 of
> table50,\x01\xD5UUUU_\x81\x814UUV\x9ASR[\xD4\xB4UUUU,1395709055737.be366e9e6be9675af7be72baf29f03de.
> into
> tmpdir=hdfs://AA.BB.CC.DD:8020/hbase/table50/be366e9e6be9675af7be72baf29f03de/.tmp,
> seqid=0, totalSize=327.5 M
> 2014-03-25 03:11:26,668 INFO
> org.apache.hadoop.hbase.regionserver.StoreFile: Delete Family Bloom filter
> type for
> hdfs://AA.BB.CC.DD:8020/hbase/table50/be366e9e6be9675af7be72baf29f03de/.tmp/5951be6151474f369126cd38fc4b8f60:
> CompoundBloomFilterWriter
> 2014-03-25 03:11:41,029 WARN org.apache.hadoop.ipc.HBaseServer:
> (operationTooSlow):
> {"processingtimems":25293,"client":"XX.XX.XX.XX:54862","timeRange":[0,9223372036854775807],"starttimems":1395717075733,"responsesize":0,"class":"HRegionServer","cacheBlocks":false,"table":"table50","families":{"cf2":["ALL"],"cf":["ALL"]},"batch":1,"startRow":"\\x05\\x80\\x00\\x00\\x00\\x00\\x1E\\x84\\x80\\xDF\\x00\\x00\\x01D\\xFD\\xFD\\x06\\x7F^\\xFF\\xFF\\xFF\\xFD","queuetimems":0,"stopRow":"\\x06j\\xAA\\xAA\\xAA\\xAA\\xCEE@
> \\xC9\\xAA\\xAA\\xAB\\xEF\\xA8\\xA7\\xB1*\\x09\\xAA\\xAA\\xAA\\xA7","method":"openScanner","totalColumns":2,"maxVersions":1,"filter":"KeyOnlyFilter","caching":1}
> 2014-03-25 03:11:41,030 WARN org.apache.hadoop.ipc.HBaseServer:
> (operationTooSlow):
> {"processingtimems":56342,"client":"XX.XX.XX.XX:54577","timeRange":[0,9223372036854775807],"starttimems":1395717044685,"responsesize":0,"class":"HRegionServer","cacheBlocks":false,"table":"table50","families":{"cf2":["ALL"],"cf":["ALL"]},"batch":1,"startRow":"\\x05\\x80\\x00\\x00\\x00\\x00\\x1E\\x84\\x80\\xDF\\x00\\x00\\x01D\\xFD\\xFD\\x06\\x7F^\\xFF\\xFF\\xFF\\xFD","queuetimems":0,"stopRow":"\\x06j\\xAA\\xAA\\xAA\\xAA\\xCEE@
> \\xC9\\xAA\\xAA\\xAB\\xEF\\xA8\\xA7\\xB1*\\x09\\xAA\\xAA\\xAA\\xA7","method":"openScanner","totalColumns":2,"maxVersions":1,"filter":"KeyOnlyFilter","caching":1}
> 2014-03-25 03:12:12,085 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Client tried to access
> missing scanner 7468301718262159439 (no lease)
> 2014-03-25 03:12:12,086 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.UnknownScannerException: No lease for
> 7468301718262159439: lease '7468301718262159439' does not exist
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.internalNext(HRegionServer.java:2640)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2609)
>         ......
> 2014-03-25 03:12:12,088 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.regionserver.LeaseException: lease
> '7468301718262159439' does not exist
>         at
> org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:231)
>         at
> org.apache.hadoop.hbase.regionserver.Leases.cancelLease(Leases.java:214)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.internalCloseScanner(HRegionServer.java:2749)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.close(HRegionServer.java:2723)
>         ......
> 2014-03-25 03:12:40,981 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.ipc.CallerDisconnectedException: Aborting on
> region
> table50,\x05\x80\x00\x00\x00\x00\x1E\x84\x80\xDF\x00\x00\x01D\xFD\xFD\x06\x7F^\xFF\xFF\xFF\xFD,1395709055737.5a772656accd69dd0354a09fc0c4415f.,
> call next(7468301718262159439, 1), rpc version=1, client version=29,
> methodsFingerPrint=-1105746420 from XX.XX.XX.XX:54928 after 59937 ms, since
> caller disconnected
>         at
> org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:437)
>         at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3985)
>         at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3891)
>         ......
> 2014-03-25 03:12:41,027 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
> 9121928292667158981 lease expired on region
> table50,\x05\x80\x00\x00\x00\x00\x1E\x84\x80\xDF\x00\x00\x01D\xFD\xFD\x06\x7F^\xFF\xFF\xFF\xFD,1395709055737.5a772656accd69dd0354a09fc0c4415f.
> 2014-03-25 03:13:12,561 INFO
> org.apache.hadoop.hbase.regionserver.StoreFile: NO General Bloom and NO
> DeleteFamily was added to HFile
> (hdfs://AA.BB.CC.DD:8020/hbase/table50/be366e9e6be9675af7be72baf29f03de/.tmp/5951be6151474f369126cd38fc4b8f60)
> 2014-03-25 03:13:12,608 INFO org.apache.hadoop.hbase.regionserver.Store:
> Renaming compacted file at
> hdfs://AA.BB.CC.DD:8020/hbase/table50/be366e9e6be9675af7be72baf29f03de/.tmp/5951be6151474f369126cd38fc4b8f60
> to
> hdfs://AA.BB.CC.DD:8020/hbase/table50/be366e9e6be9675af7be72baf29f03de/cf2/5951be6151474f369126cd38fc4b8f60
> 2014-03-25 03:13:12,696 INFO org.apache.hadoop.hbase.regionserver.Store:
> Completed compaction of 10 file(s) in cf2 of
> table50,\x01\xD5UUUU_\x81\x814UUV\x9ASR[\xD4\xB4UUUU,1395709055737.be366e9e6be9675af7be72baf29f03de.
> into 5951be6151474f369126cd38fc4b8f60, size=112.3 M; total size for store
> is 669.1 M
> 2014-03-25 03:13:12,696 INFO
> org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest:
> completed compaction:
> regionName=table50,\x01\xD5UUUU_\x81\x814UUV\x9ASR[\xD4\xB4UUUU,1395709055737.be366e9e6be9675af7be72baf29f03de.,
> storeName=cf2, fileCount=10, fileSize=327.5 M, priority=-19,
> time=1134091055662452; duration=1mins, 46sec
>
>
> --------------------------------------------------------------------------------------------------------------------------------
>
> > Date: Tue, 25 Mar 2014 15:03:16 -0700
> > Subject: Re: HBase bulk load failing with RegionTooBusyException
> > From: yuzhihong@gmail.com
> > To: user@hbase.apache.org
> >
> > Can you check region server log to see if there was concurrent compaction
> > going on for table50,\x05\x80\x00\x00\x00\x00\x1E\x84\x80\
> > xDF\x00\x00\x01D\xFD\xFD\x06\x7F^\xFF\xFF\xFF\xFD,1395709055737.
> > 5a772656accd69dd0354a09fc0c4415f. ?
> >
> > If so, how long did the compaction take ?
> >
> > Cheers
> >
> >
> > On Tue, Mar 25, 2014 at 2:52 PM, Mahesh Garg <m_...@hotmail.com> wrote:
> >
> > > Hi,
> > >
> > > We are trying to bulk load into HBase. But, it it failing. We are
> using a
> > > very standard process:
> > >
> > >       1. Create HFiles using an MR job.
> > >       2. Call "completebulkload" to move HFiles into regions.
> > >
> > > The table is pre-split. We are using HBase version "0.94.15-cdh4.6.0".
> > >
> > > The load process runs in a loop and, about 10 million rows are loaded
> in
> > > each attempt. However, the process fails in
> > > "completebulkload" with RegionTooBusyException. This error happen at
> > > different times after loading about 120 M rows.
> > >
> > > Any suggestion on how to avoid/debug this problem? Any parameters to
> tune?
> > > The stack track is listed below.
> > >
> > > Thanks for your help!
> > > Garg
> > >
> > >
> > > 14/03/25 03:10:39 INFO mapreduce.LoadIncrementalHFiles: Trying to load
> > >
> hfile=hdfs://XXX.XXX.XXX.XXX:8020/user/userName/outputdir/cf2/e2680aee969a47ae97c6980bbd76ea66
> > >
> first=\x06\x00\x00\x00\x00\x00\x00\x00\x07_0_\x00\x00\x01D\xF73\xB7\x1F_\x00\x00\x00\x01
> > >
> last=\x06\x00\x00\x00\x00\x00w[B_25_\x00\x00\x01D\xF74\xD7\xAC_\x00\x00\x00\x01
> > > 14/03/25 03:21:50 ERROR mapreduce.LoadIncrementalHFiles: Encountered
> > > unrecoverable error from region server
> > > org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> > > attempts=10, exceptions:
> > > Tue Mar 25 03:11:39 UTC 2014,
> > > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3@29324479,
> > > org.apache.hadoop.hbase.RegionTooBusyException:
> > > org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in
> > > 60000 ms.
> > >
> regionName=table50,\x05\x80\x00\x00\x00\x00\x1E\x84\x80\xDF\x00\x00\x01D\xFD\xFD\x06\x7F^\xFF\xFF\xFF\xFD,1395709055737.5a772656accd69dd0354a09fc0c4415f.,
> > > server=XXX.XXX.XXX.XXX,60020,1395216882104
> > >         at
> > > org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:5898)
> > >         at
> > > org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:5884)
> > >         at
> > >
> org.apache.hadoop.hbase.regionserver.HRegion.startBulkRegionOperation(HRegion.java:5843)
> > >         at
> > >
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3640)
> > >         at
> > >
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3623)
> > >         at
> > >
> org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFiles(HRegionServer.java:2978)
> > >         at sun.reflect.GeneratedMethodAccessor61.invoke(Unknown Source)
> > >         at
> > >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >         at java.lang.reflect.Method.invoke(Method.java:597)
> > >         at
> > > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRp
> > >
> > >
>

RE: HBase bulk load failing with RegionTooBusyException

Posted by Mahesh Garg <m_...@hotmail.com>.
Yes. It took 132 seconds (from 03:11:26 -> 03:13:12). I also do see some warnings saying "operationTooSlow". 
Also, please note that the "major compaction" has been disabled.
Here is the log. (Please excuse for length of the log. But, I hope it will be helpful.)

Thanks again.
Garg

--------------------------------------------------------------------------------------------------------------------------
2014-03-25 03:11:26,650 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on cf2 in region table50,\x01\xD5UUUU_\x81\x814UUV\x9ASR[\xD4\xB4UUUU,1395709055737.be366e9e6be9675af7be72baf29f03de.
2014-03-25 03:11:26,651 INFO org.apache.hadoop.hbase.regionserver.Store: Starting compaction of 10 file(s) in cf2 of table50,\x01\xD5UUUU_\x81\x814UUV\x9ASR[\xD4\xB4UUUU,1395709055737.be366e9e6be9675af7be72baf29f03de. into tmpdir=hdfs://AA.BB.CC.DD:8020/hbase/table50/be366e9e6be9675af7be72baf29f03de/.tmp, seqid=0, totalSize=327.5 M
2014-03-25 03:11:26,668 INFO org.apache.hadoop.hbase.regionserver.StoreFile: Delete Family Bloom filter type for hdfs://AA.BB.CC.DD:8020/hbase/table50/be366e9e6be9675af7be72baf29f03de/.tmp/5951be6151474f369126cd38fc4b8f60: CompoundBloomFilterWriter
2014-03-25 03:11:41,029 WARN org.apache.hadoop.ipc.HBaseServer: (operationTooSlow): {"processingtimems":25293,"client":"XX.XX.XX.XX:54862","timeRange":[0,9223372036854775807],"starttimems":1395717075733,"responsesize":0,"class":"HRegionServer","cacheBlocks":false,"table":"table50","families":{"cf2":["ALL"],"cf":["ALL"]},"batch":1,"startRow":"\\x05\\x80\\x00\\x00\\x00\\x00\\x1E\\x84\\x80\\xDF\\x00\\x00\\x01D\\xFD\\xFD\\x06\\x7F^\\xFF\\xFF\\xFF\\xFD","queuetimems":0,"stopRow":"\\x06j\\xAA\\xAA\\xAA\\xAA\\xCEE@\\xC9\\xAA\\xAA\\xAB\\xEF\\xA8\\xA7\\xB1*\\x09\\xAA\\xAA\\xAA\\xA7","method":"openScanner","totalColumns":2,"maxVersions":1,"filter":"KeyOnlyFilter","caching":1}
2014-03-25 03:11:41,030 WARN org.apache.hadoop.ipc.HBaseServer: (operationTooSlow): {"processingtimems":56342,"client":"XX.XX.XX.XX:54577","timeRange":[0,9223372036854775807],"starttimems":1395717044685,"responsesize":0,"class":"HRegionServer","cacheBlocks":false,"table":"table50","families":{"cf2":["ALL"],"cf":["ALL"]},"batch":1,"startRow":"\\x05\\x80\\x00\\x00\\x00\\x00\\x1E\\x84\\x80\\xDF\\x00\\x00\\x01D\\xFD\\xFD\\x06\\x7F^\\xFF\\xFF\\xFF\\xFD","queuetimems":0,"stopRow":"\\x06j\\xAA\\xAA\\xAA\\xAA\\xCEE@\\xC9\\xAA\\xAA\\xAB\\xEF\\xA8\\xA7\\xB1*\\x09\\xAA\\xAA\\xAA\\xA7","method":"openScanner","totalColumns":2,"maxVersions":1,"filter":"KeyOnlyFilter","caching":1}
2014-03-25 03:12:12,085 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Client tried to access missing scanner 7468301718262159439 (no lease)
2014-03-25 03:12:12,086 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.UnknownScannerException: No lease for 7468301718262159439: lease '7468301718262159439' does not exist
        at org.apache.hadoop.hbase.regionserver.HRegionServer.internalNext(HRegionServer.java:2640)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2609)
        ......
2014-03-25 03:12:12,088 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.regionserver.LeaseException: lease '7468301718262159439' does not exist
        at org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:231)
        at org.apache.hadoop.hbase.regionserver.Leases.cancelLease(Leases.java:214)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.internalCloseScanner(HRegionServer.java:2749)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.close(HRegionServer.java:2723)
        ......
2014-03-25 03:12:40,981 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.ipc.CallerDisconnectedException: Aborting on region table50,\x05\x80\x00\x00\x00\x00\x1E\x84\x80\xDF\x00\x00\x01D\xFD\xFD\x06\x7F^\xFF\xFF\xFF\xFD,1395709055737.5a772656accd69dd0354a09fc0c4415f., call next(7468301718262159439, 1), rpc version=1, client version=29, methodsFingerPrint=-1105746420 from XX.XX.XX.XX:54928 after 59937 ms, since caller disconnected
        at org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:437)
        at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3985)
        at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3891)
        ......
2014-03-25 03:12:41,027 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 9121928292667158981 lease expired on region table50,\x05\x80\x00\x00\x00\x00\x1E\x84\x80\xDF\x00\x00\x01D\xFD\xFD\x06\x7F^\xFF\xFF\xFF\xFD,1395709055737.5a772656accd69dd0354a09fc0c4415f.
2014-03-25 03:13:12,561 INFO org.apache.hadoop.hbase.regionserver.StoreFile: NO General Bloom and NO DeleteFamily was added to HFile (hdfs://AA.BB.CC.DD:8020/hbase/table50/be366e9e6be9675af7be72baf29f03de/.tmp/5951be6151474f369126cd38fc4b8f60)
2014-03-25 03:13:12,608 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming compacted file at hdfs://AA.BB.CC.DD:8020/hbase/table50/be366e9e6be9675af7be72baf29f03de/.tmp/5951be6151474f369126cd38fc4b8f60 to hdfs://AA.BB.CC.DD:8020/hbase/table50/be366e9e6be9675af7be72baf29f03de/cf2/5951be6151474f369126cd38fc4b8f60
2014-03-25 03:13:12,696 INFO org.apache.hadoop.hbase.regionserver.Store: Completed compaction of 10 file(s) in cf2 of table50,\x01\xD5UUUU_\x81\x814UUV\x9ASR[\xD4\xB4UUUU,1395709055737.be366e9e6be9675af7be72baf29f03de. into 5951be6151474f369126cd38fc4b8f60, size=112.3 M; total size for store is 669.1 M
2014-03-25 03:13:12,696 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed compaction: regionName=table50,\x01\xD5UUUU_\x81\x814UUV\x9ASR[\xD4\xB4UUUU,1395709055737.be366e9e6be9675af7be72baf29f03de., storeName=cf2, fileCount=10, fileSize=327.5 M, priority=-19, time=1134091055662452; duration=1mins, 46sec

--------------------------------------------------------------------------------------------------------------------------------

> Date: Tue, 25 Mar 2014 15:03:16 -0700
> Subject: Re: HBase bulk load failing with RegionTooBusyException
> From: yuzhihong@gmail.com
> To: user@hbase.apache.org
> 
> Can you check region server log to see if there was concurrent compaction
> going on for table50,\x05\x80\x00\x00\x00\x00\x1E\x84\x80\
> xDF\x00\x00\x01D\xFD\xFD\x06\x7F^\xFF\xFF\xFF\xFD,1395709055737.
> 5a772656accd69dd0354a09fc0c4415f. ?
> 
> If so, how long did the compaction take ?
> 
> Cheers
> 
> 
> On Tue, Mar 25, 2014 at 2:52 PM, Mahesh Garg <m_...@hotmail.com> wrote:
> 
> > Hi,
> >
> > We are trying to bulk load into HBase. But, it it failing. We are using a
> > very standard process:
> >
> >       1. Create HFiles using an MR job.
> >       2. Call "completebulkload" to move HFiles into regions.
> >
> > The table is pre-split. We are using HBase version "0.94.15-cdh4.6.0".
> >
> > The load process runs in a loop and, about 10 million rows are loaded in
> > each attempt. However, the process fails in
> > "completebulkload" with RegionTooBusyException. This error happen at
> > different times after loading about 120 M rows.
> >
> > Any suggestion on how to avoid/debug this problem? Any parameters to tune?
> > The stack track is listed below.
> >
> > Thanks for your help!
> > Garg
> >
> >
> > 14/03/25 03:10:39 INFO mapreduce.LoadIncrementalHFiles: Trying to load
> > hfile=hdfs://XXX.XXX.XXX.XXX:8020/user/userName/outputdir/cf2/e2680aee969a47ae97c6980bbd76ea66
> > first=\x06\x00\x00\x00\x00\x00\x00\x00\x07_0_\x00\x00\x01D\xF73\xB7\x1F_\x00\x00\x00\x01
> > last=\x06\x00\x00\x00\x00\x00w[B_25_\x00\x00\x01D\xF74\xD7\xAC_\x00\x00\x00\x01
> > 14/03/25 03:21:50 ERROR mapreduce.LoadIncrementalHFiles: Encountered
> > unrecoverable error from region server
> > org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> > attempts=10, exceptions:
> > Tue Mar 25 03:11:39 UTC 2014,
> > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3@29324479,
> > org.apache.hadoop.hbase.RegionTooBusyException:
> > org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in
> > 60000 ms.
> > regionName=table50,\x05\x80\x00\x00\x00\x00\x1E\x84\x80\xDF\x00\x00\x01D\xFD\xFD\x06\x7F^\xFF\xFF\xFF\xFD,1395709055737.5a772656accd69dd0354a09fc0c4415f.,
> > server=XXX.XXX.XXX.XXX,60020,1395216882104
> >         at
> > org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:5898)
> >         at
> > org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:5884)
> >         at
> > org.apache.hadoop.hbase.regionserver.HRegion.startBulkRegionOperation(HRegion.java:5843)
> >         at
> > org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3640)
> >         at
> > org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3623)
> >         at
> > org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFiles(HRegionServer.java:2978)
> >         at sun.reflect.GeneratedMethodAccessor61.invoke(Unknown Source)
> >         at
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >         at java.lang.reflect.Method.invoke(Method.java:597)
> >         at
> > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRp
> >
> >
 		 	   		  

Re: HBase bulk load failing with RegionTooBusyException

Posted by Ted Yu <yu...@gmail.com>.
Can you check region server log to see if there was concurrent compaction
going on for table50,\x05\x80\x00\x00\x00\x00\x1E\x84\x80\
xDF\x00\x00\x01D\xFD\xFD\x06\x7F^\xFF\xFF\xFF\xFD,1395709055737.
5a772656accd69dd0354a09fc0c4415f. ?

If so, how long did the compaction take ?

Cheers


On Tue, Mar 25, 2014 at 2:52 PM, Mahesh Garg <m_...@hotmail.com> wrote:

> Hi,
>
> We are trying to bulk load into HBase. But, it it failing. We are using a
> very standard process:
>
>       1. Create HFiles using an MR job.
>       2. Call "completebulkload" to move HFiles into regions.
>
> The table is pre-split. We are using HBase version "0.94.15-cdh4.6.0".
>
> The load process runs in a loop and, about 10 million rows are loaded in
> each attempt. However, the process fails in
> "completebulkload" with RegionTooBusyException. This error happen at
> different times after loading about 120 M rows.
>
> Any suggestion on how to avoid/debug this problem? Any parameters to tune?
> The stack track is listed below.
>
> Thanks for your help!
> Garg
>
>
> 14/03/25 03:10:39 INFO mapreduce.LoadIncrementalHFiles: Trying to load
> hfile=hdfs://XXX.XXX.XXX.XXX:8020/user/userName/outputdir/cf2/e2680aee969a47ae97c6980bbd76ea66
> first=\x06\x00\x00\x00\x00\x00\x00\x00\x07_0_\x00\x00\x01D\xF73\xB7\x1F_\x00\x00\x00\x01
> last=\x06\x00\x00\x00\x00\x00w[B_25_\x00\x00\x01D\xF74\xD7\xAC_\x00\x00\x00\x01
> 14/03/25 03:21:50 ERROR mapreduce.LoadIncrementalHFiles: Encountered
> unrecoverable error from region server
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> attempts=10, exceptions:
> Tue Mar 25 03:11:39 UTC 2014,
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3@29324479,
> org.apache.hadoop.hbase.RegionTooBusyException:
> org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in
> 60000 ms.
> regionName=table50,\x05\x80\x00\x00\x00\x00\x1E\x84\x80\xDF\x00\x00\x01D\xFD\xFD\x06\x7F^\xFF\xFF\xFF\xFD,1395709055737.5a772656accd69dd0354a09fc0c4415f.,
> server=XXX.XXX.XXX.XXX,60020,1395216882104
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:5898)
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:5884)
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.startBulkRegionOperation(HRegion.java:5843)
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3640)
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3623)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFiles(HRegionServer.java:2978)
>         at sun.reflect.GeneratedMethodAccessor61.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRp
>
>