You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Chen Wang <ch...@gmail.com> on 2014/06/20 03:09:37 UTC

mapreduce.LoadIncrementalHFiles hangs..

Last piece of the puzzle!

My Mapreduce succeeded in generating hdfs file, However, bulk load with the
following code:

LoadIncrementalHFiles loader = new LoadIncrementalHFiles(hbaseConf);

 loader.doBulkLoad(newExecutionOutput, candidateSendTable);

Just hangs there without any output. I tried to run

hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles
<hdfs://storefileoutput> <tablename>

It seems to get into some kind of infinite loop..

*2014-06-19 18:06:29,990 DEBUG [LoadIncrementalHFiles-1]
client.HConnectionManager$HConnectionImplementation: Removed
cluster-04:60020 as a location of
[tablenae],1403133308612.060ff9282b3b653c59c1e6be82d2521a. for
tableName=[table] from cache*

*2014-06-19 18:06:30,004 DEBUG [LoadIncrementalHFiles-1]
mapreduce.LoadIncrementalHFiles: Going to connect to server
region=[tablename],,1403133308612.060ff9282b3b653c59c1e6be82d2521a.,
hostname=cluster-04,60020,1403211430209, seqNum=1 for row  with hfile group
[{[B@3b5d5e0d,hdfs://mypath}]*

*2014-06-19 18:06:45,839 DEBUG [LruStats #0] hfile.LruBlockCache:
Total=3.17 MB, free=383.53 MB, max=386.70 MB, blocks=0, accesses=0, hits=0,
hitRatio=0, cachingAccesses=0, cachingHits=0,
cachingHitsRatio=0,evictions=0, evicted=0, evictedPerRun=NaN*


*Any guidence on how I can debug this?*

*Thanks much!*

*Chen*

Re: mapreduce.LoadIncrementalHFiles hangs..

Posted by Ted Yu <yu...@gmail.com>.

You need to give user hbase write permission to the subdirectory under /tmp

Cheers

On Jun 19, 2014, at 7:33 PM, Chen Wang <ch...@gmail.com> wrote:

> Ted,
> Thanks for the pointer!! After checking the 04 regionserver log, they are
> flushed with permission denied error. Does it mean that user hbase does not
> have permission to access to the hdfs?
> 
> aused by:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
> Permission denied: user=hbase, access=WRITE,
> inode="/tmp/campaign_generator/2014-06-20-00-52-19/campaign":hdfs:supergroup:drwxr-xr-x
>        at
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:179)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5489)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameToInternal(FSNamesystem.java:3196)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameToInt(FSNamesystem.java:3166)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameTo(FSNamesystem.java:3134)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rename(NameNodeRpcServer.java:680)
>        at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.rename(ClientNamenodeProtocolServerSideTranslatorPB.java:523)
> 
> Hbase should really throw exception instead of hanging there trying...
> 
> In any case, big step for me. Thanks for the debugging pointers. I used to
> be in .net scope.:-)
> Chen
> 
> 
> On Thu, Jun 19, 2014 at 7:25 PM, Ted Yu <yu...@gmail.com> wrote:
> 
>> Was cluster-04 always showing up in the log ?
>> 
>> Have you checked region server log on cluster-04 ?
>> 
>> Cheers
>> 
>> 
>> On Thu, Jun 19, 2014 at 6:09 PM, Chen Wang <ch...@gmail.com>
>> wrote:
>> 
>>> Last piece of the puzzle!
>>> 
>>> My Mapreduce succeeded in generating hdfs file, However, bulk load with
>> the
>>> following code:
>>> 
>>> LoadIncrementalHFiles loader = new LoadIncrementalHFiles(hbaseConf);
>>> 
>>> loader.doBulkLoad(newExecutionOutput, candidateSendTable);
>>> 
>>> Just hangs there without any output. I tried to run
>>> 
>>> hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles
>>> <hdfs://storefileoutput> <tablename>
>>> 
>>> It seems to get into some kind of infinite loop..
>>> 
>>> *2014-06-19 18:06:29,990 DEBUG [LoadIncrementalHFiles-1]
>>> client.HConnectionManager$HConnectionImplementation: Removed
>>> cluster-04:60020 as a location of
>>> [tablenae],1403133308612.060ff9282b3b653c59c1e6be82d2521a. for
>>> tableName=[table] from cache*
>>> 
>>> *2014-06-19 18:06:30,004 DEBUG [LoadIncrementalHFiles-1]
>>> mapreduce.LoadIncrementalHFiles: Going to connect to server
>>> region=[tablename],,1403133308612.060ff9282b3b653c59c1e6be82d2521a.,
>>> hostname=cluster-04,60020,1403211430209, seqNum=1 for row  with hfile
>> group
>>> [{[B@3b5d5e0d,hdfs://mypath}]*
>>> 
>>> *2014-06-19 18:06:45,839 DEBUG [LruStats #0] hfile.LruBlockCache:
>>> Total=3.17 MB, free=383.53 MB, max=386.70 MB, blocks=0, accesses=0,
>> hits=0,
>>> hitRatio=0, cachingAccesses=0, cachingHits=0,
>>> cachingHitsRatio=0,evictions=0, evicted=0, evictedPerRun=NaN*
>>> 
>>> 
>>> *Any guidence on how I can debug this?*
>>> 
>>> *Thanks much!*
>>> 
>>> *Chen*
>>

Re: mapreduce.LoadIncrementalHFiles hangs..

Posted by Chen Wang <ch...@gmail.com>.

Ted,
Thanks for the pointer!! After checking the 04 regionserver log, they are
flushed with permission denied error. Does it mean that user hbase does not
have permission to access to the hdfs?

aused by:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
Permission denied: user=hbase, access=WRITE,
inode="/tmp/campaign_generator/2014-06-20-00-52-19/campaign":hdfs:supergroup:drwxr-xr-x
        at
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
        at
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
        at
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
        at
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:179)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5489)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameToInternal(FSNamesystem.java:3196)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameToInt(FSNamesystem.java:3166)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameTo(FSNamesystem.java:3134)
        at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rename(NameNodeRpcServer.java:680)
        at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.rename(ClientNamenodeProtocolServerSideTranslatorPB.java:523)

Hbase should really throw exception instead of hanging there trying...

In any case, big step for me. Thanks for the debugging pointers. I used to
be in .net scope.:-)
Chen


On Thu, Jun 19, 2014 at 7:25 PM, Ted Yu <yu...@gmail.com> wrote:

> Was cluster-04 always showing up in the log ?
>
> Have you checked region server log on cluster-04 ?
>
> Cheers
>
>
> On Thu, Jun 19, 2014 at 6:09 PM, Chen Wang <ch...@gmail.com>
> wrote:
>
> > Last piece of the puzzle!
> >
> > My Mapreduce succeeded in generating hdfs file, However, bulk load with
> the
> > following code:
> >
> > LoadIncrementalHFiles loader = new LoadIncrementalHFiles(hbaseConf);
> >
> >  loader.doBulkLoad(newExecutionOutput, candidateSendTable);
> >
> > Just hangs there without any output. I tried to run
> >
> > hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles
> > <hdfs://storefileoutput> <tablename>
> >
> > It seems to get into some kind of infinite loop..
> >
> > *2014-06-19 18:06:29,990 DEBUG [LoadIncrementalHFiles-1]
> > client.HConnectionManager$HConnectionImplementation: Removed
> > cluster-04:60020 as a location of
> > [tablenae],1403133308612.060ff9282b3b653c59c1e6be82d2521a. for
> > tableName=[table] from cache*
> >
> > *2014-06-19 18:06:30,004 DEBUG [LoadIncrementalHFiles-1]
> > mapreduce.LoadIncrementalHFiles: Going to connect to server
> > region=[tablename],,1403133308612.060ff9282b3b653c59c1e6be82d2521a.,
> > hostname=cluster-04,60020,1403211430209, seqNum=1 for row  with hfile
> group
> > [{[B@3b5d5e0d,hdfs://mypath}]*
> >
> > *2014-06-19 18:06:45,839 DEBUG [LruStats #0] hfile.LruBlockCache:
> > Total=3.17 MB, free=383.53 MB, max=386.70 MB, blocks=0, accesses=0,
> hits=0,
> > hitRatio=0, cachingAccesses=0, cachingHits=0,
> > cachingHitsRatio=0,evictions=0, evicted=0, evictedPerRun=NaN*
> >
> >
> > *Any guidence on how I can debug this?*
> >
> > *Thanks much!*
> >
> > *Chen*
> >
>

Re: mapreduce.LoadIncrementalHFiles hangs..

Posted by Ted Yu <yu...@gmail.com>.

Was cluster-04 always showing up in the log ?

Have you checked region server log on cluster-04 ?

Cheers


On Thu, Jun 19, 2014 at 6:09 PM, Chen Wang <ch...@gmail.com>
wrote:

> Last piece of the puzzle!
>
> My Mapreduce succeeded in generating hdfs file, However, bulk load with the
> following code:
>
> LoadIncrementalHFiles loader = new LoadIncrementalHFiles(hbaseConf);
>
>  loader.doBulkLoad(newExecutionOutput, candidateSendTable);
>
> Just hangs there without any output. I tried to run
>
> hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles
> <hdfs://storefileoutput> <tablename>
>
> It seems to get into some kind of infinite loop..
>
> *2014-06-19 18:06:29,990 DEBUG [LoadIncrementalHFiles-1]
> client.HConnectionManager$HConnectionImplementation: Removed
> cluster-04:60020 as a location of
> [tablenae],1403133308612.060ff9282b3b653c59c1e6be82d2521a. for
> tableName=[table] from cache*
>
> *2014-06-19 18:06:30,004 DEBUG [LoadIncrementalHFiles-1]
> mapreduce.LoadIncrementalHFiles: Going to connect to server
> region=[tablename],,1403133308612.060ff9282b3b653c59c1e6be82d2521a.,
> hostname=cluster-04,60020,1403211430209, seqNum=1 for row  with hfile group
> [{[B@3b5d5e0d,hdfs://mypath}]*
>
> *2014-06-19 18:06:45,839 DEBUG [LruStats #0] hfile.LruBlockCache:
> Total=3.17 MB, free=383.53 MB, max=386.70 MB, blocks=0, accesses=0, hits=0,
> hitRatio=0, cachingAccesses=0, cachingHits=0,
> cachingHitsRatio=0,evictions=0, evicted=0, evictedPerRun=NaN*
>
>
> *Any guidence on how I can debug this?*
>
> *Thanks much!*
>
> *Chen*
>

Re: mapreduce.LoadIncrementalHFiles hangs..

Posted by Chen Wang <ch...@gmail.com>.

Here you go.
Thanks for looking!!

ull thread dump Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode):



"Attach Listener" daemon prio=10 tid=0x00007f8130c49800 nid=0x5826 waiting
on condition [0x0000000000000000]

   java.lang.Thread.State: RUNNABLE



"IPC Client (669337652) connection to cluster-trgt04 /10.93.81.96:60020
from cwang" daemon prio=10 tid=0x0000000001b26000 nid=0x5769 runnable
[0x00007f811654b000]

   java.lang.Thread.State: RUNNABLE

            at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)

            at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)

            at
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)

            at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)

            - locked <0x00000000c26ffda0> (a sun.nio.ch.Util$2)

            - locked <0x00000000c26ffd90> (a
java.util.Collections$UnmodifiableSet)

            - locked <0x00000000c26ffc78> (a sun.nio.ch.EPollSelectorImpl)

            at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)

            at
org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335)

            at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)

            at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)

            at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)

            at java.io.FilterInputStream.read(FilterInputStream.java:133)

            at java.io.FilterInputStream.read(FilterInputStream.java:133)

            at
org.apache.hadoop.hbase.ipc.RpcClient$Connection$PingInputStream.read(RpcClient.java:555)

            at
java.io.BufferedInputStream.fill(BufferedInputStream.java:235)

            at
java.io.BufferedInputStream.read(BufferedInputStream.java:254)

            - locked <0x00000000c5593a78> (a java.io.BufferedInputStream)

            at java.io.DataInputStream.readInt(DataInputStream.java:387)

            at
org.apache.hadoop.hbase.ipc.RpcClient$Connection.readResponse(RpcClient.java:1059)

            at
org.apache.hadoop.hbase.ipc.RpcClient$Connection.run(RpcClient.java:721)



"LoadIncrementalHFiles-1" prio=10 tid=0x00007f8130c63800 nid=0x5768 in
Object.wait() [0x00007f811664c000]

   java.lang.Thread.State: TIMED_WAITING (on object monitor)

            at java.lang.Object.wait(Native Method)

            - waiting on <0x00000000c55913f8> (a
org.apache.hadoop.hbase.ipc.RpcClient$Call)

            at
org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1435)

            - locked <0x00000000c55913f8> (a
org.apache.hadoop.hbase.ipc.RpcClient$Call)

            at
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1653)

            at
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1711)

            at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.bulkLoadHFile(ClientProtos.java:27344)

            at
org.apache.hadoop.hbase.protobuf.ProtobufUtil.bulkLoadHFile(ProtobufUtil.java:1430)

            at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:589)

            at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:578)

            at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:120)

            - locked <0x00000000c54eeca8> (a
org.apache.hadoop.hbase.client.RpcRetryingCaller)

            at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:96)

            - locked <0x00000000c54eeca8> (a
org.apache.hadoop.hbase.client.RpcRetryingCaller)

            at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.tryAtomicRegionLoad(LoadIncrementalHFiles.java:629)

            at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$1.call(LoadIncrementalHFiles.java:342)

            at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$1.call(LoadIncrementalHFiles.java:340)

            at java.util.concurrent.FutureTask.run(FutureTask.java:262)

            at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

            at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

            at java.lang.Thread.run(Thread.java:745)



"org.apache.hadoop.hdfs.PeerCache@aa3f4a1" daemon prio=10
tid=0x0000000001cf4000 nid=0x5767 waiting on condition [0x00007f811674d000]

   java.lang.Thread.State: TIMED_WAITING (sleeping)

            at java.lang.Thread.sleep(Native Method)

            at org.apache.hadoop.hdfs.PeerCache.run(PeerCache.java:245)

            at
org.apache.hadoop.hdfs.PeerCache.access$000(PeerCache.java:41)

            at org.apache.hadoop.hdfs.PeerCache$1.run(PeerCache.java:119)

            at java.lang.Thread.run(Thread.java:745)



"LruStats #0" daemon prio=10 tid=0x0000000001ccf000 nid=0x5766 waiting on
condition [0x00007f811684e000]

   java.lang.Thread.State: TIMED_WAITING (parking)

            at sun.misc.Unsafe.park(Native Method)

            - parking to wait for  <0x00000000c3deaad8> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

            at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)

            at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)

            at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)

            at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)

            at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)

            at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)

            at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

            at java.lang.Thread.run(Thread.java:745)



"LoadIncrementalHFiles-0.LruBlockCache.EvictionThread" daemon prio=10
tid=0x0000000001ccd800 nid=0x5765 in Object.wait() [0x00007f811694f000]

   java.lang.Thread.State: WAITING (on object monitor)

            at java.lang.Object.wait(Native Method)

            - waiting on <0x00000000c3e001e0> (a
org.apache.hadoop.hbase.io.hfile.LruBlockCache$EvictionThread)

            at java.lang.Object.wait(Object.java:503)

            at
org.apache.hadoop.hbase.io.hfile.LruBlockCache$EvictionThread.run(LruBlockCache.java:678)

            - locked <0x00000000c3e001e0> (a
org.apache.hadoop.hbase.io.hfile.LruBlockCache$EvictionThread)

            at java.lang.Thread.run(Thread.java:745)



"LoadIncrementalHFiles-0" prio=10 tid=0x00007f8130c62000 nid=0x5764 waiting
on condition [0x00007f8116a50000]

   java.lang.Thread.State: TIMED_WAITING (parking)

            at sun.misc.Unsafe.park(Native Method)

            - parking to wait for  <0x00000000c2cf0c58> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

            at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)

            at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)

            at
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)

            at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)

            at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)

            at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

            at java.lang.Thread.run(Thread.java:745)



"IPC Parameter Sending Thread #0" daemon prio=10 tid=0x00007f8130f0c800
nid=0x5762 waiting on condition [0x00007f8116d53000]

   java.lang.Thread.State: TIMED_WAITING (parking)

            at sun.misc.Unsafe.park(Native Method)

            - parking to wait for  <0x00000000c6f2e310> (a
java.util.concurrent.SynchronousQueue$TransferStack)

            at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)

            at
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)

            at
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)

            at
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)

            at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)

            at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)

            at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

            at java.lang.Thread.run(Thread.java:745)



"Thread-2" daemon prio=10 tid=0x00007f8130eb1000 nid=0x575d runnable
[0x00007f8116f55000]

   java.lang.Thread.State: RUNNABLE

            at
org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(Native Method)

            at
org.apache.hadoop.net.unix.DomainSocketWatcher.access$800(DomainSocketWatcher.java:52)

            at
org.apache.hadoop.net.unix.DomainSocketWatcher$1.run(DomainSocketWatcher.java:457)

            at java.lang.Thread.run(Thread.java:745)



"main-EventThread" daemon prio=10 tid=0x00007f8130d0e000 nid=0x575c waiting
on condition [0x00007f8117056000]

   java.lang.Thread.State: WAITING (parking)

            at sun.misc.Unsafe.park(Native Method)

            - parking to wait for  <0x00000000c6ec0988> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

            at
java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)

            at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)

            at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)

            at
org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:494)



"main-SendThread(trgt-thrift01:2181)" daemon prio=10 tid=0x00007f8130d0d000
nid=0x575b runnable [0x00007f81191ad000]

   java.lang.Thread.State: RUNNABLE

            at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)

            at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)

            at
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)

            at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)

            - locked <0x00000000c6ec2118> (a sun.nio.ch.Util$2)

            - locked <0x00000000c6ec2128> (a
java.util.Collections$UnmodifiableSet)

            - locked <0x00000000c6ec20d0> (a sun.nio.ch.EPollSelectorImpl)

            at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)

            at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:338)

            at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075)



"process reaper" daemon prio=10 tid=0x00007f8130c3e000 nid=0x5759 waiting
on condition [0x00007f81191e6000]

   java.lang.Thread.State: TIMED_WAITING (parking)

            at sun.misc.Unsafe.park(Native Method)

            - parking to wait for  <0x00000000c6ec0a78> (a
java.util.concurrent.SynchronousQueue$TransferStack)

            at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)

            at
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)

            at
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)

            at
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)

            at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)

            at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)

            at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

            at java.lang.Thread.run(Thread.java:745)



"Service Thread" daemon prio=10 tid=0x00007f813011f800 nid=0x5756 runnable
[0x0000000000000000]

   java.lang.Thread.State: RUNNABLE



"C2 CompilerThread1" daemon prio=10 tid=0x00007f813011d000 nid=0x5755
waiting on condition [0x0000000000000000]

   java.lang.Thread.State: RUNNABLE



"C2 CompilerThread0" daemon prio=10 tid=0x00007f813011a800 nid=0x5754
waiting on condition [0x0000000000000000]

   java.lang.Thread.State: RUNNABLE



"Signal Dispatcher" daemon prio=10 tid=0x00007f8130118800 nid=0x5753
runnable [0x0000000000000000]

   java.lang.Thread.State: RUNNABLE



"Surrogate Locker Thread (Concurrent GC)" daemon prio=10
tid=0x00007f813010e800 nid=0x5752 waiting on condition [0x0000000000000000]

   java.lang.Thread.State: RUNNABLE



"Finalizer" daemon prio=10 tid=0x00007f81300f7000 nid=0x5751 in
Object.wait() [0x00007f8124c51000]

   java.lang.Thread.State: WAITING (on object monitor)

            at java.lang.Object.wait(Native Method)

            - waiting on <0x00000000c6ec2378> (a
java.lang.ref.ReferenceQueue$Lock)

            at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)

            - locked <0x00000000c6ec2378> (a
java.lang.ref.ReferenceQueue$Lock)

            at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)

            at
java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189)



"Reference Handler" daemon prio=10 tid=0x00007f81300f3000 nid=0x5750 in
Object.wait() [0x00007f8124d52000]

   java.lang.Thread.State: WAITING (on object monitor)

            at java.lang.Object.wait(Native Method)

            - waiting on <0x00000000c6ec2410> (a
java.lang.ref.Reference$Lock)

            at java.lang.Object.wait(Object.java:503)

            at
java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)

            - locked <0x00000000c6ec2410> (a java.lang.ref.Reference$Lock)



"main" prio=10 tid=0x00007f8130016800 nid=0x5749 waiting on condition
[0x00007f813533e000]

   java.lang.Thread.State: WAITING (parking)

            at sun.misc.Unsafe.park(Native Method)

            - parking to wait for  <0x00000000c528aa60> (a
java.util.concurrent.FutureTask)

            at
java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)

            at
java.util.concurrent.FutureTask.awaitDone(FutureTask.java:425)

            at java.util.concurrent.FutureTask.get(FutureTask.java:187)

            at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.bulkLoadPhase(LoadIncrementalHFiles.java:353)

            at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:292)

            at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.run(LoadIncrementalHFiles.java:842)

            at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

            at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

            at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.main(LoadIncrementalHFiles.java:847)



"VM Thread" prio=10 tid=0x00007f81300f0800 nid=0x574f runnable



"Gang worker#0 (Parallel GC Threads)" prio=10 tid=0x00007f8130027800
nid=0x574a runnable



"Gang worker#1 (Parallel GC Threads)" prio=10 tid=0x00007f8130029000
nid=0x574b runnable



"Gang worker#2 (Parallel GC Threads)" prio=10 tid=0x00007f813002b000
nid=0x574c runnable



"Gang worker#3 (Parallel GC Threads)" prio=10 tid=0x00007f813002d000
nid=0x574d runnable



"Concurrent Mark-Sweep GC Thread" prio=10 tid=0x00007f81300ad800 nid=0x574e
runnable

"VM Periodic Task Thread" prio=10 tid=0x00007f813012b000 nid=0x5757 waiting
on condition



JNI global references: 212


On Thu, Jun 19, 2014 at 6:13 PM, Ted Yu <yu...@gmail.com> wrote:

> You're using 0.96, right ?
>
> Can you jstack the LoadIncrementalHFiles process and pastebin the stack ?
>
>
> On Thu, Jun 19, 2014 at 6:09 PM, Chen Wang <ch...@gmail.com>
> wrote:
>
> > Last piece of the puzzle!
> >
> > My Mapreduce succeeded in generating hdfs file, However, bulk load with
> the
> > following code:
> >
> > LoadIncrementalHFiles loader = new LoadIncrementalHFiles(hbaseConf);
> >
> >  loader.doBulkLoad(newExecutionOutput, candidateSendTable);
> >
> > Just hangs there without any output. I tried to run
> >
> > hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles
> > <hdfs://storefileoutput> <tablename>
> >
> > It seems to get into some kind of infinite loop..
> >
> > *2014-06-19 18:06:29,990 DEBUG [LoadIncrementalHFiles-1]
> > client.HConnectionManager$HConnectionImplementation: Removed
> > cluster-04:60020 as a location of
> > [tablenae],1403133308612.060ff9282b3b653c59c1e6be82d2521a. for
> > tableName=[table] from cache*
> >
> > *2014-06-19 18:06:30,004 DEBUG [LoadIncrementalHFiles-1]
> > mapreduce.LoadIncrementalHFiles: Going to connect to server
> > region=[tablename],,1403133308612.060ff9282b3b653c59c1e6be82d2521a.,
> > hostname=cluster-04,60020,1403211430209, seqNum=1 for row  with hfile
> group
> > [{[B@3b5d5e0d,hdfs://mypath}]*
> >
> > *2014-06-19 18:06:45,839 DEBUG [LruStats #0] hfile.LruBlockCache:
> > Total=3.17 MB, free=383.53 MB, max=386.70 MB, blocks=0, accesses=0,
> hits=0,
> > hitRatio=0, cachingAccesses=0, cachingHits=0,
> > cachingHitsRatio=0,evictions=0, evicted=0, evictedPerRun=NaN*
> >
> >
> > *Any guidence on how I can debug this?*
> >
> > *Thanks much!*
> >
> > *Chen*
> >
>

Re: mapreduce.LoadIncrementalHFiles hangs..

Posted by Ted Yu <yu...@gmail.com>.

You're using 0.96, right ?

Can you jstack the LoadIncrementalHFiles process and pastebin the stack ?


On Thu, Jun 19, 2014 at 6:09 PM, Chen Wang <ch...@gmail.com>
wrote:

> Last piece of the puzzle!
>
> My Mapreduce succeeded in generating hdfs file, However, bulk load with the
> following code:
>
> LoadIncrementalHFiles loader = new LoadIncrementalHFiles(hbaseConf);
>
>  loader.doBulkLoad(newExecutionOutput, candidateSendTable);
>
> Just hangs there without any output. I tried to run
>
> hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles
> <hdfs://storefileoutput> <tablename>
>
> It seems to get into some kind of infinite loop..
>
> *2014-06-19 18:06:29,990 DEBUG [LoadIncrementalHFiles-1]
> client.HConnectionManager$HConnectionImplementation: Removed
> cluster-04:60020 as a location of
> [tablenae],1403133308612.060ff9282b3b653c59c1e6be82d2521a. for
> tableName=[table] from cache*
>
> *2014-06-19 18:06:30,004 DEBUG [LoadIncrementalHFiles-1]
> mapreduce.LoadIncrementalHFiles: Going to connect to server
> region=[tablename],,1403133308612.060ff9282b3b653c59c1e6be82d2521a.,
> hostname=cluster-04,60020,1403211430209, seqNum=1 for row  with hfile group
> [{[B@3b5d5e0d,hdfs://mypath}]*
>
> *2014-06-19 18:06:45,839 DEBUG [LruStats #0] hfile.LruBlockCache:
> Total=3.17 MB, free=383.53 MB, max=386.70 MB, blocks=0, accesses=0, hits=0,
> hitRatio=0, cachingAccesses=0, cachingHits=0,
> cachingHitsRatio=0,evictions=0, evicted=0, evictedPerRun=NaN*
>
>
> *Any guidence on how I can debug this?*
>
> *Thanks much!*
>
> *Chen*
>