You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "binlijin (JIRA)" <ji...@apache.org> on 2016/08/11 00:45:20 UTC

[jira] [Commented] (HBASE-16393) Improve computeHDFSBlocksDistribution

    [ https://issues.apache.org/jira/browse/HBASE-16393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416293#comment-15416293 ] 

binlijin commented on HBASE-16393:
----------------------------------

{code}
"hostmaster:60100.activeMasterManager" daemon prio=10 tid=0x00007f3df900a000 nid=0x36a51 in Object.wait() [0x00007f3df86fc000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:503)
        at org.apache.hadoop.ipc.Client.call(Client.java:1484)
        - locked <0x00000005fdaea6c8> (a org.apache.hadoop.ipc.Client$Call)
        at org.apache.hadoop.ipc.Client.call(Client.java:1429)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
        at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy16.getFileInfo(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:330)
        at com.sun.proxy.$Proxy17.getFileInfo(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:330)
        at com.sun.proxy.$Proxy17.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1974)
        at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1128)
        at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
        at org.apache.hadoop.hbase.regionserver.StoreFileInfo.getReferencedFileStatus(StoreFileInfo.java:337)
        at org.apache.hadoop.hbase.regionserver.StoreFileInfo.computeHDFSBlocksDistributionInternal(StoreFileInfo.java:290)
        at org.apache.hadoop.hbase.regionserver.StoreFileInfo.computeHDFSBlocksDistribution(StoreFileInfo.java:284)
        at org.apache.hadoop.hbase.regionserver.HRegion.computeHDFSBlocksDistribution(HRegion.java:1083)
        at org.apache.hadoop.hbase.regionserver.HRegion.computeHDFSBlocksDistribution(HRegion.java:1058)
        at org.apache.hadoop.hbase.master.balancer.RegionLocationFinder.internalGetTopBlockLocation(RegionLocationFinder.java:127)
        at org.apache.hadoop.hbase.master.balancer.RegionLocationFinder$1.load(RegionLocationFinder.java:65)
        at org.apache.hadoop.hbase.master.balancer.RegionLocationFinder$1.load(RegionLocationFinder.java:61)
        at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3584)
        at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2372)
        at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2335)
        - locked <0x00000005fda4e288> (a com.google.common.cache.LocalCache$StrongAccessEntry)
        at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2250)
        at com.google.common.cache.LocalCache.get(LocalCache.java:3985)
        at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3989)
        at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4873)
        at org.apache.hadoop.hbase.master.balancer.RegionLocationFinder.getTopBlockLocations(RegionLocationFinder.java:105)
        at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.registerRegion(BaseLoadBalancer.java:433)
        at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.<init>(BaseLoadBalancer.java:281)
        at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:1098)
        at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.retainAssignment(BaseLoadBalancer.java:1235)
        at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2793)
        at org.apache.hadoop.hbase.master.AssignmentManager.assignAllUserRegions(AssignmentManager.java:2900)
        at org.apache.hadoop.hbase.master.AssignmentManager.processDeadServersAndRegionsInTransition(AssignmentManager.java:670)
        at org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:492)
        at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:764)
        at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:183)
        at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1664)
        at java.lang.Thread.run(Thread.java:756)
{code}

> Improve computeHDFSBlocksDistribution
> -------------------------------------
>
>                 Key: HBASE-16393
>                 URL: https://issues.apache.org/jira/browse/HBASE-16393
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: binlijin
>
> With our cluster is big, i can see the balancer is slow from time to time. And the balancer will be called on master startup, so we can see the startup is slow also. 
> The first thing i think whether if we can parallel compute different region's HDFSBlocksDistribution. 
> The second i think we can improve compute single region's HDFSBlocksDistribution.
> When to compute a storefile's HDFSBlocksDistribution first we call FileSystem#getFileStatus(path) and then FileSystem#getFileBlockLocations(status, start, length), so two namenode rpc call for every storefile. Instead we can use FileSystem#listLocatedStatus to get a LocatedFileStatus for the information we need, so reduce the namenode rpc call to one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)