You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Wellington Chevreuil (Jira)" <ji...@apache.org> on 2019/09/26 08:18:00 UTC
[jira] [Resolved] (HBASE-23076) [HBOSS] ZKTreeLockManager shouldn't try to acquire a lock from the InterProcessMutex instance when checking if other processes hold it.

     [ https://issues.apache.org/jira/browse/HBASE-23076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wellington Chevreuil resolved HBASE-23076.
------------------------------------------
    Release Note: Thanks for finding and reviewing it, [~elserj]!
      Resolution: Fixed

> [HBOSS] ZKTreeLockManager shouldn't try to acquire a lock from the InterProcessMutex instance when checking if other processes hold it.
> ---------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-23076
>                 URL: https://issues.apache.org/jira/browse/HBASE-23076
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: hbase-filesystem-1.0.0-alpha1
>            Reporter: Wellington Chevreuil
>            Assignee: Wellington Chevreuil
>            Priority: Critical
>             Fix For: hbase-filesystem-1.0.0-alpha2
>
>
> While going through some internal tests, our [~elserj] had faced major bottleneck problems when creating tables with reasonable large number of pre-split tables:
> {noformat} create 'josh', 'f1', {SPLITS=> (1..500).map {|i| "user#{1000+i*(9999-1000)/500}"}}
> {noformat}
> The above resulted in RSes taking long time to complete all assignments, leading to APs timeout failures from master point of view, which in turn submits further APs, in a cascade fashion, until RSes RPC queues got flood and started throwing CallQueueFullException, leaving Master with loads of procedures to complete and many RITs.
> Jstack analysis pointed to potential lock contentions inside *ZKTreeLockManager.isLocked* method. To quote [~elserj]  report:
> {quote}Specifically, lots of threads that look like this:
> {noformat}
> "RpcServer.priority.FPBQ.Fifo.handler=8,queue=0,port=16020" #100 daemon prio=5 os_prio=0 tid=0x00007f5d6dc3a000 nid=0x6b1 waiting for monitor entry [0x00007f5d3bafb000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
> 	at org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.recipes.locks.LockInternals.internalLockLoop(LockInternals.java:289)
> 	- waiting to lock <0x000000074ddd0d10> (a org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.recipes.locks.LockInternals)
> 	at org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.recipes.locks.LockInternals.attemptLock(LockInternals.java:219)
> 	at org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.recipes.locks.InterProcessMutex.internalLock(InterProcessMutex.java:237)
> 	at org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.recipes.locks.InterProcessMutex.acquire(InterProcessMutex.java:108)
> 	at org.apache.hadoop.hbase.oss.sync.ZKTreeLockManager.isLocked(ZKTreeLockManager.java:310)
> 	at org.apache.hadoop.hbase.oss.sync.ZKTreeLockManager.writeLockAbove(ZKTreeLockManager.java:183)
> 	at org.apache.hadoop.hbase.oss.sync.TreeLockManager.treeReadLock(TreeLockManager.java:282)
> 	at org.apache.hadoop.hbase.oss.sync.TreeLockManager.lock(TreeLockManager.java:449)
> 	at org.apache.hadoop.hbase.oss.HBaseObjectStoreSemantics.open(HBaseObjectStoreSemantics.java:181)
> 	at org.apache.hadoop.fs.FilterFileSystem.open(FilterFileSystem.java:166)
> 	at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:911)
> 	at org.apache.hadoop.hbase.util.FSTableDescriptors.readTableDescriptor(FSTableDescriptors.java:566)
> 	at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorFromFs(FSTableDescriptors.java:559)
> 	at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorFromFs(FSTableDescriptors.java:545)
> 	at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:241)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.executeOpenRegionProcedures(RSRpcServices.java:3626)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.lambda$executeProcedures$2(RSRpcServices.java:3694)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices$$Lambda$107/1985163471.accept(Unknown Source)
> 	at java.util.ArrayList.forEach(ArrayList.java:1257)
> 	at java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.executeProcedures(RSRpcServices.java:3694)
> 	at org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:29774)
> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
> 	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:132)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
>    Locked ownable synchronizers:
> 	- None
> {noformat}
> This means that we can only open Regions in a table one at a time now (across all regionservers). That's pretty bad and would explain why that part was so slow.
> Two thoughts already:
> 1) Having to grab the lock to determine if it's held is sub-optimal. That's what the top of this stacktrace is and I think we need to come up with some other approach because this doesn't scale.
>  2) We're all blocked in reading the TableDescriptor. Maybe the Master can include the TableDescriptor in the OpenRegionRequest so the RS's don't have to read it back?
> {quote}
> From [~elserj] suggestion above, #2 would require changes at hbase project side, but we still can try optmize hboss *ZKTreeLockManager.isLocked* method as mentioned in #1.
> Looking at curator's *InterProcessMutex*, we can use its *getParticipantNodes()* method for checking if there's any process locking the given node.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)