You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Wellington Chevreuil (Jira)" <ji...@apache.org> on 2020/08/27 10:58:00 UTC

[jira] [Work started] (HBASE-24961) [HBOSS] HBaseObjectStoreSemantics.close should call super.close to make sure its own instance always get removed from FileSystem.CACHE

     [ https://issues.apache.org/jira/browse/HBASE-24961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HBASE-24961 started by Wellington Chevreuil.
----------------------------------------------------
> [HBOSS] HBaseObjectStoreSemantics.close should call super.close to make sure its own instance always get removed from FileSystem.CACHE
> --------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-24961
>                 URL: https://issues.apache.org/jira/browse/HBASE-24961
>             Project: HBase
>          Issue Type: Bug
>          Components: Filesystem Integration, hboss
>            Reporter: Wellington Chevreuil
>            Assignee: Wellington Chevreuil
>            Priority: Major
>
> This came up when running bulkloads on hbase deployments using HBOSS. The fixes introduced by HBASE-23679 use *_FileSystem.closeAllForUGI(ugi)_* to make sure _*FileSystem*_ instances get cleared for the specific running UGI. Problem is that _*FileSystem.closeAllForUGI*_ does not remove the instance from _*FileSystem.CACHE*_ explicitly, it rather calls _*FileSystem.close*_, which in turn removes itself from _*FileSystem.CACHE*_. In this case, though, our _*FileSystem*_ implementation is _*HBaseObjectStoreSemantics*_, so _*FileSystem.closeAllForUGI*_ closes it, but does not remove it from _*FileSystem.CACHE*_, leading to all attempts to _*FileSystem.get*_ by the same UGI retrieving a closed _*HBaseObjectStoreSemantics*_ instance, ultimately failing as below:
>  
> {noformat}
> 2020-08-26 12:43:57,528 ERROR org.apache.hadoop.hbase.regionserver.SecureBulkLoadManager: Failed to complete bulk load
> java.io.IOException: Exception while testing a lock
>         at org.apache.hadoop.hbase.oss.sync.ZKTreeLockManager.isLocked(ZKTreeLockManager.java:312)
>         at org.apache.hadoop.hbase.oss.sync.ZKTreeLockManager.writeLockAbove(ZKTreeLockManager.java:183)
>         at org.apache.hadoop.hbase.oss.sync.TreeLockManager.treeReadLock(TreeLockManager.java:282)
>         at org.apache.hadoop.hbase.oss.sync.TreeLockManager.lock(TreeLockManager.java:449)
>         at org.apache.hadoop.hbase.oss.HBaseObjectStoreSemantics.exists(HBaseObjectStoreSemantics.java:498)
>         at org.apache.hadoop.hbase.regionserver.SecureBulkLoadManager$1.run(SecureBulkLoadManager.java:281)
>         at org.apache.hadoop.hbase.regionserver.SecureBulkLoadManager$1.run(SecureBulkLoadManager.java:266)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:360)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1856)
>         at org.apache.hadoop.hbase.regionserver.SecureBulkLoadManager.secureBulkLoadHFiles(SecureBulkLoadManager.java:266)
>         at org.apache.hadoop.hbase.regionserver.RSRpcServices.bulkLoadHFile(RSRpcServices.java:2445)
>         at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42280)
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:418)
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
> Caused by: java.lang.IllegalStateException: Expected state [STARTED] was [STOPPED] {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)