You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Pratyush Bhatt (Jira)" <ji...@apache.org> on 2023/06/26 15:59:00 UTC

[jira] [Created] (HDDS-8932) [Hbase-Ozone] Hbase on Ozone EC Should gracefully exit

Pratyush Bhatt created HDDS-8932:
------------------------------------

             Summary: [Hbase-Ozone] Hbase on Ozone EC Should gracefully exit
                 Key: HDDS-8932
                 URL: https://issues.apache.org/jira/browse/HDDS-8932
             Project: Apache Ozone
          Issue Type: Bug
            Reporter: Pratyush Bhatt


While running Hbase on top of ozone, and WAL dir being on a EC bucket, the Hbase HMaster is abruptly shutting down.
{noformat}
[root@ozn-lease111-3 ~]# ozone sh bucket info vol1/bucket1
{
  "metadata" : { },
  "volumeName" : "vol1",
  "name" : "bucket1",
  "storageType" : "DISK",
  "versioning" : false,
  "usedBytes" : 253311,
  "usedNamespace" : 131,
  "creationTime" : "2023-06-25T09:18:45.283Z",
  "modificationTime" : "2023-06-25T10:58:02.310Z",
  "sourcePathExist" : true,
  "quotaInBytes" : -1,
  "quotaInNamespace" : -1,
  "bucketLayout" : "FILE_SYSTEM_OPTIMIZED",
  "link" : false,
  "replicationConfig" : {
    "data" : 3,
    "parity" : 2,
    "ecChunkSize" : 1048576,
    "codec" : "RS",
    "replicationType" : "EC",
    "requiredNodes" : 5
  }
}{noformat}

Error log from RegionServer:
{noformat}
2023-06-25 12:04:54,912 WARN org.apache.hadoop.hbase.regionserver.wal.AbstractProtobufLogWriter: Init output failed, path=ofs://ozone1/vol1/bucket1/hbase/MasterData/WALs/ozn-lease111-4.ozn-lease111.root.hwx.site,22001,1687694687449/ozn-lease111-4.ozn-lease111.root.hwx.site%2C22001%2C1687694687449.1687694694845
org.apache.hadoop.hbase.util.CommonFSUtils$StreamLacksCapabilityException: hflush
        at org.apache.hadoop.hbase.io.asyncfs.AsyncFSOutputHelper.createOutput(AsyncFSOutputHelper.java:71)
        at org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter.initOutput(AsyncProtobufLogWriter.java:190)
        at org.apache.hadoop.hbase.regionserver.wal.AbstractProtobufLogWriter.init(AbstractProtobufLogWriter.java:160)
        at org.apache.hadoop.hbase.wal.AsyncFSWALProvider.createAsyncWriter(AsyncFSWALProvider.java:116)
        at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createWriterInstance(AsyncFSWAL.java:726)
        at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createWriterInstance(AsyncFSWAL.java:129)
        at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.rollWriter(AbstractFSWAL.java:886)
        at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.rollWriter(AbstractFSWAL.java:575)
        at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.init(AbstractFSWAL.java:516)
        at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:160)
        at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:62)
        at org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:295)
        at org.apache.hadoop.hbase.master.region.MasterRegion.createWAL(MasterRegion.java:211)
        at org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:304)
        at org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:424)
        at org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:122)
        at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:848)
        at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2216)
        at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:528)
        at java.lang.Thread.run(Thread.java:748)
2023-06-25 12:04:54,913 ERROR org.apache.hadoop.hbase.wal.AsyncFSWALProvider: The RegionServer async write ahead log provider relies on the ability to call hflush for proper operation during component failures, but the current FileSystem does not support doing so. Please check the config value of 'hbase.wal.dir' and ensure it points to a FileSystem mount that has suitable capabilities for output streams.
2023-06-25 12:04:54,920 ERROR org.apache.hadoop.hbase.master.HMaster: Failed to become active master
org.apache.hadoop.hbase.FailedCloseWALAfterInitializedErrorException: Failed close after init wal failed.
        at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:167)
        at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:62)
        at org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:295)
        at org.apache.hadoop.hbase.master.region.MasterRegion.createWAL(MasterRegion.java:211)
        at org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:304)
        at org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:424)
        at org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:122)
        at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:848)
        at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2216)
        at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:528)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.NullPointerException
        at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.shutdown(AbstractFSWAL.java:986)
        at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.close(AbstractFSWAL.java:1013)
        at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:165)
        ... 10 more
Caused by: java.lang.NullPointerException
        at java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1011)
        at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1006)
        at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.closeWriter(AsyncFSWAL.java:750)
        at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.doShutdown(AsyncFSWAL.java:807)
        at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL$2.call(AbstractFSWAL.java:958)
        at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL$2.call(AbstractFSWAL.java:953)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        ... 1 more
2023-06-25 12:04:54,922 ERROR org.apache.hadoop.hbase.master.HMaster: ***** ABORTING master ozn-lease111-4.ozn-lease111.root.hwx.site,22001,1687694687449: Unhandled exception. Starting shutdown. *****
org.apache.hadoop.hbase.FailedCloseWALAfterInitializedErrorException: Failed close after init wal failed.
        at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:167)
        at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:62)
        at org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:295)
        at org.apache.hadoop.hbase.master.region.MasterRegion.createWAL(MasterRegion.java:211)
        at org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:304)
        at org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:424)
        at org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:122)
        at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:848)
        at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2216)
        at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:528)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.NullPointerException
        at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.shutdown(AbstractFSWAL.java:986)
        at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.close(AbstractFSWAL.java:1013)
        at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:165)
        ... 10 more{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org