You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Wei-Chiu Chuang (Jira)" <ji...@apache.org> on 2024/04/09 17:36:00 UTC
[jira] [Assigned] (HDDS-10649) [LeaseRecovery] Auto Lease recovery failed when Hard limit is expired.
[ https://issues.apache.org/jira/browse/HDDS-10649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wei-Chiu Chuang reassigned HDDS-10649:
--------------------------------------
Assignee: Ashish Kumar
> [LeaseRecovery] Auto Lease recovery failed when Hard limit is expired.
> ----------------------------------------------------------------------
>
> Key: HDDS-10649
> URL: https://issues.apache.org/jira/browse/HDDS-10649
> Project: Apache Ozone
> Issue Type: Bug
> Components: OM
> Reporter: Pratyush Bhatt
> Assignee: Ashish Kumar
> Priority: Major
>
> Below are the hard limit and related configs:
> {code:java}
> ozone getconf -confKey ozone.om.lease.hard.limit
> 8m
> ozone getconf -confKey ozone.om.open.key.cleanup.service.interval
> 5m
> ozone getconf -confKey ozone.om.open.key.expire.threshold
> 6m{code}
> Created a file {_}/hsyncvol/hsyncbuck/hsync/File_0.txt{_}, wrote some data into it, did hsync and then kept it open. Final modification was done at _2024-04-04T16:12:39_
> {code:java}
> {
> "volumeName" : "hsyncvol",
> "bucketName" : "hsyncbuck",
> "name" : "hsync/File_0.txt",
> "dataSize" : 26214400,
> "creationTime" : "2024-04-04T16:12:38.263Z",
> "modificationTime" : "2024-04-04T16:12:39.660Z",
> "replicationConfig" : {
> "replicationFactor" : "THREE",
> "requiredNodes" : 3,
> "replicationType" : "RATIS"
> },
> "metadata" : {
> "hsyncClientId" : "112213829764055054"
> },
> "ozoneKeyLocations" : [ {
> "containerID" : 11,
> "localID" : 113750153625603015,
> "length" : 26214400,
> "offset" : 0,
> "keyOffset" : 0
> } ],
> "file" : true
> } {code}
> It has been more than a hour and still the file is in OpenKeyTable
> {code:java}
> > date
> Thu Apr 4 17:22:06 UTC 2024
> > ozone admin om lof --service-id=ozone1712158888 --prefix=/hsyncvol/hsyncbuck/
> 0 total open files (est.). Showing 1 open files (limit 100) under path prefix:
> /hsyncvol/hsyncbuck/Client ID Creation time Hsync'ed Open File Path
> 112213829764055054 1712247158263 Yes /hsyncvol/hsyncbuck/-9223372036851973887/File_0.txt
> Reached the end of the list. {code}
> Checked the OM leader logs, there are periodic logs like below every 5 mins
> {code:java}
> 2024-04-04 17:18:17,437 ERROR [om74-OMStateMachineApplyTransactionThread - 0]-org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequest: Key committed failed. Volume:hsyncvol, Bucket:hsyncbuck, Key:File_0.txt. Exception:{}
> KEY_NOT_FOUND org.apache.hadoop.ozone.om.exceptions.OMException: Failed to commit key, as /-9223372036851974912/-9223372036851974400/-9223372036851974400/File_0.txt/112213829764055054 entry is not found in the OpenKey table
> at org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequestWithFSO.validateAndUpdateCache(OMKeyCommitRequestWithFSO.java:163)
> at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.lambda$0(OzoneManagerRequestHandler.java:406)
> at org.apache.hadoop.util.MetricUtil.captureLatencyNs(MetricUtil.java:45)
> at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequestImpl(OzoneManagerRequestHandler.java:404)
> at org.apache.hadoop.ozone.protocolPB.RequestHandler.handleWriteRequest(RequestHandler.java:63)
> at org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:525)
> at org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$1(OzoneManagerStateMachine.java:343)
> at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> .
> .
> .
> 2024-04-04 17:23:17,436 ERROR [om74-OMStateMachineApplyTransactionThread - 0]-org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequest: Key committed failed. Volume:hsyncvol, Bucket:hsyncbuck, Key:File_0.txt. Exception:{}
> KEY_NOT_FOUND org.apache.hadoop.ozone.om.exceptions.OMException: Failed to commit key, as /-9223372036851974912/-9223372036851974400/-9223372036851974400/File_0.txt/112213829764055054 entry is not found in the OpenKey table
> at org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequestWithFSO.validateAndUpdateCache(OMKeyCommitRequestWithFSO.java:163)
> at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.lambda$0(OzoneManagerRequestHandler.java:406)
> at org.apache.hadoop.util.MetricUtil.captureLatencyNs(MetricUtil.java:45)
> at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequestImpl(OzoneManagerRequestHandler.java:404)
> at org.apache.hadoop.ozone.protocolPB.RequestHandler.handleWriteRequest(RequestHandler.java:63)
> at org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:525)
> at org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$1(OzoneManagerStateMachine.java:343)
> at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> .
> .
> .
> . {code}
> cc: [~weichiu] , [~Sammi] [~ashishk]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org