You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Wei-Chiu Chuang (Jira)" <ji...@apache.org> on 2024/04/09 17:36:00 UTC

[jira] [Assigned] (HDDS-10649) [LeaseRecovery] Auto Lease recovery failed when Hard limit is expired.

     [ https://issues.apache.org/jira/browse/HDDS-10649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wei-Chiu Chuang reassigned HDDS-10649:
--------------------------------------

    Assignee: Ashish Kumar

> [LeaseRecovery] Auto Lease recovery failed when Hard limit is expired.
> ----------------------------------------------------------------------
>
>                 Key: HDDS-10649
>                 URL: https://issues.apache.org/jira/browse/HDDS-10649
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: OM
>            Reporter: Pratyush Bhatt
>            Assignee: Ashish Kumar
>            Priority: Major
>
> Below are the hard limit and related configs:
> {code:java}
> ozone getconf -confKey ozone.om.lease.hard.limit
> 8m
> ozone getconf -confKey ozone.om.open.key.cleanup.service.interval
> 5m
> ozone getconf -confKey ozone.om.open.key.expire.threshold
> 6m{code}
> Created a file {_}/hsyncvol/hsyncbuck/hsync/File_0.txt{_}, wrote some data into it, did hsync and then kept it open. Final modification was done at _2024-04-04T16:12:39_
> {code:java}
> {
>   "volumeName" : "hsyncvol",
>   "bucketName" : "hsyncbuck",
>   "name" : "hsync/File_0.txt",
>   "dataSize" : 26214400,
>   "creationTime" : "2024-04-04T16:12:38.263Z",
>   "modificationTime" : "2024-04-04T16:12:39.660Z",
>   "replicationConfig" : {
>     "replicationFactor" : "THREE",
>     "requiredNodes" : 3,
>     "replicationType" : "RATIS"
>   },
>   "metadata" : {
>     "hsyncClientId" : "112213829764055054"
>   },
>   "ozoneKeyLocations" : [ {
>     "containerID" : 11,
>     "localID" : 113750153625603015,
>     "length" : 26214400,
>     "offset" : 0,
>     "keyOffset" : 0
>   } ],
>   "file" : true
> } {code}
> It has been more than a hour and still the file is in OpenKeyTable
> {code:java}
> > date
> Thu Apr  4 17:22:06 UTC 2024
> > ozone admin om lof --service-id=ozone1712158888  --prefix=/hsyncvol/hsyncbuck/
> 0 total open files (est.). Showing 1 open files (limit 100) under path prefix:
>   /hsyncvol/hsyncbuck/Client ID        Creation time    Hsync'ed    Open File Path
> 112213829764055054    1712247158263    Yes        /hsyncvol/hsyncbuck/-9223372036851973887/File_0.txt
> Reached the end of the list. {code}
> Checked the OM leader logs, there are periodic logs like below every 5 mins
> {code:java}
> 2024-04-04 17:18:17,437 ERROR [om74-OMStateMachineApplyTransactionThread - 0]-org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequest: Key committed failed. Volume:hsyncvol, Bucket:hsyncbuck, Key:File_0.txt. Exception:{}
> KEY_NOT_FOUND org.apache.hadoop.ozone.om.exceptions.OMException: Failed to commit key, as /-9223372036851974912/-9223372036851974400/-9223372036851974400/File_0.txt/112213829764055054 entry is not found in the OpenKey table
>     at org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequestWithFSO.validateAndUpdateCache(OMKeyCommitRequestWithFSO.java:163)
>     at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.lambda$0(OzoneManagerRequestHandler.java:406)
>     at org.apache.hadoop.util.MetricUtil.captureLatencyNs(MetricUtil.java:45)
>     at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequestImpl(OzoneManagerRequestHandler.java:404)
>     at org.apache.hadoop.ozone.protocolPB.RequestHandler.handleWriteRequest(RequestHandler.java:63)
>     at org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:525)
>     at org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$1(OzoneManagerStateMachine.java:343)
>     at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
>     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:748)
> .
> .
> .
> 2024-04-04 17:23:17,436 ERROR [om74-OMStateMachineApplyTransactionThread - 0]-org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequest: Key committed failed. Volume:hsyncvol, Bucket:hsyncbuck, Key:File_0.txt. Exception:{}
> KEY_NOT_FOUND org.apache.hadoop.ozone.om.exceptions.OMException: Failed to commit key, as /-9223372036851974912/-9223372036851974400/-9223372036851974400/File_0.txt/112213829764055054 entry is not found in the OpenKey table
>     at org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequestWithFSO.validateAndUpdateCache(OMKeyCommitRequestWithFSO.java:163)
>     at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.lambda$0(OzoneManagerRequestHandler.java:406)
>     at org.apache.hadoop.util.MetricUtil.captureLatencyNs(MetricUtil.java:45)
>     at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequestImpl(OzoneManagerRequestHandler.java:404)
>     at org.apache.hadoop.ozone.protocolPB.RequestHandler.handleWriteRequest(RequestHandler.java:63)
>     at org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:525)
>     at org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$1(OzoneManagerStateMachine.java:343)
>     at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
>     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:748)
> .
> .
> .
> . {code}
> cc: [~weichiu] , [~Sammi] [~ashishk] 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org