You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Kaijie Chen (Jira)" <ji...@apache.org> on 2022/08/31 07:17:00 UTC

[jira] [Comment Edited] (HDDS-7187) EC: Retry failed writes before rewrite to a new block group

    [ https://issues.apache.org/jira/browse/HDDS-7187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17598216#comment-17598216 ] 

Kaijie Chen edited comment on HDDS-7187 at 8/31/22 7:16 AM:
------------------------------------------------------------

I found that XceiverClient is using {{ozone.client.read.timeout}} as timeout for each EC WriteChunk and PutBlock request,
and the error code under heavy load is {{DEADLINE_EXCEEDED}}. Maybe increasing this timeout will also help.

{noformat}
java.util.concurrent.ExecutionException:java.util.concurrent.CompletionException:org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException:DEADLINE_EXCEEDED:deadlineexceededafter29.9996156685. [closed=[], committed=[buffered_nanos=1259694370, remote_addr=9.37.156.222/9.37.156.222:12009]]
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at org.apache.hadoop.ozone.client.io.ECBlockOutputStreamEntry.isFailed(ECBlockOutputStreamEntry.java:366)
at org.apache.hadoop.ozone.client.io.ECBlockOutputStreamEntry.getFailedStreams(ECBlockOutputStreamEntry.java:343)
at org.apache.hadoop.ozone.client.io.ECBlockOutputStreamEntry.streamsWithWriteFailure(ECBLockOutputStreamEntry.java:307)
at org.apache.hadoop.ozone.client.io.ECKeyOutputStream.handleParityWrites(ECKeyOutputStream.java:237)
at org.apache.hadoop.ozone.client.io.ECKeyOutputStream.encodeAndWriteParityCells(ECKeyOutputStream.java:206)
at org.apache.hadoop.ozone.client.io.ECKeyOutputStream.handleWrite(ECKeyOutputStream.java:383)
at org.apache.hadoop.ozone.client.io.ECKeyOutputStream.write(ECKeyOutputStream.java:163)
at org.apache.hadoop.ozone.client.io.OzoneOutputStream.write(OzoneOutputStream.java:50)
at org.apache.hadoop.ozone.freon.ContentGenerator.write(ContentGenerator.java:77)
at org.apache.hadoop.ozone.freon.OzoneClientKeyGenerator.lambda$createKey$0(OzoneClientKeyGenerator.java:115)
at com. codahale.metrics.Timer.time(Timer.java:101)
at org.apache.hadoop.ozone.freon.OzoneClientKeyGenerator.createKey(OzoneClientKeyGenerator.java:112)
at org.apache.hadoop.ozone.freon.BaseFreonGenerator.tryNextTask(BaseFreonGenerator.java:184)
at org.apache.hadoop.ozone.freon.BaseFreonGenerator.taskLoop(BaseFreonGenerator.java:164)
at org.apache.hadoop.ozone.freon.BaseFreonGenerator.lambda$startTaskRunners$0(BaseFreonGenerator.java:147)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread .run(Thread.java: 748)
{noformat}



was (Author: ckj996):
I found that XceiverClient is using {{ozone.client.read.timeout}} as timeout for each EC WriteChunk and PutBlock request,
and the error code under heavy load is {{DEADLINE_EXCEEDED}}. Maybe increasing this timeout will help.

{noformat}
java.util.concurrent.ExecutionException:java.util.concurrent.CompletionException:org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException:DEADLINE_EXCEEDED:deadlineexceededafter29.9996156685. [closed=[], committed=[buffered_nanos=1259694370, remote_addr=9.37.156.222/9.37.156.222:12009]]
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at org.apache.hadoop.ozone.client.io.ECBlockOutputStreamEntry.isFailed(ECBlockOutputStreamEntry.java:366)
at org.apache.hadoop.ozone.client.io.ECBlockOutputStreamEntry.getFailedStreams(ECBlockOutputStreamEntry.java:343)
at org.apache.hadoop.ozone.client.io.ECBlockOutputStreamEntry.streamsWithWriteFailure(ECBLockOutputStreamEntry.java:307)
at org.apache.hadoop.ozone.client.io.ECKeyOutputStream.handleParityWrites(ECKeyOutputStream.java:237)
at org.apache.hadoop.ozone.client.io.ECKeyOutputStream.encodeAndWriteParityCells(ECKeyOutputStream.java:206)
at org.apache.hadoop.ozone.client.io.ECKeyOutputStream.handleWrite(ECKeyOutputStream.java:383)
at org.apache.hadoop.ozone.client.io.ECKeyOutputStream.write(ECKeyOutputStream.java:163)
at org.apache.hadoop.ozone.client.io.OzoneOutputStream.write(OzoneOutputStream.java:50)
at org.apache.hadoop.ozone.freon.ContentGenerator.write(ContentGenerator.java:77)
at org.apache.hadoop.ozone.freon.OzoneClientKeyGenerator.lambda$createKey$0(OzoneClientKeyGenerator.java:115)
at com. codahale.metrics.Timer.time(Timer.java:101)
at org.apache.hadoop.ozone.freon.OzoneClientKeyGenerator.createKey(OzoneClientKeyGenerator.java:112)
at org.apache.hadoop.ozone.freon.BaseFreonGenerator.tryNextTask(BaseFreonGenerator.java:184)
at org.apache.hadoop.ozone.freon.BaseFreonGenerator.taskLoop(BaseFreonGenerator.java:164)
at org.apache.hadoop.ozone.freon.BaseFreonGenerator.lambda$startTaskRunners$0(BaseFreonGenerator.java:147)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread .run(Thread.java: 748)
{noformat}


> EC: Retry failed writes before rewrite to a new block group
> -----------------------------------------------------------
>
>                 Key: HDDS-7187
>                 URL: https://issues.apache.org/jira/browse/HDDS-7187
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Kaijie Chen
>            Assignee: Kaijie Chen
>            Priority: Major
>
> When WriteChunk or PutBlock fails when writing stripe, client should retry the operation before rewrite this stripe to a new block. So that the block files on disk will be as large as possible.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org