You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Xu Shao Hong (Jira)" <ji...@apache.org> on 2022/01/19 09:37:00 UTC

[jira] [Updated] (HDDS-6203) CleanUp gz files failed to be fully written during Container move

     [ https://issues.apache.org/jira/browse/HDDS-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xu Shao Hong updated HDDS-6203:
-------------------------------
    Description: 
Right now, the container re-replication will be sent with GRPC as gz files to the temporary dir. If the temporary dir is small, there will be a contest for space left that concurrent threads downloading the containers will compete to write the downloaded byte buffer to the actual files with FileOutputStream. 

Once the thread fails to write the buffer to file, the current logic will not clean up the failed and incomplete file and just complete exceptionally as the code shows.

 
{code:java}
GrpcReplicationClient

@Override
public void onNext(CopyContainerResponseProto chunk) {
  try {
    chunk.getData().writeTo(stream);
  } catch (IOException e) {
    response.completeExceptionally(e);
  }
} {code}
the exception will be caught at ```getContainerDataFromReplicas``` and only will be logged as an error.

Thus it is necessary to clean up the possible incomplete files which failed in this case.

 

 

From https://issues.apache.org/jira/browse/HDDS-5188, maybe we should improve the protocol in the future. 

 

In addition, I have tested manually to mimic such a contest case and proved the incomplete files remained, the example could be seen in the attachment. I manually create a mounted disk of 5G size as temp file dir.

 

  was:
Right now, the container re-replication will be sent with GRPC as gz files to the temporary dir. If the temporary dir is small, there will be a contest for space left that concurrent threads downloading the containers will compete to write the downloaded byte buffer to the actual files with FileOutputStream. 

Once the thread fails to write the buffer to file, the current logic will not clean up the failed and incomplete file and just complete exceptionally as the code shows.

 
{code:java}
GrpcReplicationClient

@Override
public void onNext(CopyContainerResponseProto chunk) {
  try {
    chunk.getData().writeTo(stream);
  } catch (IOException e) {
    response.completeExceptionally(e);
  }
} {code}
the exception will be caught at ```getContainerDataFromReplicas``` and only will be logged as an error.

Thus it is necessary to clean up the possible incomplete files which failed in such case.

(I have tested manually to mimic such a contest case and proved the incomplete files remained, the example could be seen in attachment)

 


> CleanUp gz files failed to be fully written during Container move
> -----------------------------------------------------------------
>
>                 Key: HDDS-6203
>                 URL: https://issues.apache.org/jira/browse/HDDS-6203
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Xu Shao Hong
>            Assignee: Xu Shao Hong
>            Priority: Major
>
> Right now, the container re-replication will be sent with GRPC as gz files to the temporary dir. If the temporary dir is small, there will be a contest for space left that concurrent threads downloading the containers will compete to write the downloaded byte buffer to the actual files with FileOutputStream. 
> Once the thread fails to write the buffer to file, the current logic will not clean up the failed and incomplete file and just complete exceptionally as the code shows.
>  
> {code:java}
> GrpcReplicationClient
> @Override
> public void onNext(CopyContainerResponseProto chunk) {
>   try {
>     chunk.getData().writeTo(stream);
>   } catch (IOException e) {
>     response.completeExceptionally(e);
>   }
> } {code}
> the exception will be caught at ```getContainerDataFromReplicas``` and only will be logged as an error.
> Thus it is necessary to clean up the possible incomplete files which failed in this case.
>  
>  
> From https://issues.apache.org/jira/browse/HDDS-5188, maybe we should improve the protocol in the future. 
>  
> In addition, I have tested manually to mimic such a contest case and proved the incomplete files remained, the example could be seen in the attachment. I manually create a mounted disk of 5G size as temp file dir.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org