You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2016/09/12 11:35:20 UTC

[jira] [Commented] (HADOOP-13560) S3ABlockOutputStream to support huge (many GB) file writes

    [ https://issues.apache.org/jira/browse/HADOOP-13560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483862#comment-15483862 ] 

Steve Loughran commented on HADOOP-13560:
-----------------------------------------

latest PR update includes documentation.

One feature we could also consider is caching write failures in the background threads and raising them in the output stream thread once one one has been caught and logged. This will not let the caller re-attempt the write but it will make it visible, *and it will make it visible fast*.

As it is, failures are delayed until the close operation and waitForAllPartUploads(), which is at risk of having the exception swallowed, or, at best, logged —so risking hiding the fact that a write has failed and that data has been lost.

> S3ABlockOutputStream to support huge (many GB) file writes
> ----------------------------------------------------------
>
>                 Key: HADOOP-13560
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13560
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.9.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
>         Attachments: HADOOP-13560-branch-2-001.patch, HADOOP-13560-branch-2-002.patch
>
>
> An AWS SDK [issue|https://github.com/aws/aws-sdk-java/issues/367] highlights that metadata isn't copied on large copies.
> 1. Add a test to do that large copy/rname and verify that the copy really works
> 2. Verify that metadata makes it over.
> Verifying large file rename is important on its own, as it is needed for very large commit operations for committers using rename



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org