You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2020/03/03 19:56:00 UTC

[jira] [Commented] (HADOOP-16900) Very large files can be truncated when written through S3AFileSystem

    [ https://issues.apache.org/jira/browse/HADOOP-16900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17050512#comment-17050512 ] 

Steve Loughran commented on HADOOP-16900:
-----------------------------------------

This is bad. you are probably the first person writing quite so much data.

What should we do? Fail fast without writing anything? OR explicitly save as truncated before failing?

+[~gabor.bota]

> Very large files can be truncated when written through S3AFileSystem
> --------------------------------------------------------------------
>
>                 Key: HADOOP-16900
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16900
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 3.2.1
>            Reporter: Andrew Olson
>            Priority: Major
>              Labels: s3
>
> If a written file size exceeds 10,000 * {{fs.s3a.multipart.size}}, a corrupt truncation of the S3 object will occur as the maximum number of parts in a multipart upload is 10,000 as specific by the S3 API and there is an apparent bug where this failure is not fatal, and the multipart upload is allowed to be marked as completed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org