You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2021/01/14 14:06:01 UTC

[jira] [Updated] (HADOOP-17434) Improve S3A upload statistics collection from ProgressEvent callbacks

     [ https://issues.apache.org/jira/browse/HADOOP-17434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steve Loughran updated HADOOP-17434:
------------------------------------
    Parent: HADOOP-17469  (was: HADOOP-16830)

> Improve S3A upload statistics collection from ProgressEvent callbacks
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-17434
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17434
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.4.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> Collection of S3A upload stats from ProgressEvent callbacks can be improved
> Two similar but different implementations of listeners
> * org.apache.hadoop.fs.s3a.S3ABlockOutputStream.BlockUploadProgress
> * org.apache.hadoop.fs.s3a.ProgressableProgressListener. Used on simple PUT calls.
> Both call back into S3A FS to incrementWriteOperations; BlockUploadProgress also updates S3AInstrumentation/IOStatistics.
> * I'm not 100% confident that BlockUploadProgress is updating things (especially gauges of pending bytes) at the right time
> * or that completion is being handled
> * And the other interface doesn't update S3AInstrumentation; numbers are lost.
> * And there's no incremental updating during {{CommitOperations.uploadFileToPendingCommit()}}, which doesn't call Progressable.progress() other than on every block.
> * or in MultipartUploader 
> Proposed: 
> * a single Progress listener which updates BlockOutputStreamStatistics, used by all interfaces.
> * WriteOperations to help set this up for callers; 
> * And it's uploadPart API to take a Progressable (or the progress listener to use for uploading that part)
> * Multipart upload API to also add a progressable...would help for distcp-like applications.
> +Itests to verify that the gauges come out right. At the end of each operation, the #of bytes pending upload == 0; that of bytes uploaded == the original size



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org