You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2021/01/14 14:06:01 UTC
[jira] [Updated] (HADOOP-17434) Improve S3A upload statistics
collection from ProgressEvent callbacks
[ https://issues.apache.org/jira/browse/HADOOP-17434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran updated HADOOP-17434:
------------------------------------
Parent: HADOOP-17469 (was: HADOOP-16830)
> Improve S3A upload statistics collection from ProgressEvent callbacks
> ---------------------------------------------------------------------
>
> Key: HADOOP-17434
> URL: https://issues.apache.org/jira/browse/HADOOP-17434
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.4.0
> Reporter: Steve Loughran
> Priority: Minor
>
> Collection of S3A upload stats from ProgressEvent callbacks can be improved
> Two similar but different implementations of listeners
> * org.apache.hadoop.fs.s3a.S3ABlockOutputStream.BlockUploadProgress
> * org.apache.hadoop.fs.s3a.ProgressableProgressListener. Used on simple PUT calls.
> Both call back into S3A FS to incrementWriteOperations; BlockUploadProgress also updates S3AInstrumentation/IOStatistics.
> * I'm not 100% confident that BlockUploadProgress is updating things (especially gauges of pending bytes) at the right time
> * or that completion is being handled
> * And the other interface doesn't update S3AInstrumentation; numbers are lost.
> * And there's no incremental updating during {{CommitOperations.uploadFileToPendingCommit()}}, which doesn't call Progressable.progress() other than on every block.
> * or in MultipartUploader
> Proposed:
> * a single Progress listener which updates BlockOutputStreamStatistics, used by all interfaces.
> * WriteOperations to help set this up for callers;
> * And it's uploadPart API to take a Progressable (or the progress listener to use for uploading that part)
> * Multipart upload API to also add a progressable...would help for distcp-like applications.
> +Itests to verify that the gauges come out right. At the end of each operation, the #of bytes pending upload == 0; that of bytes uploaded == the original size
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org