You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2021/02/28 16:15:00 UTC

[jira] [Assigned] (HADOOP-17553) FileSystem.close() to optionally log IOStats; save to local dir

     [ https://issues.apache.org/jira/browse/HADOOP-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steve Loughran reassigned HADOOP-17553:
---------------------------------------

    Assignee: Mehakmeet Singh

> FileSystem.close() to optionally log IOStats; save to local dir
> ---------------------------------------------------------------
>
>                 Key: HADOOP-17553
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17553
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs, fs/azure, fs/s3
>    Affects Versions: 3.3.1
>            Reporter: Steve Loughran
>            Assignee: Mehakmeet Singh
>            Priority: Major
>
> We could save the IOStats to a local temp dir as JSON (the snapshot is designed to be serializable, even has a test), with a unique name (iostats-stevel-s3a-bucket1-timestamp-random#.json ... etc). 
> We can collect these (Rajesh can, anyway), and then
> * look for load on a specific bucket
> * look what happened at a specific time
> The best bit: the IOStatisticsSnapshot aggregates counters, min/max/mean, so you could merge iostats-*-s3a-bucket1-*.json to get the IOStats of all principals working with a given bucket
> This will be local, so low cost, low cost enough we could turn it on in production. All that's needed is collection of the stats from the local hosts (or they write to a shared mounted volume)
> We will need some "hadoop iostats merge" command to take multiple files and merge them all together; print to screen or save to a new file. Straightforward as all the load and merge code is present.
> Needs
> * logging in FS.close
> * new iostats CLI + docs, tests
> * extend IOStatisticsSnapshot with list of <string, string> options for use in annotating saved logs (hostname, principal, jobID, ...). Don't know how to merge these.
> If we are going to add a new context map to the IOStatisticsSnapshot then we MUST update it before 3.3.1 ships so as to avoid breaking the serialization format on the next release, especially the java one. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org