You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2021/02/28 16:15:00 UTC

[jira] [Created] (HADOOP-17553) FileSystem.close() to optionally log IOStats; save to local dir

Steve Loughran created HADOOP-17553:
---------------------------------------

             Summary: FileSystem.close() to optionally log IOStats; save to local dir
                 Key: HADOOP-17553
                 URL: https://issues.apache.org/jira/browse/HADOOP-17553
             Project: Hadoop Common
          Issue Type: Sub-task
          Components: fs, fs/azure, fs/s3
    Affects Versions: 3.3.1
            Reporter: Steve Loughran


We could save the IOStats to a local temp dir as JSON (the snapshot is designed to be serializable, even has a test), with a unique name (iostats-stevel-s3a-bucket1-timestamp-random#.json ... etc). 

We can collect these (Rajesh can, anyway), and then
* look for load on a specific bucket
* look what happened at a specific time

The best bit: the IOStatisticsSnapshot aggregates counters, min/max/mean, so you could merge iostats-*-s3a-bucket1-*.json to get the IOStats of all principals working with a given bucket

This will be local, so low cost, low cost enough we could turn it on in production. All that's needed is collection of the stats from the local hosts (or they write to a shared mounted volume)
We will need some "hadoop iostats merge" command to take multiple files and merge them all together; print to screen or save to a new file. Straightforward as all the load and merge code is present.


Needs
* logging in FS.close
* new iostats CLI + docs, tests
* extend IOStatisticsSnapshot with list of <string, string> options for use in annotating saved logs (hostname, principal, jobID, ...). Don't know how to merge these.

If we are going to add a new context map to the IOStatisticsSnapshot then we MUST update it before 3.3.1 ships so as to avoid breaking the serialization format on the next release, especially the java one. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org