You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "steveloughran (via GitHub)" <gi...@apache.org> on 2023/02/24 14:48:28 UTC

[GitHub] [hadoop] steveloughran commented on pull request #5402: HADOOP-18635 : Expose distcp counters to user via new DistCpConstants "CONF_LABEL_DI…

steveloughran commented on PR #5402:
URL: https://github.com/apache/hadoop/pull/5402#issuecomment-1443788914

   
   I really don't like how the results come back.
   
   I'm going to propose adding IOStatistics support to distcp so lined up for future work and to not modify the source config to suddenly become two way exchange of data
   
   1. DistCp to implement IOStatisticsSource
   1. until job finishes, getIOStatistics() to return null
   3. when job finished, 
   
   ```
   
   // to create a builder
   IOStatisticsStore iostats = IOStatisticsBinding.iostatisticsStore()
     .withCounter(DISTCP_TOTAL_BYTES_COPIED)
     .build()
     
   // then set the counter to the retrieved value
   iostats.setCounter(DISTCP_TOTAL_BYTES_COPIED, <counter>)
   ```
   
   This is extra work and you have to learn a new api, but
   
   * IOStatisticsAssertions has the asserts
   * IOStatisticsLogging has pretty printing
   * you can take an IOStatisticsSnapshot and send over the wire as json or java serialized object
   * lines it up perfectly for us collecting more detailed stats, not just from the workers (trickier...) but also cost of directory scanning, cleanup etc.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org