You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/02/01 19:01:28 UTC

[GitHub] [hudi] prashantwason commented on pull request #2496: [HUDI-1554] Introduced buffering for streams in HUDI.

prashantwason commented on pull request #2496:
URL: https://github.com/apache/hudi/pull/2496#issuecomment-771082643


   > Do we need to enable metrics FS(time aware, size aware) as well by default whenever buffering is enabled
   
   There are two parts to metrics in HUDI:
   1. Metrics in-memory (explained below)
   2. Publishing the metrics out (e.g to Graphana). This needs to be enabled explicitly (disabled by default) and requires external infrastructure. 
   
   Metrics within HUDI are implemented using Registry which simply maintains the key-value metric pairs in memory. Each metric itself is a AtomicLong held in a in-memory Hash-map. 
   
   Therefore the overhead of incrementing a metrics is:
   1. HasMap lookup to find the Counter
   2. AtomicLong.addAndGet()
   
   So this should be negligible overhead on modern processors unless we are maintain millions of metrics. 
   
   I feel the checks of metrics enable everywhere (if-metrics-enabled-then-do-something) tend to make the code ugly and they dont provide any performance benefits. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org