You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2019/11/05 14:58:48 UTC

[GitHub] [incubator-iceberg] aokolnychyi opened a new issue #617: Codahale instrumentation

aokolnychyi opened a new issue #617: Codahale instrumentation
URL: https://github.com/apache/incubator-iceberg/issues/617
 
 
   I think it makes sense to consider using a metrics lib to measure things like commit time, job planning time, time to connect to the metastore, etc. Codahale seems to be a very popular option for this.
   
   There are a couple of design decisions we need to make:
   
   - Should Iceberg be independent of query engines when it comes to metrics? For example, Spark has `MetricsSystem` with its own sources and sinks. One option is to create only a custom metrics source for Spark and use the existing logic to register it. The main question is whether this approach is flexible enough and will work with arbitrary query engines. As opposed to this, Iceberg can have its own `MetricsSystem` and its own way to register and report metrics.
   - Do we want to instrument `planFiles`/`planTasks`? If yes, we will have to modify the iterators. Alternatively, we can say that metrics should be collected at the data source level. However, this restricts us in what metrics we can actually collect.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org