You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/12/02 13:00:35 UTC

[GitHub] [iceberg] gaborkaszab commented on pull request #5837: API,Core: Introduce metrics for data files by file format

gaborkaszab commented on PR #5837:
URL: https://github.com/apache/iceberg/pull/5837#issuecomment-1335193062

   Thanks for the review on https://github.com/gaborkaszab/iceberg/commit/b268f65f3d588f30c637aa3db3f53be2a5f4c80d @nastra! I think I addressed most of them.
   
   The changes since the last patch:
   - Made the MultiDimensionCounter non-generic.
   - The MultiDimensionCounter receives a name in the constructor.
   - The key for the internal counters is created from the MultiDimensionCounter name plus the name of the dimension.
   - Made a MultiDimensionCounter interface plus a DefaultMultiDimensionCounter implementation for it.
   - There is a NOOP implementation as well.
   - The DefaultMultiDimensionCounter lives in ScanMetrics but is created through MetricsContext. I believe this is in line with how the other Counters and Timers are handled.
   - The counters for the dimensions still live within the DefaultMultiDimensionCounter internally, but they are also created through MetricsContext. I figured this is the most natural way to represent the multi-dimensions.
   - The MultiDimensionCounterResult has a function to aggregate the results for all dimensions.
   - Updated the existing tests to cover the new counters, also to/from Json conversions are taken care. All tests should pass now.
   - Wrote some new tests as well to have more coverage on the new code.
   
   What is remaining:
   Currently there is only support for one dimension. I still have to figure out how this would look like having more than one dimensions. I think it has to satisfy the following requirements:
   - The number of dimensions should be decided during instantiation of the multi-counter. Maybe constructor param?
   - The increase() and value() functions should make sure that they receive as many "dimension key"s as many dimensions were decided in the constructor.
   - The key for the internal counters could be constructed from the name of the multi-counter plus appending the name of the dimensions. I think this should be enough to have unique keys when creating through MetricsContext.
   - The order of the dimensions matters.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org