You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/11/25 07:07:16 UTC

[GitHub] [pinot] richardstartin edited a comment on pull request #7828: Fix thread safety issue and add cache to EmptySegmentPruner

richardstartin edited a comment on pull request #7828:
URL: https://github.com/apache/pinot/pull/7828#issuecomment-978891304


   >  I actually considered that, and decided to go with the snapshot approach because we want to optimize the query runtime method as much as possible
   
   I don’t think this likely to be an optimisation unless the removeAll call is made frequently and the number of empty segments is large.
   
   I think when an optimisation makes the code much harder to read we need 4 bits of data:
   
   1. **evidence that there is a problem** - e.g. wall time thresholded tracing events around EmptySegmentPruner.prune, or a sampling profiler reporting lots of samples in this method
   2. **evidence that the thing being optimised solves the problem** - assuming we have data that there is a problem, is it the removeAll call (which this change may optimise) or the copy on line 151 which is the cause?
   3. **evidence the problem is common** - e.g. statistics about number of empty segments from a real cluster, do we have a metric for this?
   4. **evidence the optimisation is effective** - for widely observed set sizes, does it actually make a difference to have a HashSet instead of a concurrent HashSet?
   
   I raise this only because the code is *much* harder to read than if it used a threadsafe Set. If it were as readable I would be inclined to assume that the optimisation is effective.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org