You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/05/01 02:08:48 UTC

[GitHub] [incubator-druid] gianm commented on issue #7571: Optimize coordinator API to retrieve segments with overshadowed status

gianm commented on issue #7571: Optimize coordinator API to retrieve segments with overshadowed status
URL: https://github.com/apache/incubator-druid/issues/7571#issuecomment-488175532
 
 
   The general idea of computing overshadowedness as few times as possible sounds good to me, and in particular computing it on `poll` sounds good. I think we might as well do it eagerly, though, since it's always going to be needed (DruidCoordinatorRuleRunner, for example, always computes the list of overshadowed segments so it can avoid loading them on historicals).
   
   I am skeptical of making DataSegment mutable, though. The alternative is a wrapper object: something like DataSegmentWithTimelineContext or DataSegmentWithOvershadowedState. The main motivation I could see for adding fields to DataSegment, vs. doing a wrapper object, is memory savings. (Is there some other reason I am missing?)
   
   It is nice, I think, that DataSegment objects are self-contained -- they don't depend on other context. 'isOvershadowed' by its nature depends on other context (namely, what _other_ DataSegments exist). It won't be possible to set the field correctly unless _all_ DataSegment objects for a particular datasource are considered. This won't be possible in all cases, which will make it unclear when the field is valid and when it's not, opening up new possible bugs. A wrapper encodes into the type system whether or not the contextual overshadowedness information has been computed or not.
   
   Furthermore, I am not convinced this tactic will be effective at minimizing memory use. As an example, the proposed `Map<DataSegment, DataSegment>` in `MetadataSegmentView` would need to keep two copies of each DataSegment, since the key and value are different objects. A wrapper would be cheaper, since the DataSegment object contained by different wrappers could be shared.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org