You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/03/02 19:26:15 UTC

[GitHub] [druid] maytasm edited a comment on pull request #10900: Add query granularity to compaction task

maytasm edited a comment on pull request #10900:
URL: https://github.com/apache/druid/pull/10900#issuecomment-789152703


   > Assuming the doc changes are coming in a follow up change - we should make it clear what the tradeoffs are for rolling up data.
   > 
   > If someone accidentally rolls up data to a coarser query granularity (MONTH -> YEAR) - do they have any way to get the segments with the finer queryGranularity back (MONTH)?
   > 
   > > Note that the Compaction task ultimately still converts CompactionGranularitySpec to a UniformGranularitySpec when creating the index ingestion spec
   > 
   > I don't understand the granularity specs enough. What's the impact of using UniformGranularitySpec instead of ArbitraryGranularitySpec? Is this something a user should be aware of?
   
   The doc change is coming in a separate PR (https://github.com/apache/druid/pull/10935). I will make it clear that If someone rolls up data to a coarser query granularity (MONTH -> YEAR) -the segment with finer queryGranularity (MONTH) will be overshadowed. Those segments may be remove from deep storage if a kill task is run on those intervals. Hence, user can lose data with finer queryGranularity (MONTH). 
   
   Regarding UniformGranularitySpec vs. ArbitraryGranularitySpec. This was a design choice of existing implementation. IT was not changed in this PR. It is not something a user should be aware of. UniformGranularitySpec works much better in this case as Druid will automatically buckets down the interval according to the segmentGranularity. A ArbitraryGranularitySpec requires you to explicitly list out all bucket intervals. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org