You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/03/11 05:19:00 UTC

[GitHub] [druid] suneet-s commented on pull request #10935: First refactor of compaction

suneet-s commented on pull request #10935:
URL: https://github.com/apache/druid/pull/10935#issuecomment-796461843


   I echo others comments on this PR. This is a huge improvement - thank you @techdocsmith ! I haven't verified the correctness of how exactly compaction works, or the details of the different tuning knobs
   
   Some overall structural feedback (doesn't need to be addressed in this PR):
   - I think the data management doc should be broken into a few separate docs. Seeing compaction pulled out of there - it feels like this would be a good landing page - that then points you to "getting data in", "Optimizing data", "Updating data"(maybe) and "Deleting data" This is obviously beyond the scope of this PR, but I think it's worth mentioning because it adds structure around how to think about data and managing data in Druid.
   - Data management also talks about lookups, while the rest of the doc talks about datasources. This seemed a little out of place when I was reading locally. I don't have a suggestion for how to structure this right now, but wanted to surface it in case you had better ideas.
   - The compaction page currently talks about the what. I wonder if it needs to be split into 2 pages (or sections), one that spells out the "why should I care/ I want to do..." a little bit more, and another that spells out "how do I do that". Maybe it can be intertwined in the same page?
   - I really like the distinction between auto-compaction and manual compaction. However the page doesn't link to anything that tells me how to use auto-compaction, but it does link to something about manual compaction. Are there instructions for auto-compaction elsewhere?
   - There are some known differences between auto-compaction and manual compaction. Support for queryGranularity is one right now. Do you think we should call this out in the section that talks about the differences between the 2. This is tricky, because it's like a gap in functionality - but it's a gotcha I think users will want to know about.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org