You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@druid.apache.org by GitBox <gi...@apache.org> on 2018/07/24 20:25:51 UTC

[GitHub] himanshug edited a comment on issue #6016: Druid 'Shapeshifting' Columns

himanshug edited a comment on issue #6016: Druid 'Shapeshifting' Columns
URL: https://github.com/apache/incubator-druid/pull/6016#issuecomment-407536584
 
 
   This is impressive.
   
   I haven't read the code yet, but just the description. I had few doubts...
   
   First "Time to select rows" benchmark appears to have a peak just before 3M . Assuming that peak is at x M, it says performance would be better when selecting (x+delta) rows instead of x rows. I'm interested if there is an explanation for it. That peak shows up at different points in x-axis very consistently in all similar graphs. 
   
   In "The bad" section , it seems ShapeShiftingColumn will outperform current impl in *all* cases, if they could use blocks of varying sizes. That sounds great and deserves ateast validating that. If true, then I think it is well worth it even with a little bit of extra heap required given that this feature already requires re-tuning heap and maybe other jvm params.
   
   Does new design to read data increase heap reqirements, if yes then that would not be end of the world but deserves mention in the "bad" section (so re-tuning at historicals as well). Also new read design introduces mutation and I hope this doesn't mean requiring locking etc in the face of concurrent segment read or else that might cause more problems than being solved.
   That said, it appears that minor changes here can whack the performance very easily (maybe even different jvm versions, hardware will have different performance). unsure whether same code produces different performance on different jvm version and hardware. This is hard problem.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@druid.apache.org
For additional commands, e-mail: dev-help@druid.apache.org