You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/06/18 18:13:30 UTC

[GitHub] [incubator-druid] gianm edited a comment on issue #7588: multi-value string column support for expressions

gianm edited a comment on issue #7588: multi-value string column support for expressions
URL: https://github.com/apache/incubator-druid/pull/7588#issuecomment-503250609
 
 
   > I think my main outstanding question for this PR is whether or not this new expression selector behavior should be optional for now, because there is a potential performance impact for expression selectors with bindings from row based selectors since they will have to use `RowBasedExpressionColumnValueSelector` with it's per row check of input binding type. I do hope to eliminate nearly all of these cases in future work by enhancing the type information throughout the query pipeline, but i'm not certain how far in the future that actually is and I haven't had the chance to conduct thorough measurements on performance difference, which is why I brought this up. Also I could see a use for turning off this behavior if certain there are no multi-value dimensions. On the other hand though, we have a lot of options, I'm not sure we need more. Maybe some of the improvements done in this PR, like argument validation for functions being done at parse time instead of for each eval, make up enough for the `RowBasedExpressionColumnValueSelector` phase that it about evens out. Thoughts?
   
   I wouldn't worry too much about this overhead, for two reasons:
   
   - Type check overhead should be very minimal in the grand scheme of everything that needs to happen per row during the input binding phase.
   - Right now the expression layer is quite inefficient due to not having types known in advance, and this is really just one more manifestation of that. We should be addressing this in a holistic way in the future & so I wouldn't worry too much about any particular instance right now. At this point, we should be able to eliminate the overhead completely.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org