You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/05/21 11:24:39 UTC

[GitHub] [incubator-druid] clintropolis opened a new pull request #7719: add bloom filter fallback aggregator when types are unknown

clintropolis opened a new pull request #7719: add bloom filter fallback aggregator when types are unknown
URL: https://github.com/apache/incubator-druid/pull/7719
 
 
   I discovered a similar issue to #7660 while working on #7718 with the bloom filter aggregator, where it behaved in a manner even more strict than the quantiles aggregator, just not working at all if `ColumnCapabilities` are not available. This PR remedies this issue by adding a fallback aggregator, `ObjectBloomFilterAggregator` which examines the objects and aggregates to the best of its ability. 
   
   This (and many other) aggregator could perhaps be improved by using something like a functional interface inside `bufferAdd` to have the initial version of the function checking types, and then locking in a selector specialized function after the first non-null value. However, since i'm unsure if the cost of the if is insignificant to the rest of the work, and since this is not the only aggregator that is using this per-row check, I save exploring this optimization for future work revisiting complex value aggregators as a whole.
   
   The added test only works for group by v2 because the bloom filter aggregator only has stub methods for it's `ComplexMetricSerde`, which group by v1 requires to be a bit more implemented perform nested queries, and results in some confusing `Bloom filter aggregators are query-time only` error messages that should probably be fixed in a follow-up PR.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org