You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Bankim Bhavsar (Jira)" <ji...@apache.org> on 2021/05/18 17:27:00 UTC

[jira] [Resolved] (KUDU-3286) Add special handling for empty strings for Bloom filter predicate push down

     [ https://issues.apache.org/jira/browse/KUDU-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bankim Bhavsar resolved KUDU-3286.
----------------------------------
    Fix Version/s: 1.15.0
       Resolution: Fixed

> Add special handling for empty strings for Bloom filter predicate push down
> ---------------------------------------------------------------------------
>
>                 Key: KUDU-3286
>                 URL: https://issues.apache.org/jira/browse/KUDU-3286
>             Project: Kudu
>          Issue Type: Improvement
>    Affects Versions: 1.13.0
>            Reporter: Bankim Bhavsar
>            Assignee: Bankim Bhavsar
>            Priority: Major
>             Fix For: 1.15.0
>
>
> Fast hash used with Bloom filter predicate pushdown has special handling for nullptr.
> [https://github.com/apache/kudu/blob/master/src/kudu/util/hash_util.h#L95]
> However there isn't any special handling for empty objects/strings. Fast hash for an empty string with seed=0 generates a hash value of 0. This doesn't set any bits in Bloom filter and as a result empty strings are reported as not present.
> Impala uses the direct bloom filter approach and includes special handling for empty strings.
> [https://github.com/apache/impala/blob/master/be/src/runtime/raw-value.inline.h#L352]
> This leads to discrepancy between Impala and Kudu and returns incorrect join results.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)