You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Stamatis Zampetakis (Jira)" <ji...@apache.org> on 2020/10/01 19:15:00 UTC

[jira] [Created] (HIVE-24221) Use vectorizable expression to combine multiple columns in semijoin bloom filters

Stamatis Zampetakis created HIVE-24221:
------------------------------------------

             Summary: Use vectorizable expression to combine multiple columns in semijoin bloom filters
                 Key: HIVE-24221
                 URL: https://issues.apache.org/jira/browse/HIVE-24221
             Project: Hive
          Issue Type: Improvement
          Components: Query Planning
         Environment: 

            Reporter: Stamatis Zampetakis
            Assignee: Stamatis Zampetakis


Currently, multi-column semijoin reducers use an n-ary call to GenericUDFMurmurHash to combine multiple values into one, which is used as an entry to the bloom filter. However, there are no vectorized operators that treat n-ary inputs. The same goes for the vectorized implementation of GenericUDFMurmurHash introduced in HIVE-23976. 

The goal of this issue is to choose an alternative way to combine multiple values into one to pass in the bloom filter comprising only vectorized operators.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)