You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Deepak Jaiswal (JIRA)" <ji...@apache.org> on 2017/01/20 07:53:26 UTC

[jira] [Created] (HIVE-15676) Remove Bloom Filters from semi join reduction if it is too big.

Deepak Jaiswal created HIVE-15676:
-------------------------------------

             Summary: Remove Bloom Filters from semi join reduction if it is too big.
                 Key: HIVE-15676
                 URL: https://issues.apache.org/jira/browse/HIVE-15676
             Project: Hive
          Issue Type: Improvement
            Reporter: Deepak Jaiswal
            Assignee: Deepak Jaiswal


Bloom filters themselves could become really big if the row count is high. Aggregating such bloom filters in reducers could be even more expensive. For e.g., a bloom filter for 100M rows can be as big as 170MB. Aggregating 100 such filters in reducer could end up taking 17GB of memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)