You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Sergey Shelukhin (JIRA)" <ji...@apache.org> on 2014/08/05 20:02:12 UTC
[jira] [Created] (HIVE-7617) optimize bytes mapjoin hash table read
path wrt serialization, at least for common cases
Sergey Shelukhin created HIVE-7617:
--------------------------------------
Summary: optimize bytes mapjoin hash table read path wrt serialization, at least for common cases
Key: HIVE-7617
URL: https://issues.apache.org/jira/browse/HIVE-7617
Project: Hive
Issue Type: Improvement
Reporter: Sergey Shelukhin
BytesBytes has table stores keys in the byte array for compact representation, however that means that the straightforward implementation of lookups serializes lookup keys to byte arrays, which is relatively expensive.
We can either shortcut hashcode and compare for common types on read path (integral types which would cover most of the real-world keys), or specialize hashtable and from BytesBytes... create LongBytes, StringBytes, or whatever. First one seems simpler now.
--
This message was sent by Atlassian JIRA
(v6.2#6252)