You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/12/24 00:06:01 UTC

[jira] [Resolved] (IMPALA-2277) Investigate alternative hash functions

     [ https://issues.apache.org/jira/browse/IMPALA-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Armstrong resolved IMPALA-2277.
-----------------------------------
    Resolution: Later

We switched to FastHash in some places.

* CRC is very cheap to evaluate and it's hard to outdo for short data types in perf-critical places like hash join
* FastHash is good for mixed data where we want a higher quality hash function, e.g. data distribution in an exchange
* Many other hash functions only have benefits on long strings.

> Investigate alternative hash functions
> --------------------------------------
>
>                 Key: IMPALA-2277
>                 URL: https://issues.apache.org/jira/browse/IMPALA-2277
>             Project: IMPALA
>          Issue Type: Task
>          Components: Backend
>    Affects Versions: Impala 2.2
>            Reporter: Tim Armstrong
>            Priority: Minor
>
> Impala currently uses FNV, Murmur2 and CRC hashes in different places depending on requirements. There are additional, newer, hash functions available including Murmur3, SpookyHash, CityHash, and others that may offer benefits.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)