You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/07/20 19:46:22 UTC

[GitHub] [arrow] save-buffer commented on pull request #13487: ARROW-16945: [C++] Add new scalar compute function for 32-bit hashing

save-buffer commented on PR #13487:
URL: https://github.com/apache/arrow/pull/13487#issuecomment-1190686364

   Weston is correct, the `key_hash` interface is only meant for internal use of hash tables. It subtly doesn't match xxh in some cases (iirc it for example can pad with 0's instead of having special handling for arbitrary lengths). The interface is also optimized for hashing in mini batches and reusing as much memory as possible. In general I would suggest using `util/hashing` instead of `key_hash`, as `key_hash` is for internal use; we may want to change it at any time (e.g. we had an idea to see if we can make a really crappy hash that's very fast to compute but is "good enough" for a hash table). 
   
   I can leave a review of the current code, but I'd suggest moving away from the `key_hash` interface overall.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org