You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by sohami <gi...@git.apache.org> on 2017/09/25 03:00:40 UTC
[GitHub] drill pull request #959: DRILL-5816: Hash function produces skewed results o...
GitHub user sohami opened a pull request:
https://github.com/apache/drill/pull/959
DRILL-5816: Hash function produces skewed results on String values wi…
…th same leading prefix
Note: Changing hash32 computation to use Murmur3.hash32 instead of int casted version of Murmur3.hash64
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sohami/drill DRILL-5816
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/drill/pull/959.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #959
----
commit 4b9dd5be778307138e5fc60041232c66c6671d75
Author: Sorabh Hamirwasia <sh...@maprtech.com>
Date: 2017-09-15T22:07:50Z
DRILL-5816: Hash function produces skewed results on String values with same leading prefix
Note: Changing hash32 computation to use Murmur3.hash32 instead of int casted version of Murmur3.hash64
----
---
[GitHub] drill issue #959: DRILL-5816: Hash function produces skewed results on Strin...
Posted by sohami <gi...@git.apache.org>.
Github user sohami commented on the issue:
https://github.com/apache/drill/pull/959
@amansinha100 - Please help to review.
---
[GitHub] drill pull request #959: DRILL-5816: Hash function produces skewed results o...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/drill/pull/959
---
[GitHub] drill issue #959: DRILL-5816: Hash function produces skewed results on Strin...
Posted by sohami <gi...@git.apache.org>.
Github user sohami commented on the issue:
https://github.com/apache/drill/pull/959
I have created [DRILL-5287](https://issues.apache.org/jira/browse/DRILL-5827) for adding tests.
---
[GitHub] drill issue #959: DRILL-5816: Hash function produces skewed results on Strin...
Posted by sohami <gi...@git.apache.org>.
Github user sohami commented on the issue:
https://github.com/apache/drill/pull/959
@paul-rogers - I have added few tests and findings using hash32 and hash64 to compute the 32 bit hash codes in JIRA.
---
[GitHub] drill issue #959: DRILL-5816: Hash function produces skewed results on Strin...
Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on the issue:
https://github.com/apache/drill/pull/959
The change looks benign. I wonder; do we have a test case that exercises the hash functions so we can be certain that the fix actually solves the original problem?
---
[GitHub] drill issue #959: DRILL-5816: Hash function produces skewed results on Strin...
Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on the issue:
https://github.com/apache/drill/pull/959
lgtm. +1. @sohami could you pls check with @chunhui-shi if unit tests were added previously for DRILL-4237 (another skew issue)? If not, could you create a new JIRA to add such tests, which should include the data set for DRILL-4237, DRILL-4119, DRILL-5816 among others.
---