You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by sohami <gi...@git.apache.org> on 2017/09/25 03:00:40 UTC

[GitHub] drill pull request #959: DRILL-5816: Hash function produces skewed results o...

GitHub user sohami opened a pull request:

    https://github.com/apache/drill/pull/959

    DRILL-5816: Hash function produces skewed results on String values wi…

    …th same leading prefix
    
                Note: Changing hash32 computation to use Murmur3.hash32 instead of int casted version of Murmur3.hash64

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sohami/drill DRILL-5816

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/959.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #959
    
----
commit 4b9dd5be778307138e5fc60041232c66c6671d75
Author: Sorabh Hamirwasia <sh...@maprtech.com>
Date:   2017-09-15T22:07:50Z

    DRILL-5816: Hash function produces skewed results on String values with same leading prefix
                Note: Changing hash32 computation to use Murmur3.hash32 instead of int casted version of Murmur3.hash64

----


---

[GitHub] drill issue #959: DRILL-5816: Hash function produces skewed results on Strin...

Posted by sohami <gi...@git.apache.org>.
Github user sohami commented on the issue:

    https://github.com/apache/drill/pull/959
  
    @amansinha100 - Please help to review.


---

[GitHub] drill pull request #959: DRILL-5816: Hash function produces skewed results o...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/drill/pull/959


---

[GitHub] drill issue #959: DRILL-5816: Hash function produces skewed results on Strin...

Posted by sohami <gi...@git.apache.org>.
Github user sohami commented on the issue:

    https://github.com/apache/drill/pull/959
  
    I have created [DRILL-5287](https://issues.apache.org/jira/browse/DRILL-5827) for adding tests.


---

[GitHub] drill issue #959: DRILL-5816: Hash function produces skewed results on Strin...

Posted by sohami <gi...@git.apache.org>.
Github user sohami commented on the issue:

    https://github.com/apache/drill/pull/959
  
    @paul-rogers - I have added few tests and findings using hash32 and hash64 to compute the 32 bit hash codes in JIRA. 


---

[GitHub] drill issue #959: DRILL-5816: Hash function produces skewed results on Strin...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/959
  
    The change looks benign. I wonder; do we have a test case that exercises the hash functions so we can be certain that the fix actually solves the original problem?


---

[GitHub] drill issue #959: DRILL-5816: Hash function produces skewed results on Strin...

Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on the issue:

    https://github.com/apache/drill/pull/959
  
    lgtm.  +1.  @sohami could you pls check with @chunhui-shi if unit tests were added previously for  DRILL-4237 (another skew issue)?  If not, could you create a new JIRA to add such tests, which should include the data set for DRILL-4237, DRILL-4119, DRILL-5816 among others. 


---