You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2012/12/18 01:46:13 UTC

[jira] [Commented] (MAPREDUCE-4887) Rehashing partitioner for better distribution

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534470#comment-13534470 ] 

Doug Cutting commented on MAPREDUCE-4887:
-----------------------------------------

This looks like a good addition.  The javadoc might provide more detail, e.g., that a smoother partitioning may improve reduce time in some cases and should harm things in no cases, that this is suggested with Integer and Long keys with simple patterns in their distributions.
                
> Rehashing partitioner for better distribution
> ---------------------------------------------
>
>                 Key: MAPREDUCE-4887
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4887
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Radim Kolar
>         Attachments: rehash1.txt
>
>
> rehash value returned by Object.hashCode() to get better distribution

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira