You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Andrew Ash (JIRA)" <ji...@apache.org> on 2017/11/08 10:56:00 UTC

[jira] [Created] (SPARK-22470) Doc that functions.hash is also used internally for shuffle and bucketing

Andrew Ash created SPARK-22470:
----------------------------------

             Summary: Doc that functions.hash is also used internally for shuffle and bucketing
                 Key: SPARK-22470
                 URL: https://issues.apache.org/jira/browse/SPARK-22470
             Project: Spark
          Issue Type: Documentation
          Components: Documentation, SQL
    Affects Versions: 2.2.0
            Reporter: Andrew Ash


https://issues.apache.org/jira/browse/SPARK-12480 added a hash function that appears to be the same hash function as what Spark uses internally for shuffle and bucketing.

One of my users would like to bake this assumption into code, but is unsure if it's a guarantee or a coincidence that they're the same function.  Would it be considered an API break if at some point the two functions were different, or if the implementation of both changed together?

We should add a line to the scaladoc to clarify.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org