You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Andrew Ash (JIRA)" <ji...@apache.org> on 2017/11/08 10:56:00 UTC
[jira] [Created] (SPARK-22470) Doc that functions.hash is also used
internally for shuffle and bucketing
Andrew Ash created SPARK-22470:
----------------------------------
Summary: Doc that functions.hash is also used internally for shuffle and bucketing
Key: SPARK-22470
URL: https://issues.apache.org/jira/browse/SPARK-22470
Project: Spark
Issue Type: Documentation
Components: Documentation, SQL
Affects Versions: 2.2.0
Reporter: Andrew Ash
https://issues.apache.org/jira/browse/SPARK-12480 added a hash function that appears to be the same hash function as what Spark uses internally for shuffle and bucketing.
One of my users would like to bake this assumption into code, but is unsure if it's a guarantee or a coincidence that they're the same function. Would it be considered an API break if at some point the two functions were different, or if the implementation of both changed together?
We should add a line to the scaladoc to clarify.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org