You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Joseph K. Bradley (JIRA)" <ji...@apache.org> on 2015/05/21 00:16:59 UTC

[jira] [Resolved] (SPARK-7511) PySpark ML seed Param should be varied per class

     [ https://issues.apache.org/jira/browse/SPARK-7511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joseph K. Bradley resolved SPARK-7511.
--------------------------------------
       Resolution: Fixed
    Fix Version/s: 1.4.0

Issue resolved by pull request 6139
[https://github.com/apache/spark/pull/6139]

> PySpark ML seed Param should be varied per class
> ------------------------------------------------
>
>                 Key: SPARK-7511
>                 URL: https://issues.apache.org/jira/browse/SPARK-7511
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML, PySpark
>            Reporter: Joseph K. Bradley
>            Priority: Minor
>             Fix For: 1.4.0
>
>
> Currently, Scala's HasSeed mix-in uses a random Long as the default value for seed.  Python uses 42.  After discussions, we've decided to use a seed which varies based on the class name, but which is fixed instead of random.  This will make behavior reproducible, rather than random, by default.  Users will still be able to change the random seed.
> The default seed should be produced via some hash of the class name.
> Scala's seed will be fixed in a separate patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org