You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yanbo Liang (JIRA)" <ji...@apache.org> on 2016/11/02 09:47:59 UTC

[jira] [Updated] (SPARK-17692) Document ML/MLlib behavior changes in Spark 2.1

     [ https://issues.apache.org/jira/browse/SPARK-17692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yanbo Liang updated SPARK-17692:
--------------------------------
    Description: 
This JIRA records behavior changes of ML/MLlib between 2.0 and 2.1, so we can note those changes (if any) in the user guide's Migration Guide section. If you found one, please comment below and link the corresponding JIRA here.
* SPARK-17389: Reduce KMeans default k-means|| init steps to 2 from 5.  
* SPARK-17870: ChiSquareSelector use pValue rather than raw statistic for SelectKBest features.
* SPARK-3261: KMeans returns potentially fewer than k cluster centers in cases where k distinct centroids aren't available or aren't selected.

  was:
This JIRA records behavior changes of ML/MLlib between 2.0 and 2.1, so we can note those changes (if any) in the user guide's Migration Guide section. If you found one, please comment below and link the corresponding JIRA here.
* SPARK-17389: Reduce KMeans default k-means|| init steps to 2 from 5.  
* SPARK-17870: ChiSquareSelector use pValue rather than raw statistic for SelectKBest features.


> Document ML/MLlib behavior changes in Spark 2.1
> -----------------------------------------------
>
>                 Key: SPARK-17692
>                 URL: https://issues.apache.org/jira/browse/SPARK-17692
>             Project: Spark
>          Issue Type: Documentation
>          Components: ML, MLlib
>            Reporter: Yanbo Liang
>            Assignee: Yanbo Liang
>              Labels: 2.1.0
>
> This JIRA records behavior changes of ML/MLlib between 2.0 and 2.1, so we can note those changes (if any) in the user guide's Migration Guide section. If you found one, please comment below and link the corresponding JIRA here.
> * SPARK-17389: Reduce KMeans default k-means|| init steps to 2 from 5.  
> * SPARK-17870: ChiSquareSelector use pValue rather than raw statistic for SelectKBest features.
> * SPARK-3261: KMeans returns potentially fewer than k cluster centers in cases where k distinct centroids aren't available or aren't selected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org