You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2017/05/08 08:59:04 UTC

[jira] [Resolved] (SPARK-20634) result of MLlib KMeans cluster is not stabilize

     [ https://issues.apache.org/jira/browse/SPARK-20634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved SPARK-20634.
-------------------------------
    Resolution: Invalid

I can't understand what this is describing; please read http://spark.apache.org/contributing.html . This doesn't specify any particular problem. You would not expect k-means results to be the same each time. It's stochastic.

> result of MLlib KMeans cluster is not stabilize
> -----------------------------------------------
>
>                 Key: SPARK-20634
>                 URL: https://issues.apache.org/jira/browse/SPARK-20634
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib
>    Affects Versions: 2.0.2
>         Environment: Windows 10
> spark 2.0.2 standalone
> spyder 3.1.4
> Anaconda 4.3.0
> python 3.5.2
>            Reporter: Simon.J
>            Priority: Critical
>
> 1.Get a DataFrame through python with Cx_Oracle lib.
> 2.Start a local Spark Session.
> 3.Convert the dataset for Kmeansmodel train.
> 4.Train the KMeans model and predict the same data.just set K =3
> 5.Get the ClassifierFeature of the KMeans model'predict.
> 6.Get the count of every ClassifierFeature.
> 7.Loop 4-6 for 20 times.
> 8.Compare the result of every time.
> 9.Find the KMeans result dose not stabilize.
> 10.The same dataset and param for ML package'KMeans, its result is the same.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org