You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Simon.J (JIRA)" <ji...@apache.org> on 2017/05/08 09:37:04 UTC
[jira] [Updated] (SPARK-20634) result of MLlib KMeans cluster is
not stabilize
[ https://issues.apache.org/jira/browse/SPARK-20634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Simon.J updated SPARK-20634:
----------------------------
hi:
it is really stochastic, but the same dataset, use the KMeans in sparkml lib ,the result is stabilize. Is that okay?
2017-05-08
ffdd-120
发件人:"Sean Owen (JIRA)" <ji...@apache.org>
发送时间:2017-05-08 16:59
主题:[jira] [Resolved] (SPARK-20634) result of MLlib KMeans cluster is not stabilize
收件人:"ffdd-120"<ff...@163.com>
抄送:
[ https://issues.apache.org/jira/browse/SPARK-20634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-20634.
-------------------------------
Resolution: Invalid
I can't understand what this is describing; please read http://spark.apache.org/contributing.html . This doesn't specify any particular problem. You would not expect k-means results to be the same each time. It's stochastic.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
> result of MLlib KMeans cluster is not stabilize
> -----------------------------------------------
>
> Key: SPARK-20634
> URL: https://issues.apache.org/jira/browse/SPARK-20634
> Project: Spark
> Issue Type: Bug
> Components: MLlib
> Affects Versions: 2.0.2
> Environment: Windows 10
> spark 2.0.2 standalone
> spyder 3.1.4
> Anaconda 4.3.0
> python 3.5.2
> Reporter: Simon.J
> Priority: Critical
>
> 1.Get a DataFrame through python with Cx_Oracle lib.
> 2.Start a local Spark Session.
> 3.Convert the dataset for Kmeansmodel train.
> 4.Train the KMeans model and predict the same data.just set K =3
> 5.Get the ClassifierFeature of the KMeans model'predict.
> 6.Get the count of every ClassifierFeature.
> 7.Loop 4-6 for 20 times.
> 8.Compare the result of every time.
> 9.Find the KMeans result dose not stabilize.
> 10.The same dataset and param for ML package'KMeans, its result is the same.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org