You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by 김태준 <ki...@gmail.com> on 2016/10/21 06:50:10 UTC
kmeans|| waiting issue
Hi guys
I have a question about kmeans|| |
i ran Spark ML 2.0 Kmeans|| by my data
first few step was completed at short time as below image
[image: 본문 이미지 1]
but, next step was not started in few times
it was started after 1 hour left
it used environment
-- features size : 160000
-- features dimension : 100
-- kmean K : 10000
-- other settings : default
-- cluster machines : 30 ( each machine : 4 core, 8GB)
i want to know that why so much time waiting
Re: kmeans|| waiting issue
Posted by 김태준 <ki...@gmail.com>.
Thanks Sean for your reply
I will try your suggestion
Sean~ when release version up to 2.1 ?
2016년 10월 21일 (금) 오후 5:47, Sean Owen <so...@cloudera.com>님이 작성:
There are some recent changes to speed up k-means, like
https://issues.apache.org/jira/browse/SPARK-11560. It could be relevant.
These will be in 2.1.
On Fri, Oct 21, 2016 at 7:51 AM 김태준 <ki...@gmail.com> wrote:
Hi guys
I have a question about kmeans|| |
i ran Spark ML 2.0 Kmeans|| by my data
first few step was completed at short time as below image
[image: 스크린샷 2016-10-21 오후 3.35.56.png]
but, next step was not started in few times
it was started after 1 hour left
it used environment
-- features size : 160000
-- features dimension : 100
-- kmean K : 10000
-- other settings : default
-- cluster machines : 30 ( each machine : 4 core, 8GB)
i want to know that why so much time waiting
Re: kmeans|| waiting issue
Posted by Sean Owen <so...@cloudera.com>.
There are some recent changes to speed up k-means, like
https://issues.apache.org/jira/browse/SPARK-11560. It could be relevant.
These will be in 2.1.
On Fri, Oct 21, 2016 at 7:51 AM 김태준 <ki...@gmail.com> wrote:
> Hi guys
> I have a question about kmeans|| |
>
> i ran Spark ML 2.0 Kmeans|| by my data
> first few step was completed at short time as below image
> [image: 본문 이미지 1]
>
> but, next step was not started in few times
> it was started after 1 hour left
>
> it used environment
> -- features size : 160000
> -- features dimension : 100
> -- kmean K : 10000
> -- other settings : default
>
> -- cluster machines : 30 ( each machine : 4 core, 8GB)
>
>
> i want to know that why so much time waiting
>
>
>
>