You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by 김태준 <ki...@gmail.com> on 2016/10/21 06:50:10 UTC

kmeans|| waiting issue

Hi guys
I have a question about kmeans|| |

i ran Spark ML 2.0 Kmeans|| by my data
first few step was completed at short time as below image
[image: 본문 이미지 1]

but, next step was not started in few times
it was started after 1 hour left

it used environment
-- features size : 160000
-- features dimension : 100
-- kmean K : 10000
-- other settings : default

-- cluster machines : 30 ( each machine : 4 core, 8GB)


i want to know that why so much time waiting

Re: kmeans|| waiting issue

Posted by 김태준 <ki...@gmail.com>.
Thanks Sean for your reply
I will try your suggestion

Sean~ when release version up to 2.1 ?

2016년 10월 21일 (금) 오후 5:47, Sean Owen <so...@cloudera.com>님이 작성:

There are some recent changes to speed up k-means, like
https://issues.apache.org/jira/browse/SPARK-11560. It could be relevant.
These will be in 2.1.

On Fri, Oct 21, 2016 at 7:51 AM 김태준 <ki...@gmail.com> wrote:

Hi guys
I have a question about kmeans|| |

i ran Spark ML 2.0 Kmeans|| by my data
first few step was completed at short time as below image
[image: 스크린샷 2016-10-21 오후 3.35.56.png]

but, next step was not started in few times
it was started after 1 hour left

it used environment
-- features size : 160000
-- features dimension : 100
-- kmean K : 10000
-- other settings : default

-- cluster machines : 30 ( each machine : 4 core, 8GB)


i want to know that why so much time waiting

Re: kmeans|| waiting issue

Posted by Sean Owen <so...@cloudera.com>.
There are some recent changes to speed up k-means, like
https://issues.apache.org/jira/browse/SPARK-11560. It could be relevant.
These will be in 2.1.

On Fri, Oct 21, 2016 at 7:51 AM 김태준 <ki...@gmail.com> wrote:

> Hi guys
> I have a question about kmeans|| |
>
> i ran Spark ML 2.0 Kmeans|| by my data
> first few step was completed at short time as below image
> [image: 본문 이미지 1]
>
> but, next step was not started in few times
> it was started after 1 hour left
>
> it used environment
> -- features size : 160000
> -- features dimension : 100
> -- kmean K : 10000
> -- other settings : default
>
> -- cluster machines : 30 ( each machine : 4 core, 8GB)
>
>
> i want to know that why so much time waiting
>
>
>
>