You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hama.apache.org by HuYuesheng <ys...@gmail.com> on 2012/08/27 11:53:03 UTC

RAM usage

Hi,

    I want to know, if I want to test a 1TB K-means dataset, dose it mean I
need at least 1TB RAM(all of the cluster)?
    Thank you!

   Best Regards!

Yuesheng Hu
China

Re: RAM usage

Posted by Thomas Jungblut <th...@gmail.com>.
Hi,

the algorithm uses memory proportianal to the number of your centers.
By default, it sets "k.means.caching.enabled" to true, which caches your
vectors to cluster in heap and thus you would need 1tb of ram.
I would suggest you to set this to false (you will need to recompile the
KMeansBSP class in the ml package, the line you have to change is 347).

Good luck and let us know if you have problems.

2012/8/27 HuYuesheng <ys...@gmail.com>

> Hi,
>
>     I want to know, if I want to test a 1TB K-means dataset, dose it mean I
> need at least 1TB RAM(all of the cluster)?
>     Thank you!
>
>    Best Regards!
>
> Yuesheng Hu
> China
>