You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Tim Smith <se...@gmail.com> on 2017/04/27 17:46:53 UTC

Initialize Gaussian Mixture Model using Spark ML dataframe API

Hi,

I am trying to figure out the API to initialize a gaussian mixture model
using either centroids created by K-means or previously calculated GMM
model (I am aware that you can "save" a model and "load" in later but I am
not interested in saving a model to a filesystem).

The Spark MLlib API lets you do this using SetInitialModel
https://spark.apache.org/docs/2.1.0/api/scala/index.html#org.apache.spark.mllib.clustering.GaussianMixture

However, I cannot figure out how to do this using Spark ML API. Can anyone
please point me in the right direction? I've tried reading the Spark ML
code and was wondering if the "set" call lets you do that?

--
Thanks,

Tim

Re: Initialize Gaussian Mixture Model using Spark ML dataframe API

Posted by Tim Smith <se...@gmail.com>.
Yes, I noticed these open issues, both with KMeans and GMM:
https://issues.apache.org/jira/browse/SPARK-13025

Thanks,

Tim


On Mon, May 1, 2017 at 9:01 PM, Yanbo Liang <yb...@gmail.com> wrote:

> Hi Tim,
>
> Spark ML API doesn't support set initial model for GMM currently. I wish
> we can get this feature in Spark 2.3.
>
> Thanks
> Yanbo
>
> On Fri, Apr 28, 2017 at 1:46 AM, Tim Smith <se...@gmail.com> wrote:
>
>> Hi,
>>
>> I am trying to figure out the API to initialize a gaussian mixture model
>> using either centroids created by K-means or previously calculated GMM
>> model (I am aware that you can "save" a model and "load" in later but I am
>> not interested in saving a model to a filesystem).
>>
>> The Spark MLlib API lets you do this using SetInitialModel
>> https://spark.apache.org/docs/2.1.0/api/scala/index.html#org
>> .apache.spark.mllib.clustering.GaussianMixture
>>
>> However, I cannot figure out how to do this using Spark ML API. Can
>> anyone please point me in the right direction? I've tried reading the Spark
>> ML code and was wondering if the "set" call lets you do that?
>>
>> --
>> Thanks,
>>
>> Tim
>>
>
>


-- 

--
Thanks,

Tim

Re: Initialize Gaussian Mixture Model using Spark ML dataframe API

Posted by Yanbo Liang <yb...@gmail.com>.
Hi Tim,

Spark ML API doesn't support set initial model for GMM currently. I wish we
can get this feature in Spark 2.3.

Thanks
Yanbo

On Fri, Apr 28, 2017 at 1:46 AM, Tim Smith <se...@gmail.com> wrote:

> Hi,
>
> I am trying to figure out the API to initialize a gaussian mixture model
> using either centroids created by K-means or previously calculated GMM
> model (I am aware that you can "save" a model and "load" in later but I am
> not interested in saving a model to a filesystem).
>
> The Spark MLlib API lets you do this using SetInitialModel
> https://spark.apache.org/docs/2.1.0/api/scala/index.html#
> org.apache.spark.mllib.clustering.GaussianMixture
>
> However, I cannot figure out how to do this using Spark ML API. Can anyone
> please point me in the right direction? I've tried reading the Spark ML
> code and was wondering if the "set" call lets you do that?
>
> --
> Thanks,
>
> Tim
>