You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Gianmarco De Francisci Morales <gd...@apache.org> on 2014/12/23 19:01:36 UTC

MLlib + Streaming

Hi,

I have recently seen a demo of Spark where different pieces were put
together (training via MLlib + deploying on Spark Streaming).
I was wondering if MLlib currently works to directly train on Streaming.
And, if so, what are the semantics of the algorithms?
If not, would it be interesting to have ML algorithms developed for the
streaming setting?

Thanks,
--
Gianmarco

Re: MLlib + Streaming

Posted by Jeremy Freeman <fr...@gmail.com>.
Hi Fernando,

There’s currently no streaming ALS in Spark. I’m exploring a streaming singular value decomposition (JIRA) based on this paper (http://www.stat.osu.edu/~dmsl/thinSVDtracking.pdf), which might be one way to think about it.

There has also been some cool recent work explicitly on streaming ALS w/ SGD that we should look into (https://www.cs.utexas.edu/~cjohnson/ParallelCollabFilt.pdf).

— Jeremy

-------------------------
jeremyfreeman.net
@thefreemanlab

On Dec 23, 2014, at 2:47 PM, Fernando O. <fo...@gmail.com> wrote:

> Hey Xiangrui,
>   
>     Is there any plan to have a streaming compatible ALS version?
> 
> Or if it's currently doable, is there any example?
> 
> 
> 
> On Tue, Dec 23, 2014 at 4:31 PM, Xiangrui Meng <me...@gmail.com> wrote:
> We have streaming linear regression (since v1.1) and k-means (v1.2) in
> MLlib. You can check the user guide:
> 
> http://spark.apache.org/docs/latest/mllib-linear-methods.html#streaming-linear-regression
> http://spark.apache.org/docs/latest/mllib-clustering.html#streaming-clustering
> 
> -Xiangrui
> 
> On Tue, Dec 23, 2014 at 10:01 AM, Gianmarco De Francisci Morales
> <gd...@apache.org> wrote:
> > Hi,
> >
> > I have recently seen a demo of Spark where different pieces were put
> > together (training via MLlib + deploying on Spark Streaming).
> > I was wondering if MLlib currently works to directly train on Streaming.
> > And, if so, what are the semantics of the algorithms?
> > If not, would it be interesting to have ML algorithms developed for the
> > streaming setting?
> >
> > Thanks,
> > --
> > Gianmarco
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
> 
> 


Re: MLlib + Streaming

Posted by "Fernando O." <fo...@gmail.com>.
Hey Xiangrui,

    Is there any plan to have a streaming compatible ALS version?

Or if it's currently doable, is there any example?



On Tue, Dec 23, 2014 at 4:31 PM, Xiangrui Meng <me...@gmail.com> wrote:

> We have streaming linear regression (since v1.1) and k-means (v1.2) in
> MLlib. You can check the user guide:
>
>
> http://spark.apache.org/docs/latest/mllib-linear-methods.html#streaming-linear-regression
>
> http://spark.apache.org/docs/latest/mllib-clustering.html#streaming-clustering
>
> -Xiangrui
>
> On Tue, Dec 23, 2014 at 10:01 AM, Gianmarco De Francisci Morales
> <gd...@apache.org> wrote:
> > Hi,
> >
> > I have recently seen a demo of Spark where different pieces were put
> > together (training via MLlib + deploying on Spark Streaming).
> > I was wondering if MLlib currently works to directly train on Streaming.
> > And, if so, what are the semantics of the algorithms?
> > If not, would it be interesting to have ML algorithms developed for the
> > streaming setting?
> >
> > Thanks,
> > --
> > Gianmarco
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Re: MLlib + Streaming

Posted by Xiangrui Meng <me...@gmail.com>.
We have streaming linear regression (since v1.1) and k-means (v1.2) in
MLlib. You can check the user guide:

http://spark.apache.org/docs/latest/mllib-linear-methods.html#streaming-linear-regression
http://spark.apache.org/docs/latest/mllib-clustering.html#streaming-clustering

-Xiangrui

On Tue, Dec 23, 2014 at 10:01 AM, Gianmarco De Francisci Morales
<gd...@apache.org> wrote:
> Hi,
>
> I have recently seen a demo of Spark where different pieces were put
> together (training via MLlib + deploying on Spark Streaming).
> I was wondering if MLlib currently works to directly train on Streaming.
> And, if so, what are the semantics of the algorithms?
> If not, would it be interesting to have ML algorithms developed for the
> streaming setting?
>
> Thanks,
> --
> Gianmarco

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org