You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Zak H <za...@gmail.com> on 2016/11/01 17:00:25 UTC
Question about using collaborative filtering in MLlib
Hi,
I'm using the Java Api for Dataframe api for Spark-Mllib. Should I be using
the RDD api instead as I'm not sure if this functionality has been ported
over to dataframes, correct me if I'm wrong.
My goal is to evaluate spark's recommendation capabilities. I'm looking at
this example:
http://spark.apache.org/docs/latest/ml-collaborative-filtering.html
Looking at the java docs I can see there is a method:
http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.html
"public RDD <http://spark.apache.org/docs/latest/api/java/org/apache/spark/rdd/RDD.html><scala.Tuple2<Object,Rating
<http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/recommendation/Rating.html>[]>>
recommendUsersForProducts(int num)"
For some reason the recommendProductsForUsers method isn't available in the
java api:
model.recommendProductsForUsers
Is there something I'm missing here:
I've posted my code here on this gist. I am using the dataframe api for
mllib. I know there may be work to port over functionality from RDD's.
https://gist.github.com/zmhassan/6ccdda8b4ad86f9b1924477c65ed5d45
Thanks,
Zak
Re: Question about using collaborative filtering in MLlib
Posted by Nick Pentreath <ni...@gmail.com>.
I have a PR for it - https://github.com/apache/spark/pull/12574
Sadly I've been tied up and haven't had a chance to work further on it.
The main issue outstanding is deciding on the transform semantics as well
as performance testing.
Any comments / feedback welcome especially on transform semantics.
N
Re: Question about using collaborative filtering in MLlib
Posted by Yuhao Yang <hh...@gmail.com>.
Hi Zak,
Indeed the function is missing in DataFrame-based API. I can probably
provide some quick prototype to see if it we can merge the function into
next release. I would send update here and feel free to give it a try.
Regards,
Yuhao
2016-11-01 10:00 GMT-07:00 Zak H <za...@gmail.com>:
> Hi,
>
> I'm using the Java Api for Dataframe api for Spark-Mllib. Should I be
> using the RDD api instead as I'm not sure if this functionality has been
> ported over to dataframes, correct me if I'm wrong.
>
> My goal is to evaluate spark's recommendation capabilities. I'm looking
> at this example:
>
> http://spark.apache.org/docs/latest/ml-collaborative-filtering.html
>
> Looking at the java docs I can see there is a method: http://spark.apache.
> org/docs/latest/api/java/org/apache/spark/mllib/recommendation/
> MatrixFactorizationModel.html
>
> "public RDD <http://spark.apache.org/docs/latest/api/java/org/apache/spark/rdd/RDD.html><scala.Tuple2<Object,Rating <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/recommendation/Rating.html>[]>> recommendUsersForProducts(int num)"
>
>
> For some reason the recommendProductsForUsers method isn't available in
> the java api:
> model.recommendProductsForUsers
>
> Is there something I'm missing here:
>
> I've posted my code here on this gist. I am using the dataframe api for
> mllib. I know there may be work to port over functionality from RDD's.
>
> https://gist.github.com/zmhassan/6ccdda8b4ad86f9b1924477c65ed5d45
>
> Thanks,
> Zak
>