You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ryan <ry...@gmail.com> on 2017/06/08 03:17:35 UTC
Re: Question about mllib.recommendation.ALS
1. could you give job, stage & task status from Spark UI? I found it
extremely useful for performance tuning.
2. use modele.transform for predictions. Usually we have a pipeline for
preparing training data, and use the same pipeline to transform data you
want to predict could give us the prediction column.
On Thu, Jun 1, 2017 at 7:48 AM, Sahib Aulakh [Search] <
sahibaulakh@coupang.com> wrote:
> Hello:
>
> I am training the ALS model for recommendations. I have about 200m ratings
> from about 10m users and 3m products. I have a small cluster with 48 cores
> and 120gb cluster-wide memory.
>
> My code is very similar to the example code
>
> spark/examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala
> code.
>
> I have a couple of questions:
>
>
> 1. All steps up to model training runs reasonably fast. Model training
> is under 10 minutes for rank 20. However, the model.recommendProductsForUsers
> step is either slow or just does not work as the code just seems to hang at
> this point. I have tried user and product blocks sizes of -1 and 20, 40,
> etc, played with executor memory size, etc. Can someone shed some light
> here as to what could be wrong?
> 2. Also, is there any example code for the ml.recommendation.ALS
> algorithm? I can figure out how to train the model but I don't understand
> (from the documentation) how to perform predictions?
>
> Thanks for any information you can provide.
> Sahib Aulakh.
>
>
> --
> Sahib Aulakh
> Sr. Principal Engineer
>
Re: Question about mllib.recommendation.ALS
Posted by Sahib,
,
Search,
,
<sa...@coupang.com>.
Many thanks. Will try it.
On Thu, Jun 8, 2017 at 8:41 AM Nick Pentreath <ni...@gmail.com>
wrote:
> Spark 2.2 will support the recommend-all methods in ML.
>
> Also, both ML and MLLIB performance has been greatly improved for the
> recommend-all methods.
>
> Perhaps you could check out the current RC of Spark 2.2 or master branch
> to try it out?
>
> N
>
> On Thu, 8 Jun 2017 at 17:18, Sahib Aulakh [Search] <
> sahibaulakh@coupang.com> wrote:
>
>> Many thanks for your response. I already figured out the details with
>> some help from another forum.
>>
>>
>> 1. I was trying to predict ratings for all users and all products.
>> This is inefficient and now I am trying to reduce the number of required
>> predictions.
>> 2. There is a nice example buried in Spark source code which points
>> out the usage of ML side ALS.
>>
>> Regards.
>> Sahib Aulakh.
>>
>> On Wed, Jun 7, 2017 at 8:17 PM, Ryan <ry...@gmail.com> wrote:
>>
>>> 1. could you give job, stage & task status from Spark UI? I found it
>>> extremely useful for performance tuning.
>>>
>>> 2. use modele.transform for predictions. Usually we have a pipeline for
>>> preparing training data, and use the same pipeline to transform data you
>>> want to predict could give us the prediction column.
>>>
>>> On Thu, Jun 1, 2017 at 7:48 AM, Sahib Aulakh [Search] <
>>> sahibaulakh@coupang.com> wrote:
>>>
>>>> Hello:
>>>>
>>>> I am training the ALS model for recommendations. I have about 200m
>>>> ratings from about 10m users and 3m products. I have a small cluster with
>>>> 48 cores and 120gb cluster-wide memory.
>>>>
>>>> My code is very similar to the example code
>>>>
>>>> spark/examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala
>>>> code.
>>>>
>>>> I have a couple of questions:
>>>>
>>>>
>>>> 1. All steps up to model training runs reasonably fast. Model
>>>> training is under 10 minutes for rank 20. However, the
>>>> model.recommendProductsForUsers step is either slow or just does not work
>>>> as the code just seems to hang at this point. I have tried user and product
>>>> blocks sizes of -1 and 20, 40, etc, played with executor memory size, etc.
>>>> Can someone shed some light here as to what could be wrong?
>>>> 2. Also, is there any example code for the ml.recommendation.ALS
>>>> algorithm? I can figure out how to train the model but I don't understand
>>>> (from the documentation) how to perform predictions?
>>>>
>>>> Thanks for any information you can provide.
>>>> Sahib Aulakh.
>>>>
>>>>
>>>> --
>>>> Sahib Aulakh
>>>> Sr. Principal Engineer
>>>>
>>>
>>>
>>
>>
>> --
>> Sahib Aulakh
>> Sr. Principal Engineer
>>
> --
Sahib Aulakh
Sr. Principal Engineer
Re: Question about mllib.recommendation.ALS
Posted by Nick Pentreath <ni...@gmail.com>.
Spark 2.2 will support the recommend-all methods in ML.
Also, both ML and MLLIB performance has been greatly improved for the
recommend-all methods.
Perhaps you could check out the current RC of Spark 2.2 or master branch to
try it out?
N
On Thu, 8 Jun 2017 at 17:18, Sahib Aulakh [Search] <
sahibaulakh@coupang.com> wrote:
> Many thanks for your response. I already figured out the details with some
> help from another forum.
>
>
> 1. I was trying to predict ratings for all users and all products.
> This is inefficient and now I am trying to reduce the number of required
> predictions.
> 2. There is a nice example buried in Spark source code which points
> out the usage of ML side ALS.
>
> Regards.
> Sahib Aulakh.
>
> On Wed, Jun 7, 2017 at 8:17 PM, Ryan <ry...@gmail.com> wrote:
>
>> 1. could you give job, stage & task status from Spark UI? I found it
>> extremely useful for performance tuning.
>>
>> 2. use modele.transform for predictions. Usually we have a pipeline for
>> preparing training data, and use the same pipeline to transform data you
>> want to predict could give us the prediction column.
>>
>> On Thu, Jun 1, 2017 at 7:48 AM, Sahib Aulakh [Search] <
>> sahibaulakh@coupang.com> wrote:
>>
>>> Hello:
>>>
>>> I am training the ALS model for recommendations. I have about 200m
>>> ratings from about 10m users and 3m products. I have a small cluster with
>>> 48 cores and 120gb cluster-wide memory.
>>>
>>> My code is very similar to the example code
>>>
>>> spark/examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala
>>> code.
>>>
>>> I have a couple of questions:
>>>
>>>
>>> 1. All steps up to model training runs reasonably fast. Model
>>> training is under 10 minutes for rank 20. However, the
>>> model.recommendProductsForUsers step is either slow or just does not work
>>> as the code just seems to hang at this point. I have tried user and product
>>> blocks sizes of -1 and 20, 40, etc, played with executor memory size, etc.
>>> Can someone shed some light here as to what could be wrong?
>>> 2. Also, is there any example code for the ml.recommendation.ALS
>>> algorithm? I can figure out how to train the model but I don't understand
>>> (from the documentation) how to perform predictions?
>>>
>>> Thanks for any information you can provide.
>>> Sahib Aulakh.
>>>
>>>
>>> --
>>> Sahib Aulakh
>>> Sr. Principal Engineer
>>>
>>
>>
>
>
> --
> Sahib Aulakh
> Sr. Principal Engineer
>
Re: Question about mllib.recommendation.ALS
Posted by Sahib,
,
Search,
,
<sa...@coupang.com>.
Many thanks for your response. I already figured out the details with some
help from another forum.
1. I was trying to predict ratings for all users and all products. This
is inefficient and now I am trying to reduce the number of required
predictions.
2. There is a nice example buried in Spark source code which points out
the usage of ML side ALS.
Regards.
Sahib Aulakh.
On Wed, Jun 7, 2017 at 8:17 PM, Ryan <ry...@gmail.com> wrote:
> 1. could you give job, stage & task status from Spark UI? I found it
> extremely useful for performance tuning.
>
> 2. use modele.transform for predictions. Usually we have a pipeline for
> preparing training data, and use the same pipeline to transform data you
> want to predict could give us the prediction column.
>
> On Thu, Jun 1, 2017 at 7:48 AM, Sahib Aulakh [Search] <
> sahibaulakh@coupang.com> wrote:
>
>> Hello:
>>
>> I am training the ALS model for recommendations. I have about 200m
>> ratings from about 10m users and 3m products. I have a small cluster with
>> 48 cores and 120gb cluster-wide memory.
>>
>> My code is very similar to the example code
>>
>> spark/examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala
>> code.
>>
>> I have a couple of questions:
>>
>>
>> 1. All steps up to model training runs reasonably fast. Model
>> training is under 10 minutes for rank 20. However, the
>> model.recommendProductsForUsers step is either slow or just does not
>> work as the code just seems to hang at this point. I have tried user and
>> product blocks sizes of -1 and 20, 40, etc, played with executor memory
>> size, etc. Can someone shed some light here as to what could be wrong?
>> 2. Also, is there any example code for the ml.recommendation.ALS
>> algorithm? I can figure out how to train the model but I don't understand
>> (from the documentation) how to perform predictions?
>>
>> Thanks for any information you can provide.
>> Sahib Aulakh.
>>
>>
>> --
>> Sahib Aulakh
>> Sr. Principal Engineer
>>
>
>
--
Sahib Aulakh
Sr. Principal Engineer