You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by Stanley Ipkiss <sa...@gmail.com> on 2010/09/17 02:59:47 UTC

Evaluator for RecommenderJob (hadoop implementation)?

Has someone already written an evaluator for the hadoop implementation of cf?
I was looking for something like the RecommenderEvaluator class that could
basically give me the MAE or RMSE or some evaluation metric for the results
i get through the MapReduce implementation of RecommenderJob. I know it
shouldn't be too difficult to write one, but in case someone has written it
already, it will save me a day or so. Any pointers will be appreciated.

Thanks! 
-- 
View this message in context: http://lucene.472066.n3.nabble.com/Evaluator-for-RecommenderJob-hadoop-implementation-tp1515638p1515638.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Evaluator for RecommenderJob (hadoop implementation)?

Posted by Sebastian Schelter <ss...@googlemail.com>.

Hi Stanley,

please don't mail patches around, file a jira ticket and upload your
patch there.

I'm looking forward to seeing it :)

--sebastian

Am 17.09.2010 20:44, schrieb Stanley Ipkiss:
> I agree with Sean. I was thinking of splitting the dataset in the userVector
> formation job and creating a separate MapReduce phase in the end for
> evaluating (i am talking for the current RecommenderJob in
> o.a.m.cf.taste.hadoop.item). I will dig into this sometime next week and
> should be able to put it together. 
>
> (Even though my code will require multiple revisions before it can be
> included in mahout, i will just email a rough patch to Sean and then you can
> fix it whenever you get time).
>
>

Re: Evaluator for RecommenderJob (hadoop implementation)?

Posted by Stanley Ipkiss <sa...@gmail.com>.

I agree with Sean. I was thinking of splitting the dataset in the userVector
formation job and creating a separate MapReduce phase in the end for
evaluating (i am talking for the current RecommenderJob in
o.a.m.cf.taste.hadoop.item). I will dig into this sometime next week and
should be able to put it together. 

(Even though my code will require multiple revisions before it can be
included in mahout, i will just email a rough patch to Sean and then you can
fix it whenever you get time).

-- 
View this message in context: http://lucene.472066.n3.nabble.com/Evaluator-for-RecommenderJob-hadoop-implementation-tp1515638p1517829.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Evaluator for RecommenderJob (hadoop implementation)?

Posted by Sean Owen <sr...@gmail.com>.

It would be neither, I'd imagine. You need a stand-alone class that
can read input from HDFS and the output from HDFS and do the eval
math. There also needs to be some pre-processing stage to segregate
test prefs from training prefs, and run recs appropriate. It's all
straightforward but will take a bit of code.

On Fri, Sep 17, 2010 at 10:40 AM, gabeweb <ga...@htc.com> wrote:
>
> I've actually been planning on doing exactly the same thing in the near
> future (I think it would be a Recommender class, not a RecommenderEvaluator
> class, right?), but I'm not sure if that means in a week, or three.  So I
> would echo the sentiment that if you could contribute that, it would be
> great.  Otherwise, I will certainly be doing it soon, and I would be happy
> to contribute it -- in exchange for some eyes looking at my implementation
> to make sure that it makes sense, as I've only recently started using
> Mahout.
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Evaluator-for-RecommenderJob-hadoop-implementation-tp1515638p1516460.html
> Sent from the Mahout User List mailing list archive at Nabble.com.
>

Re: Evaluator for RecommenderJob (hadoop implementation)?

Posted by gabeweb <ga...@htc.com>.

I've actually been planning on doing exactly the same thing in the near
future (I think it would be a Recommender class, not a RecommenderEvaluator
class, right?), but I'm not sure if that means in a week, or three.  So I
would echo the sentiment that if you could contribute that, it would be
great.  Otherwise, I will certainly be doing it soon, and I would be happy
to contribute it -- in exchange for some eyes looking at my implementation
to make sure that it makes sense, as I've only recently started using
Mahout.

-- 
View this message in context: http://lucene.472066.n3.nabble.com/Evaluator-for-RecommenderJob-hadoop-implementation-tp1515638p1516460.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Evaluator for RecommenderJob (hadoop implementation)?

Posted by Sean Owen <sr...@gmail.com>.

No I don't know of such a thing. It'd be great if you implement and
are in a position to contribute it.

On Fri, Sep 17, 2010 at 1:59 AM, Stanley Ipkiss <sa...@gmail.com> wrote:
>
> Has someone already written an evaluator for the hadoop implementation of cf?
> I was looking for something like the RecommenderEvaluator class that could
> basically give me the MAE or RMSE or some evaluation metric for the results
> i get through the MapReduce implementation of RecommenderJob. I know it
> shouldn't be too difficult to write one, but in case someone has written it
> already, it will save me a day or so. Any pointers will be appreciated.
>
> Thanks!
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Evaluator-for-RecommenderJob-hadoop-implementation-tp1515638p1515638.html
> Sent from the Mahout User List mailing list archive at Nabble.com.
>

Re: Evaluator for RecommenderJob (hadoop implementation)?

Posted by Mark <st...@gmail.com>.

  Sorry for the hijack. Reposting

On 9/18/10 10:25 AM, Software Dev wrote:
> I am trying to run FPGrowth:
>
> /hadoop jar /opt/mahout-0.3/mahout-examples-0.3.job 
> org.apache.mahout.fpm.pfpgrowth.FPGrowthDriver -i 
> output/product/part-r-00000 -o pfp -method mapreduce -regex [\\t] -s 5 
> -g 17500 -k 50/
>
> However the 3rd task:/ "Processing FPTree: Bottom Up FP Growth > 
> reduce"/ will not finish. It's basically stuck at 85% and hasn't 
> budged in over an hour. The output of the first task outputted there 
> were about 37K features so I set -g to 17500. Does anyone know whats 
> going on and how I can speed this up?
>
> Thanks

Re: Evaluator for RecommenderJob (hadoop implementation)?

Posted by Sebastian Schelter <ss...@apache.org>.

When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is "hidden" in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.
See Also:  http://en.wikipedia.org/wiki/User:DonDiego/Thread_hijacking

> I am trying to run FPGrowth:
>
> *hadoop jar /opt/mahout-0.3/mahout-examples-0.3.job
> org.apache.mahout.fpm.pfpgrowth.FPGrowthDriver -i
> output/product/part-r-00000 -o pfp -method mapreduce -regex [\\t] -s 5 -g
> 17500 -k 50*
>
> However the 3rd task:* "Processing FPTree: Bottom Up FP Growth >
> reduce"*will not finish. It's basically stuck at 85% and hasn't budged
> in over an
> hour. The output of the first task outputted there were about 37K features
> so I set -g to 17500. Does anyone know whats going on and how I can speed
> this up?
>
> Thanks
>
>

Re: Evaluator for RecommenderJob (hadoop implementation)?

Posted by Ted Dunning <te...@gmail.com>.

I don't know the answer to this, but previously this kind of problem was
caused by highly skewed statistics in the input data.

If there are things that cooccur with everything, then you will have
problems with the speed of the algorithm.

Can you say something about the distribution of your data?  Can you post a
frequency by rank table?

On Sat, Sep 18, 2010 at 10:25 AM, Software Dev <st...@gmail.com>wrote:

> I am trying to run FPGrowth:
>
> *hadoop jar /opt/mahout-0.3/mahout-examples-0.3.job
> org.apache.mahout.fpm.pfpgrowth.FPGrowthDriver -i
> output/product/part-r-00000 -o pfp -method mapreduce -regex [\\t] -s 5 -g
> 17500 -k 50*
>
> However the 3rd task:* "Processing FPTree: Bottom Up FP Growth >
> reduce"*will not finish. It's basically stuck at 85% and hasn't budged
> in over an
> hour. The output of the first task outputted there were about 37K features
> so I set -g to 17500. Does anyone know whats going on and how I can speed
> this up?
>
> Thanks
>

Re: Evaluator for RecommenderJob (hadoop implementation)?

Posted by Software Dev <st...@gmail.com>.

I am trying to run FPGrowth:

*hadoop jar /opt/mahout-0.3/mahout-examples-0.3.job
org.apache.mahout.fpm.pfpgrowth.FPGrowthDriver -i
output/product/part-r-00000 -o pfp -method mapreduce -regex [\\t] -s 5 -g
17500 -k 50*

However the 3rd task:* "Processing FPTree: Bottom Up FP Growth >
reduce"*will not finish. It's basically stuck at 85% and hasn't budged
in over an
hour. The output of the first task outputted there were about 37K features
so I set -g to 17500. Does anyone know whats going on and how I can speed
this up?

Thanks