You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by Steven Bourke <sb...@gmail.com> on 2010/10/18 01:17:57 UTC

Why does evaluating a recommender take far less time than actually generating results?

Hi,

I've previously tested a variety of recommenders from mahout using the
evaluation framework that comes built in (MAE, Precision and Recall).

I'm just now generating a full list of recommendations for users from my
dataset. Previously using this dataset has taken a matter of minutes to get
precision and recall results back using .7 as the training percentage. I now
notice that when I generate recommendations for all users in my dataset it
takes substantially longer to generate the entire list. Any idea what I
could be doing wrong?

My code is as follows

       LongPrimitiveIterator userlist= model.getUserIDs();

         while(userlist.hasNext())

         {

         Long id = userlist.next();

         List<RecommendedItem> recommendations = recommender.recommend(id,
5);

         for(RecommendedItem reco : recommendations)

         {

         System.out.println(id+" likes " + reco);

         }



         }

}

Re: Why does evaluating a recommender take far less time than actually generating results?

Posted by Sean Owen <sr...@gmail.com>.

No the test data can't be included in the training data, or else it would be
like giving a student the answers to the exam before-hand.

You're doing much less work for other reasons. Recommendation is a bigger
problem. It may require computing many estimated preferences to get one set
of recommendations. Evaluation is just computing a single item preference
each time.

There is also an additional parameter "evaluation percentage" -- are you
setting this to less than 1? That simply throws out some percentage of all
data entirely. This is a way to make the evaluation quicker (and less
accurate) by simply shrinking the problem.

Re: Why does evaluating a recommender take far less time than actually generating results?

Posted by Steven Bourke <sb...@gmail.com>.

What I mean is my assumption (Which of course can be wrong!)

Is that when using any of the evaluation framework mahout will inspect all
users / preferences in the dataset (if specified). Therefore using .7 as
training and the remaining .3 to predict on would involve using the dataset
in its entirety. However when I look to generate all the users preferences
using the code attached in the original mail  it appears to take a good bit
longer to complete.

I was wondering is the evaluation framework doing anything different or
cutting corners somehow?

On Mon, Oct 18, 2010 at 7:38 AM, Sebastian Schelter <ss...@apache.org> wrote:

> Hi Steve,
>
> If I understand you correctly your question is why it takes longer to
> compute recommendations for all users than to run an evaluation with 0.7 as
> training percentage?
>
> That would be because if you use 70% of the ratings for training, you only
> need to estimate the preference for the remaining 30% of ratings.
>
> --sebastian
>
>
> On 18.10.2010 01:17, Steven Bourke wrote:
>
>> Hi,
>>
>> I've previously tested a variety of recommenders from mahout using the
>> evaluation framework that comes built in (MAE, Precision and Recall).
>>
>> I'm just now generating a full list of recommendations for users from my
>> dataset. Previously using this dataset has taken a matter of minutes to
>> get
>> precision and recall results back using .7 as the training percentage. I
>> now
>> notice that when I generate recommendations for all users in my dataset it
>> takes substantially longer to generate the entire list. Any idea what I
>> could be doing wrong?
>>
>> My code is as follows
>>
>>        LongPrimitiveIterator userlist= model.getUserIDs();
>>
>>          while(userlist.hasNext())
>>
>>          {
>>
>>          Long id = userlist.next();
>>
>>          List<RecommendedItem>  recommendations =
>> recommender.recommend(id,
>> 5);
>>
>>          for(RecommendedItem reco : recommendations)
>>
>>          {
>>
>>          System.out.println(id+" likes " + reco);
>>
>>          }
>>
>>
>>
>>          }
>>
>> }
>>
>>
>>
>
>

Re: Why does evaluating a recommender take far less time than actually generating results?

Posted by Sebastian Schelter <ss...@apache.org>.

Hi Steve,

If I understand you correctly your question is why it takes longer to 
compute recommendations for all users than to run an evaluation with 0.7 
as training percentage?

That would be because if you use 70% of the ratings for training, you 
only need to estimate the preference for the remaining 30% of ratings.

--sebastian

On 18.10.2010 01:17, Steven Bourke wrote:
> Hi,
>
> I've previously tested a variety of recommenders from mahout using the
> evaluation framework that comes built in (MAE, Precision and Recall).
>
> I'm just now generating a full list of recommendations for users from my
> dataset. Previously using this dataset has taken a matter of minutes to get
> precision and recall results back using .7 as the training percentage. I now
> notice that when I generate recommendations for all users in my dataset it
> takes substantially longer to generate the entire list. Any idea what I
> could be doing wrong?
>
> My code is as follows
>
>         LongPrimitiveIterator userlist= model.getUserIDs();
>
>           while(userlist.hasNext())
>
>           {
>
>           Long id = userlist.next();
>
>           List<RecommendedItem>  recommendations = recommender.recommend(id,
> 5);
>
>           for(RecommendedItem reco : recommendations)
>
>           {
>
>           System.out.println(id+" likes " + reco);
>
>           }
>
>
>
>           }
>
> }
>
>