You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by Stanley Ipkiss <sa...@gmail.com> on 2010/09/24 01:12:56 UTC

Evaluation approach in AbstractDifferenceRecommenderEvaluator

For AbstractDifferenceRecommenderEvaluator (in o.a.m.cf.taste.impl.eval), in
the function processOneUser we have - 


      if (random.nextDouble() < trainingPercentage) {
        if (trainingPrefs == null) {
          trainingPrefs = new ArrayList<Preference>(3);
        }
        trainingPrefs.add(newPref);
      } else {
        if (testPrefs == null) {
          testPrefs = new ArrayList<Preference>(3);
        }
        testPrefs.add(newPref);
      }
    }

Why do you want to limit the number of preferences (per user) being used in
training or testing set to 3? Why not increase it to a more significant
number (say, 10) or better make it include all of them? The evaluation
results that we get because of this may not be right. I know that it will be
much faster by limiting it to 3. But, I was just curious if this has any
other advantage, that I am missing out on. 

And, in the documentation you mention that training % + evaluation % may not
necessarily sum up to 1. But, out here, for each user, you either put his
preference in training or testing user preferences. This effectively means,
that the training and evaluation percentage sums up to 1, for each data
point falls in either of the two categories out here, and not both. 
-- 
View this message in context: http://lucene.472066.n3.nabble.com/Evaluation-approach-in-AbstractDifferenceRecommenderEvaluator-tp1571032p1571032.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Evaluation approach in AbstractDifferenceRecommenderEvaluator

Posted by Sean Owen <sr...@gmail.com>.

I agree with that. But that is not what these figures are.

Evaluation percentage is purely a lever to reduce the size of the
input for speed. If evaluation percentage is 0.15 (15%), then 85% of
all data is thrown out upfront.

Training percentage is what you're talking about. If it is 90%, then
90% of the remaining data goes into the training model, and the other
10% of the remaining data is for testing.

On Fri, Sep 24, 2010 at 3:01 AM, Stanley Ipkiss <sa...@gmail.com> wrote:
>
> But, what about the quoted text below. Do you agree that each sample point is
> put in either the test or training set, and hence the two percentage have to
> sum up to one for this particular implementation?

Re: Evaluation approach in AbstractDifferenceRecommenderEvaluator

Posted by Stanley Ipkiss <sa...@gmail.com>.

But, what about the quoted text below. Do you agree that each sample point is
put in either the test or training set, and hence the two percentage have to
sum up to one for this particular implementation?


Stanley Ipkiss wrote:
> 
> And, in the documentation you mention that training % + evaluation % may
> not necessarily sum up to 1. But, out here, for each user, you either put
> his preference in training or testing user preferences. This effectively
> means, that the training and evaluation percentage sums up to 1, for each
> data point falls in either of the two categories out here, and not both. 
> 

-- 
View this message in context: http://lucene.472066.n3.nabble.com/Evaluation-approach-in-AbstractDifferenceRecommenderEvaluator-tp1571032p1571633.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Evaluation approach in AbstractDifferenceRecommenderEvaluator

Posted by Stanley Ipkiss <sa...@gmail.com>.

Ohh! my bad. (*embarrassed!*) 
-- 
View this message in context: http://lucene.472066.n3.nabble.com/Evaluation-approach-in-AbstractDifferenceRecommenderEvaluator-tp1571032p1571437.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Evaluation approach in AbstractDifferenceRecommenderEvaluator

Posted by gabeweb <ga...@htc.com>.

3 is just the initial capacity; as you add elements, the array grows
automatically:

http://download.oracle.com/javase/6/docs/api/java/util/ArrayList.html

-- 
View this message in context: http://lucene.472066.n3.nabble.com/Evaluation-approach-in-AbstractDifferenceRecommenderEvaluator-tp1571032p1571402.html
Sent from the Mahout User List mailing list archive at Nabble.com.