You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by Grant Ingersoll <gs...@apache.org> on 2011/10/26 14:56:02 UTC

User vs. Item performance

I seem to recall past discussions on where one hits the bottleneck w/ user based recommendation approaches in Mahout, but I can't seem to locate it anymore.   Anyone know off hand?  Where do user based approaches hit their limits, more or less?

Thanks,
Grant

Re: User vs. Item performance

Posted by Ted Dunning <te...@gmail.com>.

Item based recommendations can also use more expensive off-line computations
which can make recommendations more accurate.  SVD based methods in
particular can be very useful especially which smaller data sets.

On Wed, Oct 26, 2011 at 6:52 AM, Sean Owen <sr...@gmail.com> wrote:

> Yes, I would still say so. You could still easily find this too slow
> if you're using user-user similarities and there are a lot of users
> and few items behind these 100M data points. Or vice versa. Past this
> point it's almost certainly too slow; before this point it could also
> be slow. You would tend to choose user-based if you have relatively
> fewer users. I don't know if there's a hard-and-fast guideline there.
>
> On Wed, Oct 26, 2011 at 2:50 PM, Grant Ingersoll <gs...@apache.org>
> wrote:
> > Sorry, should have been more clear.  I was referring to if one is using a
> user based recommender (e.g GenericUserBasedRecommender) vs. item based
> recommender.  Our general recommendation is that user based approaches won't
> scale, I was wondering what the general cutoff is on a single machine, more
> or less.  Is it still 100M data points, roughly speaking?
> >
>

Re: User vs. Item performance

Posted by Sean Owen <sr...@gmail.com>.

Yes, I would still say so. You could still easily find this too slow
if you're using user-user similarities and there are a lot of users
and few items behind these 100M data points. Or vice versa. Past this
point it's almost certainly too slow; before this point it could also
be slow. You would tend to choose user-based if you have relatively
fewer users. I don't know if there's a hard-and-fast guideline there.

On Wed, Oct 26, 2011 at 2:50 PM, Grant Ingersoll <gs...@apache.org> wrote:
> Sorry, should have been more clear.  I was referring to if one is using a user based recommender (e.g GenericUserBasedRecommender) vs. item based recommender.  Our general recommendation is that user based approaches won't scale, I was wondering what the general cutoff is on a single machine, more or less.  Is it still 100M data points, roughly speaking?
>

Re: User vs. Item performance

Posted by Grant Ingersoll <gs...@apache.org>.

Sorry, should have been more clear.  I was referring to if one is using a user based recommender (e.g GenericUserBasedRecommender) vs. item based recommender.  Our general recommendation is that user based approaches won't scale, I was wondering what the general cutoff is on a single machine, more or less.  Is it still 100M data points, roughly speaking?

On Oct 26, 2011, at 8:57 AM, Sean Owen wrote:

> Limits in terms of scalability? If you mean, how much can you fit on
> one machine without Hadoop, I usually say 100M data points or so.
> Beyond that you can go as big as you like, but on Hadoop.
> 
> On Wed, Oct 26, 2011 at 1:56 PM, Grant Ingersoll <gs...@apache.org> wrote:
>> I seem to recall past discussions on where one hits the bottleneck w/ user based recommendation approaches in Mahout, but I can't seem to locate it anymore.   Anyone know off hand?  Where do user based approaches hit their limits, more or less?
>> 
>> Thanks,
>> Grant
>>

Re: User vs. Item performance

Posted by Sean Owen <sr...@gmail.com>.

Limits in terms of scalability? If you mean, how much can you fit on
one machine without Hadoop, I usually say 100M data points or so.
Beyond that you can go as big as you like, but on Hadoop.

On Wed, Oct 26, 2011 at 1:56 PM, Grant Ingersoll <gs...@apache.org> wrote:
> I seem to recall past discussions on where one hits the bottleneck w/ user based recommendation approaches in Mahout, but I can't seem to locate it anymore.   Anyone know off hand?  Where do user based approaches hit their limits, more or less?
>
> Thanks,
> Grant
>