You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Grant Ingersoll <gs...@apache.org> on 2011/10/26 14:56:02 UTC
User vs. Item performance
I seem to recall past discussions on where one hits the bottleneck w/ user based recommendation approaches in Mahout, but I can't seem to locate it anymore. Anyone know off hand? Where do user based approaches hit their limits, more or less?
Thanks,
Grant
Re: User vs. Item performance
Posted by Ted Dunning <te...@gmail.com>.
Item based recommendations can also use more expensive off-line computations
which can make recommendations more accurate. SVD based methods in
particular can be very useful especially which smaller data sets.
On Wed, Oct 26, 2011 at 6:52 AM, Sean Owen <sr...@gmail.com> wrote:
> Yes, I would still say so. You could still easily find this too slow
> if you're using user-user similarities and there are a lot of users
> and few items behind these 100M data points. Or vice versa. Past this
> point it's almost certainly too slow; before this point it could also
> be slow. You would tend to choose user-based if you have relatively
> fewer users. I don't know if there's a hard-and-fast guideline there.
>
> On Wed, Oct 26, 2011 at 2:50 PM, Grant Ingersoll <gs...@apache.org>
> wrote:
> > Sorry, should have been more clear. I was referring to if one is using a
> user based recommender (e.g GenericUserBasedRecommender) vs. item based
> recommender. Our general recommendation is that user based approaches won't
> scale, I was wondering what the general cutoff is on a single machine, more
> or less. Is it still 100M data points, roughly speaking?
> >
>
Re: User vs. Item performance
Posted by Sean Owen <sr...@gmail.com>.
Yes, I would still say so. You could still easily find this too slow
if you're using user-user similarities and there are a lot of users
and few items behind these 100M data points. Or vice versa. Past this
point it's almost certainly too slow; before this point it could also
be slow. You would tend to choose user-based if you have relatively
fewer users. I don't know if there's a hard-and-fast guideline there.
On Wed, Oct 26, 2011 at 2:50 PM, Grant Ingersoll <gs...@apache.org> wrote:
> Sorry, should have been more clear. I was referring to if one is using a user based recommender (e.g GenericUserBasedRecommender) vs. item based recommender. Our general recommendation is that user based approaches won't scale, I was wondering what the general cutoff is on a single machine, more or less. Is it still 100M data points, roughly speaking?
>
Re: User vs. Item performance
Posted by Grant Ingersoll <gs...@apache.org>.
Sorry, should have been more clear. I was referring to if one is using a user based recommender (e.g GenericUserBasedRecommender) vs. item based recommender. Our general recommendation is that user based approaches won't scale, I was wondering what the general cutoff is on a single machine, more or less. Is it still 100M data points, roughly speaking?
On Oct 26, 2011, at 8:57 AM, Sean Owen wrote:
> Limits in terms of scalability? If you mean, how much can you fit on
> one machine without Hadoop, I usually say 100M data points or so.
> Beyond that you can go as big as you like, but on Hadoop.
>
> On Wed, Oct 26, 2011 at 1:56 PM, Grant Ingersoll <gs...@apache.org> wrote:
>> I seem to recall past discussions on where one hits the bottleneck w/ user based recommendation approaches in Mahout, but I can't seem to locate it anymore. Anyone know off hand? Where do user based approaches hit their limits, more or less?
>>
>> Thanks,
>> Grant
>>
Re: User vs. Item performance
Posted by Sean Owen <sr...@gmail.com>.
Limits in terms of scalability? If you mean, how much can you fit on
one machine without Hadoop, I usually say 100M data points or so.
Beyond that you can go as big as you like, but on Hadoop.
On Wed, Oct 26, 2011 at 1:56 PM, Grant Ingersoll <gs...@apache.org> wrote:
> I seem to recall past discussions on where one hits the bottleneck w/ user based recommendation approaches in Mahout, but I can't seem to locate it anymore. Anyone know off hand? Where do user based approaches hit their limits, more or less?
>
> Thanks,
> Grant
>