You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by Young <wo...@126.com> on 2010/08/29 21:47:05 UTC

CPU Time

Hi all,
 
Based on 1M dataset, about how many requests could be expected to be handled at a time when using item-based recommender if the engine runs on a Core2 2.4G CPU and 4G meomory. 
 
Thank you very much.
 
-- Young

Re: CPU Time

Posted by Akshay Bhat <ak...@gmail.com>.

On Sun, Aug 29, 2010 at 7:19 PM, Akshay Bhat <ak...@gmail.com> wrote:

> Item based recommender can be cached, so if you are recommending similar
> items based current item being looked at/purchased, it would just be a
> database lookup.
> For an SVD based recommender to compute similar items for 1M items with 50
> ~ 100 eigenvectors should take ~5-6 hours on similar machine.
>
Please note that this is the time required to find similar items,  after SVD
has been performed. time for SVD would depend on number of users.

> You can generate a new model every few days and update database of similar
> items.
>
>
> 2010/8/29 Young <wo...@126.com>
>
>> Hi all,
>>
>>
>> Based on 1M dataset, about how many requests could be expected to be
>> handled at a time when using item-based recommender if the engine runs on a
>> Core2 2.4G CPU and 4G meomory.
>>
>> Thank you very much.
>>
>> -- Young
>>
>>
>>
>
>
>
> --
> Akshay Uday Bhat.
> Graduate Student, Computer Science, Cornell University
> Website: http://www.akshaybhat.com
>
>


-- 
Akshay Uday Bhat.
Graduate Student, Computer Science, Cornell University
Website: http://www.akshaybhat.com

Re: CPU Time

Posted by Akshay Bhat <ak...@gmail.com>.

Item based recommender can be cached, so if you are recommending similar
items based current item being looked at/purchased, it would just be a
database lookup.
For an SVD based recommender to compute similar items for 1M items with 50 ~
100 eigenvectors should take ~5-6 hours on similar machine.
You can generate a new model every few days and update database of similar
items.


2010/8/29 Young <wo...@126.com>

> Hi all,
>
> Based on 1M dataset, about how many requests could be expected to be
> handled at a time when using item-based recommender if the engine runs on a
> Core2 2.4G CPU and 4G meomory.
>
> Thank you very much.
>
> -- Young
>
>




-- 
Akshay Uday Bhat.
Graduate Student, Computer Science, Cornell University
Website: http://www.akshaybhat.com

Re: CPU Time

Posted by Sean Owen <sr...@gmail.com>.

It could vary a lot. How many users? items? which similarity metric?

But, that's fairly small. I'd be surprised if you couldn't do
recommendations in under 100ms per request per core.

2010/8/29 Young <wo...@126.com>:
> Hi all,
>
> Based on 1M dataset, about how many requests could be expected to be handled at a time when using item-based recommender if the engine runs on a Core2 2.4G CPU and 4G meomory.
>
> Thank you very much.
>
> -- Young
>
>

Re:Re: CPU Time

Posted by Young <wo...@126.com>.

Thank you very much. That makes sense.
 
-- Young




At 2010-08-30 14:59:14，"Sebastian Schelter" <ss...@apache.org> wrote:

>Hi Young,
>
>there are several tweaks available that can help you reduce response time.
>
>* you can precompute the item-similarities offline with
>ItemSimilarityJob and load the results into memory in the online
>recommender, so it can just look them up from RAM
>* in ItemSimilarityJob you can impose a limit on the number of similar
>items per single item in the results, this way you can limit the number
>of similarities taken into consideration while computing recommendations
>* at the start of the recommendation process, a CandidateItemsStrategy
>implementation is used to identify the initial set of items that might
>be worth recommending. There are several implementations available that
>might be worth trying and you can also create your own version optimized
>for your usecase
>
>Furthermore one of the key benefits of item-based recommendation is that
>the item relations tend to be very static. Thus it might be sufficient
>to precompute the item similarities in intervals (like once a day for
>example depending on your usecase). So if you only update your data once
>a day for example, you can consider the data readonly between those
>updates which makes it an ideal caching candidate. If you manage to
>cache your computed recommendations answering a request from an
>in-memory cache should be accomplished in less than 1ms.
>
>--sebastian
>
>Am 29.08.2010 21:47, schrieb Young:
>> Hi all,
>>  
>> Based on 1M dataset, about how many requests could be expected to be handled at a time when using item-based recommender if the engine runs on a Core2 2.4G CPU and 4G meomory. 
>>  
>> Thank you very much.
>>  
>> -- Young
>>  
>>  
>

Re: CPU Time

Posted by Sebastian Schelter <ss...@apache.org>.

Hi Young,

there are several tweaks available that can help you reduce response time.

* you can precompute the item-similarities offline with
ItemSimilarityJob and load the results into memory in the online
recommender, so it can just look them up from RAM
* in ItemSimilarityJob you can impose a limit on the number of similar
items per single item in the results, this way you can limit the number
of similarities taken into consideration while computing recommendations
* at the start of the recommendation process, a CandidateItemsStrategy
implementation is used to identify the initial set of items that might
be worth recommending. There are several implementations available that
might be worth trying and you can also create your own version optimized
for your usecase

Furthermore one of the key benefits of item-based recommendation is that
the item relations tend to be very static. Thus it might be sufficient
to precompute the item similarities in intervals (like once a day for
example depending on your usecase). So if you only update your data once
a day for example, you can consider the data readonly between those
updates which makes it an ideal caching candidate. If you manage to
cache your computed recommendations answering a request from an
in-memory cache should be accomplished in less than 1ms.

--sebastian

Am 29.08.2010 21:47, schrieb Young:
> Hi all,
>  
> Based on 1M dataset, about how many requests could be expected to be handled at a time when using item-based recommender if the engine runs on a Core2 2.4G CPU and 4G meomory. 
>  
> Thank you very much.
>  
> -- Young
>  
>