You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Wei Li <we...@gmail.com> on 2014/09/24 08:06:55 UTC

RowSimilarityJob countObservations

Hi All:

    In RowSimilairtyJob, I see the main functionality of countObservations
job is to aggregate the number of users for each item, is that right? if
so, why not directly calculate the counts, just like the WordCount logic.
In current implementations, we initialized a RandomAccessSparseVector,
which may cause the OutOfMemory issue when the number of users is large. Am
I understanding correctly? thanks.


Best
Wei

Re: RowSimilarityJob countObservations

Posted by Wei Li <we...@gmail.com>.
Hi All:

     Does anybody know this? and give me some hints? it is a little bit
emergent. many thanks.

Best
Wei

On Wed, Sep 24, 2014 at 2:06 PM, Wei Li <we...@gmail.com> wrote:

> Hi All:
>
>     In RowSimilairtyJob, I see the main functionality of countObservations
> job is to aggregate the number of users for each item, is that right? if
> so, why not directly calculate the counts, just like the WordCount logic.
> In current implementations, we initialized a RandomAccessSparseVector,
> which may cause the OutOfMemory issue when the number of users is large. Am
> I understanding correctly? thanks.
>
>
> Best
> Wei
>