You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Lance Norskog <go...@gmail.com> on 2011/11/12 03:48:37 UTC

Coocurrence job

org.apache.mahout.math.hadoop.similarity.cooccurrence.Vectors.merge(Iterable<VectorWritable>)

This ORs together several (sparse) VectorWritables. It does not sum
together overlapping dimensions, it just overwrites them from the most
final vector in the list. Is this ok? Should they be summed?

-- 
Lance Norskog
goksron@gmail.com

Re: Coocurrence job

Posted by Lance Norskog <go...@gmail.com>.
Thanks.

On Sat, Nov 12, 2011 at 2:54 AM, Sebastian Schelter <ss...@apache.org> wrote:

> On 12.11.2011 11:26, Sean Owen wrote:
> > Looking at the uses of it, I think it receives as input vectors that
> > are "non overlapping" and is just stitching them together, so yes it's
> > correct.
> > But Sebastian can double-check.
>
> It is used exactly as Sean says. It is applied in the first pass over
> the data which transposes the input matrix and we are sure that there
> are no overlapping dimensions.
>
>
> --sebastian
>
> >
> > On Sat, Nov 12, 2011 at 2:48 AM, Lance Norskog <go...@gmail.com>
> wrote:
> >>
> org.apache.mahout.math.hadoop.similarity.cooccurrence.Vectors.merge(Iterable<VectorWritable>)
> >>
> >> This ORs together several (sparse) VectorWritables. It does not sum
> >> together overlapping dimensions, it just overwrites them from the most
> >> final vector in the list. Is this ok? Should they be summed?
> >>
> >> --
> >> Lance Norskog
> >> goksron@gmail.com
> >>
>
>


-- 
Lance Norskog
goksron@gmail.com

Re: Coocurrence job

Posted by Sebastian Schelter <ss...@apache.org>.
On 12.11.2011 11:26, Sean Owen wrote:
> Looking at the uses of it, I think it receives as input vectors that
> are "non overlapping" and is just stitching them together, so yes it's
> correct.
> But Sebastian can double-check.

It is used exactly as Sean says. It is applied in the first pass over
the data which transposes the input matrix and we are sure that there
are no overlapping dimensions.


--sebastian

> 
> On Sat, Nov 12, 2011 at 2:48 AM, Lance Norskog <go...@gmail.com> wrote:
>> org.apache.mahout.math.hadoop.similarity.cooccurrence.Vectors.merge(Iterable<VectorWritable>)
>>
>> This ORs together several (sparse) VectorWritables. It does not sum
>> together overlapping dimensions, it just overwrites them from the most
>> final vector in the list. Is this ok? Should they be summed?
>>
>> --
>> Lance Norskog
>> goksron@gmail.com
>>


Re: Coocurrence job

Posted by Sean Owen <sr...@gmail.com>.
Looking at the uses of it, I think it receives as input vectors that
are "non overlapping" and is just stitching them together, so yes it's
correct.
But Sebastian can double-check.

On Sat, Nov 12, 2011 at 2:48 AM, Lance Norskog <go...@gmail.com> wrote:
> org.apache.mahout.math.hadoop.similarity.cooccurrence.Vectors.merge(Iterable<VectorWritable>)
>
> This ORs together several (sparse) VectorWritables. It does not sum
> together overlapping dimensions, it just overwrites them from the most
> final vector in the list. Is this ok? Should they be summed?
>
> --
> Lance Norskog
> goksron@gmail.com
>