You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by prasenjit mukherjee <pm...@quattrowireless.com> on 2009/11/23 05:48:42 UTC

Random Indexing for tensor SVD ( was Re: mahout examples)

Hi Jake,
   Do you intend to contribute some of the Random Indexing code ?  I
am working on a multi-way clustering problem and was thinking of using
tensor SVD to do that. In that context was wondering if anyone has
used Random Indexing to solve  Higher Order SVD problem.  I guess we
can extend the current 2d approach to higher dimensions  while
generating  the context vectors via iterating over the individual
contexts.

My concern is that ( still  working that  out ) whether I am violating
any other constraints between the non-reducing dimensions.

-Prasen

On Sun, Nov 22, 2009 at 10:37 PM, Jake Mannix <ja...@gmail.com> wrote:

<snipped/>

> The machinery to do the above in parallel on "ridiculously big" data on
> Hadoop
> should be coming in soon with some of the stuff I'm working on contributing
> to Mahout.
>
>  -jake
>

Re: Random Indexing for tensor SVD ( was Re: mahout examples)

Posted by Ted Dunning <te...@gmail.com>.

I should have been made a different comment earlier.  What you said was
right in spirit, but off in the details.  I complained about the details
rather than pointing out the what was right.

If you have multiple characteristics of an object like a user, what you have
is multiple matrices, not a tensor.  In such a case, you can just adjoin the
matrices column-wise and decompose to your heart's content.  There is no
need for tensor decompositions.

So, Prasenjit, you were absolutely correct that you can decompose on
multiple features at once.  The only catch is that it is easier than you
were suggesting.

On Mon, Nov 23, 2009 at 12:52 AM, prasenjit mukherjee <
pmukherjee@quattrowireless.com> wrote:

> Agreed there isn't a unique definition of SVD for tensors, but there
> have been attempts to  extend matrix SVD to higher order
> decompositions ( although not orthogonal but diagonal )  to achieve
> multi-way clustering.  Geometrical interpretations are still fairly
> difficult to comprehend though.
>
> Found this article a bit relevant :
> www.graphanalysis.org/SIAM-PP08/Dunlavy.pdf
>
> -Prasen
>
> On Mon, Nov 23, 2009 at 11:49 AM, Ted Dunning <te...@gmail.com>
> wrote:
> > I don't think that there is a unique definition for singular value
> > decompositions for either Clifford algebras or for tensors.
> >
> > You can define an analogous decomposition using LDA.
> >
> > On Sun, Nov 22, 2009 at 8:48 PM, prasenjit mukherjee <
> > pmukherjee@quattrowireless.com> wrote:
> >
> >> Hi Jake,
> >>   Do you intend to contribute some of the Random Indexing code ?  I
> >> am working on a multi-way clustering problem and was thinking of using
> >> tensor SVD to do that. In that context was wondering if anyone has
> >> used Random Indexing to solve  Higher Order SVD problem.  I guess we
> >> can extend the current 2d approach to higher dimensions  while
> >> generating  the context vectors via iterating over the individual
> >> contexts.
> >>
> >> My concern is that ( still  working that  out ) whether I am violating
> >> any other constraints between the non-reducing dimensions.
> >>
> >> -Prasen
> >>
> >> On Sun, Nov 22, 2009 at 10:37 PM, Jake Mannix <ja...@gmail.com>
> >> wrote:
> >>
> >> <snipped/>
> >>
> >> > The machinery to do the above in parallel on "ridiculously big" data
> on
> >> > Hadoop
> >> > should be coming in soon with some of the stuff I'm working on
> >> contributing
> >> > to Mahout.
> >> >
> >> >  -jake
> >> >
> >>
> >
> >
> >
> > --
> > Ted Dunning, CTO
> > DeepDyve
> >
>



-- 
Ted Dunning, CTO
DeepDyve

Re: Random Indexing for tensor SVD ( was Re: mahout examples)

Posted by prasenjit mukherjee <pm...@quattrowireless.com>.

Agreed there isn't a unique definition of SVD for tensors, but there
have been attempts to  extend matrix SVD to higher order
decompositions ( although not orthogonal but diagonal )  to achieve
multi-way clustering.  Geometrical interpretations are still fairly
difficult to comprehend though.

Found this article a bit relevant : www.graphanalysis.org/SIAM-PP08/Dunlavy.pdf

-Prasen

On Mon, Nov 23, 2009 at 11:49 AM, Ted Dunning <te...@gmail.com> wrote:
> I don't think that there is a unique definition for singular value
> decompositions for either Clifford algebras or for tensors.
>
> You can define an analogous decomposition using LDA.
>
> On Sun, Nov 22, 2009 at 8:48 PM, prasenjit mukherjee <
> pmukherjee@quattrowireless.com> wrote:
>
>> Hi Jake,
>>   Do you intend to contribute some of the Random Indexing code ?  I
>> am working on a multi-way clustering problem and was thinking of using
>> tensor SVD to do that. In that context was wondering if anyone has
>> used Random Indexing to solve  Higher Order SVD problem.  I guess we
>> can extend the current 2d approach to higher dimensions  while
>> generating  the context vectors via iterating over the individual
>> contexts.
>>
>> My concern is that ( still  working that  out ) whether I am violating
>> any other constraints between the non-reducing dimensions.
>>
>> -Prasen
>>
>> On Sun, Nov 22, 2009 at 10:37 PM, Jake Mannix <ja...@gmail.com>
>> wrote:
>>
>> <snipped/>
>>
>> > The machinery to do the above in parallel on "ridiculously big" data on
>> > Hadoop
>> > should be coming in soon with some of the stuff I'm working on
>> contributing
>> > to Mahout.
>> >
>> >  -jake
>> >
>>
>
>
>
> --
> Ted Dunning, CTO
> DeepDyve
>

Re: Random Indexing for tensor SVD ( was Re: mahout examples)

Posted by Ted Dunning <te...@gmail.com>.

I don't think that there is a unique definition for singular value
decompositions for either Clifford algebras or for tensors.

You can define an analogous decomposition using LDA.

On Sun, Nov 22, 2009 at 8:48 PM, prasenjit mukherjee <
pmukherjee@quattrowireless.com> wrote:

> Hi Jake,
>   Do you intend to contribute some of the Random Indexing code ?  I
> am working on a multi-way clustering problem and was thinking of using
> tensor SVD to do that. In that context was wondering if anyone has
> used Random Indexing to solve  Higher Order SVD problem.  I guess we
> can extend the current 2d approach to higher dimensions  while
> generating  the context vectors via iterating over the individual
> contexts.
>
> My concern is that ( still  working that  out ) whether I am violating
> any other constraints between the non-reducing dimensions.
>
> -Prasen
>
> On Sun, Nov 22, 2009 at 10:37 PM, Jake Mannix <ja...@gmail.com>
> wrote:
>
> <snipped/>
>
> > The machinery to do the above in parallel on "ridiculously big" data on
> > Hadoop
> > should be coming in soon with some of the stuff I'm working on
> contributing
> > to Mahout.
> >
> >  -jake
> >
>



-- 
Ted Dunning, CTO
DeepDyve

Re: Random Indexing for tensor SVD ( was Re: mahout examples)

Posted by Jake Mannix <ja...@gmail.com>.

Hey Prasen,

  As I was kinda getting at over IM, for the order-3 tensor case, you can do

things like promote your scalars in your "document" vectors to be matrices,
and then run the same kinds of decompositions on the resultant vectors as
the matrix case, including do random-pre-projection.  You can only get so
much information this way (you're effectively reducing by a trace across
some sub-tensors along the way), but it is better than nothing.

  But yeah, at least from a procedural standpoint, the
Martinsson/Halko/Tropp gaussian pre-processing steps should be pretty
portable to whatever fancy techniques are done on your tensors.  Where
you go from there is up to you!

  -jake

On Sun, Nov 22, 2009 at 8:48 PM, prasenjit mukherjee <
pmukherjee@quattrowireless.com> wrote:

> Hi Jake,
>   Do you intend to contribute some of the Random Indexing code ?  I
> am working on a multi-way clustering problem and was thinking of using
> tensor SVD to do that. In that context was wondering if anyone has
> used Random Indexing to solve  Higher Order SVD problem.  I guess we
> can extend the current 2d approach to higher dimensions  while
> generating  the context vectors via iterating over the individual
> contexts.
>
> My concern is that ( still  working that  out ) whether I am violating
> any other constraints between the non-reducing dimensions.
>
> -Prasen
>
> On Sun, Nov 22, 2009 at 10:37 PM, Jake Mannix <ja...@gmail.com>
> wrote:
>
> <snipped/>
>
> > The machinery to do the above in parallel on "ridiculously big" data on
> > Hadoop
> > should be coming in soon with some of the stuff I'm working on
> contributing
> > to Mahout.
> >
> >  -jake
> >
>