You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Lance Norskog <go...@gmail.com> on 2012/09/16 05:33:59 UTC

Using SVD-conditioned matrix

If you condition a vector set with the zero-the-small-singular-values
trick, how do you project a vector from original space to the
conditioned space? This would let you "condition" new data from a
homogeneous dataset.

It would be useful in the Mahout context. For example, with SSVD you
can use the technique to get better vector clustering. You can create
the "conditioning projection" from a sampled dataset, instead of
decomposing and recomposing the whole dataset.

Also asked on stack overflow:
http://stackoverflow.com/questions/12444231/svd-matrix-conditioning-how-to-project-from-original-space-to-conditioned-spac

-- 
Lance Norskog

Re: Using SVD-conditioned matrix

Posted by Ted Dunning <te...@gmail.com>.
Lance,

There is still a confusion.

Strictly speaking, the elements of a sub-space are a sub-set of the
original space.  Vectors projected into the reduced representation are
one-to-one and onto the sub-space spanned by rows of V_k.  In fact, this is
basically just a rotation and/or reflection.

When you actually come to compute with these entities, however, it is
important to be clear which representation you are using.

To be totally anal about this, and assuming that A is n x m and n > m > k,
the rows of A span a sub-space of R^m.  The V matrix in the full SVD form a
basis of this space which is normally referred to as the span of A.  The
right hand singular vectors of the reduced SVD V_k is a basis of a
k-dimensional subspace of span A which is the most important sub-space in a
least squares sense.

Any m-dimensional point in this k-dimensional sub-space of R^m can be
projected using V_k onto R^k and can be projected back using V_k'.  Any
point in span A not in the k-dimensional subspace of R^m will be projected
to the nearest point (in L_2 sense) of the k-dimensional sub-space by V_k
V_k'.

So in LSA terms, you can convert any vector (old or new) into the latent
factor space by right multiplying by V_k.  Dot products in that space have
the same value as the dot products in the k-dimensional space.  If you use
the diagonal weights, then you can approximate the vector product x A' A y
in a least squares sense.

More in-line.

On Sun, Sep 16, 2012 at 6:43 PM, Lance Norskog <go...@gmail.com> wrote:

> > The original question (possibly misphrased and not in so many words)
> asked
> for a way to project any vector into the space spanned by A.
>
> Yes, it was not completely worded. The interpretation above is backwards.
> There is a subspace sA. In this subspace there is a set of vectors.
> These vectors are rows in a matrix A. We condition matrix A to create
> matrix Ak.  All row vectors in A are now in a new subspace sAk.
>
> Now, there is a new row vector in subspace sA. What matrix projects
> this row vector into subspace sAk?
>

V_k.

You can spot this quickly by noting that the new row vector will be 1 x m
in size.  U S V' = A which is n x m.  Thus U is n x k, S is k x k and V is
m x k.  Only V has the right shape to be multiplied by the new row vector
and the result will be 1 x k as desired.

One use case is new user recommendations. The new user gives a small
> set of items. We would like to find other users. For simplicity, take
> users as rows and items as columns in A. We want a conditioned Ak
> where we can take a new user with a few preferences, and do a cosine
> distance search among other users. With the ability to project a new
> user into a conditioned space, we can find other users who may not
> have any of the new user's items. Using a conditioned Ak increases
> recall at the expense of precision. The more conditioned, the more
> recall.
>

The use of the term "conditioned" is a bit misleading.  There is a concept
called condition-number with matrices that refers to the ratio of the
largest to smallest singular values.  When you set some of the singular
values to 0, you have explicitly set the condition number to infinity.

I don't even think that your claim that decreasing k increases recall is
correct.


> On Sun, Sep 16, 2012 at 4:11 PM, Ted Dunning <te...@gmail.com>
> wrote:
> > On Sun, Sep 16, 2012 at 1:49 PM, Sean Owen <sr...@gmail.com> wrote:
> >
> >> Oh right. It's the columns that are orthogonal. Cancel that.
> >>
> >> But how does this let you get a row of Uk out of a new row in Ak? Uk
> >> != Uk * Uk' * Ak
> >>
> >
> > Well, actually, since a column of A is clearly in the space spanned by A
> so
> > if you take any such column, U_k U_k'  will send it back to where it came
> > from.  Thus,
> >
> >   A_k = U_k U_k' A_k
> >
> > If you want to talk about rows, then use V_k as with this:
> >
> >   A_k = A_k V_k V_k'
> >
> >
> > Forgetting Sk, Uk = Ak * Vk, and I miss how these are equivalent so I
> >> misunderstand the intent of the expression.
> >>
> >
> > Forgetting S_k is a bit dangerous here.  It is true that
> >
> >     U_k S_k = A_k V_k
> >
> > But, of course, right multiplying by V_k' gives us the identity above.
> >
> > I think that the real confusion is that I am talking about projecting
> back
> > into span A and you are talking about expressing things in terms of the
> > latent variables.
> >
> >
> >> On Sun, Sep 16, 2012 at 8:55 PM, Ted Dunning <te...@gmail.com>
> >> wrote:
> >> > U_k ' U_k = I
> >> >
> >> > U_k U_k ' != I
> >>
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>

Re: Using SVD-conditioned matrix

Posted by Lance Norskog <go...@gmail.com>.
> The original question (possibly misphrased and not in so many words) asked
for a way to project any vector into the space spanned by A.

Yes, it was not completely worded. The interpretation above is backwards.
There is a subspace sA. In this subspace there is a set of vectors.
These vectors are rows in a matrix A. We condition matrix A to create
matrix Ak.  All row vectors in A are now in a new subspace sAk.

Now, there is a new row vector in subspace sA. What matrix projects
this row vector into subspace sAk?

One use case is new user recommendations. The new user gives a small
set of items. We would like to find other users. For simplicity, take
users as rows and items as columns in A. We want a conditioned Ak
where we can take a new user with a few preferences, and do a cosine
distance search among other users. With the ability to project a new
user into a conditioned space, we can find other users who may not
have any of the new user's items. Using a conditioned Ak increases
recall at the expense of precision. The more conditioned, the more
recall.

On Sun, Sep 16, 2012 at 4:11 PM, Ted Dunning <te...@gmail.com> wrote:
> On Sun, Sep 16, 2012 at 1:49 PM, Sean Owen <sr...@gmail.com> wrote:
>
>> Oh right. It's the columns that are orthogonal. Cancel that.
>>
>> But how does this let you get a row of Uk out of a new row in Ak? Uk
>> != Uk * Uk' * Ak
>>
>
> Well, actually, since a column of A is clearly in the space spanned by A so
> if you take any such column, U_k U_k'  will send it back to where it came
> from.  Thus,
>
>   A_k = U_k U_k' A_k
>
> If you want to talk about rows, then use V_k as with this:
>
>   A_k = A_k V_k V_k'
>
>
> Forgetting Sk, Uk = Ak * Vk, and I miss how these are equivalent so I
>> misunderstand the intent of the expression.
>>
>
> Forgetting S_k is a bit dangerous here.  It is true that
>
>     U_k S_k = A_k V_k
>
> But, of course, right multiplying by V_k' gives us the identity above.
>
> I think that the real confusion is that I am talking about projecting back
> into span A and you are talking about expressing things in terms of the
> latent variables.
>
>
>> On Sun, Sep 16, 2012 at 8:55 PM, Ted Dunning <te...@gmail.com>
>> wrote:
>> > U_k ' U_k = I
>> >
>> > U_k U_k ' != I
>>



-- 
Lance Norskog
goksron@gmail.com

Re: Using SVD-conditioned matrix

Posted by Lance Norskog <go...@gmail.com>.
Thanks- I will mess around in R with this advice.

On Mon, Sep 17, 2012 at 12:30 AM, Sean Owen <sr...@gmail.com> wrote:
> On Mon, Sep 17, 2012 at 12:11 AM, Ted Dunning <te...@gmail.com> wrote:
>
>>   A_k = U_k U_k' A_k
>
>> Forgetting S_k is a bit dangerous here.  It is true that
>>
>>     U_k S_k = A_k V_k
>>
>> But, of course, right multiplying by V_k' gives us the identity above.
>>
>> I think that the real confusion is that I am talking about projecting back
>> into span A and you are talking about expressing things in terms of the
>> latent variables.
>
> Right, this goes all the way to "making recommendations", back to Ak.
> I think Lance was asking about how to find the new row in latent
> feature space, a row in Uk.
>
> You can't ignore Sk, yes, just trying to get to the essence. Putting
> it back in (half-and-half) does get what I posted which indeed is the
> project I think that was in question..
>
> Uk * sqrt(Sk) = Ak * Vk * 1/sqrt(Sk)



-- 
Lance Norskog
goksron@gmail.com

Re: Using SVD-conditioned matrix

Posted by Sean Owen <sr...@gmail.com>.
On Mon, Sep 17, 2012 at 12:11 AM, Ted Dunning <te...@gmail.com> wrote:

>   A_k = U_k U_k' A_k

> Forgetting S_k is a bit dangerous here.  It is true that
>
>     U_k S_k = A_k V_k
>
> But, of course, right multiplying by V_k' gives us the identity above.
>
> I think that the real confusion is that I am talking about projecting back
> into span A and you are talking about expressing things in terms of the
> latent variables.

Right, this goes all the way to "making recommendations", back to Ak.
I think Lance was asking about how to find the new row in latent
feature space, a row in Uk.

You can't ignore Sk, yes, just trying to get to the essence. Putting
it back in (half-and-half) does get what I posted which indeed is the
project I think that was in question..

Uk * sqrt(Sk) = Ak * Vk * 1/sqrt(Sk)

Re: Using SVD-conditioned matrix

Posted by Ted Dunning <te...@gmail.com>.
On Sun, Sep 16, 2012 at 1:49 PM, Sean Owen <sr...@gmail.com> wrote:

> Oh right. It's the columns that are orthogonal. Cancel that.
>
> But how does this let you get a row of Uk out of a new row in Ak? Uk
> != Uk * Uk' * Ak
>

Well, actually, since a column of A is clearly in the space spanned by A so
if you take any such column, U_k U_k'  will send it back to where it came
from.  Thus,

  A_k = U_k U_k' A_k

If you want to talk about rows, then use V_k as with this:

  A_k = A_k V_k V_k'


Forgetting Sk, Uk = Ak * Vk, and I miss how these are equivalent so I
> misunderstand the intent of the expression.
>

Forgetting S_k is a bit dangerous here.  It is true that

    U_k S_k = A_k V_k

But, of course, right multiplying by V_k' gives us the identity above.

I think that the real confusion is that I am talking about projecting back
into span A and you are talking about expressing things in terms of the
latent variables.


> On Sun, Sep 16, 2012 at 8:55 PM, Ted Dunning <te...@gmail.com>
> wrote:
> > U_k ' U_k = I
> >
> > U_k U_k ' != I
>

Re: Using SVD-conditioned matrix

Posted by Sean Owen <sr...@gmail.com>.
Oh right. It's the columns that are orthogonal. Cancel that.

But how does this let you get a row of Uk out of a new row in Ak? Uk
!= Uk * Uk' * Ak

Forgetting Sk, Uk = Ak * Vk, and I miss how these are equivalent so I
misunderstand the intent of the expression.

On Sun, Sep 16, 2012 at 8:55 PM, Ted Dunning <te...@gmail.com> wrote:
> U_k ' U_k = I
>
> U_k U_k ' != I

Re: Using SVD-conditioned matrix

Posted by Ted Dunning <te...@gmail.com>.
I should point out that for vectors in the space spanned by columns of A,
U_k U_k' is an identity mapping. For any vector x in the null space of
columns of A, U_k U_k x = 0.

You can view every vector u as the sum of a component in span A and an
orthogonal component in null A

     u = u_A + u_/A

If you shove u through U_k U_k' you get this:

     U_k U_k' u = U_k U_k' (u_A + u_/A) = U_k U_k' (u_A) + 0 = u_A

This is another way of showing that U_k U_k' projects a vector into span A.

On Sun, Sep 16, 2012 at 12:55 PM, Ted Dunning <te...@gmail.com> wrote:

> U_k ' U_k = I
>
> U_k U_k ' != I
>
>
> On Sun, Sep 16, 2012 at 12:54 PM, Sean Owen <sr...@gmail.com> wrote:
>
>> I have a feeling this is a dumb question. But in...
>>
>>  u_k = U_k U_k' u
>>
>> U is orthonormal, so Uk is orthonormal. So U_k U_k' is the identity.
>> So what does this actually say?
>>
>> It's surely on the right track of somehthing, since I find the same
>> expression in that original SVD paper, "Incremental Singular Value
>> Decomposition Algorithms for Highly Scalable Recommender Systems". But
>> it has some serious typos in its notation. (Try to figure out Figure
>> 1.) And it proceeds in its analysis by basically saying that the
>> projection is Uk' times the new vector, so, I never understood this
>> expression.
>>
>> On Sun, Sep 16, 2012 at 7:13 PM, Ted Dunning <te...@gmail.com>
>> wrote:
>> > A is in there implicitly.
>> >
>> > U_k provides a basis of the row space and V_k provides a basis of the
>> > column space of A.  A itself has a representation in either of these
>> > depending on whether you think of it as rows or columns.
>> >
>> > The original question (possibly misphrased and not in so many words)
>> asked
>> > for a way to project any vector into the space spanned by A.  What space
>> > this is depends on whether we have a new column vector (with n rows x 1
>> > column) or a new row vector (with 1 row x m columns).  In either case,
>> the
>> > way to compute this is to project the vector onto the basis of interest
>> > (U_k for column vectors, V_k for row vectors) and then reconstruct that
>> > projection in the original space.
>> >
>> > This is not usually a practical operation when we are working with
>> sparse
>> > vectors because the projected vector is not usually sparse.  Thus it is
>> > usually better to stop before projecting back into the original space.
>> >
>> > On Sun, Sep 16, 2012 at 10:26 AM, Sean Owen <sr...@gmail.com> wrote:
>> >
>> >> I don't quite get these formulations -- shouldn't Ak be in there
>> >> somewhere? you have a new row of that (well, some piece of some new
>> >> row of Ak), and need a new row of Uk. Or: surely the expression
>> >> depends on V?
>> >>
>> >> On Sun, Sep 16, 2012 at 5:33 PM, Ted Dunning <te...@gmail.com>
>> >> wrote:
>> >> > And if you want the reduced rank representation of A, you have it
>> already
>> >> > with
>> >> >
>> >> >     A_k = U_k S_k V_k'
>> >> >
>> >> > Assume that A is n x m in size.  This means that U_k is n x k and
>> V_k is
>> >> m
>> >> > x k
>> >> >
>> >> > The rank reduced projection of an n x 1 column vector is
>> >> >
>> >> >     u_k = U_k U_k' u
>> >> >
>> >> > Beware that v_k is probably not sparse even if v is sparse.
>> >> >
>> >> > Similarly, the rank reduced projection of a 1 x m row vector is
>> >> >
>> >> >     v_k = v V_k V_k'
>> >> >
>> >> > A similar sparsity warning applies to v_k.  This is why it is usually
>> >> > preferable to just work in the reduced space directly.
>> >>
>>
>
>

Re: Using SVD-conditioned matrix

Posted by Ted Dunning <te...@gmail.com>.
U_k ' U_k = I

U_k U_k ' != I

On Sun, Sep 16, 2012 at 12:54 PM, Sean Owen <sr...@gmail.com> wrote:

> I have a feeling this is a dumb question. But in...
>
>  u_k = U_k U_k' u
>
> U is orthonormal, so Uk is orthonormal. So U_k U_k' is the identity.
> So what does this actually say?
>
> It's surely on the right track of somehthing, since I find the same
> expression in that original SVD paper, "Incremental Singular Value
> Decomposition Algorithms for Highly Scalable Recommender Systems". But
> it has some serious typos in its notation. (Try to figure out Figure
> 1.) And it proceeds in its analysis by basically saying that the
> projection is Uk' times the new vector, so, I never understood this
> expression.
>
> On Sun, Sep 16, 2012 at 7:13 PM, Ted Dunning <te...@gmail.com>
> wrote:
> > A is in there implicitly.
> >
> > U_k provides a basis of the row space and V_k provides a basis of the
> > column space of A.  A itself has a representation in either of these
> > depending on whether you think of it as rows or columns.
> >
> > The original question (possibly misphrased and not in so many words)
> asked
> > for a way to project any vector into the space spanned by A.  What space
> > this is depends on whether we have a new column vector (with n rows x 1
> > column) or a new row vector (with 1 row x m columns).  In either case,
> the
> > way to compute this is to project the vector onto the basis of interest
> > (U_k for column vectors, V_k for row vectors) and then reconstruct that
> > projection in the original space.
> >
> > This is not usually a practical operation when we are working with sparse
> > vectors because the projected vector is not usually sparse.  Thus it is
> > usually better to stop before projecting back into the original space.
> >
> > On Sun, Sep 16, 2012 at 10:26 AM, Sean Owen <sr...@gmail.com> wrote:
> >
> >> I don't quite get these formulations -- shouldn't Ak be in there
> >> somewhere? you have a new row of that (well, some piece of some new
> >> row of Ak), and need a new row of Uk. Or: surely the expression
> >> depends on V?
> >>
> >> On Sun, Sep 16, 2012 at 5:33 PM, Ted Dunning <te...@gmail.com>
> >> wrote:
> >> > And if you want the reduced rank representation of A, you have it
> already
> >> > with
> >> >
> >> >     A_k = U_k S_k V_k'
> >> >
> >> > Assume that A is n x m in size.  This means that U_k is n x k and V_k
> is
> >> m
> >> > x k
> >> >
> >> > The rank reduced projection of an n x 1 column vector is
> >> >
> >> >     u_k = U_k U_k' u
> >> >
> >> > Beware that v_k is probably not sparse even if v is sparse.
> >> >
> >> > Similarly, the rank reduced projection of a 1 x m row vector is
> >> >
> >> >     v_k = v V_k V_k'
> >> >
> >> > A similar sparsity warning applies to v_k.  This is why it is usually
> >> > preferable to just work in the reduced space directly.
> >>
>

Re: Using SVD-conditioned matrix

Posted by Sean Owen <sr...@gmail.com>.
I have a feeling this is a dumb question. But in...

 u_k = U_k U_k' u

U is orthonormal, so Uk is orthonormal. So U_k U_k' is the identity.
So what does this actually say?

It's surely on the right track of somehthing, since I find the same
expression in that original SVD paper, "Incremental Singular Value
Decomposition Algorithms for Highly Scalable Recommender Systems". But
it has some serious typos in its notation. (Try to figure out Figure
1.) And it proceeds in its analysis by basically saying that the
projection is Uk' times the new vector, so, I never understood this
expression.

On Sun, Sep 16, 2012 at 7:13 PM, Ted Dunning <te...@gmail.com> wrote:
> A is in there implicitly.
>
> U_k provides a basis of the row space and V_k provides a basis of the
> column space of A.  A itself has a representation in either of these
> depending on whether you think of it as rows or columns.
>
> The original question (possibly misphrased and not in so many words) asked
> for a way to project any vector into the space spanned by A.  What space
> this is depends on whether we have a new column vector (with n rows x 1
> column) or a new row vector (with 1 row x m columns).  In either case, the
> way to compute this is to project the vector onto the basis of interest
> (U_k for column vectors, V_k for row vectors) and then reconstruct that
> projection in the original space.
>
> This is not usually a practical operation when we are working with sparse
> vectors because the projected vector is not usually sparse.  Thus it is
> usually better to stop before projecting back into the original space.
>
> On Sun, Sep 16, 2012 at 10:26 AM, Sean Owen <sr...@gmail.com> wrote:
>
>> I don't quite get these formulations -- shouldn't Ak be in there
>> somewhere? you have a new row of that (well, some piece of some new
>> row of Ak), and need a new row of Uk. Or: surely the expression
>> depends on V?
>>
>> On Sun, Sep 16, 2012 at 5:33 PM, Ted Dunning <te...@gmail.com>
>> wrote:
>> > And if you want the reduced rank representation of A, you have it already
>> > with
>> >
>> >     A_k = U_k S_k V_k'
>> >
>> > Assume that A is n x m in size.  This means that U_k is n x k and V_k is
>> m
>> > x k
>> >
>> > The rank reduced projection of an n x 1 column vector is
>> >
>> >     u_k = U_k U_k' u
>> >
>> > Beware that v_k is probably not sparse even if v is sparse.
>> >
>> > Similarly, the rank reduced projection of a 1 x m row vector is
>> >
>> >     v_k = v V_k V_k'
>> >
>> > A similar sparsity warning applies to v_k.  This is why it is usually
>> > preferable to just work in the reduced space directly.
>>

Re: Using SVD-conditioned matrix

Posted by Ted Dunning <te...@gmail.com>.
A is in there implicitly.

U_k provides a basis of the row space and V_k provides a basis of the
column space of A.  A itself has a representation in either of these
depending on whether you think of it as rows or columns.

The original question (possibly misphrased and not in so many words) asked
for a way to project any vector into the space spanned by A.  What space
this is depends on whether we have a new column vector (with n rows x 1
column) or a new row vector (with 1 row x m columns).  In either case, the
way to compute this is to project the vector onto the basis of interest
(U_k for column vectors, V_k for row vectors) and then reconstruct that
projection in the original space.

This is not usually a practical operation when we are working with sparse
vectors because the projected vector is not usually sparse.  Thus it is
usually better to stop before projecting back into the original space.

On Sun, Sep 16, 2012 at 10:26 AM, Sean Owen <sr...@gmail.com> wrote:

> I don't quite get these formulations -- shouldn't Ak be in there
> somewhere? you have a new row of that (well, some piece of some new
> row of Ak), and need a new row of Uk. Or: surely the expression
> depends on V?
>
> On Sun, Sep 16, 2012 at 5:33 PM, Ted Dunning <te...@gmail.com>
> wrote:
> > And if you want the reduced rank representation of A, you have it already
> > with
> >
> >     A_k = U_k S_k V_k'
> >
> > Assume that A is n x m in size.  This means that U_k is n x k and V_k is
> m
> > x k
> >
> > The rank reduced projection of an n x 1 column vector is
> >
> >     u_k = U_k U_k' u
> >
> > Beware that v_k is probably not sparse even if v is sparse.
> >
> > Similarly, the rank reduced projection of a 1 x m row vector is
> >
> >     v_k = v V_k V_k'
> >
> > A similar sparsity warning applies to v_k.  This is why it is usually
> > preferable to just work in the reduced space directly.
>

Re: Using SVD-conditioned matrix

Posted by Sean Owen <sr...@gmail.com>.
I don't quite get these formulations -- shouldn't Ak be in there
somewhere? you have a new row of that (well, some piece of some new
row of Ak), and need a new row of Uk. Or: surely the expression
depends on V?

On Sun, Sep 16, 2012 at 5:33 PM, Ted Dunning <te...@gmail.com> wrote:
> And if you want the reduced rank representation of A, you have it already
> with
>
>     A_k = U_k S_k V_k'
>
> Assume that A is n x m in size.  This means that U_k is n x k and V_k is m
> x k
>
> The rank reduced projection of an n x 1 column vector is
>
>     u_k = U_k U_k' u
>
> Beware that v_k is probably not sparse even if v is sparse.
>
> Similarly, the rank reduced projection of a 1 x m row vector is
>
>     v_k = v V_k V_k'
>
> A similar sparsity warning applies to v_k.  This is why it is usually
> preferable to just work in the reduced space directly.

Re: Using SVD-conditioned matrix

Posted by Ted Dunning <te...@gmail.com>.
And if you want the reduced rank representation of A, you have it already
with

    A_k = U_k S_k V_k'

Assume that A is n x m in size.  This means that U_k is n x k and V_k is m
x k

The rank reduced projection of an n x 1 column vector is

    u_k = U_k U_k' u

Beware that v_k is probably not sparse even if v is sparse.

Similarly, the rank reduced projection of a 1 x m row vector is

    v_k = v V_k V_k'

A similar sparsity warning applies to v_k.  This is why it is usually
preferable to just work in the reduced space directly.


On Sun, Sep 16, 2012 at 2:34 AM, Sean Owen <sr...@gmail.com> wrote:

> This is the same discussion that's been going on here about "fold-in".
>
> If the decomposition is A ~= Ak = Uk * Sk  * Vk', then you can get an
> expression for just Uk by multiplying on the right by right-inverses.
> You want to take off V', and "half" of S, meaning its square root. So
> we're really working with Ak ~= (Uk * sqrt(Sk)) * (sqrt(Sk) * Vk')
>
> The right-inverse of Vk' is Vk, since it's orthonormal. The inverse of
> a diagonal matrix is just the diagonal matrix of its reciprocals. Call
> the inverse of sqrt(S) 1/sqrt(S)
>
> So Uk * sqrt(Sk) = Ak * Vk * 1/sqrt(Sk)
>
> This is how you project a row of Ak. Something entirely similar goes
> for columns:
>
> sqrt(Sk) * Vk' = 1/sqrt(Sk) * Uk' * Ak
>
> Sean
>
> On Sun, Sep 16, 2012 at 4:33 AM, Lance Norskog <go...@gmail.com> wrote:
> > If you condition a vector set with the zero-the-small-singular-values
> > trick, how do you project a vector from original space to the
> > conditioned space? This would let you "condition" new data from a
> > homogeneous dataset.
> >
> > It would be useful in the Mahout context. For example, with SSVD you
> > can use the technique to get better vector clustering. You can create
> > the "conditioning projection" from a sampled dataset, instead of
> > decomposing and recomposing the whole dataset.
> >
> > Also asked on stack overflow:
> >
> http://stackoverflow.com/questions/12444231/svd-matrix-conditioning-how-to-project-from-original-space-to-conditioned-spac
> >
> > --
> > Lance Norskog
>

Re: Using SVD-conditioned matrix

Posted by Sean Owen <sr...@gmail.com>.
This is the same discussion that's been going on here about "fold-in".

If the decomposition is A ~= Ak = Uk * Sk  * Vk', then you can get an
expression for just Uk by multiplying on the right by right-inverses.
You want to take off V', and "half" of S, meaning its square root. So
we're really working with Ak ~= (Uk * sqrt(Sk)) * (sqrt(Sk) * Vk')

The right-inverse of Vk' is Vk, since it's orthonormal. The inverse of
a diagonal matrix is just the diagonal matrix of its reciprocals. Call
the inverse of sqrt(S) 1/sqrt(S)

So Uk * sqrt(Sk) = Ak * Vk * 1/sqrt(Sk)

This is how you project a row of Ak. Something entirely similar goes
for columns:

sqrt(Sk) * Vk' = 1/sqrt(Sk) * Uk' * Ak

Sean

On Sun, Sep 16, 2012 at 4:33 AM, Lance Norskog <go...@gmail.com> wrote:
> If you condition a vector set with the zero-the-small-singular-values
> trick, how do you project a vector from original space to the
> conditioned space? This would let you "condition" new data from a
> homogeneous dataset.
>
> It would be useful in the Mahout context. For example, with SSVD you
> can use the technique to get better vector clustering. You can create
> the "conditioning projection" from a sampled dataset, instead of
> decomposing and recomposing the whole dataset.
>
> Also asked on stack overflow:
> http://stackoverflow.com/questions/12444231/svd-matrix-conditioning-how-to-project-from-original-space-to-conditioned-spac
>
> --
> Lance Norskog