You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Dimitri Pourbaix <po...@astro.ulb.ac.be> on 2010/01/31 22:22:51 UTC

[math] Work going on on SVD

Hi,

In its present form (CM2.0), SingularValueDecomposition suffers some
problems when the matrix is (numerically) singular.  Luc proposed a
way to improve the situation by limiting the singular values to non
zero ones.  Whereas the result is OK if someone is interested in getting
the singular values, it is not from a purely mathematical point of view.
Indeed, the resulting U matrix does no longer hold the right dimension.
Instead of a number of columns equal to the number of columns of the
original matrix, U now has as many columns as non-zero singular values.
The product U*S*V^T yields the original matrix but the wrong size might
put some users into trouble (that is true for the size of S as well).

I propose to compute U ... without taking advantage of V.  That means
calling EigenDecomposition a second time but should work even in case
of singular matrices.  That is the solution I am working on.  However,
doing so, I notice that EigenDecomposition also suffers major problems
in case of singular matrices.  A 3x3 singular matrix where 0 is an
eigen value with multiplicity 2 ... yields only 2 distinct eigen
vectors.  The vectors associated to the null eigen value are equal!!

So, before I can improve SVD, I have to improve EigenDecomposition!

By the way, going through SVD, EigenDecomposition, I noticed that
BidiagonalTransformer and TridiagonalTransformer both use the
Householder vector computation deeply imbedded in their code.  In
order to make both classes easier to read (and to debug), I wonder
if it might be useful to introduce a class Householder which would
take care of the computation of the vector in a unique place.

Regards,
  Dim.
----------------------------------------------------------------------------
Dimitri Pourbaix                         *
Institut d'Astronomie et d'Astrophysique *      Don't worry, be happy
CP 226, office 2.N4.211, building NO     *         and CARPE DIEM.
Universite Libre de Bruxelles            *
Boulevard du Triomphe                    *      Tel : +32-2-650.35.71
  B-1050 Bruxelles                        *      Fax : +32-2-650.42.26
http://sb9.astro.ulb.ac.be/~pourbaix     * mailto:pourbaix@astro.ulb.ac.be

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [math] Work going on on SVD

Posted by Bill Barker <bi...@verizon.net>.

--------------------------------------------------
From: "Luc Maisonobe" <Lu...@free.fr>
Sent: Tuesday, February 02, 2010 11:26 AM
To: "Commons Developers List" <de...@commons.apache.org>
Subject: Re: [math] Work going on on SVD

> Luc Maisonobe a écrit :
>> Dimitri Pourbaix a écrit :
>>> Hi,
>>
>> Hi Dimitri,
>>
>>> In its present form (CM2.0), SingularValueDecomposition suffers some
>>> problems when the matrix is (numerically) singular.  Luc proposed a
>>> way to improve the situation by limiting the singular values to non
>>> zero ones.  Whereas the result is OK if someone is interested in getting
>>> the singular values, it is not from a purely mathematical point of view.
>>> Indeed, the resulting U matrix does no longer hold the right dimension.
>>> Instead of a number of columns equal to the number of columns of the
>>> original matrix, U now has as many columns as non-zero singular values.
>>> The product U*S*V^T yields the original matrix but the wrong size might
>>> put some users into trouble (that is true for the size of S as well).
>>
>> I agree with this analysis. The current behavior is interesting for some
>> use cases but not for all of them. So I suggest we provide both cases.
>> This could be done either by renaming the current implementation as
>> TruncatedSVD and having your implementation named
>> SingularValueDecomposition, or by using some constructor parameters to
>> select the desired behavior.
>
> What do we do here ? If Dimitri can provide a new mathematically
> compliant SVD I would suggest to have it use the name
> SingularValueDecomposition and to rename the existing class
> TruncatedSVD. This class would take arguments to distinguish between
> Compact decomposition (automatic cutoff at zero value) and truncated
> decomposition (cutoff by singular values indices).
>
>>
>>> I propose to compute U ... without taking advantage of V.  That means
>>> calling EigenDecomposition a second time but should work even in case
>>> of singular matrices.  That is the solution I am working on.  However,
>>> doing so, I notice that EigenDecomposition also suffers major problems
>>> in case of singular matrices.  A 3x3 singular matrix where 0 is an
>>> eigen value with multiplicity 2 ... yields only 2 distinct eigen
>>> vectors.  The vectors associated to the null eigen value are equal!!
>>
>> Yes, this is JIRA issue MATH-333. The problem is that in the current
>> implementation, the vector is computed by
>> EigenDecomposition.findEigenVector which takes one eigenvalue as its
>> argument. so eigenvalues with order greater than 1 simply result in the
>> same computation repeated several times ...
>> Perhaps one way to solve this is to reproduce what is done in DLARRV
>> from LAPACK. The routine uses the index of the eigenvalue and not only
>> its value.
>
> Does someone had a look at DLARRV and could explain how it works (the
> vector part only, the singular values part is already well understood I
> think) ?
>
> I think we should wait for this issue to be solved before publishing
> 2.1. It is currently targeted at 2.2 but I would vote to solve it
> earlier. It is really a big drawback of the current implementation. Not
> being able to decompose identity is rather sad ...
>
+1
Having a SVD that doesn't work is really embarrassing.  I'll try and find 
some spare cycles to look at the DLARRV implementation. However, if Dimitri 
can create an alternative patch, all the better.

>>
>>> So, before I can improve SVD, I have to improve EigenDecomposition!
>>>
>>> By the way, going through SVD, EigenDecomposition, I noticed that
>>> BidiagonalTransformer and TridiagonalTransformer both use the
>>> Householder vector computation deeply imbedded in their code.  In
>>> order to make both classes easier to read (and to debug), I wonder
>>> if it might be useful to introduce a class Householder which would
>>> take care of the computation of the vector in a unique place.
>>
>> This would be a nice improvement and would probably be useful for other
>> linear algebra algorithms later on. The rationale for the current
>> implementation was to avoid copying data back and forth between a low
>> level elementary Householder class called n times and a high level
>> transformer like bi-diagonal or tri-diagonal transformer. If we can find
>> a way to have an elementary transform  processing in-place the array
>> provided to it by the  high level transformer, this would be fine.
>
> Perhaps this could wait for 2.2 or 3.0.
>
> Luc
>
>>
>> Once again, we see we lack a way to have partial view of matrices slices.
>>
>> Luc
>>
>>
>>> Regards,
>>>  Dim.
>>> ----------------------------------------------------------------------------
>>>
>>> Dimitri Pourbaix                         *
>>> Institut d'Astronomie et d'Astrophysique *      Don't worry, be happy
>>> CP 226, office 2.N4.211, building NO     *         and CARPE DIEM.
>>> Universite Libre de Bruxelles            *
>>> Boulevard du Triomphe                    *      Tel : +32-2-650.35.71
>>>  B-1050 Bruxelles                        *      Fax : +32-2-650.42.26
>>> http://sb9.astro.ulb.ac.be/~pourbaix     * 
>>> mailto:pourbaix@astro.ulb.ac.be
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [math] Work going on on SVD

Posted by Dimitri Pourbaix <po...@astro.ulb.ac.be>.
Luc,

> I think we should wait for this issue to be solved before publishing
> 2.1. It is currently targeted at 2.2 but I would vote to solve it
> earlier. It is really a big drawback of the current implementation. Not
> being able to decompose identity is rather sad ...

I agree.

>> This would be a nice improvement and would probably be useful for other
>> linear algebra algorithms later on. The rationale for the current
>> implementation was to avoid copying data back and forth between a low
>> level elementary Householder class called n times and a high level
>> transformer like bi-diagonal or tri-diagonal transformer. If we can find
>> a way to have an elementary transform  processing in-place the array
>> provided to it by the  high level transformer, this would be fine.
>
> Perhaps this could wait for 2.2 or 3.0.

Yes.  It would be an improvement of the code from a developer perspec-
tive only.  No rush!

Regards,
  Dim.
----------------------------------------------------------------------------
Dimitri Pourbaix                         *
Institut d'Astronomie et d'Astrophysique *      Don't worry, be happy
CP 226, office 2.N4.211, building NO     *         and CARPE DIEM.
Universite Libre de Bruxelles            *
Boulevard du Triomphe                    *      Tel : +32-2-650.35.71
  B-1050 Bruxelles                        *      Fax : +32-2-650.42.26
http://sb9.astro.ulb.ac.be/~pourbaix     * mailto:pourbaix@astro.ulb.ac.be

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [math] Work going on on SVD

Posted by Phil Steitz <ph...@gmail.com>.
Luc Maisonobe wrote:
> Luc Maisonobe a écrit :
>> Dimitri Pourbaix a écrit :
>>> Hi,
>> Hi Dimitri,
>>
>>> In its present form (CM2.0), SingularValueDecomposition suffers some
>>> problems when the matrix is (numerically) singular.  Luc proposed a
>>> way to improve the situation by limiting the singular values to non
>>> zero ones.  Whereas the result is OK if someone is interested in getting
>>> the singular values, it is not from a purely mathematical point of view.
>>> Indeed, the resulting U matrix does no longer hold the right dimension.
>>> Instead of a number of columns equal to the number of columns of the
>>> original matrix, U now has as many columns as non-zero singular values.
>>> The product U*S*V^T yields the original matrix but the wrong size might
>>> put some users into trouble (that is true for the size of S as well).
>> I agree with this analysis. The current behavior is interesting for some
>> use cases but not for all of them. So I suggest we provide both cases.
>> This could be done either by renaming the current implementation as
>> TruncatedSVD and having your implementation named
>> SingularValueDecomposition, or by using some constructor parameters to
>> select the desired behavior.
> 
> What do we do here ? If Dimitri can provide a new mathematically
> compliant SVD I would suggest to have it use the name
> SingularValueDecomposition and to rename the existing class
> TruncatedSVD.

 This class would take arguments to distinguish between
> Compact decomposition (automatic cutoff at zero value) and truncated
> decomposition (cutoff by singular values indices).

I like the optional constructor arguments approach better - no need
for a new impl class and no need to change the interface.

> 
>>> I propose to compute U ... without taking advantage of V.  That means
>>> calling EigenDecomposition a second time but should work even in case
>>> of singular matrices.  That is the solution I am working on.  However,
>>> doing so, I notice that EigenDecomposition also suffers major problems
>>> in case of singular matrices.  A 3x3 singular matrix where 0 is an
>>> eigen value with multiplicity 2 ... yields only 2 distinct eigen
>>> vectors.  The vectors associated to the null eigen value are equal!!
>> Yes, this is JIRA issue MATH-333. The problem is that in the current
>> implementation, the vector is computed by
>> EigenDecomposition.findEigenVector which takes one eigenvalue as its
>> argument. so eigenvalues with order greater than 1 simply result in the
>> same computation repeated several times ...
>> Perhaps one way to solve this is to reproduce what is done in DLARRV
>> from LAPACK. The routine uses the index of the eigenvalue and not only
>> its value.
> 
> Does someone had a look at DLARRV and could explain how it works (the
> vector part only, the singular values part is already well understood I
> think) ?
> 
> I think we should wait for this issue to be solved before publishing
> 2.1. It is currently targeted at 2.2 but I would vote to solve it
> earlier. It is really a big drawback of the current implementation. Not
> being able to decompose identity is rather sad ...

+1
> 
>>> So, before I can improve SVD, I have to improve EigenDecomposition!
>>>
>>> By the way, going through SVD, EigenDecomposition, I noticed that
>>> BidiagonalTransformer and TridiagonalTransformer both use the
>>> Householder vector computation deeply imbedded in their code.  In
>>> order to make both classes easier to read (and to debug), I wonder
>>> if it might be useful to introduce a class Householder which would
>>> take care of the computation of the vector in a unique place.
>> This would be a nice improvement and would probably be useful for other
>> linear algebra algorithms later on. The rationale for the current
>> implementation was to avoid copying data back and forth between a low
>> level elementary Householder class called n times and a high level
>> transformer like bi-diagonal or tri-diagonal transformer. If we can find
>> a way to have an elementary transform  processing in-place the array
>> provided to it by the  high level transformer, this would be fine.
> 
> Perhaps this could wait for 2.2 or 3.0.

+1 for waiting on this unless Dimitri or whoever ends up patching
MATH-333 finds that pulling out the Householder decomp makes it
easier to complete and test the fix.

Phil
> 
> Luc
> 
>> Once again, we see we lack a way to have partial view of matrices slices.
>>
>> Luc
>>
>>
>>> Regards,
>>>  Dim.
>>> ----------------------------------------------------------------------------
>>>
>>> Dimitri Pourbaix                         *
>>> Institut d'Astronomie et d'Astrophysique *      Don't worry, be happy
>>> CP 226, office 2.N4.211, building NO     *         and CARPE DIEM.
>>> Universite Libre de Bruxelles            *
>>> Boulevard du Triomphe                    *      Tel : +32-2-650.35.71
>>>  B-1050 Bruxelles                        *      Fax : +32-2-650.42.26
>>> http://sb9.astro.ulb.ac.be/~pourbaix     * mailto:pourbaix@astro.ulb.ac.be
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [math] Work going on on SVD

Posted by Luc Maisonobe <Lu...@free.fr>.
Luc Maisonobe a écrit :
> Dimitri Pourbaix a écrit :
>> Hi,
> 
> Hi Dimitri,
> 
>> In its present form (CM2.0), SingularValueDecomposition suffers some
>> problems when the matrix is (numerically) singular.  Luc proposed a
>> way to improve the situation by limiting the singular values to non
>> zero ones.  Whereas the result is OK if someone is interested in getting
>> the singular values, it is not from a purely mathematical point of view.
>> Indeed, the resulting U matrix does no longer hold the right dimension.
>> Instead of a number of columns equal to the number of columns of the
>> original matrix, U now has as many columns as non-zero singular values.
>> The product U*S*V^T yields the original matrix but the wrong size might
>> put some users into trouble (that is true for the size of S as well).
> 
> I agree with this analysis. The current behavior is interesting for some
> use cases but not for all of them. So I suggest we provide both cases.
> This could be done either by renaming the current implementation as
> TruncatedSVD and having your implementation named
> SingularValueDecomposition, or by using some constructor parameters to
> select the desired behavior.

What do we do here ? If Dimitri can provide a new mathematically
compliant SVD I would suggest to have it use the name
SingularValueDecomposition and to rename the existing class
TruncatedSVD. This class would take arguments to distinguish between
Compact decomposition (automatic cutoff at zero value) and truncated
decomposition (cutoff by singular values indices).

> 
>> I propose to compute U ... without taking advantage of V.  That means
>> calling EigenDecomposition a second time but should work even in case
>> of singular matrices.  That is the solution I am working on.  However,
>> doing so, I notice that EigenDecomposition also suffers major problems
>> in case of singular matrices.  A 3x3 singular matrix where 0 is an
>> eigen value with multiplicity 2 ... yields only 2 distinct eigen
>> vectors.  The vectors associated to the null eigen value are equal!!
> 
> Yes, this is JIRA issue MATH-333. The problem is that in the current
> implementation, the vector is computed by
> EigenDecomposition.findEigenVector which takes one eigenvalue as its
> argument. so eigenvalues with order greater than 1 simply result in the
> same computation repeated several times ...
> Perhaps one way to solve this is to reproduce what is done in DLARRV
> from LAPACK. The routine uses the index of the eigenvalue and not only
> its value.

Does someone had a look at DLARRV and could explain how it works (the
vector part only, the singular values part is already well understood I
think) ?

I think we should wait for this issue to be solved before publishing
2.1. It is currently targeted at 2.2 but I would vote to solve it
earlier. It is really a big drawback of the current implementation. Not
being able to decompose identity is rather sad ...

> 
>> So, before I can improve SVD, I have to improve EigenDecomposition!
>>
>> By the way, going through SVD, EigenDecomposition, I noticed that
>> BidiagonalTransformer and TridiagonalTransformer both use the
>> Householder vector computation deeply imbedded in their code.  In
>> order to make both classes easier to read (and to debug), I wonder
>> if it might be useful to introduce a class Householder which would
>> take care of the computation of the vector in a unique place.
> 
> This would be a nice improvement and would probably be useful for other
> linear algebra algorithms later on. The rationale for the current
> implementation was to avoid copying data back and forth between a low
> level elementary Householder class called n times and a high level
> transformer like bi-diagonal or tri-diagonal transformer. If we can find
> a way to have an elementary transform  processing in-place the array
> provided to it by the  high level transformer, this would be fine.

Perhaps this could wait for 2.2 or 3.0.

Luc

> 
> Once again, we see we lack a way to have partial view of matrices slices.
> 
> Luc
> 
> 
>> Regards,
>>  Dim.
>> ----------------------------------------------------------------------------
>>
>> Dimitri Pourbaix                         *
>> Institut d'Astronomie et d'Astrophysique *      Don't worry, be happy
>> CP 226, office 2.N4.211, building NO     *         and CARPE DIEM.
>> Universite Libre de Bruxelles            *
>> Boulevard du Triomphe                    *      Tel : +32-2-650.35.71
>>  B-1050 Bruxelles                        *      Fax : +32-2-650.42.26
>> http://sb9.astro.ulb.ac.be/~pourbaix     * mailto:pourbaix@astro.ulb.ac.be
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [math] Work going on on SVD

Posted by Luc Maisonobe <Lu...@free.fr>.
Dimitri Pourbaix a écrit :
> Hi,

Hi Dimitri,

> 
> In its present form (CM2.0), SingularValueDecomposition suffers some
> problems when the matrix is (numerically) singular.  Luc proposed a
> way to improve the situation by limiting the singular values to non
> zero ones.  Whereas the result is OK if someone is interested in getting
> the singular values, it is not from a purely mathematical point of view.
> Indeed, the resulting U matrix does no longer hold the right dimension.
> Instead of a number of columns equal to the number of columns of the
> original matrix, U now has as many columns as non-zero singular values.
> The product U*S*V^T yields the original matrix but the wrong size might
> put some users into trouble (that is true for the size of S as well).

I agree with this analysis. The current behavior is interesting for some
use cases but not for all of them. So I suggest we provide both cases.
This could be done either by renaming the current implementation as
TruncatedSVD and having your implementation named
SingularValueDecomposition, or by using some constructor parameters to
select the desired behavior.

> 
> I propose to compute U ... without taking advantage of V.  That means
> calling EigenDecomposition a second time but should work even in case
> of singular matrices.  That is the solution I am working on.  However,
> doing so, I notice that EigenDecomposition also suffers major problems
> in case of singular matrices.  A 3x3 singular matrix where 0 is an
> eigen value with multiplicity 2 ... yields only 2 distinct eigen
> vectors.  The vectors associated to the null eigen value are equal!!

Yes, this is JIRA issue MATH-333. The problem is that in the current
implementation, the vector is computed by
EigenDecomposition.findEigenVector which takes one eigenvalue as its
argument. so eigenvalues with order greater than 1 simply result in the
same computation repeated several times ...
Perhaps one way to solve this is to reproduce what is done in DLARRV
from LAPACK. The routine uses the index of the eigenvalue and not only
its value.

> 
> So, before I can improve SVD, I have to improve EigenDecomposition!
> 
> By the way, going through SVD, EigenDecomposition, I noticed that
> BidiagonalTransformer and TridiagonalTransformer both use the
> Householder vector computation deeply imbedded in their code.  In
> order to make both classes easier to read (and to debug), I wonder
> if it might be useful to introduce a class Householder which would
> take care of the computation of the vector in a unique place.

This would be a nice improvement and would probably be useful for other
linear algebra algorithms later on. The rationale for the current
implementation was to avoid copying data back and forth between a low
level elementary Householder class called n times and a high level
transformer like bi-diagonal or tri-diagonal transformer. If we can find
a way to have an elementary transform  processing in-place the array
provided to it by the  high level transformer, this would be fine.

Once again, we see we lack a way to have partial view of matrices slices.

Luc


> 
> Regards,
>  Dim.
> ----------------------------------------------------------------------------
> 
> Dimitri Pourbaix                         *
> Institut d'Astronomie et d'Astrophysique *      Don't worry, be happy
> CP 226, office 2.N4.211, building NO     *         and CARPE DIEM.
> Universite Libre de Bruxelles            *
> Boulevard du Triomphe                    *      Tel : +32-2-650.35.71
>  B-1050 Bruxelles                        *      Fax : +32-2-650.42.26
> http://sb9.astro.ulb.ac.be/~pourbaix     * mailto:pourbaix@astro.ulb.ac.be
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org