You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Debasish Das <de...@gmail.com> on 2014/10/16 01:57:04 UTC

Issues with ALS positive definite

Hi,

If I take the Movielens data and run the default ALS with regularization as
0.0, I am hitting exception from LAPACK that the gram matrix is not
positive definite. This is on the master branch.

This is how I run it :

./bin/spark-submit --total-executor-cores 1 --master spark://
tusca09lmlvt00c.uswin.ad.vzwcorp.com:7077 --jars
/Users/v606014/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
--class org.apache.spark.examples.mllib.MovieLensALS
./examples/target/spark-examples_2.10-1.1.0-SNAPSHOT.jar --rank 20
--numIterations 20 --lambda 0.0 --kryo
hdfs://localhost:8020/sandbox/movielens/

Error from LAPACK:

WARN TaskSetManager: Lost task 0.0 in stage 11.0 (TID 22,
tusca09lmlvt00c.uswin.ad.vzwcorp.com):
org.jblas.exceptions.LapackArgumentException: LAPACK DPOSV: Leading minor
of order i of A is not positive definite.

>From the maths it's not expected right ?

||r - wi'hj||^{2} has to be positive definite...

I think the tests are not running any 0.0 regularization tests otherwise we
should have caught it as well...

For the sparse coding NMF variant that I am running, I have to turn off L2
regularization when I run a L1 on products to extract sparse topics...

Thanks.

Deb

Re: Issues with ALS positive definite

Posted by Xiangrui Meng <me...@gmail.com>.
Do not use lambda=0.0. Use a small number instead. Cholesky
factorization doesn't work on semi-positive systems with 0
eigenvalues. -Xiangrui

On Wed, Oct 15, 2014 at 5:05 PM, Debasish Das <de...@gmail.com> wrote:
> But do you expect the mllib code to fail if I run with 0.0 regularization ?
>
> I think ||r - wi'hj||^{2} is positive definite...It can become positive
> semi definite only if there are dependent rows in the matrix...
>
> @sean is that right ? We had this discussion before as well...
>
>
> On Wed, Oct 15, 2014 at 5:01 PM, Liquan Pei <li...@gmail.com> wrote:
>
>> Hi Debaish,
>>
>> I think ||r - wi'hj||^{2} is semi-positive definite.
>>
>> Thanks,
>> Liquan
>>
>> On Wed, Oct 15, 2014 at 4:57 PM, Debasish Das <de...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> If I take the Movielens data and run the default ALS with regularization
>>> as
>>> 0.0, I am hitting exception from LAPACK that the gram matrix is not
>>> positive definite. This is on the master branch.
>>>
>>> This is how I run it :
>>>
>>> ./bin/spark-submit --total-executor-cores 1 --master spark://
>>> tusca09lmlvt00c.uswin.ad.vzwcorp.com:7077 --jars
>>>
>>> /Users/v606014/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
>>> --class org.apache.spark.examples.mllib.MovieLensALS
>>> ./examples/target/spark-examples_2.10-1.1.0-SNAPSHOT.jar --rank 20
>>> --numIterations 20 --lambda 0.0 --kryo
>>> hdfs://localhost:8020/sandbox/movielens/
>>>
>>> Error from LAPACK:
>>>
>>> WARN TaskSetManager: Lost task 0.0 in stage 11.0 (TID 22,
>>> tusca09lmlvt00c.uswin.ad.vzwcorp.com):
>>> org.jblas.exceptions.LapackArgumentException: LAPACK DPOSV: Leading minor
>>> of order i of A is not positive definite.
>>>
>>> From the maths it's not expected right ?
>>>
>>> ||r - wi'hj||^{2} has to be positive definite...
>>>
>>> I think the tests are not running any 0.0 regularization tests otherwise
>>> we
>>> should have caught it as well...
>>>
>>> For the sparse coding NMF variant that I am running, I have to turn off L2
>>> regularization when I run a L1 on products to extract sparse topics...
>>>
>>> Thanks.
>>>
>>> Deb
>>>
>>
>>
>>
>> --
>> Liquan Pei
>> Department of Physics
>> University of Massachusetts Amherst
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Issues with ALS positive definite

Posted by Debasish Das <de...@gmail.com>.
Just checked, QR is exposed by netlib: import org.netlib.lapack.Dgeqrf

For the equality and bound version, I will use QR...it will be faster than
the LU that I am using through jblas.solveSymmetric...

On Thu, Oct 16, 2014 at 8:34 AM, Debasish Das <de...@gmail.com>
wrote:

> @xiangrui should we add this epsilon inside ALS code itself ? So that if
> user by mistake put 0.0 as regularization, LAPACK failures does not show
> up...
>
> @sean For the proximal algorithms I am using Cholesky for L1 and LU for
> equality and bound constraints (since the matrix is quasi definite)...I am
> right now experimenting with the nesterov acceleration...I should
> definitely use QR in place of LU...I am already BLAS solves from netlib
> which is not in jblas so this should be fine as well...
>
> Details are over here:
>
> https://github.com/apache/spark/pull/2705
>
>
> On Thu, Oct 16, 2014 at 4:19 AM, Sean Owen <so...@cloudera.com> wrote:
>
>> It Gramian is at least positive semidefinite and will be definite if the
>> matrix is non singular, yes. That's usually but not always true.
>>
>> The lambda*I matrix is positive definite, well, when lambda is positive.
>> Adding that makes it definite.
>>
>> At least, lambda=0 could be rejected as invalid.
>>
>> But this goes back to using the Cholesky decomposition. Why not use QR?
>> It doesn't require definite. It should be a little more accurate. On these
>> smallish dense matrices I don't think it is much slower. I have not
>> benchmarked that but I opted for QR in a different implementation and it
>> has worked fine.
>>
>> Now I have to go hunt for how the QR decomposition is exposed in BLAS...
>> Looks like its GEQRF which JBLAS helpfully exposes. Debasish you could try
>> it for fun at least.
>>  On Oct 15, 2014 8:06 PM, "Debasish Das" <de...@gmail.com>
>> wrote:
>>
>>> But do you expect the mllib code to fail if I run with 0.0
>>> regularization ?
>>>
>>> I think ||r - wi'hj||^{2} is positive definite...It can become positive
>>> semi definite only if there are dependent rows in the matrix...
>>>
>>> @sean is that right ? We had this discussion before as well...
>>>
>>>
>>> On Wed, Oct 15, 2014 at 5:01 PM, Liquan Pei <li...@gmail.com> wrote:
>>>
>>> > Hi Debaish,
>>> >
>>> > I think ||r - wi'hj||^{2} is semi-positive definite.
>>> >
>>> > Thanks,
>>> > Liquan
>>> >
>>> > On Wed, Oct 15, 2014 at 4:57 PM, Debasish Das <
>>> debasish.das83@gmail.com>
>>> > wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> If I take the Movielens data and run the default ALS with
>>> regularization
>>> >> as
>>> >> 0.0, I am hitting exception from LAPACK that the gram matrix is not
>>> >> positive definite. This is on the master branch.
>>> >>
>>> >> This is how I run it :
>>> >>
>>> >> ./bin/spark-submit --total-executor-cores 1 --master spark://
>>> >> tusca09lmlvt00c.uswin.ad.vzwcorp.com:7077 --jars
>>> >>
>>> >>
>>> /Users/v606014/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
>>> >> --class org.apache.spark.examples.mllib.MovieLensALS
>>> >> ./examples/target/spark-examples_2.10-1.1.0-SNAPSHOT.jar --rank 20
>>> >> --numIterations 20 --lambda 0.0 --kryo
>>> >> hdfs://localhost:8020/sandbox/movielens/
>>> >>
>>> >> Error from LAPACK:
>>> >>
>>> >> WARN TaskSetManager: Lost task 0.0 in stage 11.0 (TID 22,
>>> >> tusca09lmlvt00c.uswin.ad.vzwcorp.com):
>>> >> org.jblas.exceptions.LapackArgumentException: LAPACK DPOSV: Leading
>>> minor
>>> >> of order i of A is not positive definite.
>>> >>
>>> >> From the maths it's not expected right ?
>>> >>
>>> >> ||r - wi'hj||^{2} has to be positive definite...
>>> >>
>>> >> I think the tests are not running any 0.0 regularization tests
>>> otherwise
>>> >> we
>>> >> should have caught it as well...
>>> >>
>>> >> For the sparse coding NMF variant that I am running, I have to turn
>>> off L2
>>> >> regularization when I run a L1 on products to extract sparse topics...
>>> >>
>>> >> Thanks.
>>> >>
>>> >> Deb
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Liquan Pei
>>> > Department of Physics
>>> > University of Massachusetts Amherst
>>> >
>>>
>>
>

Re: Issues with ALS positive definite

Posted by Debasish Das <de...@gmail.com>.
@xiangrui should we add this epsilon inside ALS code itself ? So that if
user by mistake put 0.0 as regularization, LAPACK failures does not show
up...

@sean For the proximal algorithms I am using Cholesky for L1 and LU for
equality and bound constraints (since the matrix is quasi definite)...I am
right now experimenting with the nesterov acceleration...I should
definitely use QR in place of LU...I am already BLAS solves from netlib
which is not in jblas so this should be fine as well...

Details are over here:

https://github.com/apache/spark/pull/2705


On Thu, Oct 16, 2014 at 4:19 AM, Sean Owen <so...@cloudera.com> wrote:

> It Gramian is at least positive semidefinite and will be definite if the
> matrix is non singular, yes. That's usually but not always true.
>
> The lambda*I matrix is positive definite, well, when lambda is positive.
> Adding that makes it definite.
>
> At least, lambda=0 could be rejected as invalid.
>
> But this goes back to using the Cholesky decomposition. Why not use QR? It
> doesn't require definite. It should be a little more accurate. On these
> smallish dense matrices I don't think it is much slower. I have not
> benchmarked that but I opted for QR in a different implementation and it
> has worked fine.
>
> Now I have to go hunt for how the QR decomposition is exposed in BLAS...
> Looks like its GEQRF which JBLAS helpfully exposes. Debasish you could try
> it for fun at least.
>  On Oct 15, 2014 8:06 PM, "Debasish Das" <de...@gmail.com> wrote:
>
>> But do you expect the mllib code to fail if I run with 0.0 regularization
>> ?
>>
>> I think ||r - wi'hj||^{2} is positive definite...It can become positive
>> semi definite only if there are dependent rows in the matrix...
>>
>> @sean is that right ? We had this discussion before as well...
>>
>>
>> On Wed, Oct 15, 2014 at 5:01 PM, Liquan Pei <li...@gmail.com> wrote:
>>
>> > Hi Debaish,
>> >
>> > I think ||r - wi'hj||^{2} is semi-positive definite.
>> >
>> > Thanks,
>> > Liquan
>> >
>> > On Wed, Oct 15, 2014 at 4:57 PM, Debasish Das <debasish.das83@gmail.com
>> >
>> > wrote:
>> >
>> >> Hi,
>> >>
>> >> If I take the Movielens data and run the default ALS with
>> regularization
>> >> as
>> >> 0.0, I am hitting exception from LAPACK that the gram matrix is not
>> >> positive definite. This is on the master branch.
>> >>
>> >> This is how I run it :
>> >>
>> >> ./bin/spark-submit --total-executor-cores 1 --master spark://
>> >> tusca09lmlvt00c.uswin.ad.vzwcorp.com:7077 --jars
>> >>
>> >>
>> /Users/v606014/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
>> >> --class org.apache.spark.examples.mllib.MovieLensALS
>> >> ./examples/target/spark-examples_2.10-1.1.0-SNAPSHOT.jar --rank 20
>> >> --numIterations 20 --lambda 0.0 --kryo
>> >> hdfs://localhost:8020/sandbox/movielens/
>> >>
>> >> Error from LAPACK:
>> >>
>> >> WARN TaskSetManager: Lost task 0.0 in stage 11.0 (TID 22,
>> >> tusca09lmlvt00c.uswin.ad.vzwcorp.com):
>> >> org.jblas.exceptions.LapackArgumentException: LAPACK DPOSV: Leading
>> minor
>> >> of order i of A is not positive definite.
>> >>
>> >> From the maths it's not expected right ?
>> >>
>> >> ||r - wi'hj||^{2} has to be positive definite...
>> >>
>> >> I think the tests are not running any 0.0 regularization tests
>> otherwise
>> >> we
>> >> should have caught it as well...
>> >>
>> >> For the sparse coding NMF variant that I am running, I have to turn
>> off L2
>> >> regularization when I run a L1 on products to extract sparse topics...
>> >>
>> >> Thanks.
>> >>
>> >> Deb
>> >>
>> >
>> >
>> >
>> > --
>> > Liquan Pei
>> > Department of Physics
>> > University of Massachusetts Amherst
>> >
>>
>

Re: Issues with ALS positive definite

Posted by Sean Owen <so...@cloudera.com>.
It Gramian is at least positive semidefinite and will be definite if the
matrix is non singular, yes. That's usually but not always true.

The lambda*I matrix is positive definite, well, when lambda is positive.
Adding that makes it definite.

At least, lambda=0 could be rejected as invalid.

But this goes back to using the Cholesky decomposition. Why not use QR? It
doesn't require definite. It should be a little more accurate. On these
smallish dense matrices I don't think it is much slower. I have not
benchmarked that but I opted for QR in a different implementation and it
has worked fine.

Now I have to go hunt for how the QR decomposition is exposed in BLAS...
Looks like its GEQRF which JBLAS helpfully exposes. Debasish you could try
it for fun at least.
 On Oct 15, 2014 8:06 PM, "Debasish Das" <de...@gmail.com> wrote:

> But do you expect the mllib code to fail if I run with 0.0 regularization ?
>
> I think ||r - wi'hj||^{2} is positive definite...It can become positive
> semi definite only if there are dependent rows in the matrix...
>
> @sean is that right ? We had this discussion before as well...
>
>
> On Wed, Oct 15, 2014 at 5:01 PM, Liquan Pei <li...@gmail.com> wrote:
>
> > Hi Debaish,
> >
> > I think ||r - wi'hj||^{2} is semi-positive definite.
> >
> > Thanks,
> > Liquan
> >
> > On Wed, Oct 15, 2014 at 4:57 PM, Debasish Das <de...@gmail.com>
> > wrote:
> >
> >> Hi,
> >>
> >> If I take the Movielens data and run the default ALS with regularization
> >> as
> >> 0.0, I am hitting exception from LAPACK that the gram matrix is not
> >> positive definite. This is on the master branch.
> >>
> >> This is how I run it :
> >>
> >> ./bin/spark-submit --total-executor-cores 1 --master spark://
> >> tusca09lmlvt00c.uswin.ad.vzwcorp.com:7077 --jars
> >>
> >>
> /Users/v606014/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
> >> --class org.apache.spark.examples.mllib.MovieLensALS
> >> ./examples/target/spark-examples_2.10-1.1.0-SNAPSHOT.jar --rank 20
> >> --numIterations 20 --lambda 0.0 --kryo
> >> hdfs://localhost:8020/sandbox/movielens/
> >>
> >> Error from LAPACK:
> >>
> >> WARN TaskSetManager: Lost task 0.0 in stage 11.0 (TID 22,
> >> tusca09lmlvt00c.uswin.ad.vzwcorp.com):
> >> org.jblas.exceptions.LapackArgumentException: LAPACK DPOSV: Leading
> minor
> >> of order i of A is not positive definite.
> >>
> >> From the maths it's not expected right ?
> >>
> >> ||r - wi'hj||^{2} has to be positive definite...
> >>
> >> I think the tests are not running any 0.0 regularization tests otherwise
> >> we
> >> should have caught it as well...
> >>
> >> For the sparse coding NMF variant that I am running, I have to turn off
> L2
> >> regularization when I run a L1 on products to extract sparse topics...
> >>
> >> Thanks.
> >>
> >> Deb
> >>
> >
> >
> >
> > --
> > Liquan Pei
> > Department of Physics
> > University of Massachusetts Amherst
> >
>

Re: Issues with ALS positive definite

Posted by Debasish Das <de...@gmail.com>.
But do you expect the mllib code to fail if I run with 0.0 regularization ?

I think ||r - wi'hj||^{2} is positive definite...It can become positive
semi definite only if there are dependent rows in the matrix...

@sean is that right ? We had this discussion before as well...


On Wed, Oct 15, 2014 at 5:01 PM, Liquan Pei <li...@gmail.com> wrote:

> Hi Debaish,
>
> I think ||r - wi'hj||^{2} is semi-positive definite.
>
> Thanks,
> Liquan
>
> On Wed, Oct 15, 2014 at 4:57 PM, Debasish Das <de...@gmail.com>
> wrote:
>
>> Hi,
>>
>> If I take the Movielens data and run the default ALS with regularization
>> as
>> 0.0, I am hitting exception from LAPACK that the gram matrix is not
>> positive definite. This is on the master branch.
>>
>> This is how I run it :
>>
>> ./bin/spark-submit --total-executor-cores 1 --master spark://
>> tusca09lmlvt00c.uswin.ad.vzwcorp.com:7077 --jars
>>
>> /Users/v606014/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
>> --class org.apache.spark.examples.mllib.MovieLensALS
>> ./examples/target/spark-examples_2.10-1.1.0-SNAPSHOT.jar --rank 20
>> --numIterations 20 --lambda 0.0 --kryo
>> hdfs://localhost:8020/sandbox/movielens/
>>
>> Error from LAPACK:
>>
>> WARN TaskSetManager: Lost task 0.0 in stage 11.0 (TID 22,
>> tusca09lmlvt00c.uswin.ad.vzwcorp.com):
>> org.jblas.exceptions.LapackArgumentException: LAPACK DPOSV: Leading minor
>> of order i of A is not positive definite.
>>
>> From the maths it's not expected right ?
>>
>> ||r - wi'hj||^{2} has to be positive definite...
>>
>> I think the tests are not running any 0.0 regularization tests otherwise
>> we
>> should have caught it as well...
>>
>> For the sparse coding NMF variant that I am running, I have to turn off L2
>> regularization when I run a L1 on products to extract sparse topics...
>>
>> Thanks.
>>
>> Deb
>>
>
>
>
> --
> Liquan Pei
> Department of Physics
> University of Massachusetts Amherst
>

Re: Issues with ALS positive definite

Posted by Liquan Pei <li...@gmail.com>.
Hi Debaish,

I think ||r - wi'hj||^{2} is semi-positive definite.

Thanks,
Liquan

On Wed, Oct 15, 2014 at 4:57 PM, Debasish Das <de...@gmail.com>
wrote:

> Hi,
>
> If I take the Movielens data and run the default ALS with regularization as
> 0.0, I am hitting exception from LAPACK that the gram matrix is not
> positive definite. This is on the master branch.
>
> This is how I run it :
>
> ./bin/spark-submit --total-executor-cores 1 --master spark://
> tusca09lmlvt00c.uswin.ad.vzwcorp.com:7077 --jars
>
> /Users/v606014/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
> --class org.apache.spark.examples.mllib.MovieLensALS
> ./examples/target/spark-examples_2.10-1.1.0-SNAPSHOT.jar --rank 20
> --numIterations 20 --lambda 0.0 --kryo
> hdfs://localhost:8020/sandbox/movielens/
>
> Error from LAPACK:
>
> WARN TaskSetManager: Lost task 0.0 in stage 11.0 (TID 22,
> tusca09lmlvt00c.uswin.ad.vzwcorp.com):
> org.jblas.exceptions.LapackArgumentException: LAPACK DPOSV: Leading minor
> of order i of A is not positive definite.
>
> From the maths it's not expected right ?
>
> ||r - wi'hj||^{2} has to be positive definite...
>
> I think the tests are not running any 0.0 regularization tests otherwise we
> should have caught it as well...
>
> For the sparse coding NMF variant that I am running, I have to turn off L2
> regularization when I run a L1 on products to extract sparse topics...
>
> Thanks.
>
> Deb
>



-- 
Liquan Pei
Department of Physics
University of Massachusetts Amherst