You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Aureliano Buendia <bu...@gmail.com> on 2014/07/27 20:06:51 UTC

MLlib NNLS implementation is buggy, returning wrong solutions

Hi,

The recently added NNLS implementation in MLlib returns wrong solutions.
This is not data specific, just try any data in R's nnls, and then the same
data in MLlib's NNLS. The results are very different.

Also, the elected algorithm Polyak(1969) is not the best one around. The
most popular one is Lawson-Hanson (1974):

http://en.wikipedia.org/wiki/Non-negative_least_squares#Algorithms

Re: MLlib NNLS implementation is buggy, returning wrong solutions

Posted by Debasish Das <de...@gmail.com>.

Hi Aureliano,

Will it be possible for you to give the test-case ? You can add it to JIRA
as well as an attachment I guess...

I am preparing the PR for ADMM based QuadraticMinimizer...In my matlab
experiments with scaling the rank to 1000 and beyond (which is too high for
ALS but gives a good idea of solver scalability, ~400 is the max I have
seen in the sparkler paper), I am noticing consistent results both in
correctness and runtime with MOSEK...

I will update more on the JIRA this week...got it cleared from our legal
last week...Please stay tuned...

https://issues.apache.org/jira/browse/SPARK-2426

Thanks.
Deb

On Sun, Jul 27, 2014 at 12:38 PM, DB Tsai <db...@dbtsai.com> wrote:

> Could you help to provide a test case to verify this issue and open a JIRA
> to track this? Also, are you interested in submit a PR to fix it? Thanks.
>
> Sent from my Google Nexus 5
> On Jul 27, 2014 11:07 AM, "Aureliano Buendia" <bu...@gmail.com>
> wrote:
>
>> Hi,
>>
>> The recently added NNLS implementation in MLlib returns wrong solutions.
>> This is not data specific, just try any data in R's nnls, and then the same
>> data in MLlib's NNLS. The results are very different.
>>
>> Also, the elected algorithm Polyak(1969) is not the best one around. The
>> most popular one is Lawson-Hanson (1974):
>>
>> http://en.wikipedia.org/wiki/Non-negative_least_squares#Algorithms
>>
>>
>>

Re: MLlib NNLS implementation is buggy, returning wrong solutions

Posted by DB Tsai <db...@dbtsai.com>.

Could you help to provide a test case to verify this issue and open a JIRA
to track this? Also, are you interested in submit a PR to fix it? Thanks.

Sent from my Google Nexus 5
On Jul 27, 2014 11:07 AM, "Aureliano Buendia" <bu...@gmail.com> wrote:

> Hi,
>
> The recently added NNLS implementation in MLlib returns wrong solutions.
> This is not data specific, just try any data in R's nnls, and then the same
> data in MLlib's NNLS. The results are very different.
>
> Also, the elected algorithm Polyak(1969) is not the best one around. The
> most popular one is Lawson-Hanson (1974):
>
> http://en.wikipedia.org/wiki/Non-negative_least_squares#Algorithms
>
>
>

Re: MLlib NNLS implementation is buggy, returning wrong solutions

Posted by Shuo Xiang <sh...@gmail.com>.

It is possible that the answer (the final solution vector x) given by two
different algorithms (such as the one in mllib and in R) are different, as
the problem may not be strictly convex and multiple global optimum may
exist. However, these answers should admit the same objective values. Can
you give an example such that the objective value of other method is better
(smaller) than the obj of mllib?


2014-07-27 11:06 GMT-07:00 Aureliano Buendia <bu...@gmail.com>:

> Hi,
>
> The recently added NNLS implementation in MLlib returns wrong solutions.
> This is not data specific, just try any data in R's nnls, and then the same
> data in MLlib's NNLS. The results are very different.
>
> Also, the elected algorithm Polyak(1969) is not the best one around. The
> most popular one is Lawson-Hanson (1974):
>
> http://en.wikipedia.org/wiki/Non-negative_least_squares#Algorithms
>
>
>