You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Evan Ward <ev...@apache.org> on 2014/09/11 20:29:49 UTC

[math] Least Squares Outlier Rejection

Hi,

A while ago I had bought up the idea of adding residual editing (aka data
editing, outlier rejection, robust regression) to our non-linear least
squares implementations.[1] As the name suggests, the idea is to de-weight
observations that don't match the user's model. There are several ways to
this including choosing a fixed cutoff, a fixed standard deviation cutoff,
or reducing a residual's weight based on its magnitude.[2]

However we add the data editing feature I think it will cause backward
incompatibilities with the released API. I've outlined below the two
options I see. I'm open to other ideas as well.

1. Replace edited residuals with 0's in the residual vector and Jacobian
(i.e. apply a 0 weight). This has the advantage of being simple to
implement and that our existing optimizers are already able to handle it.
The downside is evident when the user tries to obtain the number of
residuals that were edited. It is hard to tell the difference between an
edited residual, an apriori zero weight, or a model evaluation where the
residual and gradient is, in fact, zero. We can provide easy access to the
number of edited residuals by adding a method to the Evaluation interface.
(This is what I implemented in the patch in the original thread.) Now that
the code has been released though, this would cause a backward
incompatibility for some advanced users. Most users will likely use the
included factory and builder methods to define their
LeastSquaresProblem(LSP) and these users would not be affected by the
change. Only the users that provide a custom implementation of
LSP.Evauation would be affected.

2. Remove edited residuals from the gradient and Jacobian, so that the
resulting vector and matrix have fewer rows. The advantage here is that the
user can compare the length of the residual vector in the Optimum to the
number of observations in the LSP to determine the number of edited
residuals. The problem is that returning vectors/matrices with different
sizes from LSP.evaluate() would violate the contract. Additionally we would
have to modify our existing optimizers to deal with the variable lengths.
For GaussNewton the modification would be small, but for LevenburgMarquardt
I would likely have to re-write it since I don't understand the code (not
for lack of trying :P ). Users that implement LeastSquaresOptimizers would
likely have to modify their code as well.

To summarize, in both cases users that only use the provided [math] classes
would not have to modify their code, while users that provide custom
implementations of [math] interfaces would have to modify their code.

I would like to get this feature wrapped in for the next release. Please
let me know if you have a preference for either implementation and if there
are any other issues I should consider.

Best Regards,
Evan

[1] http://markmail.org/message/e53nago3swvu3t52
     https://issues.apache.org/jira/browse/MATH-1105
[2] http://www.mathworks.com/help/curvefit/removing-outliers.html
     http://www.mathworks.com/help/curvefit/least-squares-fitting.html

Re: [math] Least Squares Outlier Rejection

Posted by Gilles <gi...@harfang.homelinux.org>.
Hello.

On Tue, 16 Sep 2014 18:34:52 -0400, Evan and Maureen Ward wrote:
> Hi Gilles, Luc,
>
> Thanks for all the comments. I'll try to respond to the more 
> fundamental
> concerns in this email, and the more practical ones in another email, 
> if we
> decide that we want to include data editing in [math].
>

>> [...]

>>
>> I don't see the that the editing has to occur during the 
>> optimization.
>> It could be independent:
>>  1. Let the optimization run to completion with all the data
>>  2. Compute the residuals
>>  3a. If there are no outliers, stop
>>  3b. If there are outliers, remove them (or modify their weight)
>>  4. Run the optimization with the remaining data
>>  5. Goto 2
>>
>> The advantage I see here is that the weight modification is a user
>> decision (and implementation).
>>
>> However, IIUC the examples from the link provided by Evan, "robust"
>> optimization (i.e. handling outliers during optimization) could lead
>> to a "better" solution.
>>
>> Would it always be "better"? Not sure: significant points could be
>> mistaken as outliers and be discarded before the optimizer could
>> figure out the correct solution...
>> To avoid a "good but wrong" fit, I'm afraid that we'd have to
>> introduce several parameters (like "do not discard outliers before
>> iteration n", "do not discard more than m points", etc.)
>> The tuning of those parameters will probably not be obvious, and
>> they will surely complicate the code (e.g. input data like the
>> "target" and "weight" array won't be "final").
>>
>
> The advantage I see is not correctness (since the algorithm you 
> outline
> will converge correctly), but reducing function evaluations. (I don't 
> have
> data to back up this assertion.) Without inline data editing the
> optimization algorithm would "waste" the evaluations between when the
> outliers became obvious, and when the optimization converges. With 
> the
> "inline" scheme, the outliers are deleted as soon as possible, and 
> the
> remaining evaluations are used to converge towards the correct 
> solution.
>
> Converging to a "good but wrong" will always be a risk of any 
> algorithm
> that automatically throws away data. As with our other algorithms, 
> I'm
> expecting the user to know when the algorithm is a good fit for their
> problem. The use case I see is when the observations contain mostly 
> real
> data, and a few random numbers. The bad observations can be hard to
> identify apriori, but become obvious during the fitting process.
>
>
>>> [...]
>>
>> What works in one domain might not in another.
>> Thus, the feature should not alter the "standardness", nor decrease
>> the robustness of the implementation. Caring for special cases
>> (for which the feature is useful) may be obtained by e.g. use the
>> standard algorithm as a basic block that is called repeatedly, as
>> hinted above (and tuning the standard parameters and input data
>> appropriately for each call).
>>
>
> I was not expecting the response that [math] may not want this 
> feature. I'm
> o.k. with this result since I can implement it as an addition to 
> [math],
> though the API won't be as clean.
>

IMHO, we cannot assume that fiddling with some of the data points while
the optimization progresses won't alter the correctness of the 
solution.

I think that when points are deemed "outliers", e.g. using external
knowledge not available to the optimizer, there should be removed; and
the optimization be redone on the "real" data.
As I understand it, "robust" optimization is not really a good name 
(I'd
suggest "fuzzy", or something) because it will indeed assign less 
weight
to data points solely on the basis that they are less represented among
the currently available data, irrespective of whether they actually
pertain to the phenomenon being measured.
At first sight, this could allow the optimizer to drag farther and 
farther
away from the correct solution (if those points were no outliers).

>
>>
>>>> At first sight, I'd avoid modification of the sizes of input data
>>>> (option 2);
>>>> from an API usage viewpoint, I imagine that user code will require
>>>> additional
>>>> "length" tests.
>>>>
>>>> Couldn't the problem you mention in option 1 disappear by having
>>>> different
>>>> methods that return the a-priori weights and the modified weights?
>>>>
>>>
>>> As we are already much too stringent with our compatibility policy, 
>>> I
>>> would allow the case were only *very* advanced users would have
>>> problems. So it would seem fair to me if we can do some changes 
>>> were the
>>> users of the factory are unaffected and expert users who decided 
>>> the
>>> factory was not good for them have to put new efforts.
>>>
>>
>> Before embarking on this, I would like to see examples where the 
>> "inline"
>> outlier rejection is leads to a (correct) solution impossible to 
>> achieve
>> with the approach I've suggested.
>>
>
> As discussed above, I don't see any cases where "inline" outlier 
> rejection
> will result in a less correct solution than the algorithm you 
> outline.

I do see a potential case, in my current work. ;-)

> I do
> think we can save a significant number of function evaluations by 
> using
> "inline" outlier rejection.

We can of course talk about a performance-correctness trade-off, 
perhaps
useful for cases where the risk is low (lots of data, known expected 
rate of
outliers).

IIUC the reference you provided, it's seems that we only need a hook to
allow "outside" modification of the weights (?).
Could it be provided with an interface like the following:

public interface WeightValidator {
     /**
      * @param weights Current weights.
      * @param residuals Current residuals.
      * @return the adjusted weights.
     RealVector validateWeights(RealVector weights,
                                RealVector residuals);
}

(similar to the suggestion for MATH-1144)?


Best regards,
Gilles

>
> Best Regards,
> Evan
>
>
>> [...]
>>
>>
>>>>> [1] http://markmail.org/message/e53nago3swvu3t52
>>>>>      https://issues.apache.org/jira/browse/MATH-1105
>>>>> [2] http://www.mathworks.com/help/curvefit/removing-outliers.html
>>>>>      
>>>>> http://www.mathworks.com/help/curvefit/least-squares-fitting.html
>>>>>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [math] Least Squares Outlier Rejection

Posted by Evan and Maureen Ward <ev...@gmail.com>.
Hi Gilles, Luc,

Thanks for all the comments. I'll try to respond to the more fundamental
concerns in this email, and the more practical ones in another email, if we
decide that we want to include data editing in [math].

On Fri, Sep 12, 2014 at 8:55 AM, Gilles <gi...@harfang.homelinux.org>
wrote:

> Hi.
>
>
> On Fri, 12 Sep 2014 09:16:07 +0200, Luc Maisonobe wrote:
>
>> Le 12/09/2014 01:35, Gilles a écrit :
>>
>>> Hello.
>>>
>>> On Thu, 11 Sep 2014 14:29:49 -0400, Evan Ward wrote:
>>>
>>>> Hi,
>>>>
>>>> A while ago I had bought up the idea of adding residual editing (aka
>>>> data
>>>> editing, outlier rejection, robust regression) to our non-linear least
>>>> squares implementations.[1] As the name suggests, the idea is to
>>>> de-weight
>>>> observations that don't match the user's model. There are several ways
>>>> to
>>>> this including choosing a fixed cutoff, a fixed standard deviation
>>>> cutoff,
>>>> or reducing a residual's weight based on its magnitude.[2]
>>>>
>>>> However we add the data editing feature I think it will cause backward
>>>> incompatibilities with the released API. I've outlined below the two
>>>> options I see. I'm open to other ideas as well.
>>>>
>>>> 1. Replace edited residuals with 0's in the residual vector and Jacobian
>>>> (i.e. apply a 0 weight). This has the advantage of being simple to
>>>> implement and that our existing optimizers are already able to handle
>>>> it.
>>>> The downside is evident when the user tries to obtain the number of
>>>> residuals that were edited. It is hard to tell the difference between an
>>>> edited residual, an apriori zero weight, or a model evaluation where the
>>>> residual and gradient is, in fact, zero. We can provide easy access to
>>>> the
>>>> number of edited residuals by adding a method to the Evaluation
>>>> interface.
>>>> (This is what I implemented in the patch in the original thread.) Now
>>>> that
>>>> the code has been released though, this would cause a backward
>>>> incompatibility for some advanced users. Most users will likely use the
>>>> included factory and builder methods to define their
>>>> LeastSquaresProblem(LSP) and these users would not be affected by the
>>>> change. Only the users that provide a custom implementation of
>>>> LSP.Evauation would be affected.
>>>>
>>>> 2. Remove edited residuals from the gradient and Jacobian, so that the
>>>> resulting vector and matrix have fewer rows. The advantage here is
>>>> that the
>>>> user can compare the length of the residual vector in the Optimum to the
>>>> number of observations in the LSP to determine the number of edited
>>>> residuals. The problem is that returning vectors/matrices with different
>>>> sizes from LSP.evaluate() would violate the contract. Additionally we
>>>> would
>>>> have to modify our existing optimizers to deal with the variable
>>>> lengths.
>>>> For GaussNewton the modification would be small, but for
>>>> LevenburgMarquardt
>>>> I would likely have to re-write it since I don't understand the code
>>>> (not
>>>> for lack of trying :P ). Users that implement LeastSquaresOptimizers
>>>> would
>>>> likely have to modify their code as well.
>>>>
>>>> To summarize, in both cases users that only use the provided [math]
>>>> classes
>>>> would not have to modify their code, while users that provide custom
>>>> implementations of [math] interfaces would have to modify their code.
>>>>
>>>> I would like to get this feature wrapped in for the next release. Please
>>>> let me know if you have a preference for either implementation and if
>>>> there
>>>> are any other issues I should consider.
>>>>
>>>
>>> Compatibility breaks cannot occur in minor releases.
>>> The next major release should not occur before deprecated classes are all
>>> replaced. [I'm thinking about the optimizers, for which the fluent API
>>> should
>>> be implemented based on your design of NLLS.]
>>>
>>
>> I theoretically agree here, but we are in a sort of chicken and egg
>> problem. Typically, when I proposed to switch the ODE to the new fluent
>> API, we decided we should do it at a major release only. We also said we
>> cannot remove deprecation at minor releases. Now we say major release
>> should wait for deprecation to be removed.
>>
>> So As far as I understand, everything should occur at once for major
>> releases : remove deprecated classes, introduce new classes and
>> interfaces, but see next remark.
>>
>
> No, there is actually no chicken and egg problem in this case.
> The issue is only that we should not delete an API before a replacement
> is provided.
> So
> 1. We want to remove the deprecated optimizers API
>    -> must be done in major release (e.g. 4.0)
> 2. We want to let users upgrade their code
>    -> the new API must exist before 4.0
>
> 1 and 2 are not contradictory as the new API will be written from
> scratch (it is not a modification, say, of existing interfaces).
>
>
>>> It would be nice to recode the whole "LevenbergMarquardtOptimizer" in
>>> full OO
>>> Java. But it should be implemented and tested before any new feature is
>>> added
>>> to the mix.
>>>
>>
>> Here comes another problem: we want to have time to stabilize new API,
>> which mean we should have a few minor releases with trial API.
>> Unfortunately, once these trial API have been introduced, we are stuck
>> with them and we end up piling several API togethers (the optimizers are
>> the perfect example).
>>
>
> Piling up, yes, but it is not a problem: all the old APIs are already
> deprecated; "only" the replacements are missing. We must provide them
> in a minor release before 4.0. [This should not need to delay the next
> release (3.4), as we can have as many 3.x as we want...]
>
>
>> So when do we test new API : at minor releases just before a major
>> release, but then we keep all the trials until a big clean up at major
>> release ?
>>
>
> Yes.
>
>
>>
>>> Do I understand correctly that in the "robust" fit, the weights are
>>> modified
>>> during the optimization?
>>> If so, would the algorithms still be "standard"?
>>>
>>
>> I think performing some modifications between iterations is still a
>> standard way to perform some optimizations: we can detect the outliers
>> only once we have computed residuals and some statistics like the
>> standard deviation, so this editing operation should occur somewhere
>> during the optimization process.
>>
>
> I don't see the that the editing has to occur during the optimization.
> It could be independent:
>  1. Let the optimization run to completion with all the data
>  2. Compute the residuals
>  3a. If there are no outliers, stop
>  3b. If there are outliers, remove them (or modify their weight)
>  4. Run the optimization with the remaining data
>  5. Goto 2
>
> The advantage I see here is that the weight modification is a user
> decision (and implementation).
>
> However, IIUC the examples from the link provided by Evan, "robust"
> optimization (i.e. handling outliers during optimization) could lead
> to a "better" solution.
>
> Would it always be "better"? Not sure: significant points could be
> mistaken as outliers and be discarded before the optimizer could
> figure out the correct solution...
> To avoid a "good but wrong" fit, I'm afraid that we'd have to
> introduce several parameters (like "do not discard outliers before
> iteration n", "do not discard more than m points", etc.)
> The tuning of those parameters will probably not be obvious, and
> they will surely complicate the code (e.g. input data like the
> "target" and "weight" array won't be "final").
>

The advantage I see is not correctness (since the algorithm you outline
will converge correctly), but reducing function evaluations. (I don't have
data to back up this assertion.) Without inline data editing the
optimization algorithm would "waste" the evaluations between when the
outliers became obvious, and when the optimization converges. With the
"inline" scheme, the outliers are deleted as soon as possible, and the
remaining evaluations are used to converge towards the correct solution.

Converging to a "good but wrong" will always be a risk of any algorithm
that automatically throws away data. As with our other algorithms, I'm
expecting the user to know when the algorithm is a good fit for their
problem. The use case I see is when the observations contain mostly real
data, and a few random numbers. The bad observations can be hard to
identify apriori, but become obvious during the fitting process.



>
>  I have already seen this typically in
>> orbit determination so I would consider it a classical optional feature.
>>
>
> What works in one domain might not in another.
> Thus, the feature should not alter the "standardness", nor decrease
> the robustness of the implementation. Caring for special cases
> (for which the feature is useful) may be obtained by e.g. use the
> standard algorithm as a basic block that is called repeatedly, as
> hinted above (and tuning the standard parameters and input data
> appropriately for each call).
>

I was not expecting the response that [math] may not want this feature. I'm
o.k. with this result since I can implement it as an addition to [math],
though the API won't be as clean.


>
>>> At first sight, I'd avoid modification of the sizes of input data
>>> (option 2);
>>> from an API usage viewpoint, I imagine that user code will require
>>> additional
>>> "length" tests.
>>>
>>> Couldn't the problem you mention in option 1 disappear by having
>>> different
>>> methods that return the a-priori weights and the modified weights?
>>>
>>
>> As we are already much too stringent with our compatibility policy, I
>> would allow the case were only *very* advanced users would have
>> problems. So it would seem fair to me if we can do some changes were the
>> users of the factory are unaffected and expert users who decided the
>> factory was not good for them have to put new efforts.
>>
>
> Before embarking on this, I would like to see examples where the "inline"
> outlier rejection is leads to a (correct) solution impossible to achieve
> with the approach I've suggested.
>

As discussed above, I don't see any cases where "inline" outlier rejection
will result in a less correct solution than the algorithm you outline. I do
think we can save a significant number of function evaluations by using
"inline" outlier rejection.

Best Regards,
Evan


>
>  Anyway, I would be OK with both solutions, and also think rewriting
>> Levenberg-Marquardt would be a good thing. I was translated from Minpack
>> with only a change in the QR decomposition and a change in the
>> convergence checker, so basically it is Fortran written with Java syntax.
>>
>
> I just want to remind that quite some effort has been put already in the
> modification of the original port (removal of many "work" arrays, "final"
> data). It's not anymore just Fortran in Java. [But the core algorithm is
> certainly not obvious, and I'm all for a clean reimplementation (to exist
> in parallel with the current one, until everybody agrees that nothing is
> lost).]
>
>
> Best,
> Gilles
>
>
>
>>>> [1] http://markmail.org/message/e53nago3swvu3t52
>>>>      https://issues.apache.org/jira/browse/MATH-1105
>>>> [2] http://www.mathworks.com/help/curvefit/removing-outliers.html
>>>>      http://www.mathworks.com/help/curvefit/least-squares-fitting.html
>>>>
>>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Re: [math] Least Squares Outlier Rejection

Posted by Gilles <gi...@harfang.homelinux.org>.
Hi.

On Fri, 12 Sep 2014 09:16:07 +0200, Luc Maisonobe wrote:
> Le 12/09/2014 01:35, Gilles a écrit :
>> Hello.
>>
>> On Thu, 11 Sep 2014 14:29:49 -0400, Evan Ward wrote:
>>> Hi,
>>>
>>> A while ago I had bought up the idea of adding residual editing 
>>> (aka data
>>> editing, outlier rejection, robust regression) to our non-linear 
>>> least
>>> squares implementations.[1] As the name suggests, the idea is to
>>> de-weight
>>> observations that don't match the user's model. There are several 
>>> ways to
>>> this including choosing a fixed cutoff, a fixed standard deviation
>>> cutoff,
>>> or reducing a residual's weight based on its magnitude.[2]
>>>
>>> However we add the data editing feature I think it will cause 
>>> backward
>>> incompatibilities with the released API. I've outlined below the 
>>> two
>>> options I see. I'm open to other ideas as well.
>>>
>>> 1. Replace edited residuals with 0's in the residual vector and 
>>> Jacobian
>>> (i.e. apply a 0 weight). This has the advantage of being simple to
>>> implement and that our existing optimizers are already able to 
>>> handle it.
>>> The downside is evident when the user tries to obtain the number of
>>> residuals that were edited. It is hard to tell the difference 
>>> between an
>>> edited residual, an apriori zero weight, or a model evaluation 
>>> where the
>>> residual and gradient is, in fact, zero. We can provide easy access 
>>> to
>>> the
>>> number of edited residuals by adding a method to the Evaluation
>>> interface.
>>> (This is what I implemented in the patch in the original thread.) 
>>> Now
>>> that
>>> the code has been released though, this would cause a backward
>>> incompatibility for some advanced users. Most users will likely use 
>>> the
>>> included factory and builder methods to define their
>>> LeastSquaresProblem(LSP) and these users would not be affected by 
>>> the
>>> change. Only the users that provide a custom implementation of
>>> LSP.Evauation would be affected.
>>>
>>> 2. Remove edited residuals from the gradient and Jacobian, so that 
>>> the
>>> resulting vector and matrix have fewer rows. The advantage here is
>>> that the
>>> user can compare the length of the residual vector in the Optimum 
>>> to the
>>> number of observations in the LSP to determine the number of edited
>>> residuals. The problem is that returning vectors/matrices with 
>>> different
>>> sizes from LSP.evaluate() would violate the contract. Additionally 
>>> we
>>> would
>>> have to modify our existing optimizers to deal with the variable 
>>> lengths.
>>> For GaussNewton the modification would be small, but for
>>> LevenburgMarquardt
>>> I would likely have to re-write it since I don't understand the 
>>> code (not
>>> for lack of trying :P ). Users that implement 
>>> LeastSquaresOptimizers
>>> would
>>> likely have to modify their code as well.
>>>
>>> To summarize, in both cases users that only use the provided [math]
>>> classes
>>> would not have to modify their code, while users that provide 
>>> custom
>>> implementations of [math] interfaces would have to modify their 
>>> code.
>>>
>>> I would like to get this feature wrapped in for the next release. 
>>> Please
>>> let me know if you have a preference for either implementation and 
>>> if
>>> there
>>> are any other issues I should consider.
>>
>> Compatibility breaks cannot occur in minor releases.
>> The next major release should not occur before deprecated classes 
>> are all
>> replaced. [I'm thinking about the optimizers, for which the fluent 
>> API
>> should
>> be implemented based on your design of NLLS.]
>
> I theoretically agree here, but we are in a sort of chicken and egg
> problem. Typically, when I proposed to switch the ODE to the new 
> fluent
> API, we decided we should do it at a major release only. We also said 
> we
> cannot remove deprecation at minor releases. Now we say major release
> should wait for deprecation to be removed.
>
> So As far as I understand, everything should occur at once for major
> releases : remove deprecated classes, introduce new classes and
> interfaces, but see next remark.

No, there is actually no chicken and egg problem in this case.
The issue is only that we should not delete an API before a replacement
is provided.
So
1. We want to remove the deprecated optimizers API
    -> must be done in major release (e.g. 4.0)
2. We want to let users upgrade their code
    -> the new API must exist before 4.0

1 and 2 are not contradictory as the new API will be written from
scratch (it is not a modification, say, of existing interfaces).

>>
>> It would be nice to recode the whole "LevenbergMarquardtOptimizer" 
>> in
>> full OO
>> Java. But it should be implemented and tested before any new feature 
>> is
>> added
>> to the mix.
>
> Here comes another problem: we want to have time to stabilize new 
> API,
> which mean we should have a few minor releases with trial API.
> Unfortunately, once these trial API have been introduced, we are 
> stuck
> with them and we end up piling several API togethers (the optimizers 
> are
> the perfect example).

Piling up, yes, but it is not a problem: all the old APIs are already
deprecated; "only" the replacements are missing. We must provide them
in a minor release before 4.0. [This should not need to delay the next
release (3.4), as we can have as many 3.x as we want...]

>
> So when do we test new API : at minor releases just before a major
> release, but then we keep all the trials until a big clean up at 
> major
> release ?

Yes.

>
>>
>> Do I understand correctly that in the "robust" fit, the weights are
>> modified
>> during the optimization?
>> If so, would the algorithms still be "standard"?
>
> I think performing some modifications between iterations is still a
> standard way to perform some optimizations: we can detect the 
> outliers
> only once we have computed residuals and some statistics like the
> standard deviation, so this editing operation should occur somewhere
> during the optimization process.

I don't see the that the editing has to occur during the optimization.
It could be independent:
  1. Let the optimization run to completion with all the data
  2. Compute the residuals
  3a. If there are no outliers, stop
  3b. If there are outliers, remove them (or modify their weight)
  4. Run the optimization with the remaining data
  5. Goto 2

The advantage I see here is that the weight modification is a user
decision (and implementation).

However, IIUC the examples from the link provided by Evan, "robust"
optimization (i.e. handling outliers during optimization) could lead
to a "better" solution.

Would it always be "better"? Not sure: significant points could be
mistaken as outliers and be discarded before the optimizer could
figure out the correct solution...
To avoid a "good but wrong" fit, I'm afraid that we'd have to
introduce several parameters (like "do not discard outliers before
iteration n", "do not discard more than m points", etc.)
The tuning of those parameters will probably not be obvious, and
they will surely complicate the code (e.g. input data like the
"target" and "weight" array won't be "final").

> I have already seen this typically in
> orbit determination so I would consider it a classical optional 
> feature.

What works in one domain might not in another.
Thus, the feature should not alter the "standardness", nor decrease
the robustness of the implementation. Caring for special cases
(for which the feature is useful) may be obtained by e.g. use the
standard algorithm as a basic block that is called repeatedly, as
hinted above (and tuning the standard parameters and input data
appropriately for each call).

>>
>> At first sight, I'd avoid modification of the sizes of input data
>> (option 2);
>> from an API usage viewpoint, I imagine that user code will require
>> additional
>> "length" tests.
>>
>> Couldn't the problem you mention in option 1 disappear by having 
>> different
>> methods that return the a-priori weights and the modified weights?
>
> As we are already much too stringent with our compatibility policy, I
> would allow the case were only *very* advanced users would have
> problems. So it would seem fair to me if we can do some changes were 
> the
> users of the factory are unaffected and expert users who decided the
> factory was not good for them have to put new efforts.

Before embarking on this, I would like to see examples where the 
"inline"
outlier rejection is leads to a (correct) solution impossible to 
achieve
with the approach I've suggested.

> Anyway, I would be OK with both solutions, and also think rewriting
> Levenberg-Marquardt would be a good thing. I was translated from 
> Minpack
> with only a change in the QR decomposition and a change in the
> convergence checker, so basically it is Fortran written with Java 
> syntax.

I just want to remind that quite some effort has been put already in 
the
modification of the original port (removal of many "work" arrays, 
"final"
data). It's not anymore just Fortran in Java. [But the core algorithm 
is
certainly not obvious, and I'm all for a clean reimplementation (to 
exist
in parallel with the current one, until everybody agrees that nothing 
is
lost).]


Best,
Gilles

>>>
>>> [1] http://markmail.org/message/e53nago3swvu3t52
>>>      https://issues.apache.org/jira/browse/MATH-1105
>>> [2] http://www.mathworks.com/help/curvefit/removing-outliers.html
>>>      
>>> http://www.mathworks.com/help/curvefit/least-squares-fitting.html



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [math] Least Squares Outlier Rejection

Posted by Luc Maisonobe <lu...@spaceroots.org>.
Le 12/09/2014 01:35, Gilles a écrit :
> Hello.
> 
> On Thu, 11 Sep 2014 14:29:49 -0400, Evan Ward wrote:
>> Hi,
>>
>> A while ago I had bought up the idea of adding residual editing (aka data
>> editing, outlier rejection, robust regression) to our non-linear least
>> squares implementations.[1] As the name suggests, the idea is to
>> de-weight
>> observations that don't match the user's model. There are several ways to
>> this including choosing a fixed cutoff, a fixed standard deviation
>> cutoff,
>> or reducing a residual's weight based on its magnitude.[2]
>>
>> However we add the data editing feature I think it will cause backward
>> incompatibilities with the released API. I've outlined below the two
>> options I see. I'm open to other ideas as well.
>>
>> 1. Replace edited residuals with 0's in the residual vector and Jacobian
>> (i.e. apply a 0 weight). This has the advantage of being simple to
>> implement and that our existing optimizers are already able to handle it.
>> The downside is evident when the user tries to obtain the number of
>> residuals that were edited. It is hard to tell the difference between an
>> edited residual, an apriori zero weight, or a model evaluation where the
>> residual and gradient is, in fact, zero. We can provide easy access to
>> the
>> number of edited residuals by adding a method to the Evaluation
>> interface.
>> (This is what I implemented in the patch in the original thread.) Now
>> that
>> the code has been released though, this would cause a backward
>> incompatibility for some advanced users. Most users will likely use the
>> included factory and builder methods to define their
>> LeastSquaresProblem(LSP) and these users would not be affected by the
>> change. Only the users that provide a custom implementation of
>> LSP.Evauation would be affected.
>>
>> 2. Remove edited residuals from the gradient and Jacobian, so that the
>> resulting vector and matrix have fewer rows. The advantage here is
>> that the
>> user can compare the length of the residual vector in the Optimum to the
>> number of observations in the LSP to determine the number of edited
>> residuals. The problem is that returning vectors/matrices with different
>> sizes from LSP.evaluate() would violate the contract. Additionally we
>> would
>> have to modify our existing optimizers to deal with the variable lengths.
>> For GaussNewton the modification would be small, but for
>> LevenburgMarquardt
>> I would likely have to re-write it since I don't understand the code (not
>> for lack of trying :P ). Users that implement LeastSquaresOptimizers
>> would
>> likely have to modify their code as well.
>>
>> To summarize, in both cases users that only use the provided [math]
>> classes
>> would not have to modify their code, while users that provide custom
>> implementations of [math] interfaces would have to modify their code.
>>
>> I would like to get this feature wrapped in for the next release. Please
>> let me know if you have a preference for either implementation and if
>> there
>> are any other issues I should consider.
> 
> Compatibility breaks cannot occur in minor releases.
> The next major release should not occur before deprecated classes are all
> replaced. [I'm thinking about the optimizers, for which the fluent API
> should
> be implemented based on your design of NLLS.]

I theoretically agree here, but we are in a sort of chicken and egg
problem. Typically, when I proposed to switch the ODE to the new fluent
API, we decided we should do it at a major release only. We also said we
cannot remove deprecation at minor releases. Now we say major release
should wait for deprecation to be removed.

So As far as I understand, everything should occur at once for major
releases : remove deprecated classes, introduce new classes and
interfaces, but see next remark.

> 
> It would be nice to recode the whole "LevenbergMarquardtOptimizer" in
> full OO
> Java. But it should be implemented and tested before any new feature is
> added
> to the mix.

Here comes another problem: we want to have time to stabilize new API,
which mean we should have a few minor releases with trial API.
Unfortunately, once these trial API have been introduced, we are stuck
with them and we end up piling several API togethers (the optimizers are
the perfect example).

So when do we test new API : at minor releases just before a major
release, but then we keep all the trials until a big clean up at major
release ?

> 
> Do I understand correctly that in the "robust" fit, the weights are
> modified
> during the optimization?
> If so, would the algorithms still be "standard"?

I think performing some modifications between iterations is still a
standard way to perform some optimizations: we can detect the outliers
only once we have computed residuals and some statistics like the
standard deviation, so this editing operation should occur somewhere
during the optimization process. I have already seen this typically in
orbit determination so I would consider it a classical optional feature.

> 
> At first sight, I'd avoid modification of the sizes of input data
> (option 2);
> from an API usage viewpoint, I imagine that user code will require
> additional
> "length" tests.
> 
> Couldn't the problem you mention in option 1 disappear by having different
> methods that return the a-priori weights and the modified weights?

As we are already much too stringent with our compatibility policy, I
would allow the case were only *very* advanced users would have
problems. So it would seem fair to me if we can do some changes were the
users of the factory are unaffected and expert users who decided the
factory was not good for them have to put new efforts.

Anyway, I would be OK with both solutions, and also think rewriting
Levenberg-Marquardt would be a good thing. I was translated from Minpack
with only a change in the QR decomposition and a change in the
convergence checker, so basically it is Fortran written with Java syntax.

best regards,
Luc

> 
> 
> Best regards,
> Gilles
> 
>>
>> Best Regards,
>> Evan
>>
>> [1] http://markmail.org/message/e53nago3swvu3t52
>>      https://issues.apache.org/jira/browse/MATH-1105
>> [2] http://www.mathworks.com/help/curvefit/removing-outliers.html
>>      http://www.mathworks.com/help/curvefit/least-squares-fitting.html
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [math] Least Squares Outlier Rejection

Posted by Gilles <gi...@harfang.homelinux.org>.
Hello.

On Thu, 11 Sep 2014 14:29:49 -0400, Evan Ward wrote:
> Hi,
>
> A while ago I had bought up the idea of adding residual editing (aka 
> data
> editing, outlier rejection, robust regression) to our non-linear 
> least
> squares implementations.[1] As the name suggests, the idea is to 
> de-weight
> observations that don't match the user's model. There are several 
> ways to
> this including choosing a fixed cutoff, a fixed standard deviation 
> cutoff,
> or reducing a residual's weight based on its magnitude.[2]
>
> However we add the data editing feature I think it will cause 
> backward
> incompatibilities with the released API. I've outlined below the two
> options I see. I'm open to other ideas as well.
>
> 1. Replace edited residuals with 0's in the residual vector and 
> Jacobian
> (i.e. apply a 0 weight). This has the advantage of being simple to
> implement and that our existing optimizers are already able to handle 
> it.
> The downside is evident when the user tries to obtain the number of
> residuals that were edited. It is hard to tell the difference between 
> an
> edited residual, an apriori zero weight, or a model evaluation where 
> the
> residual and gradient is, in fact, zero. We can provide easy access 
> to the
> number of edited residuals by adding a method to the Evaluation 
> interface.
> (This is what I implemented in the patch in the original thread.) Now 
> that
> the code has been released though, this would cause a backward
> incompatibility for some advanced users. Most users will likely use 
> the
> included factory and builder methods to define their
> LeastSquaresProblem(LSP) and these users would not be affected by the
> change. Only the users that provide a custom implementation of
> LSP.Evauation would be affected.
>
> 2. Remove edited residuals from the gradient and Jacobian, so that 
> the
> resulting vector and matrix have fewer rows. The advantage here is 
> that the
> user can compare the length of the residual vector in the Optimum to 
> the
> number of observations in the LSP to determine the number of edited
> residuals. The problem is that returning vectors/matrices with 
> different
> sizes from LSP.evaluate() would violate the contract. Additionally we 
> would
> have to modify our existing optimizers to deal with the variable 
> lengths.
> For GaussNewton the modification would be small, but for 
> LevenburgMarquardt
> I would likely have to re-write it since I don't understand the code 
> (not
> for lack of trying :P ). Users that implement LeastSquaresOptimizers 
> would
> likely have to modify their code as well.
>
> To summarize, in both cases users that only use the provided [math] 
> classes
> would not have to modify their code, while users that provide custom
> implementations of [math] interfaces would have to modify their code.
>
> I would like to get this feature wrapped in for the next release. 
> Please
> let me know if you have a preference for either implementation and if 
> there
> are any other issues I should consider.

Compatibility breaks cannot occur in minor releases.
The next major release should not occur before deprecated classes are 
all
replaced. [I'm thinking about the optimizers, for which the fluent API 
should
be implemented based on your design of NLLS.]

It would be nice to recode the whole "LevenbergMarquardtOptimizer" in 
full OO
Java. But it should be implemented and tested before any new feature is 
added
to the mix.

Do I understand correctly that in the "robust" fit, the weights are 
modified
during the optimization?
If so, would the algorithms still be "standard"?

At first sight, I'd avoid modification of the sizes of input data 
(option 2);
from an API usage viewpoint, I imagine that user code will require 
additional
"length" tests.

Couldn't the problem you mention in option 1 disappear by having 
different
methods that return the a-priori weights and the modified weights?


Best regards,
Gilles

>
> Best Regards,
> Evan
>
> [1] http://markmail.org/message/e53nago3swvu3t52
>      https://issues.apache.org/jira/browse/MATH-1105
> [2] http://www.mathworks.com/help/curvefit/removing-outliers.html
>      
> http://www.mathworks.com/help/curvefit/least-squares-fitting.html


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org