You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Joni Salonen <jo...@gmail.com> on 2006/05/08 11:53:40 UTC

Re: [math] Q-R -decomposition

Sorry about the late reply; I was practically all of April on holidays
and now I find myself occupied by final exams.

> +1 - see below.  The only real question here is do we need a
> subpackage for matrix decompositions.  Since I think it is unlikely
> that we will have more than a handful of these, I am OK putting these
> into the top level, i.e. in .linear.
>
That seems fine to  me.

> > I also had a look at Jama yesterday. There they defer the explicit
> > generation of the Q part of the decomposition until the user calls
> > getQ(), which I guess has a computational advantage over calculating
> > the whole decomp if the user of the API only needs R. This of course
> > implies that the algorithm has a state and it's most natural to
> > implement it as a class of its own.
>
> Again, I think this should be a separate (immutable) class with state,
> with the decomp done in the constructor, which should take a
> RealMatrix (not impl) as argument (using getData to copy if argument
> is not a RealMatrixImp).  I am not sure I understand what you mean
> about the Q and R accessors in Jama.  It looks to me like they are
> just doing transformations to provide Q and R separately.  I think it
> makes sense to provide those accessors (as we should in the LU case
> when we externalize that).
>
The algorithm used there produces the matrix R and an array of
Householder vectors. When the getQ() is called, the Householder
vectors are made into matrices that are multiplied together to yield
the Q matrix. This seems to be the best way to go about things.

> >
> > From the release plan I read that the QR-decomposition will be needed
> > for linear regression. Does that mean that it will be used mainly for
> > least-squares fitting? In that case both Q and R are needed most of
> > the time, so having the algorithm in a separate class is not strictly
> > necessary..
>
> The immediate motivation is for solving the normal equations.  I don't
> think we should include the solve() method that Jama has in this
> class, though.  I think it is more natural to have that in the OLS
> implementation.
>
> Tests are a good start.
>
> Returning to the overall API design, I think it makes sense to follow
> the abstract factory pattern used elsewhere in [math] (e.g. the
> distributions package) to provide for pluggable decomp implementations
> with defaults provided.  So what we would end up with would be an
> abstract DecompositionFactory class with a concrete
> DecompositionFactoryImpl subclass providing default implementations.
> Interfaces for decompositions would be abstracted.  User code with
> look like this:
>
> QRDecomposition qr =
> DecompositionFactory.newInstance().createQRDecomposition(matrix);
>
> where QRDecomposition is the interface and
> DecompositionFactory.newInstance() returns a DecompositionFactoryImpl
> and createQRDecomposition(matrix) invokes the constructor for
> QRDecompositionImpl, which is the default implementation.  This setup
> is used in the distributions and analysis packages to provide
> pluggable implementations.
>
Seems good to me. What do you think is better as a method name,
"createQRDecomposition" or "newQRDecomposition"? Both styles seem to
be in use.

I suppose we won't have a base interface for matrix decompositions?

> To get started, we can just define QRDecomposition,
> QRDecompositionImpl.  If there are no objections / better ideas, we
> can then add the factory impls and do the same for LU decomp (and
> Cholesky, which I think we may also have laying around somewhere).
>
Alright, I'm on it.

Joni

Re: [math] Q-R -decomposition

Posted by Luc Maisonobe <Lu...@free.fr>.
>> > Beyond what is available in the API (Q and R), what exactly does the
>> > QR Decomp "know" that makes solving easier?

foreword :
what is stated below is oriented only for least squares problems, it is 
not a general discussion about decomposition algorithms.

When using QR decomposition in least squares problems, you NEVER really 
compute Q explicitely, and if you don't retrieve Q, you don't retrieve R 
either. The decomposition algorithm keeps some information about Q (the 
Householder vectors, but also some coefficients and permutation indices 
if you want to support rank-deficient problems) and you use this 
information to compute transpose(Q).Q.V for some vector V without 
computing Q itself, and it uses R internally also to provide some higher 
level result, not Q and R to let you compute something with them. Q is a 
huge matrix, much larger than the Householder vectors it can be deduced 
from. This is especially true when the problem as a few parameters but a 
lot of measurements (in orbit determination problems, for example, you 
often have less than 10 parameters in the model and several tens of 
thousands of measurements).

What makes least squares solving easier is not the QR decomposition 
itself, but the way it is used in the surrounding algorithm 
(Levenberg-Marquardt for example). In this case, you do NOT compute the 
normal equations (i.e. transpose(J).J where J is the jacobian matrix) 
and decompose the resulting square matrix like you would do in a 
Gauss-Newton solver. You decompose the jacobian matrix itself (this is 
the reason for the transpose(Q).Q.V part of the solver). Both 
decomposition are therefore not used the same way.

The QR way is more accurate because there are situations where the 
squaring of the jacobian matrix involved in the normal equations 
computation leads to cancellation of some tiny values (epsilon -> 
epsilon^2). For difficult problems, this is really important.

On the other hand, using LU has other benefits. First, you may build the 
normal equations iteratively (i.e. build Jt.J by updating a matrix as 
measurements are considered one after the other) and second, the matrix 
size is small (mXm where m is the number of parameters of the model), 
which is smaller than the nXm matrix appearing in the QR decomposition 
(but beware, nobody really computes the nXn Q matrix). QR decomposition 
involves about twice the number of operations the LU decomposition.

So the choice is size versus accuracy for simple problems, and you may 
only choose accuracy for difficult problems, as the other way 
alternative simply fails. For optimization problems, the 
Levenberg-Marquardt algorithm (which uses a QR decomposition as one of 
the many parts of the algorithm) is the method of choice you will find 
in many applications. It works really well and few people bother 
studying really what is the better alternative.

In any case, for least squares problems, the decomposition used is an 
implementation choice and the user doesn't really need to see the 
intermediate steps (building J or not, decomposing Jt.J using LU or J 
using QR, applying residuals, updating the parameters). He chooses one 
method or the other and get the final result.

Luc


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] Q-R -decomposition

Posted by Bani <bo...@gmail.com>.
>
> > Beyond what is available in the API (Q and R), what exactly does the
> > QR Decomp "know" that makes solving easier?
>
> It has Q stored as a list of Householder vectors. Multiplying a matrix
> with Q or Q' that is stored in this format is not much more expensive
> than a normal matrix multiplication, but it's a lot more efficient
> when compared to first calculating an explicit form of Q and then
> multiplying. It might be better numerically too, but I cannot be sure.


Yes, QR is better, although it is more expensive to solve from scratch,
because of the ortogonality of the Q matrix.
It is particulary good for not very well conditioned systems. Sometimes you
can't even solve with LU, but you can with QR.


Vanessa

Re: [math] Q-R -decomposition

Posted by Joni Salonen <jo...@gmail.com>.
> I am reviewing the implementation code and the Householder reflections
> algorithm to figure out why this is the case.  The definitions that I
> have seen (including the ones that you reference in the javadoc) use
> the second approach (R is square).  Generally, m is assumed to be
> greater than or equal to n and I think the matrix has to be full rank
> for the decomp to be unique.  I am not an expert in this area, so will
> defer to others if there are good arguments for using the definition
> that your implementation uses.  The important thing is to document
> exactly what the code does, both in the interface javadoc and the
> impl.  It would be awkward in this case to have the interface
> under-specified regarding the dimensions of the factors, so this
> should probably be settled independently of the algorithm.
>
About the dimensions, it depends on what your requirements are. If you
require that the columns of Q form an orthogonal basis of R^m, Q has
to be m x m. If that condition is relaxed and m >= n, the last rows of
R will be zero. This means that the last columns of Q don't contribute
to the product QR. In this case Q can be m x n. The latter form is
called "the economy-size decomposition" in MATLAB documentation:
http://www.mathworks.com/access/helpdesk/help/techdoc/ref/qr.html
The two forms are mentioned also here:
http://www.cs.utexas.edu/~plapack/icpp98/node4.html
JAMA uses the economy-size decomposition.

(After sending my last email I realised that if m <= n, Q can't be m x
n already for the fact that at most m vectors can be linearly
independent--let alone orthogonal--in R^m.)

IIRC, for the factorization to be unique, the matrix has to have full
rank and the diagonal elements of R have to be positive, but a
factorization exists and can be computed with the Householder
algorithm even if the matrix is singular.

Another thing I'm not sure of is the naming; I've named the classes
without regard to the fact that they work only with real matrices. If
there is to be support for complex and arbitrary precision matrices,
perhaps a rename would be due?

> > It seems that many decompositions are used for solving linear systems.
> > A decomposition object "knows" what the system is like and has access
> > to the raw factorized data that can be packed in some way, so it would
> > make some sense to ask it solve systems, too. Plus: users would be
> > able to switch between different implementations, like with the
> > Collections API.
>
> Beyond what is available in the API (Q and R), what exactly does the
> QR Decomp "know" that makes solving easier?

It has Q stored as a list of Householder vectors. Multiplying a matrix
with Q or Q' that is stored in this format is not much more expensive
than a normal matrix multiplication, but it's a lot more efficient
when compared to first calculating an explicit form of Q and then
multiplying. It might be better numerically too, but I cannot be sure.

Re: [math] Q-R -decomposition

Posted by Phil Steitz <ph...@gmail.com>.
On 5/20/06, Joni Salonen <jo...@gmail.com> wrote:
> > > The algorithm used there produces the matrix R and an array of
> > > Householder vectors. When the getQ() is called, the Householder
> > > vectors are made into matrices that are multiplied together to yield
> > > the Q matrix. This seems to be the best way to go about things.
> > >
> > That seems fine to me, in terms of the state maintained in the decomp
> > class and the API as well - i.e., provide the accessors for Q and R,
> > but maintain state efficiently.
> >
> Done; this is what QRDecompositionImpl does now. Singular and
> rectangular cases are covered. The tests are more extensive, too.
> http://issues.apache.org/jira/browse/MATH-148

Thanks! I will commit the later versions to give us something to start with.
>
> There is one thing I'm not sure about: matrix dimensions. Some sources
> define the QR-factorization of an m x n matrix so that Q is m x m
> (square) and R is m x n, others say that Q is m x n and R is n x n
> (square). The current implementation does the former. Of course it's
> also possible to define Q as m x r and R as r x n, where r is min{m,
> n} or the rank of the original matrix. Do you have any insights on
> what should be done?

I am reviewing the implementation code and the Householder reflections
algorithm to figure out why this is the case.  The definitions that I
have seen (including the ones that you reference in the javadoc) use
the second approach (R is square).  Generally, m is assumed to be
greater than or equal to n and I think the matrix has to be full rank
for the decomp to be unique.  I am not an expert in this area, so will
defer to others if there are good arguments for using the definition
that your implementation uses.  The important thing is to document
exactly what the code does, both in the interface javadoc and the
impl.  It would be awkward in this case to have the interface
under-specified regarding the dimensions of the factors, so this
should probably be settled independently of the algorithm.
>
> > > I suppose we won't have a base interface for matrix decompositions?
> >
> > We can talk about that, but if we go with the design above, there is
> > really no place for it, unless I am missing something.  Now is a good
> > time to discuss these things, though, as once we define the API, we
> > (and our users) are going to have to live with it for some time.
> > Other than a "decompose" method (which the design above does not
> > include), its hard for me to see what a base decomposition interface
> > would include. The accessors are all different depending on the
> > decomp.  Could be I am missing something, though, so if you have ideas
> > about how to better structure this, please speak up.
> >
> It seems that many decompositions are used for solving linear systems.
> A decomposition object "knows" what the system is like and has access
> to the raw factorized data that can be packed in some way, so it would
> make some sense to ask it solve systems, too. Plus: users would be
> able to switch between different implementations, like with the
> Collections API.

Beyond what is available in the API (Q and R), what exactly does the
QR Decomp "know" that makes solving easier?
>
> Then again, solving systems doesn't seem like something inherent to
> the idea of a matrix factorization. Could it be a better idea to have
> an interface like "LinearSystemSolver" which then can be implemented
> by decomposition classes?

Probably a good idea, even if we don't have the decomp classes
implement it. I guess there is nothing wrong with the decomp
implementation classes implementing the linear solver interface.  Need
to think about this some more from both usability and extensibility
standpoint, but I think the interface makes sense in any case.

Thanks again for contributing this stuff.

Phil

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] Q-R -decomposition

Posted by Joni Salonen <jo...@gmail.com>.
> > The algorithm used there produces the matrix R and an array of
> > Householder vectors. When the getQ() is called, the Householder
> > vectors are made into matrices that are multiplied together to yield
> > the Q matrix. This seems to be the best way to go about things.
> >
> That seems fine to me, in terms of the state maintained in the decomp
> class and the API as well - i.e., provide the accessors for Q and R,
> but maintain state efficiently.
>
Done; this is what QRDecompositionImpl does now. Singular and
rectangular cases are covered. The tests are more extensive, too.
http://issues.apache.org/jira/browse/MATH-148

There is one thing I'm not sure about: matrix dimensions. Some sources
define the QR-factorization of an m x n matrix so that Q is m x m
(square) and R is m x n, others say that Q is m x n and R is n x n
(square). The current implementation does the former. Of course it's
also possible to define Q as m x r and R as r x n, where r is min{m,
n} or the rank of the original matrix. Do you have any insights on
what should be done?

> > I suppose we won't have a base interface for matrix decompositions?
>
> We can talk about that, but if we go with the design above, there is
> really no place for it, unless I am missing something.  Now is a good
> time to discuss these things, though, as once we define the API, we
> (and our users) are going to have to live with it for some time.
> Other than a "decompose" method (which the design above does not
> include), its hard for me to see what a base decomposition interface
> would include. The accessors are all different depending on the
> decomp.  Could be I am missing something, though, so if you have ideas
> about how to better structure this, please speak up.
>
It seems that many decompositions are used for solving linear systems.
A decomposition object "knows" what the system is like and has access
to the raw factorized data that can be packed in some way, so it would
make some sense to ask it solve systems, too. Plus: users would be
able to switch between different implementations, like with the
Collections API.

Then again, solving systems doesn't seem like something inherent to
the idea of a matrix factorization. Could it be a better idea to have
an interface like "LinearSystemSolver" which then can be implemented
by decomposition classes?

Re: [math] Q-R -decomposition

Posted by Phil Steitz <ph...@gmail.com>.
On 5/8/06, Joni Salonen <jo...@gmail.com> wrote:
> Sorry about the late reply; I was practically all of April on holidays
> and now I find myself occupied by final exams.

No worries.  We are all busy - glad you are still interested in working on this.
>
> > +1 - see below.  The only real question here is do we need a
> > subpackage for matrix decompositions.  Since I think it is unlikely
> > that we will have more than a handful of these, I am OK putting these
> > into the top level, i.e. in .linear.
> >
> That seems fine to  me.

OK.  Lets just keep it in .linear for now.
>
> > > I also had a look at Jama yesterday. There they defer the explicit
> > > generation of the Q part of the decomposition until the user calls
> > > getQ(), which I guess has a computational advantage over calculating
> > > the whole decomp if the user of the API only needs R. This of course
> > > implies that the algorithm has a state and it's most natural to
> > > implement it as a class of its own.
> >
> > Again, I think this should be a separate (immutable) class with state,
> > with the decomp done in the constructor, which should take a
> > RealMatrix (not impl) as argument (using getData to copy if argument
> > is not a RealMatrixImp).  I am not sure I understand what you mean
> > about the Q and R accessors in Jama.  It looks to me like they are
> > just doing transformations to provide Q and R separately.  I think it
> > makes sense to provide those accessors (as we should in the LU case
> > when we externalize that).
> >
> The algorithm used there produces the matrix R and an array of
> Householder vectors. When the getQ() is called, the Householder
> vectors are made into matrices that are multiplied together to yield
> the Q matrix. This seems to be the best way to go about things.
>
That seems fine to me, in terms of the state maintained in the decomp
class and the API as well - i.e., provide the accessors for Q and R,
but maintain state efficiently.
> > >
> > > From the release plan I read that the QR-decomposition will be needed
> > > for linear regression. Does that mean that it will be used mainly for
> > > least-squares fitting? In that case both Q and R are needed most of
> > > the time, so having the algorithm in a separate class is not strictly
> > > necessary..
> >
> > The immediate motivation is for solving the normal equations.  I don't
> > think we should include the solve() method that Jama has in this
> > class, though.  I think it is more natural to have that in the OLS
> > implementation.
> >
> > Tests are a good start.
> >
> > Returning to the overall API design, I think it makes sense to follow
> > the abstract factory pattern used elsewhere in [math] (e.g. the
> > distributions package) to provide for pluggable decomp implementations
> > with defaults provided.  So what we would end up with would be an
> > abstract DecompositionFactory class with a concrete
> > DecompositionFactoryImpl subclass providing default implementations.
> > Interfaces for decompositions would be abstracted.  User code with
> > look like this:
> >
> > QRDecomposition qr =
> > DecompositionFactory.newInstance().createQRDecomposition(matrix);
> >
> > where QRDecomposition is the interface and
> > DecompositionFactory.newInstance() returns a DecompositionFactoryImpl
> > and createQRDecomposition(matrix) invokes the constructor for
> > QRDecompositionImpl, which is the default implementation.  This setup
> > is used in the distributions and analysis packages to provide
> > pluggable implementations.
> >
> Seems good to me. What do you think is better as a method name,
> "createQRDecomposition" or "newQRDecomposition"? Both styles seem to
> be in use.

Yes, we have not been completely consistent on this in [math].  We
used newXxx in the analysis package for solvers, but pretty much
everywhere else use createXxx.  I think it is better to continue with
createXxx.
>
> I suppose we won't have a base interface for matrix decompositions?

We can talk about that, but if we go with the design above, there is
really no place for it, unless I am missing something.  Now is a good
time to discuss these things, though, as once we define the API, we
(and our users) are going to have to live with it for some time.
Other than a "decompose" method (which the design above does not
include), its hard for me to see what a base decomposition interface
would include. The accessors are all different depending on the
decomp.  Could be I am missing something, though, so if you have ideas
about how to better structure this, please speak up.
>
> > To get started, we can just define QRDecomposition,
> > QRDecompositionImpl.  If there are no objections / better ideas, we
> > can then add the factory impls and do the same for LU decomp (and
> > Cholesky, which I think we may also have laying around somewhere).
> >
> Alright, I'm on it.

Thanks!

Phil

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org