You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Phil Steitz <ph...@steitz.com> on 2004/10/12 13:11:58 UTC

[math] RealMatrix Immutability wase: Re: [math] cvs commit: jakarta-commons/math/src/test/org/apache/commons/math/linear RealMatrixImplTest.java

Mark R. Diggory wrote:
> Phil,
> 
> I think we wanted to maintain the existence of setEntry/getDataRef API 
> of the RealMatrixImpl without having it in the RealMatrix Interface. At 
> least until we come up with a strategy for mutability that made more 
> sense then these methods. This last change removed it from both.

getDataRef is still there in RealMatrixImpl, though I am starting to think 
that we can make it protected.  Either the class is immutable or it is 
not.  We need to decide. All of the use cases that I have can actually be 
accomplished with the immutable version, using the algebraic operations 
exposed in the RealMatrix API.  If others have use cases that require 
mutability, then we can change it back, but I would like to know what they 
are.
> 
> Michael,
> 
> We are attempting to make the Implementation immutable so that methods 
> calls such as getColumnMatrix(i) and getSubMatrix(xmin,xmax,ymin,ymax) 
> will be able to return submatrice objects of the existing data without 
> duplicating/copying the internal datastore to do so, this will provide 
> efficient means to access the stored values without performing costly 
> array copies on what may be very large double[][]'s or copying objects 
> which may not be in an []. So basically, what we are trying to avoid is 
> that if I do the following
> 
> Matrix a = ...
> Matrix b = a.getColumnMatrix(x);
> 
> that if (a) is mutable, then doing something like a.setEntry(x,y,d) will 
> also cause (b) to change, something we should try to avoid, we are 
> trying to work with these more as mathematical objects and not 
> necessarily as "Collection objects".

Yes, but the getSubXxx and getCol, getRow currently make copies. Assuming 
we retain immutability, this makes no functional difference (another 
argument for making the class immutable).  Implementation will be very 
tricky (and likely very inefficient) if we try to support "data sharing" 
as you describe. It would also make getDataRef impossible to implement.

> 
> If you require such mutability, could you take a moment and show an 
> example of the usage you require it for?

I thought at first that I would need mutability in my applications, and I 
think that Kim did as well; but I found that my use of it was just because 
I did not have the right boundary between the double[][] stuff that I was 
doing and the algebraic operations (which is what RealMatrix is for).


>> I also wanted to mention that I feel that this interface and its
>> implementation seem too closely intertwined with double arrays, as if
>> clients of the API are going to be moving back and forth between the
>> two data representations regularly.

Well, many clients will in fact be doing that, though hopefully with clean 
boundaries.  Can you suggest alternatives that do not impose too much 
overhead?
>>
>> With the Collections interfaces, if I want a List, I'm going to use
>> one, not go back and forth between Lists and arrays.  Similarly, if I 
>> want
>> to use a RealMatrix, I'd like to use one, not go back and forth 
>> between it
>> and 1D and 2D double arrays.

That's sort of the point of the immutability argument above.  The double[] 
and double[][] valued accessors are for convenience and speed when 
crossing the boundary back into the client application.  Some applications 
will start with arrays, use RealMatrix to perform algebraic operations and 
then use "output" arrays directly.  That is why the accessors are there.

Thanks for the feedback.

Phil



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] RealMatrix Immutability wase: Re: [math] cvs commit: jakarta-commons/math/src/test/org/apache/commons/math/linear RealMatrixImplTest.java

Posted by Phil Steitz <ph...@steitz.com>.
Mark R. Diggory wrote:
> 
> 
> Phil Steitz wrote:
> 
>> Mark R. Diggory wrote:
>>
>>> Phil,
>>>
>>> I think we wanted to maintain the existence of setEntry/getDataRef 
>>> API of the RealMatrixImpl without having it in the RealMatrix 
>>> Interface. At least until we come up with a strategy for mutability 
>>> that made more sense then these methods. This last change removed it 
>>> from both.
>>
>>
>>
>> getDataRef is still there in RealMatrixImpl, though I am starting to 
>> think that we can make it protected.  Either the class is immutable or 
>> it is not.  We need to decide. All of the use cases that I have can 
>> actually be accomplished with the immutable version, using the 
>> algebraic operations exposed in the RealMatrix API.  If others have 
>> use cases that require mutability, then we can change it back, but I 
>> would like to know what they are.
>>
> 
> Thats what I'm hoping to hear from Michael, I want to see solid examples 
> of the need.
> 
>>>
>>> Michael,
>>>
>>> We are attempting to make the Implementation immutable so that 
>>> methods calls such as getColumnMatrix(i) and 
>>> getSubMatrix(xmin,xmax,ymin,ymax) will be able to return submatrice 
>>> objects of the existing data without duplicating/copying the internal 
>>> datastore to do so, this will provide efficient means to access the 
>>> stored values without performing costly array copies on what may be 
>>> very large double[][]'s or copying objects which may not be in an []. 
>>> So basically, what we are trying to avoid is that if I do the following
>>>
>>> Matrix a = ...
>>> Matrix b = a.getColumnMatrix(x);
>>>
>>> that if (a) is mutable, then doing something like a.setEntry(x,y,d) 
>>> will also cause (b) to change, something we should try to avoid, we 
>>> are trying to work with these more as mathematical objects and not 
>>> necessarily as "Collection objects".
>>
>>
>>
>> Yes, but the getSubXxx and getCol, getRow currently make copies. 
>> Assuming we retain immutability, this makes no functional difference 
>> (another argument for making the class immutable).  
> 
> 
> Yes, like I said, this a feature I am working on in my checkout. I 
> havn't committed it because it is not complete yet.
> 
>> Implementation will be very tricky (and likely very inefficient) if we 
>> try to support "data sharing" as you describe. It would also make 
>> getDataRef impossible to implement.
> 
> 
> Not inefficient, a little complex, yes, but not inefficient, the only 
> real difference is "where in" the original datastore data structure (in 
> this case the data[][]) your are iterating over, the RealMatrix 
> representing the Row column or submatrix is simply maintaining extra 
> information about the xmn, xmax, ymin, ymax it represents in the 
> original data store. It accesses the original data store using the 
> getEntry(x,y) . The x and y are transposed to the appropriate location 
> int eh datstore using simple arithmetic.

That will work OK for the row and column accessors and for ranges of 
contiguous cells; but not for the general case currently enabled via the 
sumMatrix API.  These methods are very flexible and allow you to permute 
rows / columns as well as select ranges.  Keeping track of all of that 
would be intractable.  So, if we do go down this path, the "advanced" 
subMatrix methods will still have to make copies.

Assuming we agree on immutablility, the semantics are no different, so I 
would prefer to focus on getting the 1.0 changes complete, including the 
changes to reduce copy operations internal to methods. Remember that all 
the changes already made still have to be ported to the BigMatrix impl. 
Changing and testing all the methods to work with offsets will be time 
consuming and error prone, so I think it is better to hold off on this 
until post 1.0.  IIUC, this would not require any change to the public 
API, so it can be postponed.

Phil


> I think were starting get this thing going in the right direction.
> 
> -Mark
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] RealMatrix Immutability

Posted by "Mark R. Diggory" <md...@latte.harvard.edu>.

Michael Heuer wrote:

> 
> This is similar to how colt matrix views (viewRow, viewColumn,
> viewSelection, etc.) are implemented (disclaimer:  I use colt all the
> time).
> 
>    michael

Yes, a good reason why we are looking at its implementation to enhance 
the Matrix API. :-)

-Mark
-- 
Mark Diggory
Software Developer
Harvard MIT Data Center
http://www.hmdc.harvard.edu

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] RealMatrix Immutability

Posted by Michael Heuer <he...@acm.org>.
On Tue, 12 Oct 2004, Mark R. Diggory wrote:

> Phil Steitz wrote:
> > Mark R. Diggory wrote:
> >
> >> Phil,
> >>
> >> I think we wanted to maintain the existence of setEntry/getDataRef API
> >> of the RealMatrixImpl without having it in the RealMatrix Interface.
> >> At least until we come up with a strategy for mutability that made
> >> more sense then these methods. This last change removed it from both.
> >
> >
> > getDataRef is still there in RealMatrixImpl, though I am starting to
> > think that we can make it protected.  Either the class is immutable or
> > it is not.  We need to decide. All of the use cases that I have can
> > actually be accomplished with the immutable version, using the algebraic
> > operations exposed in the RealMatrix API.  If others have use cases that
> > require mutability, then we can change it back, but I would like to know
> > what they are.
> >
>
> Thats what I'm hoping to hear from Michael, I want to see solid examples
> of the need.
>
> >>
> >> Michael,
> >>
> >> We are attempting to make the Implementation immutable so that methods
> >> calls such as getColumnMatrix(i) and getSubMatrix(xmin,xmax,ymin,ymax)
> >> will be able to return submatrice objects of the existing data without
> >> duplicating/copying the internal datastore to do so, this will provide
> >> efficient means to access the stored values without performing costly
> >> array copies on what may be very large double[][]'s or copying objects
> >> which may not be in an []. So basically, what we are trying to avoid
> >> is that if I do the following
> >>
> >> Matrix a = ...
> >> Matrix b = a.getColumnMatrix(x);
> >>
> >> that if (a) is mutable, then doing something like a.setEntry(x,y,d)
> >> will also cause (b) to change, something we should try to avoid, we
> >> are trying to work with these more as mathematical objects and not
> >> necessarily as "Collection objects".
> >
> >
> > Yes, but the getSubXxx and getCol, getRow currently make copies.
> > Assuming we retain immutability, this makes no functional difference
> > (another argument for making the class immutable).
>
> Yes, like I said, this a feature I am working on in my checkout. I
> havn't committed it because it is not complete yet.
>
> > Implementation will
> > be very tricky (and likely very inefficient) if we try to support "data
> > sharing" as you describe. It would also make getDataRef impossible to
> > implement.
>
> Not inefficient, a little complex, yes, but not inefficient, the only
> real difference is "where in" the original datastore data structure (in
> this case the data[][]) your are iterating over, the RealMatrix
> representing the Row column or submatrix is simply maintaining extra
> information about the xmn, xmax, ymin, ymax it represents in the
> original data store. It accesses the original data store using the
> getEntry(x,y) . The x and y are transposed to the appropriate location
> int eh datstore using simple arithmetic.

This is similar to how colt matrix views (viewRow, viewColumn,
viewSelection, etc.) are implemented (disclaimer:  I use colt all the
time).

   michael


> I think were starting get this thing going in the right direction.
>
> -Mark
>
> --
> Mark Diggory
> Software Developer
> Harvard MIT Data Center
> http://www.hmdc.harvard.edu
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] RealMatrix Immutability wase: Re: [math] cvs commit: jakarta-commons/math/src/test/org/apache/commons/math/linear RealMatrixImplTest.java

Posted by "Mark R. Diggory" <md...@latte.harvard.edu>.

Phil Steitz wrote:
> Mark R. Diggory wrote:
> 
>> Phil,
>>
>> I think we wanted to maintain the existence of setEntry/getDataRef API 
>> of the RealMatrixImpl without having it in the RealMatrix Interface. 
>> At least until we come up with a strategy for mutability that made 
>> more sense then these methods. This last change removed it from both.
> 
> 
> getDataRef is still there in RealMatrixImpl, though I am starting to 
> think that we can make it protected.  Either the class is immutable or 
> it is not.  We need to decide. All of the use cases that I have can 
> actually be accomplished with the immutable version, using the algebraic 
> operations exposed in the RealMatrix API.  If others have use cases that 
> require mutability, then we can change it back, but I would like to know 
> what they are.
> 

Thats what I'm hoping to hear from Michael, I want to see solid examples 
of the need.

>>
>> Michael,
>>
>> We are attempting to make the Implementation immutable so that methods 
>> calls such as getColumnMatrix(i) and getSubMatrix(xmin,xmax,ymin,ymax) 
>> will be able to return submatrice objects of the existing data without 
>> duplicating/copying the internal datastore to do so, this will provide 
>> efficient means to access the stored values without performing costly 
>> array copies on what may be very large double[][]'s or copying objects 
>> which may not be in an []. So basically, what we are trying to avoid 
>> is that if I do the following
>>
>> Matrix a = ...
>> Matrix b = a.getColumnMatrix(x);
>>
>> that if (a) is mutable, then doing something like a.setEntry(x,y,d) 
>> will also cause (b) to change, something we should try to avoid, we 
>> are trying to work with these more as mathematical objects and not 
>> necessarily as "Collection objects".
> 
> 
> Yes, but the getSubXxx and getCol, getRow currently make copies. 
> Assuming we retain immutability, this makes no functional difference 
> (another argument for making the class immutable).  

Yes, like I said, this a feature I am working on in my checkout. I 
havn't committed it because it is not complete yet.

> Implementation will 
> be very tricky (and likely very inefficient) if we try to support "data 
> sharing" as you describe. It would also make getDataRef impossible to 
> implement.

Not inefficient, a little complex, yes, but not inefficient, the only 
real difference is "where in" the original datastore data structure (in 
this case the data[][]) your are iterating over, the RealMatrix 
representing the Row column or submatrix is simply maintaining extra 
information about the xmn, xmax, ymin, ymax it represents in the 
original data store. It accesses the original data store using the 
getEntry(x,y) . The x and y are transposed to the appropriate location 
int eh datstore using simple arithmetic.


I think were starting get this thing going in the right direction.

-Mark

-- 
Mark Diggory
Software Developer
Harvard MIT Data Center
http://www.hmdc.harvard.edu

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] RealMatrix Immutability

Posted by Phil Steitz <ph...@steitz.com>.
Michael Heuer wrote:
> On Tue, 12 Oct 2004, Phil Steitz wrote:
> 
> 
>>Mark R. Diggory wrote:
>>
>>>Phil,
>>>
>>>I think we wanted to maintain the existence of setEntry/getDataRef API
>>>of the RealMatrixImpl without having it in the RealMatrix Interface. At
>>>least until we come up with a strategy for mutability that made more
>>>sense then these methods. This last change removed it from both.
>>
>>getDataRef is still there in RealMatrixImpl, though I am starting to think
>>that we can make it protected.  Either the class is immutable or it is
>>not.  We need to decide. All of the use cases that I have can actually be
>>accomplished with the immutable version, using the algebraic operations
>>exposed in the RealMatrix API.  If others have use cases that require
>>mutability, then we can change it back, but I would like to know what they
>>are.

Another point in favor of immutability is that the cached LU decomposition 
becomes invalid on mutation, so the luDecompose method has to be exposed. 
  This method can also be made protected or even private now.

>>
>>>Michael,
>>>
>>>We are attempting to make the Implementation immutable so that methods
>>>calls such as getColumnMatrix(i) and getSubMatrix(xmin,xmax,ymin,ymax)
>>>will be able to return submatrice objects of the existing data without
>>>duplicating/copying the internal datastore to do so, this will provide
>>>efficient means to access the stored values without performing costly
>>>array copies on what may be very large double[][]'s or copying objects
>>>which may not be in an []. So basically, what we are trying to avoid is
>>>that if I do the following
>>>
>>>Matrix a = ...
>>>Matrix b = a.getColumnMatrix(x);
>>>
>>>that if (a) is mutable, then doing something like a.setEntry(x,y,d) will
>>>also cause (b) to change, something we should try to avoid, we are
>>>trying to work with these more as mathematical objects and not
>>>necessarily as "Collection objects".
>>
>>Yes, but the getSubXxx and getCol, getRow currently make copies. Assuming
>>we retain immutability, this makes no functional difference (another
>>argument for making the class immutable).  Implementation will be very
>>tricky (and likely very inefficient) if we try to support "data sharing"
>>as you describe. It would also make getDataRef impossible to implement.
>>
>>
>>>If you require such mutability, could you take a moment and show an
>>>example of the usage you require it for?
>>
>>I thought at first that I would need mutability in my applications, and I
>>think that Kim did as well; but I found that my use of it was just because
>>I did not have the right boundary between the double[][] stuff that I was
>>doing and the algebraic operations (which is what RealMatrix is for).
> 
> 
> The double[][] constructor for RealMatrixImpl copies the input array, so
> if I want to fill a matrix with values, isn't it more efficient and more
> straightforward to simply fill the matrix?
> 
> RealMatrix rm = new RealMatrixImpl(10000, 10000);
> for (int row = 0, rows = rm.getRowDimension(); row < rows; row++)
> {
>   for (int col = 0, cols = rm.getColumnDimension(); col < cols; col++)
>   {
>     rm.setEntry(row, col, someValue);
>   }
> }
> 

That is a good point, but is not really an argument agains immutability. 
This is really just an initialization issue. The constructor makes a copy 
to be "defensive" and to provide a clean separation.  We could provide a 
factory method that does not make a copy in MatrixUtils.  That is probably 
a good idea.  Do you have other use cases where the setEntry method is 
really required?

> 
> 
>>>>I also wanted to mention that I feel that this interface and its
>>>>implementation seem too closely intertwined with double arrays, as if
>>>>clients of the API are going to be moving back and forth between the
>>>>two data representations regularly.
>>
>>Well, many clients will in fact be doing that, though hopefully with clean
>>boundaries.  Can you suggest alternatives that do not impose too much
>>overhead?
>>
>>>>With the Collections interfaces, if I want a List, I'm going to use
>>>>one, not go back and forth between Lists and arrays.  Similarly, if I
>>>>want
>>>>to use a RealMatrix, I'd like to use one, not go back and forth
>>>>between it
>>>>and 1D and 2D double arrays.
>>
>>That's sort of the point of the immutability argument above.  The double[]
>>and double[][] valued accessors are for convenience and speed when
>>crossing the boundary back into the client application.  Some applications
>>will start with arrays, use RealMatrix to perform algebraic operations and
>>then use "output" arrays directly.  That is why the accessors are there.
>>
>>Thanks for the feedback.
>>
>>Phil
>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>>
>>
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math] RealMatrix Immutability

Posted by Michael Heuer <he...@acm.org>.
On Tue, 12 Oct 2004, Phil Steitz wrote:

> Mark R. Diggory wrote:
> > Phil,
> >
> > I think we wanted to maintain the existence of setEntry/getDataRef API
> > of the RealMatrixImpl without having it in the RealMatrix Interface. At
> > least until we come up with a strategy for mutability that made more
> > sense then these methods. This last change removed it from both.
>
> getDataRef is still there in RealMatrixImpl, though I am starting to think
> that we can make it protected.  Either the class is immutable or it is
> not.  We need to decide. All of the use cases that I have can actually be
> accomplished with the immutable version, using the algebraic operations
> exposed in the RealMatrix API.  If others have use cases that require
> mutability, then we can change it back, but I would like to know what they
> are.
> >
> > Michael,
> >
> > We are attempting to make the Implementation immutable so that methods
> > calls such as getColumnMatrix(i) and getSubMatrix(xmin,xmax,ymin,ymax)
> > will be able to return submatrice objects of the existing data without
> > duplicating/copying the internal datastore to do so, this will provide
> > efficient means to access the stored values without performing costly
> > array copies on what may be very large double[][]'s or copying objects
> > which may not be in an []. So basically, what we are trying to avoid is
> > that if I do the following
> >
> > Matrix a = ...
> > Matrix b = a.getColumnMatrix(x);
> >
> > that if (a) is mutable, then doing something like a.setEntry(x,y,d) will
> > also cause (b) to change, something we should try to avoid, we are
> > trying to work with these more as mathematical objects and not
> > necessarily as "Collection objects".
>
> Yes, but the getSubXxx and getCol, getRow currently make copies. Assuming
> we retain immutability, this makes no functional difference (another
> argument for making the class immutable).  Implementation will be very
> tricky (and likely very inefficient) if we try to support "data sharing"
> as you describe. It would also make getDataRef impossible to implement.
>
> >
> > If you require such mutability, could you take a moment and show an
> > example of the usage you require it for?
>
> I thought at first that I would need mutability in my applications, and I
> think that Kim did as well; but I found that my use of it was just because
> I did not have the right boundary between the double[][] stuff that I was
> doing and the algebraic operations (which is what RealMatrix is for).

The double[][] constructor for RealMatrixImpl copies the input array, so
if I want to fill a matrix with values, isn't it more efficient and more
straightforward to simply fill the matrix?

RealMatrix rm = new RealMatrixImpl(10000, 10000);
for (int row = 0, rows = rm.getRowDimension(); row < rows; row++)
{
  for (int col = 0, cols = rm.getColumnDimension(); col < cols; col++)
  {
    rm.setEntry(row, col, someValue);
  }
}


> >> I also wanted to mention that I feel that this interface and its
> >> implementation seem too closely intertwined with double arrays, as if
> >> clients of the API are going to be moving back and forth between the
> >> two data representations regularly.
>
> Well, many clients will in fact be doing that, though hopefully with clean
> boundaries.  Can you suggest alternatives that do not impose too much
> overhead?
> >>
> >> With the Collections interfaces, if I want a List, I'm going to use
> >> one, not go back and forth between Lists and arrays.  Similarly, if I
> >> want
> >> to use a RealMatrix, I'd like to use one, not go back and forth
> >> between it
> >> and 1D and 2D double arrays.
>
> That's sort of the point of the immutability argument above.  The double[]
> and double[][] valued accessors are for convenience and speed when
> crossing the boundary back into the client application.  Some applications
> will start with arrays, use RealMatrix to perform algebraic operations and
> then use "output" arrays directly.  That is why the accessors are there.
>
> Thanks for the feedback.
>
> Phil
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org