You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Ted Dunning (JIRA)" <ji...@apache.org> on 2011/08/20 23:28:27 UTC

[jira] [Created] (MAHOUT-790) Redundancy in Matrix API, view or get?

Redundancy in Matrix API, view or get?
--------------------------------------

                 Key: MAHOUT-790
                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
             Project: Mahout
          Issue Type: Improvement
            Reporter: Ted Dunning


We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
{code}
  Matrix viewPart(int[] offset, int[] size);
  Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
  Vector viewRow(int row);
  Vector viewColumn(int column);
{code}

and things that do the same but call refer to getting stuff
{code}
  Vector getColumn(int column);
  Vector getRow(int row);
  double getQuick(int row, int column);
  int[] getNumNondefaultElements();
  Map<String, Integer> getColumnLabelBindings();
  Map<String, Integer> getRowLabelBindings();
  double get(String rowLabel, String columnLabel);
{code}

To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  

This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?

Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?

Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 

In sum, I suggest that:

  - getRow and getColumn go away

  - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn

  - there be a constructor for AbstractMatrix that sets the internal size things correctly.

  - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.

  - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.

I will produce a patch shortly.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] [Resolved] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by Ted Dunning <te...@gmail.com>.
Yes.  Done.

On Tue, Sep 13, 2011 at 9:14 AM, Sean Owen (JIRA) <ji...@apache.org> wrote:

>
>     [
> https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]
>
> Sean Owen resolved MAHOUT-790.
> ------------------------------
>
>    Resolution: Fixed
>
> Ted looks like you are done with this one?
>
> > Redundancy in Matrix API, view or get?
> > --------------------------------------
> >
> >                 Key: MAHOUT-790
> >                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
> >             Project: Mahout
> >          Issue Type: Improvement
> >    Affects Versions: 0.5
> >            Reporter: Ted Dunning
> >            Assignee: Ted Dunning
> >             Fix For: 0.6
> >
> >         Attachments: MAHOUT-790.patch
> >
> >
> > We have a bunch of redundant methods in our matrix interface.  These
> include things that return views of parts of the matrix:
> > {code}
> >   Matrix viewPart(int[] offset, int[] size);
> >   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int
> columnsRequested);
> >   Vector viewRow(int row);
> >   Vector viewColumn(int column);
> > {code}
> > and things that do the same but call refer to getting stuff
> > {code}
> >   Vector getColumn(int column);
> >   Vector getRow(int row);
> >   double getQuick(int row, int column);
> >   int[] getNumNondefaultElements();
> >   Map<String, Integer> getColumnLabelBindings();
> >   Map<String, Integer> getRowLabelBindings();
> >   double get(String rowLabel, String columnLabel);
> > {code}
> > To my mind, get implies a get-by-value whereas view implies
> get-by-reference.  As such, I would suggest that getColumn and getRow should
> disappear.  On the other hand, getQuick and get are both correctly named.
> > This raises the question of what getNumNondefaultElements really does.  I
> certainly can't tell just from the signature.  Is it too confusing to keep?
> > Additionally, what do people think that getColumnLabelBindings and
> getRowLabelBindings return?  A mutable map?  Or an immutable one?
> > Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal)
> have default implementations that use MatrixVectorView, but AbstractMatrix
> doesn't have an implementation for getRow and getColumn.
> > In sum, I suggest that:
> >   - getRow and getColumn go away
> >   - the fancy fast implementations fo getRow and getColumn that exist be
> migrated to be over-rides of viewRow and viewColumn
> >   - there be a constructor for AbstractMatrix that sets the internal size
> things correctly.
> >   - that the internal cardinality array in AbstractMatrix goes away to be
> replaced by two integers.
> >   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column)
> and viewDiagonal(int row, column, length) be added.
> > I will produce a patch shortly.
>
> --
> This message is automatically generated by JIRA.
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095252#comment-13095252 ] 

Hudson commented on MAHOUT-790:
-------------------------------

Integrated in Mahout-Quality #1012 (See [https://builds.apache.org/job/Mahout-Quality/1012/])
    MAHOUT-790 - kill the cardinality array and size() for matrices.  Use rowSize() and columnSize() instead.

MAHOUT-792 - Fix RTM to avoid size() and cardinality array.
MAHOUT-790 - More get/view changes
MAHOUT-790 - collapse get{Row,Column} to view{Row,Column}, kill addTo
MAHOUT-790 - Fixed copyright and license on QRDecomposition and SVDDecomposition
MAHOUT-790 - Copyright format cleanup
MAHOUT-790 - Matrix cleanups.
MAHOUT-790 - Add some vector and matrix types to simplify certain manipulations.
MAHOUT-790 - Add view for diagonal of a matrix.

tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1164016
Files : 
* /mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/als/eval/InMemoryFactorizationEvaluator.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/bayes/InMemoryBayesDatastore.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/discriminative/LinearTrainer.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/naivebayes/NaiveBayesModel.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/naivebayes/training/TrainUtils.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/sequencelearning/hmm/HmmUtils.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/math/MatrixWritable.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/decomposer/EigenVerificationJob.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/UpperTriangular.java
* /mahout/trunk/core/src/test/java/org/apache/mahout/cf/taste/hadoop/als/ParallelALSFactorizationJobTest.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/AbstractMatrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/DenseMatrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/DiagonalMatrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/Matrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/MatrixView.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/PermutedVectorView.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/PivotedMatrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/RandomAccessSparseVector.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/RandomTrinaryMatrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/SequentialAccessSparseVector.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/SparseColumnMatrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/SparseMatrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/SparseRowMatrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/decomposer/hebbian/HebbianSolver.java
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/AbstractTestVector.java
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/MatrixTest.java
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/TestMatrixView.java
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/TestSparseColumnMatrix.java
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/TestSparseMatrix.java
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/TestSparseRowMatrix.java
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/TestVectorView.java
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/als/AlternateLeastSquaresSolverTest.java
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/decomposer/SolverTest.java

tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1164015
Files : 
* /mahout/trunk/core/src/main/java/org/apache/mahout/clustering/lda/LDAInference.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/AbstractMatrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/DiagonalMatrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/PivotedMatrix.java

tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1164014
Files : 
* /mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/als/eval/InMemoryFactorizationEvaluator.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/AbstractVectorClassifier.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/discriminative/LinearTrainer.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/naivebayes/NaiveBayesModel.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/naivebayes/training/WeightsMapper.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/sgd/AbstractOnlineLogisticRegression.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/sgd/GradientMachine.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/clustering/AbstractCluster.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/clustering/RunningSumsGaussianAccumulator.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/common/mapreduce/VectorSumReducer.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/MatrixMultiplicationJob.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/TimesSquaredJob.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/GivensThinSolver.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDPrototype.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/UJob.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/UpperTriangular.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/VJob.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/vectorizer/common/PartialVectorMergeReducer.java
* /mahout/trunk/core/src/test/java/org/apache/mahout/cf/taste/hadoop/als/ParallelALSFactorizationJobTest.java
* /mahout/trunk/core/src/test/java/org/apache/mahout/classifier/discriminative/PerceptronTrainerTest.java
* /mahout/trunk/core/src/test/java/org/apache/mahout/classifier/discriminative/WinnowTrainerTest.java
* /mahout/trunk/core/src/test/java/org/apache/mahout/classifier/sgd/OnlineBaseTest.java
* /mahout/trunk/core/src/test/java/org/apache/mahout/clustering/TestGaussianAccumulators.java
* /mahout/trunk/core/src/test/java/org/apache/mahout/math/hadoop/decomposer/TestDistributedLanczosSolverCLI.java
* /mahout/trunk/core/src/test/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDPrototypeTest.java
* /mahout/trunk/examples/src/main/java/org/apache/mahout/clustering/display/DisplayMeanShift.java
* /mahout/trunk/integration/src/test/java/org/apache/mahout/clustering/TestClusterDumper.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/AbstractMatrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/AbstractVector.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/Algebra.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/DenseMatrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/DenseVector.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/Matrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/MatrixVectorView.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/MatrixView.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/NamedVector.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/RandomAccessSparseVector.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/SparseColumnMatrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/SparseMatrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/SparseRowMatrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/Vector.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/VectorView.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/als/AlternateLeastSquaresSolver.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/decomposer/hebbian/HebbianSolver.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/decomposer/hebbian/TrainingState.java
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/MatrixTest.java
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/TestMatrixView.java
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/VectorTest.java
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/decomposer/SolverTest.java

tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1164013
Files : 
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/QRDecomposition.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/SingularValueDecomposition.java

tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1164012
Files : 
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/TestSingularValueDecomposition.java

tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1164011
Files : 
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/MatrixTest.java

tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1164010
Files : 
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/ConstantVector.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/DiagonalMatrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/PermutedVectorView.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/PivotedMatrix.java
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/PermutedVectorViewTest.java
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/PivotedMatrixTest.java

tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1164009
Files : 
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/AbstractMatrix.java
* /mahout/trunk/math/src/main/java/org/apache/mahout/math/Matrix.java


> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>            Assignee: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088820#comment-13088820 ] 

Sean Owen commented on MAHOUT-790:
----------------------------------

I may misunderstand: Do you expect m and m.viewPart() to be the same class in an ideal world or not? I don't expect that, myself, and indeed that's not the case. So all the less would I expect that m.like() and m.viewPart().like() ought to be the same class. I thought you were suggesting they ought to be.

I agree with your last point here so I think we must be agreeing.

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089789#comment-13089789 ] 

Dmitriy Lyubimov commented on MAHOUT-790:
-----------------------------------------

Thanks, Ted. 

I am working off a frozen snapshot in production (just built my own private release of a suitable functionality snapshot i use), so it's no immediate problem. At some point we will refactor. No problem. But it might help in other places where i have less of idea how in flux the api is. 

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089775#comment-13089775 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

Great idea.  I doubt it would have helped here since I thought this interface was pretty stable.

Maybe it would focus our minds as we add the @stable annotation.

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089844#comment-13089844 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

@Dmitriy,

I completely agree.  We need to get the basic API's rock solid as soon as possible.


> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088293#comment-13088293 ] 

Lance Norskog edited comment on MAHOUT-790 at 8/21/11 1:03 AM:
---------------------------------------------------------------

Cardinality array: definitely- it is mutable from outside. 
    final int row; 
    final int column;

viewColumn v.s. getColumn: the question here is deep v.s. shallow copy?
I would go with
    getColumn(int column) and getColumn(int column, Vector v)
where getColumn(int column) is assumed to give a shallow copy.

Diagonals: are they really needed now? Should there be triangular or symmetric? They have enough of their own behavior to be a separate subclass rather than some magic thing held by the main class. Example: DiagonalMatrix.invert() is a valid method, because it either blows up if there is a 0 value, or returns 1/values.

getNumNondefault: this requires being able to produce the number, which is a "design load". It is not used much in "real code". I suspect most of those users could track/deduce the number in some other way, rather than expect the object to remember it.

      was (Author: lancenorskog):
    Cardinality array: definitely- it is mutable from outside. 
    final int row; 
    final int column;

viewColumn v.s. getColumn: the question here is deep v.s. shallow copy?
I would go with
    getColumn(int column) and getColumn(int column, Vector v)
where getColumn(int column) is assumed to give a shallow copy.

Diagonals: are they really needed now? Should there be triangular or symmetric?

getNumNondefault: this requires being able to produce the number, which is a "design load". It is not used much in "real code". I suspect most of those users could track/deduce the number in some other way, rather than expect the object to remember it.
  
> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Ted Dunning
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095683#comment-13095683 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

Hmph... that test should have been commented out.

Let me take a look.  My final merges may not have been quite right.




> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>            Assignee: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Dunning updated MAHOUT-790:
-------------------------------

    Attachment: MAHOUT-790.patch

Here is a monster patch that cleans up the matrix classes as suggested.  The remaining nit is the iterator in SparseColumnMatrix.

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Ted Dunning
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088509#comment-13088509 ] 

Lance Norskog commented on MAHOUT-790:
--------------------------------------

addTo v.s. apply(Functions.PLUS)

If performance is really a problem, the apply implementation can do instanceof, but keep the interface clean.



> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095680#comment-13095680 ] 

Dmitriy Lyubimov commented on MAHOUT-790:
-----------------------------------------

Hm. i can't seem to make trunk to build. 

building math produces 


Results :

Failed tests: 
  testIterate(org.apache.mahout.math.TestSparseColumnMatrix): iterator: {0:1.1,1:2.2}, randomAccess: {2:5.5,1:3.3,0:1.1} expected:<{0:1.1,1:2.2}> but was:<{2:5.5,1:3.3,0:1.1}>


> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>            Assignee: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088508#comment-13088508 ] 

Lance Norskog commented on MAHOUT-790:
--------------------------------------

While you've got the scissors out, I would reconsider clone(). 

* clone() requires every subclass to implement its own version
** The "which class do I use for the clone()" problem is handled better by like().


> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Dunning updated MAHOUT-790:
-------------------------------

    Resolution: Fixed
      Assignee: Ted Dunning
        Status: Resolved  (was: Patch Available)

Checked in.  We will want to make sure that Jenkins concurs that it works.

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>            Assignee: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088451#comment-13088451 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

I have also found a number of instances where from.addTo(to) is used instead of to.assign(from, Functions.PLUS).  I am changing these usages to the latter form and removing addTo as redundant?  

Any comment on that change?

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Ted Dunning
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088788#comment-13088788 ] 

Sean Owen commented on MAHOUT-790:
----------------------------------

I know the clone() contract well -- why wouldn't, as you say, like() + assign() satisfy the contract? That's why I questioned the objection that, well, every class has to implement it. I don't think so. If like() does the hard part of figuring out what class to return, this is a breeze. It would be nice to have clone() even if it can be accomplished with like() and assign() as a convenience method, to match developer expectations.

I don't necessarily think m.like() and m.viewPart().like() return the same class. I *might* well expect that m and m.viewPart() are of the same class! which would make this true.

erm, in terms of actionable changes, I think I was arguing against more change rather than for more, so proceed and we can sort it later. Don't remove clone() unless it's really painful given the road you've gone down with this patch.

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095234#comment-13095234 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

OK.  Just committed this.

When you update, make sure to do [mvn install] in the math module to make sure that you get the update for the rest of Mahout (that threw me for a loop for a while).



> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088506#comment-13088506 ] 

Lance Norskog commented on MAHOUT-790:
--------------------------------------

bq. It really helps some algorithms to be able to pull the primary diagonal of a matrix out as a vector. So yes, that is needed.
This sounds like a utility method? The different Matrix data structures may want to have different implementations of viewing it; I can see a disastrous clash between a 'sequential' Matrix and pulling diagonals in one go. It may be one of those cases where each use of this is somewhat customized, and the surrounding code knows the matrix implementation. That is, an algorithm for sequential matrices is carefully coded around this fact, and so how it uses a diagonal will also have this profile.
So, static utility method and "you know the problem space" are the two uses for this?

I go on about this because I tried to make a generic read-only Matrix and Vector, and then random sub-classes of those. This exercise showed the design tensions so I'm now wary of adding more features which subclasses must consider.
 

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089777#comment-13089777 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

So Dmitriy, how badly are the changes I am pushing going to hit you?

I am about to add a merge of numCols and columnSize as well (same for numRows).


> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088512#comment-13088512 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

The diagonal support consists of a viewDiagonal method on Matrix on the one hand and a DiagonalMatrix implementation on the other.  As the name suggests viewDiagonal is a view method so it would be bad to make it read-only.  It does make certain operations like computing the determinant very simple:
{code}
  determinant = new CholeskyDecomposition(matrix).viewDiagonal().aggregate(Functions.TIMES)
{code}
or setting the diagonal of a matrix to all zeros:
{code} 
   m.viewDiagonal().assign(0)
{code}

The diagonal matrix is handy when reconstructing SVD's.  You get this:

{code}
   u.times(new DiagonalMatrix(singularValues)).times(v.transpose())
{code}

Since the DiagonalMatrix is marked as sparse, efficiency is good.


> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088775#comment-13088775 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

Github branch MAHOUT-790 available at git://github.com/tdunning/mahout.git

That provides more details on successive changes.

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088513#comment-13088513 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

yeah, I hate our use of clone as well.  But I am not going to change it on this pass.  I have already touched 80 files with > 200 changes.  That will be enough to commit cleanly.



> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089790#comment-13089790 ] 

Dmitriy Lyubimov commented on MAHOUT-790:
-----------------------------------------

the thing is, i like Mahout's DRM and Matrix apis very much and use them pervasively. I think they are cornerstone for everything else and for custom pipelining. It would be great if we could make them stable rather sooner than later :) 

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090113#comment-13090113 ] 

Lance Norskog commented on MAHOUT-790:
--------------------------------------

bq. I don't necessarily think m.like() and m.viewPart().like() return the same class. I might well expect that m and m.viewPart() are of the same class! which would make this true.
Ouch- that twisted my head. But it does show that perhaps viewPart should be done by composition. You pick a viewer class and give it the delegate Matrix.


> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090113#comment-13090113 ] 

Lance Norskog edited comment on MAHOUT-790 at 8/24/11 9:45 AM:
---------------------------------------------------------------

bq. I don't necessarily think m.like() and m.viewPart().like() return the same class. I might well expect that m and m.viewPart() are of the same class! which would make this true.
Ouch- that twisted my head. But it does show that perhaps viewPart should be done outside of Matrix. You pick a viewer class and give it the delegate Matrix in the constructor. If it's not linear algebra, should it be in Matrix?


      was (Author: lancenorskog):
    bq. I don't necessarily think m.like() and m.viewPart().like() return the same class. I might well expect that m and m.viewPart() are of the same class! which would make this true.
Ouch- that twisted my head. But it does show that perhaps viewPart should be done by composition. You pick a viewer class and give it the delegate Matrix.

  
> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088450#comment-13088450 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

So the getRow/getColumn vs viewRow/viewColumn merge exercise is turning out good.  I have found a number of bugs that relate to the confusion between whether getRow returned a copy or not.

But I am also finding that getRow is much more popular than viewRow.  My tendency is to still change the name to make clear that the result is a view.

Any thoughts?


> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Ted Dunning
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Dunning updated MAHOUT-790:
-------------------------------

        Fix Version/s: 0.6
    Affects Version/s: 0.5
               Status: Patch Available  (was: Open)

This build should fail for now.  The fix will be forth coming.

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090377#comment-13090377 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

{quote}
But it does show that perhaps viewPart should be done outside of Matrix.
{quote}
I think not.  Getting a view of a submatrix or row or column or diagonal is a fundamental operation in linear algebra. The method may delegate to a view class, but to the user, it should appear as a matrix operation.

Besides, there are are kinds of matrices and vectors where the view *is* the same type as the matrix.  For instance, for dense matrices this is often true because the dense matrix is a one-dimensional storage array combined with an offset plus row and column strides.  Any block-wise view of this keeps the same storage and has different offset and strides.

On the other hand, sparse matrices do better with a view structure.

In any caes, viewPart should be a method on the matrix.  It should return a matrix and preserve sparsity and maybe a few related properties, but not precise type.

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Reopened] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Dunning reopened MAHOUT-790:
--------------------------------


Lost a change.  Re-opening to commit the fix.

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>            Assignee: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Sebastian Schelter (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089799#comment-13089799 ] 

Sebastian Schelter edited comment on MAHOUT-790 at 8/23/11 9:56 PM:
--------------------------------------------------------------------

I like this idea of annotating stuff very much. Maybe it would also make sense to apply that in a broader way to highlight which implementations are mature and production-ready (like most of our recommender code) and which are rather new and experimental (like the graph mining module).

      was (Author: ssc):
    I like this idea of annotating stuff very much. Maybe it would also make sense to apply that in a broader way to highlight implementations are mature and production-ready (like most of our recommender code) and which are rather new experimental (like the graph mining module).
  
> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088634#comment-13088634 ] 

Sean Owen commented on MAHOUT-790:
----------------------------------

On clone() vs like(): there are two logical operations we might want to support here. There's an operation that gives a separate, identical copy with all the same values. There's also an operation that gives a separate, identical copy with no data or values.

The first should certainly be called clone(), since that's what it is, in Java.

Lance, is your objection that we simply should not have the first operation? or that you don't want to use clone() per se for some reason?
I don't see that either of these two has an easier time deciding what class to return, or that one or the other must or must not be implemented in subclasses. These are like any other OO method.

I can imagine both are useful, and so would support keeping both. If someone has a good argument that one or the other really isn't used, that's good too. And certainly if we're finding they're implemented incorrectly, and I did find several instances of that in the past, we should fix it.

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095684#comment-13095684 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

Definitely due to last minute regression.  My local history shows that this
changed right as I checked stuff in.




> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>            Assignee: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088760#comment-13088760 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

Sean,

We definitely need both operations.  The first can be expressed, however, as a.like().assign(a) so it isn't quite as necessary as it might seem.

The problem with clone itself is that there are serious restrictions on how you have to do it based on Java requirements.  That makes it a royal pain some days of the week.  This may be easier after this JIRA gets resolved since the only information at AbstractMatrix level is the number of rows and columns and they are trivial to deal with.

We definitely also have some bugs in our test suite in that it is assumed that like() has to return the same type of object.  That isn't really true.  For instance, m.viewPart(0,3,2,5).like() should return the same thing that m.like() returns.  But viewPart probably returns some kind of view object so these aren't the same.

I can deal with those issues another day.  

If we can get eyes on this monster patch, I would like to commit it shortly.  The only big issues I know about are:

a) getRow and getColumn is a more common name than viewRow and viewColumn.  Does everybody promise not to be confused by the loss of getRow?

b) what should the iterator of a matrix do?  Right now SparseColumnMatrix iterates over columns and everything else by rows.  Unless there is some very clear way to tell what the iteration is doing, I would like to go on record as grumpy about that.


> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088814#comment-13088814 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

{quote}
I don't necessarily think m.like() and m.viewPart().like() return the same class. I might well expect that m and m.viewPart() are of the same class! which would make this true.
{quote}
You might expect that, but you would be wrong.  :-)

Seriously, I would expect that mostly like() should preserve isSparse, but not necessarily much else.  Much better is to have like() encode the judgement of the implementor and look below any facades to the truth of the matter.

I am not touching clone for now.


> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Sebastian Schelter (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089799#comment-13089799 ] 

Sebastian Schelter commented on MAHOUT-790:
-------------------------------------------

I like this idea of annotating stuff very much. Maybe it would also make sense to apply that in a broader way to highlight implementations are mature and production-ready (like most of our recommender code) and which are rather new experimental (like the graph mining module).

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089731#comment-13089731 ] 

Dmitriy Lyubimov commented on MAHOUT-790:
-----------------------------------------

It might help to introduce interface maturity annotations (similar to what they do in Hadoop) to indicate our opinion of still-evolving apis to the user. 

I have tons of outside code locked to the Matrix api already. I probably would've used it anyway even if it were marked as evolving. but we definitely have various levels of api maturity. So it might help to indicate it.

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088293#comment-13088293 ] 

Lance Norskog commented on MAHOUT-790:
--------------------------------------

Cardinality array: definitely- it is mutable from outside. 
    final int row; 
    final int column;

viewColumn v.s. getColumn: the question here is deep v.s. shallow copy?
I would go with
    getColumn(int column) and getColumn(int column, Vector v)
where getColumn(int column) is assumed to give a shallow copy.

Diagonals: are they really needed now? Should there be triangular or symmetric?

getNumNondefault: this requires being able to produce the number, which is a "design load". It is not used much in "real code". I suspect most of those users could track/deduce the number in some other way, rather than expect the object to remember it.

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Ted Dunning
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088453#comment-13088453 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

I have also found an odd function called slice(int).  It seems to be used variously to describe a column or row view.  This non-specificity seems disastrous to me so I am deleting that function and replacing it with viewRow.

Can somebody say what this is for?

Jake?  Were you involved with this?  It seems to appear in the distributed row matrix stuff a fair bit.


> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Ted Dunning
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096734#comment-13096734 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

Lance, 

Ideally labels should follow the rows and/or columns as the matrix pivots.  Since the pivoted matrix is just a view, this should be easy to make happen.  It probably doesn't happen correctly now.

Yes, the inverse of a non-zero diagonal is easy.  Commonly a diagonal matrix with zeros is truncated before inverting.  The definition of zero is an application specific thing and thus should not be included in the DM itself.

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>            Assignee: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Reopened] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Dunning reopened MAHOUT-790:
--------------------------------


Found the missing commit.  It had other changes as well.

Applied them.  Should be fixed (finally)

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>            Assignee: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088308#comment-13088308 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

{quote}
viewColumn v.s. getColumn: the question here is deep v.s. shallow copy?
I would go with
getColumn(int column) and getColumn(int column, Vector v)
where getColumn(int column) is assumed to give a shallow copy.
{quote}

Neither of these produce a copy.  Both return references.  Your reaction is similar to mine that get implies a copy.  I don't plan to add code to create a copy so I plan to just reduce current function to a single entry point.

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Ted Dunning
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Dunning resolved MAHOUT-790.
--------------------------------

    Resolution: Fixed

Really fixed now.

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>            Assignee: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved MAHOUT-790.
------------------------------

    Resolution: Fixed

Ted looks like you are done with this one?

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>            Assignee: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096564#comment-13096564 ] 

Lance Norskog commented on MAHOUT-790:
--------------------------------------

What should happen to labels in PivotedMatrix.java? Should they point to the row number as they do now? Should they track the movements of rows & columns?

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>            Assignee: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088399#comment-13088399 ] 

Sean Owen commented on MAHOUT-790:
----------------------------------

I think that all sounds great!

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Ted Dunning
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096779#comment-13096779 ] 

Lance Norskog commented on MAHOUT-790:
--------------------------------------

bq. Ideally labels should follow the rows and/or columns as the matrix pivots. Since the pivoted matrix is just a view, this should be easy to make happen. It probably doesn't happen correctly now.

Matrix row&column labels are string->index maps. If you write code that knows it has a permuted matrix, it can pull the labels and the permutation lists and do the indirection. If it does not know it has a permuted matrix, it will get non-tracking outputs.



> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>            Assignee: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088310#comment-13088310 ] 

Ted Dunning commented on MAHOUT-790:
------------------------------------

{quote}
Diagonals: are they really needed now? 
{quote}
It really helps some algorithms to be able to pull the primary diagonal of a matrix out as a vector.  So yes, that is needed.

Regarding a DiagonalMatrix, I do have a need for that as well and will include that in the SSVD patch.  Since that is a pure addition, I don't think it needs the same level of discussion.

{quote}
Should there be triangular or symmetric? They have enough of their own behavior to be a separate subclass rather than some magic thing held by the main class. 
{quote}
Possibly.  So far, this isn't a big deal although I do have a triangular solver in my CholeskyDecomposition.  If we see some more uses, it might be worth pulling out as a separate class.

I don't see that a SymmetricMatrix is an important thing yet.  Yes, there are important mathematical properties there, but these aren't necessarily something worth reflecting in the type structure.  Good use cases would change my mind.

{quote}
Example: DiagonalMatrix.invert() is a valid method, because it either blows up if there is a 0 value, or returns 1/values.
{quote}
I prefer solve methods rather than invert methods.  It is already much too hard to cure people of trying to invert matrices.  Introducing an invert method anywhere would just make that harder.

> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Ted Dunning
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-790) Redundancy in Matrix API, view or get?

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096567#comment-13096567 ] 

Lance Norskog commented on MAHOUT-790:
--------------------------------------

Inverting a diagonal matrix is 1/the values in the diagonal. This is trivial if all the diagonals are non-zero, but impossible if any are 0. Should DiagonalMatrix track whether any values are 0? 


> Redundancy in Matrix API, view or get?
> --------------------------------------
>
>                 Key: MAHOUT-790
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-790
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Ted Dunning
>            Assignee: Ted Dunning
>             Fix For: 0.6
>
>         Attachments: MAHOUT-790.patch
>
>
> We have a bunch of redundant methods in our matrix interface.  These include things that return views of parts of the matrix:
> {code}
>   Matrix viewPart(int[] offset, int[] size);
>   Matrix viewPart(int rowOffset, int rowsRequested, int columnOffset, int columnsRequested);
>   Vector viewRow(int row);
>   Vector viewColumn(int column);
> {code}
> and things that do the same but call refer to getting stuff
> {code}
>   Vector getColumn(int column);
>   Vector getRow(int row);
>   double getQuick(int row, int column);
>   int[] getNumNondefaultElements();
>   Map<String, Integer> getColumnLabelBindings();
>   Map<String, Integer> getRowLabelBindings();
>   double get(String rowLabel, String columnLabel);
> {code}
> To my mind, get implies a get-by-value whereas view implies get-by-reference.  As such, I would suggest that getColumn and getRow should disappear.  On the other hand, getQuick and get are both correctly named.  
> This raises the question of what getNumNondefaultElements really does.  I certainly can't tell just from the signature.  Is it too confusing to keep?
> Additionally, what do people think that getColumnLabelBindings and getRowLabelBindings return?  A mutable map?  Or an immutable one?
> Under the covers, viewRow and viewColumn (and the upcoming viewDiagonal) have default implementations that use MatrixVectorView, but AbstractMatrix doesn't have an implementation for getRow and getColumn. 
> In sum, I suggest that:
>   - getRow and getColumn go away
>   - the fancy fast implementations fo getRow and getColumn that exist be migrated to be over-rides of viewRow and viewColumn
>   - there be a constructor for AbstractMatrix that sets the internal size things correctly.
>   - that the internal cardinality array in AbstractMatrix goes away to be replaced by two integers.
>   - viewDiagonal() and viewDiagonal(length) and viewDiagonal(row, column) and viewDiagonal(int row, column, length) be added.
> I will produce a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira