You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Sean Owen <sr...@gmail.com> on 2010/08/31 20:29:50 UTC

Removing deprecated and unused code?

I don't want to do this now. I don't even want to do it for 0.4. But
wanted to poll sentiment: what would happen if we removed any
deprecated, unused code from the math libraries? This is the source
of, well, a lot of unused code and most style / findbugs / pmd
warnings now.

The argument against of course is that it could be used at some point.
I might note that code can always be resurrected from SVN, that it's
easier to remove code earlier than later, and that un-deprecating this
code needs a bit of rewrite anyway.

Re: Removing deprecated and unused code?

Posted by Grant Ingersoll <gs...@apache.org>.
FYI, seems the Carrot2 guys might use some of the proposed classes that are being removed.  Hopefully, Dawid or Stazsek will speak up.

-Grant

On Sep 1, 2010, at 8:54 AM, Ted Dunning wrote:

> I opened an issue for promoting exponential and normal distributions.  I am
> traveling the rest of this week though so progress
> won't be super fast (it is half done now).
> 
> On Wed, Sep 1, 2010 at 5:24 AM, Drew Farris <dr...@gmail.com> wrote:
> 
>> +1 to getting this stuff cleaned out. If you're doing it now we might
>> as well get this in for 0.4
>> 
>> On Wed, Sep 1, 2010 at 5:59 AM, Sean Owen <sr...@gmail.com> wrote:
>>> BTW I'm making pretty good progress clearing out the code Ted
>>> mentions. Any comments?
>>> 
>>> On Tue, Aug 31, 2010 at 7:53 PM, Ted Dunning <te...@gmail.com>
>> wrote:
>>>> I think that there are probably three classes of inherited math code
>> now.
>>>> One class is the stuff
>>>> that we kinda sorta plan to resurrect, another is the stuff that we will
>> be
>>>> replacing, but which is
>>>> needed to avoid compile errors in class I and the final class is code
>> that
>>>> we really should delete.
>>>> 
>>> 
>> 

--------------------------
Grant Ingersoll
http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct 7-8


Re: Removing deprecated and unused code?

Posted by Ted Dunning <te...@gmail.com>.
I opened an issue for promoting exponential and normal distributions.  I am
traveling the rest of this week though so progress
won't be super fast (it is half done now).

On Wed, Sep 1, 2010 at 5:24 AM, Drew Farris <dr...@gmail.com> wrote:

> +1 to getting this stuff cleaned out. If you're doing it now we might
> as well get this in for 0.4
>
> On Wed, Sep 1, 2010 at 5:59 AM, Sean Owen <sr...@gmail.com> wrote:
> > BTW I'm making pretty good progress clearing out the code Ted
> > mentions. Any comments?
> >
> > On Tue, Aug 31, 2010 at 7:53 PM, Ted Dunning <te...@gmail.com>
> wrote:
> >> I think that there are probably three classes of inherited math code
> now.
> >>  One class is the stuff
> >> that we kinda sorta plan to resurrect, another is the stuff that we will
> be
> >> replacing, but which is
> >> needed to avoid compile errors in class I and the final class is code
> that
> >> we really should delete.
> >>
> >
>

Re: Removing deprecated and unused code?

Posted by Drew Farris <dr...@gmail.com>.
+1 to getting this stuff cleaned out. If you're doing it now we might
as well get this in for 0.4

On Wed, Sep 1, 2010 at 5:59 AM, Sean Owen <sr...@gmail.com> wrote:
> BTW I'm making pretty good progress clearing out the code Ted
> mentions. Any comments?
>
> On Tue, Aug 31, 2010 at 7:53 PM, Ted Dunning <te...@gmail.com> wrote:
>> I think that there are probably three classes of inherited math code now.
>>  One class is the stuff
>> that we kinda sorta plan to resurrect, another is the stuff that we will be
>> replacing, but which is
>> needed to avoid compile errors in class I and the final class is code that
>> we really should delete.
>>
>

Re: Removing deprecated and unused code?

Posted by Ted Dunning <te...@gmail.com>.
Would it help you for us to hold back commits for a day?

Or would the opposite strategy help of taking turns on small pieces of this
change at a time?

On Wed, Sep 1, 2010 at 2:59 AM, Sean Owen <sr...@gmail.com> wrote:

> BTW I'm making pretty good progress clearing out the code Ted
> mentions. Any comments?
>
> On Tue, Aug 31, 2010 at 7:53 PM, Ted Dunning <te...@gmail.com>
> wrote:
> > I think that there are probably three classes of inherited math code now.
> >  One class is the stuff
> > that we kinda sorta plan to resurrect, another is the stuff that we will
> be
> > replacing, but which is
> > needed to avoid compile errors in class I and the final class is code
> that
> > we really should delete.
> >
>

Re: Removing deprecated and unused code?

Posted by Sean Owen <sr...@gmail.com>.
BTW I'm making pretty good progress clearing out the code Ted
mentions. Any comments?

On Tue, Aug 31, 2010 at 7:53 PM, Ted Dunning <te...@gmail.com> wrote:
> I think that there are probably three classes of inherited math code now.
>  One class is the stuff
> that we kinda sorta plan to resurrect, another is the stuff that we will be
> replacing, but which is
> needed to avoid compile errors in class I and the final class is code that
> we really should delete.
>

Re: Removing deprecated and unused code?

Posted by Jake Mannix <ja...@gmail.com>.
On Tue, Sep 7, 2010 at 1:10 PM, Ted Dunning <te...@gmail.com> wrote:

> I am down with that and have already done QRD.  I will take a look at the
> others shortly (but a certain book chapter intervenes today).
>
> The stats stuff we may need is close to converted.  It all uses our random
> number structure and the importantest bits have tests with a few bits left
> to do.
>
> I will make EigenvalueDecomposition my next project.
>

That would be supercool.


>
> On Tue, Sep 7, 2010 at 1:04 PM, Jake Mannix <ja...@gmail.com> wrote:
>
> > ...
> > What would be really really nice is if some kind soul out there ported
> the
> > following classes in their entirety over to use Mahout Matrix/Vector
> > classes:
> >
> >
> >
> math/src/main/java/org/apache/mahout/math/matrix/linalg/LUDecomposition.java
> >
> >
> math/src/main/java/org/apache/mahout/math/matrix/linalg/LUDecompositionQuick.java
> >
> >
> math/src/main/java/org/apache/mahout/math/matrix/linalg/QRDecomposition.java
> >
> >
> math/src/main/java/org/apache/mahout/math/matrix/linalg/EigenvalueDecomposition.java
> >
> > Then I'd be fine if we nuked the entire remaining COLT codebase, in fact
> > (although I'm not sure if we really are completely up to par with the jet
> > stats stuff,
> > are we?).
> >
> >
>

Re: Removing deprecated and unused code?

Posted by Ted Dunning <te...@gmail.com>.
I am down with that and have already done QRD.  I will take a look at the
others shortly (but a certain book chapter intervenes today).

The stats stuff we may need is close to converted.  It all uses our random
number structure and the importantest bits have tests with a few bits left
to do.

I will make EigenvalueDecomposition my next project.

On Tue, Sep 7, 2010 at 1:04 PM, Jake Mannix <ja...@gmail.com> wrote:

> ...
> What would be really really nice is if some kind soul out there ported the
> following classes in their entirety over to use Mahout Matrix/Vector
> classes:
>
>
> math/src/main/java/org/apache/mahout/math/matrix/linalg/LUDecomposition.java
>
> math/src/main/java/org/apache/mahout/math/matrix/linalg/LUDecompositionQuick.java
>
> math/src/main/java/org/apache/mahout/math/matrix/linalg/QRDecomposition.java
>
> math/src/main/java/org/apache/mahout/math/matrix/linalg/EigenvalueDecomposition.java
>
> Then I'd be fine if we nuked the entire remaining COLT codebase, in fact
> (although I'm not sure if we really are completely up to par with the jet
> stats stuff,
> are we?).
>
>

Re: Removing deprecated and unused code?

Posted by Jake Mannix <ja...@gmail.com>.
Sorry to not weigh in on this earlier, but the distributed SVD work we do
currently
relies on the still-deprecated old COLT EigenvalueDecomposition.java code,
which in turn relies on DoubleMatrix1D and DoubleMatrix2D.

Now, it turns out that currently, we only use the special case of an
eigenvalue
decomposition for small, tri-diagonal matrices, and is not a bottleneck in
terms
of algorithmic complexity by any means, which means if someone were to
write (or port) tri-diagonal eigendecomposition to run on Mahout standard
data structures, we could remove the current dependency on DoubleMatrix1D
and DoubleMatrix2D.

On the other hand, some of the approaches to stochastic decomposition
which have been suggested, require eigendecomposition of medium-sized
fully general dense symmetric matrices, and COLT's code is the only form
of that we have, so if we only pulled out COLT's special-case of
tri-diagonal
decomposition, we'd suddenly be missing some key implementation.

What would be really really nice is if some kind soul out there ported the
following classes in their entirety over to use Mahout Matrix/Vector
classes:

math/src/main/java/org/apache/mahout/math/matrix/linalg/LUDecomposition.java
math/src/main/java/org/apache/mahout/math/matrix/linalg/LUDecompositionQuick.java
math/src/main/java/org/apache/mahout/math/matrix/linalg/QRDecomposition.java
math/src/main/java/org/apache/mahout/math/matrix/linalg/EigenvalueDecomposition.java

Then I'd be fine if we nuked the entire remaining COLT codebase, in fact
(although
I'm not sure if we really are completely up to par with the jet stats stuff,
are we?).

  -jake

On Tue, Aug 31, 2010 at 11:53 AM, Ted Dunning <te...@gmail.com> wrote:
>
> Here is my classification of the remaining deprecated math code:
>
> Code to nuke now unless needed for compile:
>
> math/src/main/java/org/apache/mahout/math/GenericPermuting.java
> math/src/main/java/org/apache/mahout/math/Partitioning.java
> math/src/main/java/org/apache/mahout/math/function/IntIntIntProcedure.java
> math/src/main/java/org/apache/mahout/math/jet/math/IntFunctions.java
>
> math/src/main/java/org/apache/mahout/math/jet/random/sampling/RandomSampler.java
>
> math/src/main/java/org/apache/mahout/math/jet/random/sampling/RandomSamplingAssistant.java
>
> math/src/main/java/org/apache/mahout/math/jet/random/sampling/WeightedRandomSampler.java
>
> math/src/main/java/org/apache/mahout/math/jet/stat/quantile/DoubleQuantileFinder.java
>
> math/src/main/java/org/apache/mahout/math/jet/stat/quantile/EquiDepthHistogram.java
>
> math/src/main/java/org/apache/mahout/math/jet/stat/quantile/QuantileFinderFactory.java
> math/src/main/java/org/apache/mahout/math/matrix/DoubleFactory1D.java
> math/src/main/java/org/apache/mahout/math/matrix/DoubleFactory2D.java
> math/src/main/java/org/apache/mahout/math/matrix/DoubleFactory3D.java
> math/src/main/java/org/apache/mahout/math/matrix/DoubleMatrix1D.java
>
> math/src/main/java/org/apache/mahout/math/matrix/DoubleMatrix1DProcedure.java
> math/src/main/java/org/apache/mahout/math/matrix/DoubleMatrix2D.java
>
> math/src/main/java/org/apache/mahout/math/matrix/DoubleMatrix2DProcedure.java
> math/src/main/java/org/apache/mahout/math/matrix/DoubleMatrix3D.java
>
> math/src/main/java/org/apache/mahout/math/matrix/DoubleMatrix3DProcedure.java
>
> math/src/main/java/org/apache/mahout/math/matrix/doublealgo/DoubleMatrix1DComparator.java
>
> math/src/main/java/org/apache/mahout/math/matrix/doublealgo/DoubleMatrix2DComparator.java
> math/src/main/java/org/apache/mahout/math/matrix/doublealgo/Formatter.java
>
> math/src/main/java/org/apache/mahout/math/matrix/doublealgo/Partitioning.java
> math/src/main/java/org/apache/mahout/math/matrix/doublealgo/Sorting.java
> math/src/main/java/org/apache/mahout/math/matrix/doublealgo/Statistic.java
> math/src/main/java/org/apache/mahout/math/matrix/doublealgo/Stencil.java
> math/src/main/java/org/apache/mahout/math/matrix/doublealgo/Transform.java
>
> math/src/main/java/org/apache/mahout/math/matrix/impl/AbstractFormatter.java
> math/src/main/java/org/apache/mahout/math/matrix/impl/AbstractMatrix.java
> math/src/main/java/org/apache/mahout/math/matrix/impl/AbstractMatrix1D.java
> math/src/main/java/org/apache/mahout/math/matrix/impl/AbstractMatrix2D.java
> math/src/main/java/org/apache/mahout/math/matrix/impl/AbstractMatrix3D.java
>
> math/src/main/java/org/apache/mahout/math/matrix/impl/DenseDoubleMatrix1D.java
>
> math/src/main/java/org/apache/mahout/math/matrix/impl/DenseDoubleMatrix2D.java
>
> math/src/main/java/org/apache/mahout/math/matrix/impl/DenseDoubleMatrix3D.java
> math/src/main/java/org/apache/mahout/math/matrix/impl/Former.java
> math/src/main/java/org/apache/mahout/math/matrix/impl/FormerFactory.java
> math/src/main/java/org/apache/mahout/math/matrix/impl/RCDoubleMatrix2D.java
>
> math/src/main/java/org/apache/mahout/math/matrix/impl/SparseDoubleMatrix1D.java
>
> math/src/main/java/org/apache/mahout/math/matrix/impl/SparseDoubleMatrix2D.java
>
> math/src/main/java/org/apache/mahout/math/matrix/impl/SparseDoubleMatrix3D.java
> math/src/main/java/org/apache/mahout/math/matrix/linalg/Algebra.java
>
> Code to fix soon:
>
> math/src/main/java/org/apache/mahout/math/jet/random/Beta.java
> math/src/main/java/org/apache/mahout/math/jet/random/Binomial.java
> math/src/main/java/org/apache/mahout/math/jet/random/ChiSquare.java
> math/src/main/java/org/apache/mahout/math/jet/random/Distributions.java
> math/src/main/java/org/apache/mahout/math/jet/random/Exponential.java
> math/src/main/java/org/apache/mahout/math/jet/random/ExponentialPower.java
> math/src/main/java/org/apache/mahout/math/jet/random/HyperGeometric.java
> math/src/main/java/org/apache/mahout/math/jet/random/Hyperbolic.java
> math/src/main/java/org/apache/mahout/math/jet/random/Logarithmic.java
> math/src/main/java/org/apache/mahout/math/jet/random/Normal.java
> math/src/main/java/org/apache/mahout/math/jet/random/PoissonSlow.java
> math/src/main/java/org/apache/mahout/math/jet/random/StudentT.java
> math/src/main/java/org/apache/mahout/math/jet/random/VonMises.java
>
> math/src/main/java/org/apache/mahout/math/matrix/linalg/LUDecomposition.java
>
> math/src/main/java/org/apache/mahout/math/matrix/linalg/LUDecompositionQuick.java
>
> math/src/main/java/org/apache/mahout/math/matrix/linalg/QRDecomposition.java
>
> Code somebody might want:
>
> math/src/main/java/org/apache/mahout/math/jet/math/Bessel.java
> math/src/main/java/org/apache/mahout/math/jet/random/BreitWigner.java
>
> math/src/main/java/org/apache/mahout/math/jet/random/BreitWignerMeanSquare.java
> math/src/main/java/org/apache/mahout/math/jet/random/Empirical.java
> math/src/main/java/org/apache/mahout/math/jet/random/EmpiricalWalker.java
> math/src/main/java/org/apache/mahout/math/jet/random/Zeta.java
> math/src/main/java/org/apache/mahout/math/jet/stat/Descriptive.java
>
> math/src/main/java/org/apache/mahout/math/matrix/linalg/EigenvalueDecomposition.java
> math/src/main/java/org/apache/mahout/math/matrix/linalg/Property.java
>
>
> On Tue, Aug 31, 2010 at 11:29 AM, Sean Owen <sr...@gmail.com> wrote:
>
> > I don't want to do this now. I don't even want to do it for 0.4. But
> > wanted to poll sentiment: what would happen if we removed any
> > deprecated, unused code from the math libraries? This is the source
> > of, well, a lot of unused code and most style / findbugs / pmd
> > warnings now.
> >
> > The argument against of course is that it could be used at some point.
> > I might note that code can always be resurrected from SVN, that it's
> > easier to remove code earlier than later, and that un-deprecating this
> > code needs a bit of rewrite anyway.
> >
>

Re: Removing deprecated and unused code?

Posted by Ted Dunning <te...@gmail.com>.
I think that there are probably three classes of inherited math code now.
 One class is the stuff
that we kinda sorta plan to resurrect, another is the stuff that we will be
replacing, but which is
needed to avoid compile errors in class I and the final class is code that
we really should delete.

Here is my classification of the remaining deprecated math code:

Code to nuke now unless needed for compile:

math/src/main/java/org/apache/mahout/math/GenericPermuting.java
math/src/main/java/org/apache/mahout/math/Partitioning.java
math/src/main/java/org/apache/mahout/math/function/IntIntIntProcedure.java
math/src/main/java/org/apache/mahout/math/jet/math/IntFunctions.java
math/src/main/java/org/apache/mahout/math/jet/random/sampling/RandomSampler.java
math/src/main/java/org/apache/mahout/math/jet/random/sampling/RandomSamplingAssistant.java
math/src/main/java/org/apache/mahout/math/jet/random/sampling/WeightedRandomSampler.java
math/src/main/java/org/apache/mahout/math/jet/stat/quantile/DoubleQuantileFinder.java
math/src/main/java/org/apache/mahout/math/jet/stat/quantile/EquiDepthHistogram.java
math/src/main/java/org/apache/mahout/math/jet/stat/quantile/QuantileFinderFactory.java
math/src/main/java/org/apache/mahout/math/matrix/DoubleFactory1D.java
math/src/main/java/org/apache/mahout/math/matrix/DoubleFactory2D.java
math/src/main/java/org/apache/mahout/math/matrix/DoubleFactory3D.java
math/src/main/java/org/apache/mahout/math/matrix/DoubleMatrix1D.java
math/src/main/java/org/apache/mahout/math/matrix/DoubleMatrix1DProcedure.java
math/src/main/java/org/apache/mahout/math/matrix/DoubleMatrix2D.java
math/src/main/java/org/apache/mahout/math/matrix/DoubleMatrix2DProcedure.java
math/src/main/java/org/apache/mahout/math/matrix/DoubleMatrix3D.java
math/src/main/java/org/apache/mahout/math/matrix/DoubleMatrix3DProcedure.java
math/src/main/java/org/apache/mahout/math/matrix/doublealgo/DoubleMatrix1DComparator.java
math/src/main/java/org/apache/mahout/math/matrix/doublealgo/DoubleMatrix2DComparator.java
math/src/main/java/org/apache/mahout/math/matrix/doublealgo/Formatter.java
math/src/main/java/org/apache/mahout/math/matrix/doublealgo/Partitioning.java
math/src/main/java/org/apache/mahout/math/matrix/doublealgo/Sorting.java
math/src/main/java/org/apache/mahout/math/matrix/doublealgo/Statistic.java
math/src/main/java/org/apache/mahout/math/matrix/doublealgo/Stencil.java
math/src/main/java/org/apache/mahout/math/matrix/doublealgo/Transform.java
math/src/main/java/org/apache/mahout/math/matrix/impl/AbstractFormatter.java
math/src/main/java/org/apache/mahout/math/matrix/impl/AbstractMatrix.java
math/src/main/java/org/apache/mahout/math/matrix/impl/AbstractMatrix1D.java
math/src/main/java/org/apache/mahout/math/matrix/impl/AbstractMatrix2D.java
math/src/main/java/org/apache/mahout/math/matrix/impl/AbstractMatrix3D.java
math/src/main/java/org/apache/mahout/math/matrix/impl/DenseDoubleMatrix1D.java
math/src/main/java/org/apache/mahout/math/matrix/impl/DenseDoubleMatrix2D.java
math/src/main/java/org/apache/mahout/math/matrix/impl/DenseDoubleMatrix3D.java
math/src/main/java/org/apache/mahout/math/matrix/impl/Former.java
math/src/main/java/org/apache/mahout/math/matrix/impl/FormerFactory.java
math/src/main/java/org/apache/mahout/math/matrix/impl/RCDoubleMatrix2D.java
math/src/main/java/org/apache/mahout/math/matrix/impl/SparseDoubleMatrix1D.java
math/src/main/java/org/apache/mahout/math/matrix/impl/SparseDoubleMatrix2D.java
math/src/main/java/org/apache/mahout/math/matrix/impl/SparseDoubleMatrix3D.java
math/src/main/java/org/apache/mahout/math/matrix/linalg/Algebra.java

Code to fix soon:

math/src/main/java/org/apache/mahout/math/jet/random/Beta.java
math/src/main/java/org/apache/mahout/math/jet/random/Binomial.java
math/src/main/java/org/apache/mahout/math/jet/random/ChiSquare.java
math/src/main/java/org/apache/mahout/math/jet/random/Distributions.java
math/src/main/java/org/apache/mahout/math/jet/random/Exponential.java
math/src/main/java/org/apache/mahout/math/jet/random/ExponentialPower.java
math/src/main/java/org/apache/mahout/math/jet/random/HyperGeometric.java
math/src/main/java/org/apache/mahout/math/jet/random/Hyperbolic.java
math/src/main/java/org/apache/mahout/math/jet/random/Logarithmic.java
math/src/main/java/org/apache/mahout/math/jet/random/Normal.java
math/src/main/java/org/apache/mahout/math/jet/random/PoissonSlow.java
math/src/main/java/org/apache/mahout/math/jet/random/StudentT.java
math/src/main/java/org/apache/mahout/math/jet/random/VonMises.java
math/src/main/java/org/apache/mahout/math/matrix/linalg/LUDecomposition.java
math/src/main/java/org/apache/mahout/math/matrix/linalg/LUDecompositionQuick.java
math/src/main/java/org/apache/mahout/math/matrix/linalg/QRDecomposition.java

Code somebody might want:

math/src/main/java/org/apache/mahout/math/jet/math/Bessel.java
math/src/main/java/org/apache/mahout/math/jet/random/BreitWigner.java
math/src/main/java/org/apache/mahout/math/jet/random/BreitWignerMeanSquare.java
math/src/main/java/org/apache/mahout/math/jet/random/Empirical.java
math/src/main/java/org/apache/mahout/math/jet/random/EmpiricalWalker.java
math/src/main/java/org/apache/mahout/math/jet/random/Zeta.java
math/src/main/java/org/apache/mahout/math/jet/stat/Descriptive.java
math/src/main/java/org/apache/mahout/math/matrix/linalg/EigenvalueDecomposition.java
math/src/main/java/org/apache/mahout/math/matrix/linalg/Property.java


On Tue, Aug 31, 2010 at 11:29 AM, Sean Owen <sr...@gmail.com> wrote:

> I don't want to do this now. I don't even want to do it for 0.4. But
> wanted to poll sentiment: what would happen if we removed any
> deprecated, unused code from the math libraries? This is the source
> of, well, a lot of unused code and most style / findbugs / pmd
> warnings now.
>
> The argument against of course is that it could be used at some point.
> I might note that code can always be resurrected from SVN, that it's
> easier to remove code earlier than later, and that un-deprecating this
> code needs a bit of rewrite anyway.
>