You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Phil Steitz <ph...@steitz.com> on 2004/03/21 22:14:12 UTC

[math][proposal] stat package structure

I would like to propose the following repackaging of classes in the stat 
and distributions packages.

1. Move the univariate statistical aggregates (DescriptiveStatistics, 
SummaryStatistics et al) into the univariate package.

2. Move distributions back into stat. I did not comment on the initial 
move out, but it seems wrong to me now.  The abstract classes and all 
implementations are probability distributions. To me these fit naturally 
in stat.  I can live with it as is; I just wanted to see if others are 
feeling the same and if so, get it fixed while we still have the chance.

3. Create a "multivariate" subpackage and place the lonely 
BivariateRegression there.  It is odd to have a package with only one 
class, but I think the probability is 1 that we will quickly have more 
multivariate statistical classes (e.g. multiple regression) and I would 
prefer not to have to play the deprecate-move game if we can see it coming.

4. Similarly, I would like to create an "inference" or "test" subpackage 
and put TestStatistic there.

Please indicate your +/-/0 on these items separately. I am viewing each of 
the items subject to lazy consensus. I will wait one week to make any changes.

Input from all parties welcome, committers binding.

Phil


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math][proposal] stat package structure

Posted by "Mark R. Diggory" <md...@latte.harvard.edu>.
On Sun, 2004-03-21 at 16:14, Phil Steitz wrote:
> I would like to propose the following repackaging of classes in the stat 
> and distributions packages.
> 
> 1. Move the univariate statistical aggregates (DescriptiveStatistics, 
> SummaryStatistics et al) into the univariate package.
> 

+1, sensible

> 2. Move distributions back into stat. I did not comment on the initial 
> move out, but it seems wrong to me now.  The abstract classes and all 
> implementations are probability distributions. To me these fit naturally 
> in stat.  I can live with it as is; I just wanted to see if others are 
> feeling the same and if so, get it fixed while we still have the chance.
> 

+1, with moving everything else into stats, this sounds fine too. 

> 3. Create a "multivariate" subpackage and place the lonely 
> BivariateRegression there.  It is odd to have a package with only one 
> class, but I think the probability is 1 that we will quickly have more 
> multivariate statistical classes (e.g. multiple regression) and I would 
> prefer not to have to play the deprecate-move game if we can see it coming.
> 

+1, Would really like to see more multivariate approaches.

> 4. Similarly, I would like to create an "inference" or "test" subpackage 
> and put TestStatistic there.
> 

+1 (test)

> Please indicate your +/-/0 on these items separately. I am viewing each of 
> the items subject to lazy consensus. I will wait one week to make any changes.
> 
> Input from all parties welcome, committers binding.
> 
> Phil
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
-- 
Mark R. Diggory
Software Developer - VDC Project
Harvard MIT Data Center
http://www.hmdc.harvard.edu


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math][proposal] stat package structure

Posted by "Mark R. Diggory" <md...@latte.harvard.edu>.
I figured it out, it slipped through the cracks when the move to 
univariate occured in the tests directory. I've added it back in.

-Mark

Mark R. Diggory wrote:

> Where do we stand on BeanListUnivariate? Is it supposed to be in the 
> main directory, test directory or nowhere? The build is currently 
> breaking because it is absent from the source now.
>
> -Mark
>
>
> Phil Steitz wrote:
>
>> I would like to propose the following repackaging of classes in the 
>> stat and distributions packages.
>>
>> 1. Move the univariate statistical aggregates (DescriptiveStatistics, 
>> SummaryStatistics et al) into the univariate package.
>>
>> 2. Move distributions back into stat. I did not comment on the 
>> initial move out, but it seems wrong to me now.  The abstract classes 
>> and all implementations are probability distributions. To me these 
>> fit naturally in stat.  I can live with it as is; I just wanted to 
>> see if others are feeling the same and if so, get it fixed while we 
>> still have the chance.
>>
>> 3. Create a "multivariate" subpackage and place the lonely 
>> BivariateRegression there.  It is odd to have a package with only one 
>> class, but I think the probability is 1 that we will quickly have 
>> more multivariate statistical classes (e.g. multiple regression) and 
>> I would prefer not to have to play the deprecate-move game if we can 
>> see it coming.
>>
>> 4. Similarly, I would like to create an "inference" or "test" 
>> subpackage and put TestStatistic there.
>>
>> Please indicate your +/-/0 on these items separately. I am viewing 
>> each of the items subject to lazy consensus. I will wait one week to 
>> make any changes.
>>
>> Input from all parties welcome, committers binding.
>>
>> Phil
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math][proposal] stat package structure

Posted by "Mark R. Diggory" <md...@latte.harvard.edu>.
Where do we stand on BeanListUnivariate? Is it supposed to be in the 
main directory, test directory or nowhere? The build is currently 
breaking because it is absent from the source now.

-Mark


Phil Steitz wrote:

> I would like to propose the following repackaging of classes in the 
> stat and distributions packages.
>
> 1. Move the univariate statistical aggregates (DescriptiveStatistics, 
> SummaryStatistics et al) into the univariate package.
>
> 2. Move distributions back into stat. I did not comment on the initial 
> move out, but it seems wrong to me now.  The abstract classes and all 
> implementations are probability distributions. To me these fit 
> naturally in stat.  I can live with it as is; I just wanted to see if 
> others are feeling the same and if so, get it fixed while we still 
> have the chance.
>
> 3. Create a "multivariate" subpackage and place the lonely 
> BivariateRegression there.  It is odd to have a package with only one 
> class, but I think the probability is 1 that we will quickly have more 
> multivariate statistical classes (e.g. multiple regression) and I 
> would prefer not to have to play the deprecate-move game if we can see 
> it coming.
>
> 4. Similarly, I would like to create an "inference" or "test" 
> subpackage and put TestStatistic there.
>
> Please indicate your +/-/0 on these items separately. I am viewing 
> each of the items subject to lazy consensus. I will wait one week to 
> make any changes.
>
> Input from all parties welcome, committers binding.
>
> Phil
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math][proposal] stat package structure

Posted by "Mark R. Diggory" <md...@latte.harvard.edu>.

Phil Steitz wrote:
> 
>>
>>> Maybe it should be organized more along the following lines?
>>>
>>> o.a.c.m.stat.probability...
>>> o.a.c.m.stat.univariate...
>>> o.a.c.m.stat.multivariate...
>>> o.a.c.m.stat.inference...
>>
>>
>>
>> I'm not very picky about these names, either, as I'm not a
>> probability/statistics person, as I said before.  If others like the 
>> above,
>> that's fine by me.
>>
>>
> 
> I like inference better than test, since it is broader (e.g. would make 
> a natural home for confidence intervals) but I don't like probability 
> since it is IMHO too broad for its contents, which are just probability 
> distributions.  I also do not buy the argument that people will think 
> that /src/java/o/a/c/m/stat/distribution contains binaries or some such.
> 
> Phil
> 

Yes, Ok, it is pretty weak, isn't it. ;-)

-- 
Mark Diggory
Software Developer
Harvard MIT Data Center
http://www.hmdc.harvard.edu

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math][proposal] stat package structure

Posted by Phil Steitz <ph...@steitz.com>.
> 
>>Maybe it should be organized more along the following lines?
>>
>>o.a.c.m.stat.probability...
>>o.a.c.m.stat.univariate...
>>o.a.c.m.stat.multivariate...
>>o.a.c.m.stat.inference...
> 
> 
> I'm not very picky about these names, either, as I'm not a
> probability/statistics person, as I said before.  If others like the above,
> that's fine by me.
> 
> 

I like inference better than test, since it is broader (e.g. would make a 
natural home for confidence intervals) but I don't like probability since 
it is IMHO too broad for its contents, which are just probability 
distributions.  I also do not buy the argument that people will think that 
/src/java/o/a/c/m/stat/distribution contains binaries or some such.

Phil

> 
> Al
> 
> __________________________________
> Do you Yahoo!?
> Yahoo! Finance Tax Center - File online. File on time.
> http://taxes.yahoo.com/filing.html
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
> 




---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math][proposal] stat package structure

Posted by Al Chou <ho...@yahoo.com>.
--- "Mark R. Diggory" <md...@latte.harvard.edu> wrote:
> On Sun, 2004-03-21 at 22:40, Al Chou wrote:
> > --- Phil Steitz <ph...@steitz.com> wrote:
[deletia]
> > > 4. Similarly, I would like to create an "inference" or "test" subpackage 
> > > and put TestStatistic there.
> > 
> > +1  I wonder if there's a better name than those two.  I see Mark voted for
> > "test", but -- perhaps because I've never done much statistics -- that
> makes me
> > think that I'm looking at a JUnit "tests" directory tree.  A quick skim
> through
> > _NR_ chapter 14 didn't turn up any better names, though.
> > 
> 
> good point, although we avoid "test" package names for JUnit tests in
> favor of the original package name of the class being tested, something
> I think is very smart to do. I'm not too picky here, I just picked test
> because its shorter, inference is more descriptive too. Along the same
> lines "distributions" could be confused with the the generic idea of a
> "distribution" of packages directory, maybe "probability is better
> there?

Right, though given that Java package names generally map to directory names,
there's room for confusion, especially given that we do have a standard "test"
directory paralleling "java" under "src".


> Maybe it should be organized more along the following lines?
> 
> o.a.c.m.stat.probability...
> o.a.c.m.stat.univariate...
> o.a.c.m.stat.multivariate...
> o.a.c.m.stat.inference...

I'm not very picky about these names, either, as I'm not a
probability/statistics person, as I said before.  If others like the above,
that's fine by me.



Al

__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math][proposal] stat package structure

Posted by "Mark R. Diggory" <md...@latte.harvard.edu>.
On Sun, 2004-03-21 at 22:40, Al Chou wrote:
> --- Phil Steitz <ph...@steitz.com> wrote:
> > I would like to propose the following repackaging of classes in the stat 
> > and distributions packages.
> > 
> > 1. Move the univariate statistical aggregates (DescriptiveStatistics, 
> > SummaryStatistics et al) into the univariate package.
> 
> +1
> 
> 
> > 2. Move distributions back into stat. I did not comment on the initial 
> > move out, but it seems wrong to me now.  The abstract classes and all 
> > implementations are probability distributions. To me these fit naturally 
> > in stat.  I can live with it as is; I just wanted to see if others are 
> > feeling the same and if so, get it fixed while we still have the chance.
> 
> +1  Here the "stat" package doubles as a "probability" package that doesn't
> exist, but in the absence of a use case for the latter, it makes sense to make
> the move you're proposing.

> 
> > 3. Create a "multivariate" subpackage and place the lonely 
> > BivariateRegression there.  It is odd to have a package with only one 
> > class, but I think the probability is 1 that we will quickly have more 
> > multivariate statistical classes (e.g. multiple regression) and I would 
> > prefer not to have to play the deprecate-move game if we can see it coming.
> 
> +1  It's like a section of a document outline that has only one subsection
> (which my English teachers in secondary schools always warned against <g>),
> except that we can easily envision other subsections coming into existence
> under that section.
> 
> 
> > 4. Similarly, I would like to create an "inference" or "test" subpackage 
> > and put TestStatistic there.
> 
> +1  I wonder if there's a better name than those two.  I see Mark voted for
> "test", but -- perhaps because I've never done much statistics -- that makes me
> think that I'm looking at a JUnit "tests" directory tree.  A quick skim through
> _NR_ chapter 14 didn't turn up any better names, though.
> 

good point, although we avoid "test" package names for JUnit tests in
favor of the original package name of the class being tested, something
I think is very smart to do. I'm not too picky here, I just picked test
because its shorter, inference is more descriptive too. Along the same
lines "distributions" could be confused with the the generic idea of a
"distribution" of packages directory, maybe "probability is better
there?

Maybe it should be organized more along the following lines?

o.a.c.m.stat.probability...
o.a.c.m.stat.univariate...
o.a.c.m.stat.multivariate...
o.a.c.m.stat.inference...

> While you're moving TestStatistic, I noticed two typos in the Javadoc for
> chiSquare( double[, double[] ):  "freqeuncy" and "counds".

-- 
Mark R. Diggory
Software Developer - VDC Project
Harvard MIT Data Center
http://www.hmdc.harvard.edu


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [math][proposal] stat package structure

Posted by Al Chou <ho...@yahoo.com>.
--- Phil Steitz <ph...@steitz.com> wrote:
> I would like to propose the following repackaging of classes in the stat 
> and distributions packages.
> 
> 1. Move the univariate statistical aggregates (DescriptiveStatistics, 
> SummaryStatistics et al) into the univariate package.

+1


> 2. Move distributions back into stat. I did not comment on the initial 
> move out, but it seems wrong to me now.  The abstract classes and all 
> implementations are probability distributions. To me these fit naturally 
> in stat.  I can live with it as is; I just wanted to see if others are 
> feeling the same and if so, get it fixed while we still have the chance.

+1  Here the "stat" package doubles as a "probability" package that doesn't
exist, but in the absence of a use case for the latter, it makes sense to make
the move you're proposing.


> 3. Create a "multivariate" subpackage and place the lonely 
> BivariateRegression there.  It is odd to have a package with only one 
> class, but I think the probability is 1 that we will quickly have more 
> multivariate statistical classes (e.g. multiple regression) and I would 
> prefer not to have to play the deprecate-move game if we can see it coming.

+1  It's like a section of a document outline that has only one subsection
(which my English teachers in secondary schools always warned against <g>),
except that we can easily envision other subsections coming into existence
under that section.


> 4. Similarly, I would like to create an "inference" or "test" subpackage 
> and put TestStatistic there.

+1  I wonder if there's a better name than those two.  I see Mark voted for
"test", but -- perhaps because I've never done much statistics -- that makes me
think that I'm looking at a JUnit "tests" directory tree.  A quick skim through
_NR_ chapter 14 didn't turn up any better names, though.

While you're moving TestStatistic, I noticed two typos in the Javadoc for
chiSquare( double[, double[] ):  "freqeuncy" and "counds".



Al

__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org