You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@commons.apache.org by Yannick <ya...@yahoo.com> on 2008/07/17 05:52:41 UTC

Question about TestUtils.chiSquare

Hello,

I'm wondering why the chi-square test has this API:

chiSquareTest(double[] expected, long[] observed) 

-
I would think the type of expected and observed should be the same (for
example, both double, like in the situation I have, where my series
consists of purchase amounts on a website under two different
conditions). 

Am I missing something trivial? Why is there no chiSquareTest(double[] expected, double[] observed) test?

Thanks!

Yannick


      __________________________________________________________________
Be smarter than spam. See how smart SpamGuard is at giving junk email the boot with the All-new Yahoo! Mail.  Click on Options in Mail and switch to New Mail today or register for free at http://mail.yahoo.ca

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: [math] Re: Question about TestUtils.chiSquare

Posted by Phil Steitz <ph...@steitz.com>.
Luc Maisonobe wrote:
> First of all, please prefix your message with the name of the commons
> sub-project in brackets when using this list. It is shared by many
> sub-projects and it helps people setting up filters for their topics of
> interest.
>
>
> Yannick a écrit :
>   
>> Hello,
>>
>> I'm wondering why the chi-square test has this API:
>>
>> chiSquareTest(double[] expected, long[] observed) 
>>     
>
> As far as I understand, what is computed here is a Chi-square
> goodness-of-fit test or Pearson's Chi-square test. The semantics of this
> arguments were originally related to a number of real observation which
> can only be integer values.
>   
Correct.  cf the reference from the javadoc:
http://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm
> However, looking at the implementation, it appears that the long values
> are converted to double almost everywhere, so it would at least from a
> computational point of view to convert it. I am not sure about the
> "meaning" of this as I am not a statistician.
>   
Conversion is to facilitate computation of the ChiSquare statistic.   I 
am not sure either what non-integral observed count data would mean.

What is the practical application?

Phil
> If you would like to see this behaviour changed and new signatures added
> to the existing ones, please open a feature enhancement ticket in our
> JIRA issue tracking system at http://issues.apache.org/jira/browse/MATH.
> You'll have to register to the system before being allowed to open a ticket.
>
> Luc
>
>   
>> -
>> I would think the type of expected and observed should be the same (for
>> example, both double, like in the situation I have, where my series
>> consists of purchase amounts on a website under two different
>> conditions). 
>>
>> Am I missing something trivial? Why is there no chiSquareTest(double[] expected, double[] observed) test?
>>
>> Thanks!
>>
>> Yannick
>>
>>
>>       __________________________________________________________________
>> Be smarter than spam. See how smart SpamGuard is at giving junk email the boot with the All-new Yahoo! Mail.  Click on Options in Mail and switch to New Mail today or register for free at http://mail.yahoo.ca
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
>> For additional commands, e-mail: user-help@commons.apache.org
>>
>>
>>     
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


[math] Re: Question about TestUtils.chiSquare

Posted by Luc Maisonobe <Lu...@free.fr>.
First of all, please prefix your message with the name of the commons
sub-project in brackets when using this list. It is shared by many
sub-projects and it helps people setting up filters for their topics of
interest.


Yannick a écrit :
> Hello,
> 
> I'm wondering why the chi-square test has this API:
> 
> chiSquareTest(double[] expected, long[] observed) 

As far as I understand, what is computed here is a Chi-square
goodness-of-fit test or Pearson's Chi-square test. The semantics of this
arguments were originally related to a number of real observation which
can only be integer values.

However, looking at the implementation, it appears that the long values
are converted to double almost everywhere, so it would at least from a
computational point of view to convert it. I am not sure about the
"meaning" of this as I am not a statistician.

If you would like to see this behaviour changed and new signatures added
to the existing ones, please open a feature enhancement ticket in our
JIRA issue tracking system at http://issues.apache.org/jira/browse/MATH.
You'll have to register to the system before being allowed to open a ticket.

Luc

> 
> -
> I would think the type of expected and observed should be the same (for
> example, both double, like in the situation I have, where my series
> consists of purchase amounts on a website under two different
> conditions). 
> 
> Am I missing something trivial? Why is there no chiSquareTest(double[] expected, double[] observed) test?
> 
> Thanks!
> 
> Yannick
> 
> 
>       __________________________________________________________________
> Be smarter than spam. See how smart SpamGuard is at giving junk email the boot with the All-new Yahoo! Mail.  Click on Options in Mail and switch to New Mail today or register for free at http://mail.yahoo.ca
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
> 
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: Question about TestUtils.chiSquare

Posted by Ted Dunning <te...@gmail.com>.

I doubt that you really want a chi^2 test for this in any case.  You could
use it if you binned the amounts so that you are comparing counts, but you
would only find out if there was a difference, not whether there was an
interesting difference.

Most likely, what you want is an unpaired, one-sided t-test because the real
question is whether one option or the other will produce higher mean revenue
per visitor.

You might also have some use for a Kolmogorov-Smirnov test for difference in
distribution.  It would be much more appropriate for (nearly) continuous
values such as purchase amounts.


Yannick-19 wrote:
> 
> Hello,
> 
> I'm wondering why the chi-square test has this API:
> 
> chiSquareTest(double[] expected, long[] observed) 
> 
> -
> I would think the type of expected and observed should be the same (for
> example, both double, like in the situation I have, where my series
> consists of purchase amounts on a website under two different
> conditions). 
> 
> Am I missing something trivial? Why is there no chiSquareTest(double[]
> expected, double[] observed) test?
> 
> Thanks!
> 
> Yannick
> 
> 
>       __________________________________________________________________
> Be smarter than spam. See how smart SpamGuard is at giving junk email the
> boot with the All-new Yahoo! Mail.  Click on Options in Mail and switch to
> New Mail today or register for free at http://mail.yahoo.ca
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Question-about-TestUtils.chiSquare-tp18501106p19248230.html
Sent from the Commons - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org