You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by bu...@apache.org on 2005/08/17 01:41:25 UTC

DO NOT REPLY [Bug 36215] New: - [Math] HypergeometricDistributionImpl cumulativeProbability calculation overflown

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=36215>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=36215

           Summary: [Math] HypergeometricDistributionImpl
                    cumulativeProbability calculation overflown
           Product: Commons
           Version: Nightly Builds
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Math
        AssignedTo: commons-dev@jakarta.apache.org
        ReportedBy: microarray@gmail.com


Hi, for all that might use this class:

several things I found when using this class to calculate the
cumulative probability. I attached my code FYI. three things:

1. when I used my code to calculate the cumulativeProbability(50) of
5000 200 100 (Population size, number of successes, sample size),
result was greater than 1 (1.0000000000134985);
2. when I calculated cumulativeProbability(50) and
cumulativeProbability(51) for the distribution 5000 200 100, I got the
same results, but it should have been different;
2. the cumulativeProbability(x) returns "for this distribution, X,
P(X<=x)", but most of the time (at least in my case) what I care about
is the upper tail (X>=x). based on the above findings, I can't simply
use 1-cumulativeProbability(x-1) to get what I want.

here's what I think might be related to the problem: since the
cumulativeProbability is calculating the lower tail (X<=x), a
distribution like above often has this probability very close to 1;
thus it's difficult to record a number that = 1-1E-50 'cause all you
can do is record sth like 0.9999..... and further digits will be
rounded. to avoid this, I suggest adding a new function to calculate
upper tail or change this to calculate x in a range like (n<=x<=m), in
addition to fix the overflow of the current function.

thank you for your patience to get here. I'm a newbie but I've asked
Java experts in our lab about this. looking into the source code really
isn't up for me......hope someone can fix it, :) BTW I'm using cygwin under
WinXP pro SP2, with Java SDK 1.4.2_09 build b05, and the commons-math I used is
both the 1.0 and the nightly build of 8-15-05. 

the code:
-------------------
import org.apache.commons.math.distribution.HypergeometricDistributionImpl;

class HyperGeometricProbability {

    public static void main(String args[]) {

	if(args.length != 4) {
	    
	    System.out.println("USAGE: java HyperGeometricProbabilityCalc [population]
[numsuccess] [sample] [overlap]");
	    
	}
	else {
	    
	    String population = args[0];
	    String numsuccess = args[1];
	    String sample = args[2];
	    String overlap = args[3];
	    
	    int populationI = Integer.parseInt(population);
	    int numsuccessI = Integer.parseInt(numsuccess);
	    int sampleI = Integer.parseInt(sample);
	    int overlapI = Integer.parseInt(overlap);
	    
	    HypergeometricDistributionImpl hDist = new
HypergeometricDistributionImpl(populationI, numsuccessI, sampleI);
	    
	    double raw_probability = 1.0;
	    double cumPro = 1.0;
	    double real_cumPro = 1.0;

	    try {
		if (0 < overlapI && 0 < numsuccessI && 0 < sampleI) {
		    raw_probability = hDist.probability(overlapI);
		    cumPro = hDist.cumulativeProbability(overlapI - 1.0);
		    real_cumPro = 1.0 - cumPro;
			System.out.println("cumulative probability=" + cumPro + "\t" + "raw
probability=" + raw_probability + "\t" + "real cumulative probability=" + "\t" +
real_cumPro);
		    }
	    }
	    catch (Exception e) {
		System.err.println("BAD PROBABILITY CALCULATION");
	    }
	    
	}
	
    }
    
}
----------------------------------

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org