You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Warren Freitag <wa...@placebase.com> on 2006/05/18 07:24:06 UTC

[io] EndianUtils problems dealing with double values

Hi,

I've been using the EndianUtils class in Commons IO to convert an input 
file containing little-endian doubles in binary format. However I've 
come across a few situations where the function swapDouble() returns the 
wrong value, owing to the fact that the current implementation does the 
endian swapping by first converting the input value to long using 
java.lang.Double.doubleToLongBits(). However, as the Java docs claim, 
this means of converting double to long doesn't properly handle cases 
where the conversion to long results in "NaN". Unfortunately, I don't 
have handy a specific example, but I can assure the failure rate appears 
to be about 1 in 10,000 for my input file of >2million numbers. In the 
failure cases, I get "NaN" as the result from swapDouble().

The solution appears to be a simple one: change EndianUtils.java to use 
doubleToRawLongBits() instead. Once I did this, my failure rate went to 
0. The function doubleToRawLongBits() does a better job of preserving 
NaN values, according to the JavaDoc 
(http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Double.html#doubleToRawLongBits(double)).

I would like to check in my (very simple) change, but I'm unfamiliar 
with the dev process for Commons so I felt I should share my findings 
with the mailing list first.

One other thing (related, but maybe off-topic):
I'm reading my little-endian doubles from a binary file using the 
RandomAccessFile class, but I've found that the following methods 
produce different results in a few (<2%) cases:

RandomAccessFile file = new RandomAccessFile(...);

// Case 1: read double from file, do endian conversion using my change 
to EndianUtils
file.seek(X);
double d1 = EndianUtils.swapDouble(file.readDouble());

// Case 2: read long, do endian conversion, then convert to double
file.seek(X);
double d2 = Double.longBitsToDouble(EndianUtils.swapLong(file.readLong()));

In the vast majority of cases, d1 == d2, but sometimes the precision 
gets messed up. I'm talking very miniscule amounts (e.g. 
d1=2942300.30053693 d2=2942300.3005359764) where digits below 10^-5 may 
differ. Of course, the d1 case is less efficient, because 
file.readDouble() actually reads in the value as a long, then converts 
to double (if I'm not mistaken); finally, swapDouble() converts again to 
long, does the endian conversion, then converts back to double. The d2 
case skips the redundant long-to-double conversion and is thus faster -- 
and, I assume, more precise. Has anybody else encountered loss of 
precision converting back and forth between double and long?

Thanks for hearing me out, and I do appreciate any feedback! Keep up the 
great work, all!
-Warren

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [io] EndianUtils problems dealing with double values

Posted by Henri Yandell <fl...@gmail.com>.
Sounds like a well-found bug.

Could you open a Jira issue with this information? Partly so it
doesn't get lost, but also so that we can have a nice definition of
the issues in the next release.

http://issues.apache.org/jira/browse/IO

Thanks,

Hen

On 5/17/06, Warren Freitag <wa...@placebase.com> wrote:
> Hi,
>
> I've been using the EndianUtils class in Commons IO to convert an input
> file containing little-endian doubles in binary format. However I've
> come across a few situations where the function swapDouble() returns the
> wrong value, owing to the fact that the current implementation does the
> endian swapping by first converting the input value to long using
> java.lang.Double.doubleToLongBits(). However, as the Java docs claim,
> this means of converting double to long doesn't properly handle cases
> where the conversion to long results in "NaN". Unfortunately, I don't
> have handy a specific example, but I can assure the failure rate appears
> to be about 1 in 10,000 for my input file of >2million numbers. In the
> failure cases, I get "NaN" as the result from swapDouble().
>
> The solution appears to be a simple one: change EndianUtils.java to use
> doubleToRawLongBits() instead. Once I did this, my failure rate went to
> 0. The function doubleToRawLongBits() does a better job of preserving
> NaN values, according to the JavaDoc
> (http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Double.html#doubleToRawLongBits(double)).
>
> I would like to check in my (very simple) change, but I'm unfamiliar
> with the dev process for Commons so I felt I should share my findings
> with the mailing list first.
>
> One other thing (related, but maybe off-topic):
> I'm reading my little-endian doubles from a binary file using the
> RandomAccessFile class, but I've found that the following methods
> produce different results in a few (<2%) cases:
>
> RandomAccessFile file = new RandomAccessFile(...);
>
> // Case 1: read double from file, do endian conversion using my change
> to EndianUtils
> file.seek(X);
> double d1 = EndianUtils.swapDouble(file.readDouble());
>
> // Case 2: read long, do endian conversion, then convert to double
> file.seek(X);
> double d2 = Double.longBitsToDouble(EndianUtils.swapLong(file.readLong()));
>
> In the vast majority of cases, d1 == d2, but sometimes the precision
> gets messed up. I'm talking very miniscule amounts (e.g.
> d1=2942300.30053693 d2=2942300.3005359764) where digits below 10^-5 may
> differ. Of course, the d1 case is less efficient, because
> file.readDouble() actually reads in the value as a long, then converts
> to double (if I'm not mistaken); finally, swapDouble() converts again to
> long, does the endian conversion, then converts back to double. The d2
> case skips the redundant long-to-double conversion and is thus faster --
> and, I assume, more precise. Has anybody else encountered loss of
> precision converting back and forth between double and long?
>
> Thanks for hearing me out, and I do appreciate any feedback! Keep up the
> great work, all!
> -Warren
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org