You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2013/03/05 04:33:37 UTC

[Bug 54636] New: getNumericCellValue return different result for visually equal numbers

https://issues.apache.org/bugzilla/show_bug.cgi?id=54636

            Bug ID: 54636
           Summary: getNumericCellValue return different result for
                    visually equal numbers
           Product: POI
           Version: 3.9
          Hardware: PC
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HSSF
          Assignee: dev@poi.apache.org
          Reporter: yujunliang@hotmail.com
    Classification: Unclassified

Created attachment 30017
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=30017&action=edit
File to illustrate the bug, please read it with POI and debug the Cell.

Please see attached spreadsheet, 

Here is the content of the spreadsheet,


    A    
1    97.916%    
2    97.916%    


Cell A1 was generated by earlier version of jasper library, A2 was typed in by
myself,


when reading it using POI-3.8 3.9, here are the values returned,


Cell A1 : 0.97915999999999

Cell A2 : 0.97916


Any idea?

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 54636] getNumericCellValue return different result for visually equal numbers

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=54636

--- Comment #1 from Mark B <ma...@tiscali.co.uk> ---
I could be wrong but do doubt that this is a bug with the API because it is the
format applied to the cells that determines what the user sees when viewing the
workbook using Excel. Please take a look at the HSSFDataFormatter class as it
contains methods that will allow you to use POI to format the cells contents
appropriately.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 54636] getNumericCellValue return different result for visually equal numbers

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=54636

Nick Burch <ap...@gagravarr.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |INVALID

--- Comment #2 from Nick Burch <ap...@gagravarr.org> ---
This is to be expected. Excel stores the values in the file as floating point
numbers, and applies formatting rules to control how they look to the end user.
You should either use something like DataFormatter to get strings based on the
format rules, or specify the number of significant digits when comparing the
numbers.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 54636] getNumericCellValue return different result for visually equal numbers

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=54636

--- Comment #10 from Mark B <ma...@tiscali.co.uk> ---
Using Excel to convert the file from Binary to OOXML is not of course
infallible but I do find it useful as it makes it so simple to examine the
files contents. Also, I think that the data will be preserved quite well,
problems and all, as the process relies entirely upon Excel itself.

On a similar note, will you have to support both file formats - binary and
OOXML? If so, it is possible to accomplish this using a single codebase as
there is a common parent class for both the HSSF and XSSF streams. It is
becoming more common for developers to write code that uses the classes in the
ss.usermodel package - Workbook, Sheet, Row, Cell, etc - safe in the knowledge
that it will handle either file format successfully.

Finally, I cannot really comment on including your code into the API. The
better place to ask this question would be the dev list.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 54636] getNumericCellValue return different result for visually equal numbers

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=54636

--- Comment #7 from Mark B <ma...@tiscali.co.uk> ---
Now I understand the question completely - I thought you were asking why POI
returned a numeric value for one of these cells rather than a nicely formatted
sting similar to the one the user sees when viewing the workbook using Excel.
Nick's response should, I think, be accorded greater weight. To my mind, this
problem is caused by the difficulties encountered representing a floating point
number and I will explain why I do think this.
The older binary file format is quite hard to interrogate without using a tool
like POI. As a result, I used Excel to convert your test file into OOXML and
saved it with the extension.xlsx. These are simply zipped archives of various
xml files and it is quite trivial to look at their contents directly.
Opening the OOXML based Excel archive and drilling down to the xml markup for
the first sheet reveals this fragment;
<sheetData>
   <row r="1" spans="1:3">
      <c r="A1" s="1"><v>0.97915999999999992</v></c>
      <c r="B1" t="s"><v>0</v></c>
      <c r="C1" s="2"/>
   </row>
   <row r="2" spans="1:3">
      <c r="A2" s="2"><v>0.97916000000000003</v></c>
      <c r="B2" t="s"><v>1</v></c>
      <c r="C2" s="2"/>
   </row>
      <row r="3" spans="1:3">
      <c r="A3" s="2"><v>0.97916234499999999</v></c>
      <c r="B3" t="s"><v>2</v></c>
   </row>
</sheetData>

As you can no dobt intuit, this fragment contains the data for the sheet
arranged into rows. Each row contains child elements that define the markup for
the cells on the row and the key here are the numerical values contained within
the v - value - elements. As you can see, these are all flaoting point values
with a significant number of digits following the decimal point but all
display, when the sheet is viewed using Excel, exactly as you originally
described (I will attach the converted .xlsx workbook for you to see for
yourself).
This suggests - to me at least - that you are seeing some sort of interaction
between the way Excel itself stores floating point values and the way they are
handled/represented internally. Consequently, I do not regard this as being a
bug within POI as it is simply reporting the values it finds within the
workbook files. All in my opinion of course and I could be wrong.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 54636] getNumericCellValue return different result for visually equal numbers

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=54636

Yujun Liang <yu...@hotmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|INVALID                     |---

--- Comment #5 from Yujun Liang <yu...@hotmail.com> ---
I understand floating point and I understand it return the first value as
0.97915999999999, what puzzled me was why it doesn't do it for cell A2 which is
exactly the same number? Can you explain that part? Have you opened the
attachment?

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 54636] getNumericCellValue return different result for visually equal numbers

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=54636

Yujun Liang <yu...@hotmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|---                         |INVALID

--- Comment #9 from Yujun Liang <yu...@hotmail.com> ---
(In reply to comment #8)
> Created attachment 30022 [details]
> Test file converted into OOXML format.

Thank you Mark. You did answer my question, as well as taught me to save as
xlsx format to examine the older spreadsheet. If I had known that before I
wouldn't have created this bug report. 

I agree this is not a POI bug. jxl uses native api so it can handle this
situation, I understand POI can't do that.

I am new to POI so I didn't understand the subtle difference in the storage,
your example well illustrates it for me. Thanks.

I wrote some POI wrapper so it allows me to do this,

workbook.sheet(0).cell("A1")
workbook.sheet("sheet1").cell("A2")
workbook.sheet("sheet1").column("AT")
workbook.sheet("sheet1").row("1") // row("1") is getRow(0)

and workbook1.equals(workbook2) will compare sheet by sheet, cell by cell.
That's how I found out the spreadsheet generated by older jasper library is not
same as the one generated by the later jasper which using newer POI. 

Do you want to incorporate this API into POI library?

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 54636] getNumericCellValue return different result for visually equal numbers

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=54636

Nick Burch <ap...@gagravarr.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|---                         |INVALID

--- Comment #4 from Nick Burch <ap...@gagravarr.org> ---
I'd strongly suggest you go and read the wikipedia page on floating point
numbers and binary representations, and possibly review some tutorials / CS101
talks on the same topic. Come back if after all that you still have queries on
how it works!

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 54636] getNumericCellValue return different result for visually equal numbers

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=54636

--- Comment #8 from Mark B <ma...@tiscali.co.uk> ---
Created attachment 30022
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=30022&action=edit
Test file converted into OOXML format.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 54636] getNumericCellValue return different result for visually equal numbers

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=54636

Yujun Liang <yu...@hotmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|INVALID                     |---

--- Comment #3 from Yujun Liang <yu...@hotmail.com> ---
Can you explain so I understand why it returns different result?

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 54636] getNumericCellValue return different result for visually equal numbers

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=54636

--- Comment #6 from Yujun Liang <yu...@hotmail.com> ---
Created attachment 30020
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=30020&action=edit
Modified file.

I added one more cell A3, value = 97.9162345%, format = "#,##0.000%"

So the file contents becomes,

           A             Display value with focus   Reading with POI
1     97.916%             97.916%                   0.97915999999999   
2     97.916%             97.916%                   0.97916 
3     97.916%             97.9162345%               0.979162345

Any idea why cell A2 is not 0.97915999999999 by POI?

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 54636] getNumericCellValue return different result for visually equal numbers

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=54636

Yujun Liang <yu...@hotmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 OS|                            |All

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org