You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by "Allison, Timothy B." <ta...@mitre.org> on 2016/06/29 12:52:50 UTC
xlsx somewhat recently switched to Scientific notation for long
sequences of digits?
All,
On https://issues.apache.org/jira/browse/TIKA-2025, a Tika user noted that, at least for xlsx, what used to be rendered as a long sequence of digits (e.g. 340229177292566) is now being extracted as scientific notation (3.40229E+14). This new behavior mimics Excel more closely, but is there an easy/obvious way for us at the Tika level to revert back to extracting the full sequence of digits or do I have to look into this at the POI level?
Thank you.
Best,
Tim
RE: xlsx somewhat recently switched to Scientific notation for long
sequences of digits?
Posted by "Allison, Timothy B." <ta...@mitre.org>.
As Aeham Abushwashi pointed out on TIKA-2025, this was caused by the improvement/closer alignment to Excel's spec in org.apache.poi.ss.usermodel.ExcelGeneralNumberFormat.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58471
and
http://svn.apache.org/viewvc?diff_format=h&view=revision&revision=1706971
Short of redoing custom formatting at the Tika level, any recommendations?
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org
RE: xlsx somewhat recently switched to Scientific notation for long
sequences of digits?
Posted by "Allison, Timothy B." <ta...@mitre.org>.
Got it. I realize there's a double under the hood for all numbers in POI and Excel, and I agree that you can't have a 16 digit numeric in Excel...that would have to be stored as a string/text cell in Excel.
The question has more to do with a change in formatting for the < 16 digit numerics.
With poi-3.13, FormatTrackingHSSFListener's formatNumberDateCell(number) for these numbers yielded "340229177292566".
With poi-3.15-beta1, we're getting "3.40229E+14". Again, to be fair, "3.40229E+14" is exactly what Excel displays if the columns are of a certain width, so in some ways this is progress.
The question: is there an easy way for us to get the old behavior?
-----Original Message-----
From: Javen O'Neal [mailto:javenoneal@gmail.com]
Sent: Wednesday, June 29, 2016 11:13 AM
To: POI Users List <us...@poi.apache.org>
Subject: Re: xlsx somewhat recently switched to Scientific notation for long sequences of digits?
Excel and POI don't make a distinction between double/decimal and int. Does Excel make any guarantees that doubles won't have precision issues?
16-digit credit cards are not storable as 32-bit ints, but require 64-bit longs.
On Jun 29, 2016 5:53 AM, "Allison, Timothy B." <ta...@mitre.org> wrote:
All,
On https://issues.apache.org/jira/browse/TIKA-2025, a Tika user noted that, at least for xlsx, what used to be rendered as a long sequence of digits (e.g. 340229177292566) is now being extracted as scientific notation (3.40229E+14). This new behavior mimics Excel more closely, but is there an easy/obvious way for us at the Tika level to revert back to extracting the full sequence of digits or do I have to look into this at the POI level?
Thank you.
Best,
Tim
Re: xlsx somewhat recently switched to Scientific notation for long
sequences of digits?
Posted by Javen O'Neal <ja...@gmail.com>.
Excel and POI don't make a distinction between double/decimal and int. Does
Excel make any guarantees that doubles won't have precision issues?
16-digit credit cards are not storable as 32-bit ints, but require 64-bit
longs.
On Jun 29, 2016 5:53 AM, "Allison, Timothy B." <ta...@mitre.org> wrote:
All,
On https://issues.apache.org/jira/browse/TIKA-2025, a Tika user noted
that, at least for xlsx, what used to be rendered as a long sequence of
digits (e.g. 340229177292566) is now being extracted as scientific notation
(3.40229E+14). This new behavior mimics Excel more closely, but is there
an easy/obvious way for us at the Tika level to revert back to extracting
the full sequence of digits or do I have to look into this at the POI level?
Thank you.
Best,
Tim