You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2010/04/08 18:32:51 UTC

DO NOT REPLY [Bug 49074] New: Creating Too Many Rich Strings Causes Excel 2007 Files to Lose All Formatting

https://issues.apache.org/bugzilla/show_bug.cgi?id=49074

           Summary: Creating Too Many Rich Strings Causes Excel 2007 Files
                    to Lose All Formatting
           Product: POI
           Version: 3.0
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: major
          Priority: P2
         Component: HSSF
        AssignedTo: dev@poi.apache.org
        ReportedBy: brownicu@us.ibm.com


Created an attachment (id=25245)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=25245)
See the comments at the top of the attached test driver program

Background:
In POI Bug 47543, we discovered that one must apply the base cell font to the
cell itself via the cell style and not to the HSSFRichTextString object. Only
the second and subsequent fonts used in HSSFRichTextStrings should use the
HSSFRichTextString.applyFont() routine.

That approach works well up to a point, after which Excel 2003 and Excel 2007
will fail to read the font data, throwing user errors instead and showing only
the raw sheet data without any formatting.

To reproduce, run the attached Driver.java program and try to open the
generated file in Excel 2007. Excel throws an error stating that it found
"unreadable content", and all cell formatting is then lost.

Note that in the test driver program we are creating 65 rows and 10 columns of
cell data that contains footnote markers (a superscript character at the end of
each cell text). If you change the row constant to 64 rows or lower, then Excel
can read the generated content. If you use 65 rows or higher, the error
described here will occur. 

Our customers typically generate large amounts of output containing many tables
with footnotes in certain cells. When we reach the rich text limit described
here their Excel output becomes useless and we receive technical support
compaints.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 49074] Creating Too Many Rich Strings Causes Excel 2007 Files to Lose All Formatting

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=49074

--- Comment #5 from Curtis Browning <br...@us.ibm.com> 2010-07-01 15:29:20 EDT ---
Created an attachment (id=25681)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=25681)
65 rows created by POI code in the attached Driver.java

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 49074] Creating Too Many Rich Strings Causes Excel 2007 Files to Lose All Formatting

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=49074

--- Comment #3 from Curtis Browning <br...@us.ibm.com> 2010-07-01 15:22:06 EDT ---
Created an attachment (id=25679)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=25679)
Test Driver to Create the Sample Excel Files

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 49074] Creating Too Many Rich Strings Causes Excel 2007 Files to Lose All Formatting

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=49074

Nick Burch <ni...@alfresco.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #8 from Nick Burch <ni...@alfresco.com> 2010-09-21 06:54:33 EDT ---
I've been comparing the BiffViewer dumps from poi, excel and open office, and I
can't spot anything wrong with what POI is doing. There are some differences,
but generally when POI and Excel differ, open office does it the same way as
POI, and when POI and OpenOffice differ, POI has done it similar to excel...

OpenOffice can read the poi 65 row file without error, so we're not doing
anything too crazy

The only thing I can suggest is you ask around all of your clients, until you
find one who have a support contract with Microsoft. Then, get the 65 row POI
file submitted to Microsoft, and see if they can shed any light on what Excel
dislikes about the file

As it stands, I can't figure out what POI is doing wrong to be able to fix it
:/

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 49074] Creating Too Many Rich Strings Causes Excel 2007 Files to Lose All Formatting

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=49074

Nick Burch <ni...@alfresco.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #1 from Nick Burch <ni...@alfresco.com> 2010-05-05 12:40:36 EDT ---
Is 65x10 the smallest file you can generate that gives the problem, or can you
find something smaller?

I'd suggest you create four files:
* One via POI, which is the smallest you can that gives the problem
* One via POI, which is like the above, but the biggest without issues
* One via Excel, containing the same data as the POI one that fails
* One via Excel, containing the same data as the POI one that works

Next, use BiffViewer to try to see how the files differ. Especially interesting
will be between the two poi ones, and between the poi one that breaks and the
excel equivalent. Use the poi+excel pair that both work to help identify
harmless things that are done differently.

Hopefully from this analysis, you can identify what's going wrong with the file
that POI generates

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 49074] Creating Too Many Rich Strings Causes Excel 2007 Files to Lose All Formatting

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=49074

Curtis Browning <br...@us.ibm.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |NEW

--- Comment #2 from Curtis Browning <br...@us.ibm.com> 2010-07-01 15:20:38 EDT ---
Hi Nick,

In response to your questions:

1. Yes, the 65 row x 10 column file is the smallest file that I can create in
POI that will demonstrate the problem.

2. I have attached 4 sample Excel workbooks, two created via POI
(test_export_64_rows.xls and test_export_65_rows.xls), and two created via the
attached driver.vbs script (vba_64_rows.xls and vba_65_rows.xls). The two
64-row files and the two 65-row files contain the exact same cell content.

3. I tried following the steps you requested I take with the BiffViewer
application, but the two files created via the VBA script cause BiffViewer to
crash with a General Protection Fault. I also tried to manually create the test
file in Excel (slow and cumbersome as it was), but the same GPF occurred when
opening the file in BiffViewer.

Please have a look at the attachments and the relevant POI code. Hopefully it
will help you to get to the root cause of this problem.

Regards,
Curtis

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 49074] Creating Too Many Rich Strings Causes Excel 2007 Files to Lose All Formatting

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=49074

--- Comment #7 from Curtis Browning <br...@us.ibm.com> 2010-07-01 15:30:56 EDT ---
Created an attachment (id=25683)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=25683)
65 rows created by driver.vbs script

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 49074] Creating Too Many Rich Strings Causes Excel 2007 Files to Lose All Formatting

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=49074

--- Comment #6 from Curtis Browning <br...@us.ibm.com> 2010-07-01 15:30:30 EDT ---
Created an attachment (id=25682)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=25682)
64 rows created by driver.vbs script

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 49074] Creating Too Many Rich Strings Causes Excel 2007 Files to Lose All Formatting

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=49074

--- Comment #4 from Curtis Browning <br...@us.ibm.com> 2010-07-01 15:28:53 EDT ---
Created an attachment (id=25680)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=25680)
64 rows created by POI code in the attached Driver.java

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org