You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2009/07/16 18:44:08 UTC

DO NOT REPLY [Bug 47543] New: Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

https://issues.apache.org/bugzilla/show_bug.cgi?id=47543

           Summary: Using many HSSFRichTextString cells causes Excel 2003
                    and lower to crash
           Product: POI
           Version: 3.5-dev
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: critical
          Priority: P1
         Component: HSSF
        AssignedTo: dev@poi.apache.org
        ReportedBy: curtis@spss.com
                CC: curtis@spss.com


Creating too many HSSFRichTextString cells causes Excel 2003 and lower to crash
when changing the color of the cells and then saving the Excel file.

Run the attached Driver program, which creates a file that demonstrates the
Excel 2003 errors.

 1. Run the program, then open the resulting file in Excel 2003 or lower.
 2. Select all cells by clicking on the corner cell
 3. Press the color toolbar button to change the color of all cells
 4. Press the save button in Excel and it will crash.

Note: The row and column count is only 10 in the attached example, but if you
increse the ROWS and COLS constants to 100 in the makeData()method below, then
Excel 2003 will crash after step 3 above.

This is a critical bug for our product, with very large customers complaining
vociferously. We are running POI 3.1 now but I tested this with 3.5 beta and
get the same results.

We often generate large workbooks that contain tables with many footnotes,
hence the need for HSSFRichTextString. To work around the HSSFRichTextString
problem for other cells, we have been forced to fallback to using the
deprecated version of HSSFCell.setCellValue() that takes a plain String instead
of the recommended HSSFRichTextString.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543





--- Comment #12 from Yegor Kozlov <ye...@dinom.ru>  2009-07-28 22:48:05 PST ---
Created an attachment (id=24057)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24057)
vba-100.xls - a 10x10 grid produced by driver.vbs

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543

Yegor Kozlov <ye...@dinom.ru> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED

--- Comment #18 from Yegor Kozlov <ye...@dinom.ru> 2009-09-13 07:56:46 PDT ---
Finally, I updated the javadoc for HSSFRichTextString  to reflect my research
on this issue.

Yegor

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543





--- Comment #9 from Curtis Browning <cu...@spss.com>  2009-07-28 07:53:30 PST ---
Thanks Nick,

Are you asking me to track down the reason why the file created in Excel 2003
does not open in BiffViewer, or is that something that you will pursue?

My guess is that it has something to do with the CONTINUE record in the SST.
When I was writing Excel export code in C++ many years ago, I found that there
are documentation problems with the way that CONTINUE records are handled by
Excel. It took a long time and much effort to figure out what Excel really does
when rich text strings are continued in the SST, but we finally got it working
correctly.

Regards,
Curtis

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543





--- Comment #6 from Curtis Browning <cu...@spss.com>  2009-07-27 09:05:19 PST ---
Created an attachment (id=24043)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24043)
Generated manually in Excel - same content as test program

This file was (painstakingly) created in Excel 2003 by manually editing each
cell. It has the same visual content as the file generated by the POI test
program, but it doesn't crash Excel 2003 when the cell colors are changed and
the file is saved.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543


Curtis Browning <cu...@spss.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |NEW




--- Comment #7 from Curtis Browning <cu...@spss.com>  2009-07-27 09:13:57 PST ---
Hello, 

In response to your first question:
No, this is not a problem with many rich strings in Excel 2003. 

See the two attached Excel documents which contain the same visual information
(a 10x10 cell matrix with superscripts on each cell value). The one named
test_poi_export.xls was generated using the attached test program. When one
highlights all the cells, changes their text color, then presses Save in Excel
2003, Excel 2003 will crash. When one follows the same steps with the second
attached Excel workbook (from_excel.xls), the text color changes correctly and
pressing the Save button in Excel 2003 does not crash the application.

I attempted to open these files in BiffViewer, but whereas the one created by
POI opens fine, the second file ("from_excel.xls", created manually in Excel
2003), crashes the BiffViewer application with a general protection fault. I
don't care that the one created in Excel 2003 crashes BiffViewer, what we need
is for POI to generate a file that does not crash Excel 2003.

Regards,
Curtis

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543





--- Comment #14 from Yegor Kozlov <ye...@dinom.ru>  2009-07-31 12:13:03 PST ---
Created an attachment (id=24075)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24075)
Driver2.java - a version of Driver.java that produces output modifiable by
Excel 2003

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543





--- Comment #11 from Yegor Kozlov <ye...@dinom.ru>  2009-07-28 22:46:52 PST ---
Created an attachment (id=24056)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24056)
driver.vbs - a VBA script for testing

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543





--- Comment #3 from Curtis Browning <cu...@spss.com>  2009-07-20 07:32:13 PST ---
I would like to note also that in the 10x10 example, Excel 2003 will also crash
if one attempts to save the Excel file immediately after changing the cell
colors.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543





--- Comment #1 from Curtis Browning <cu...@spss.com>  2009-07-16 09:47:31 PST ---
Created an attachment (id=23995)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=23995)
Demonstrates the problem

I thought this was attached with the first submittal, but it didn't seem to
take it.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543





--- Comment #16 from Yegor Kozlov <ye...@dinom.ru>  2009-07-31 12:16:34 PST ---
Curtis,

Can you try the suggested workaround and let me know whether it works for you
or not. 

Yegor

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543





--- Comment #15 from Yegor Kozlov <ye...@dinom.ru>  2009-07-31 12:13:50 PST ---
Created an attachment (id=24076)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24076)
driver2-10000.xls - 100x100 grid produced by Driver2.java

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543





--- Comment #10 from Yegor Kozlov <ye...@dinom.ru>  2009-07-28 22:45:53 PST ---
Guys,

Creating a text .xls file manually is  not a good idea - Excel creates a lot of
garbage in this case. I created a simple VBA script that does a job similar to
the attached Driver.java. The output is always readable by BiffViewer AND
tolerant to changing font colors, no matter how large the grid is - 10x10,
100x100, etc. For all tests produced by the VBA script Excel allows changing
the font color. 

I compared POI- and VBA- versions and didn't see anything suspicious about
CONTINUE records - all looks sane.  

I won't be able to look into it more earlier than this weekend. If anyone wants
to track this bug, I suggest the following strategy:

 - modify Driver.java and driver.vbs to produce equivalent output
 - for both POI- and VBA- code use a blank template XLS to minimize the
differences. 
 - start with a small grid, say 5x5. 
 - start tweaking POI to produce output as much close to VBA as possible. For
testing purposes you may need to temporary modify record classes.

I won't be surprised if the culprit is a record absolutely irrelevant to SST -
you never know it with Excel :)

Regards,
Yegor

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543


Nick Burch <ni...@torchbox.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO




--- Comment #4 from Nick Burch <ni...@torchbox.com>  2009-07-27 08:04:06 PST ---
Just to check - if you create a file like the one that crashes Excel 2003 using
excel itself, does that open fine in excel 2003? (i.e. is it a problem with
excel and lots of rich text formatting, or just in how poi writes out the rich
text formatting?)

If it is a poi issue, any chance you could create two files:
* the smallest file produced by poi with rich text that crashes excel 2003
* a file with the same contents produced by excel

(If you have time, it'd be great if you could use BiffViewer to try to spot
what excel and poi have done differently between these two files)

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543





--- Comment #8 from Nick Burch <ni...@torchbox.com>  2009-07-28 06:32:03 PST ---
Thanks for the info and files Curtis

Step one will be to get BiffViewer to be able to open the excel file without
crashing. 

Once we've sorted that, we can then compare the two files at the record level,
and see what Excel has done differently about encoding the rich text. 

Finally, when we know that, we can make a stab at changing how poi saves large
amounts to rich text to closer match excel

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543





--- Comment #2 from Yegor Kozlov <ye...@dinom.ru>  2009-07-19 23:40:27 PST ---
The problem confirmed in current trunk. 

Changing font color for a 10x10 grid works fine but crashes Excel for a 100x100
grid. 

Excel 2007 has no problems with such 100x100 files, the problem is observed in
Excel 2003 and lower. 

Regards,
Yegor

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543





--- Comment #5 from Curtis Browning <cu...@spss.com>  2009-07-27 09:03:22 PST ---
Created an attachment (id=24042)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=24042)
Generated by the attached test program using POI 

This file was generated by the test program attached to this report. If one
follows the steps in the report, Excel will crash when the save button is
pressed.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543





--- Comment #13 from Yegor Kozlov <ye...@dinom.ru>  2009-07-31 12:11:06 PST ---
Quite an interesting bug.

Here is the problem in a nutshell:

For reach text containing N runs Excel saves font the first run in the cell
style and subsequent N-1 runs override the font in the cell style.

Compare two snippets that produce equivalent output:

1. POI approach: 

 HSSFCell hssfCell = row.createCell(idx);
 //rich text consists of two runs
 HSSFRichTextString richString = new HSSFRichTextString( "Bug 47543" );
 richString.applyFont( 0, 4, font1 );
 richString.applyFont( 4, 8, font2 );
 hssfCell.setCellValue( richString );

2. Excel approach

 //create a cell style and assign the first font to it
 HSSFCellStyle style = workbook.createCellStyle();
 style.setFont(font1);

 HSSFCell hssfCell = row.createCell(idx);
 hssfCell.setCellStyle(style);

 //rich text consists of one run overriding the cell style
 HSSFRichTextString richString = new HSSFRichTextString( cellValue );
 richString.applyFont( 4, 8, font2 );
 hssfCell.setCellValue( richString );

It turns out that the way you create HSSFRichTextString cells is important, (1)
results in Excel crash while (2) always seems to work. 

See Driver2 - a version of the attached Driver program that produces valid
output readable and modifiable by Excel 2003. 

driver2-10000.xls is a sample output of 100x100 grid.

Regards,
Yegor

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 47543] Using many HSSFRichTextString cells causes Excel 2003 and lower to crash

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47543





--- Comment #17 from Curtis Browning <cu...@spss.com>  2009-08-03 08:59:17 PST ---
Thanks very much Yegor, the suggested change did indeed address the problem. It
makes sense also given the way that Excel stores font information for rich
strings in the SST. 

I would suggest also that the POI documentation on using rich text strings be
updated to reflect your findings regarding the correct usage. When I looked
back at some of the internet questions and postings related to problems using
POI with large numbers of HSSFRichTextString objects, I now suspect that the
same issue may be involved.

Thanks again for your help, 
Curtis

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org