You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2011/12/25 21:11:36 UTC
DO NOT REPLY [Bug 52385] New: [REGRESSION] HPSF corrupts output when
starting file has unsupported variant props
https://issues.apache.org/bugzilla/show_bug.cgi?id=52385
Bug #: 52385
Summary: [REGRESSION] HPSF corrupts output when starting file
has unsupported variant props
Product: POI
Version: 3.8-dev
Platform: All
OS/Version: All
Status: NEW
Severity: critical
Priority: P2
Component: HPSF
AssignedTo: dev@poi.apache.org
ReportedBy: yegor@dinom.ru
Classification: Unclassified
[REGRESSION] HPSF
It looks like we have a regression caused by recent changes in HPSF: an OLE2
file becomes unreadable after write if it contains a variant property of
unsupported type. In my research the problematic variant types were 4126 and
4108. The log warninga are below:
HPSF does not yet support the variant type 4126 (unknown variant type,
000000000000101E).
HPSF does not yet support the variant type 4108 (unknown variant type,
000000000000100C).
I was working on some improvements in HSSF and noticed Excel coudn't open the
output file. At first I thought it was my changes, but it turned out that even
simple read-write results in unreadble output:
HSSFWorkbook wb = new HSSFWorkbook(new FileInputStream(inputFile));
FileOutputStream os = new FileOutputStream(outputFile);
wb.write(os);
os.close();
Try the code above against the following files from our collection of test
files and the output will be coruppted.
12843-1.xls 34775.xls 45365.xls
ContinueRecordProblem.xls OddStyleRecord.xls
13224.xls 37684-2.xls 45365-2.xls ex42570-20305.xls
RangePtg.xls
14460.xls 41139.xls 46137.xls ex44921-21902.xls
testNames.xls
24207.xls 42464-ExpPtg-bad.xls 47034.xls
ex45978-extraLinkTableSheets.xls XRefCalc.xls
27852.xls 42464-ExpPtg-ok.xls 47847.xls ex46548-23133.xls
XRefCalcData.xls
29982.xls 42844.xls 48026.xls
IndexFunctionTestCaseData.xls
30978-deleted.xls 44010-SingleChart.xls 49185.xls IrrNpvTestCaseData.xls
32822.xls 44010-TwoCharts.xls 50939.xls MRExtraLines.xls
Excel 2010 shows a warning when opening such files.
The problem seems to be reelated to OLE properties and HPSF. If I comment the
line 1218 in HSSFWorkbook then all is fine and Excel is happy to open the
output files:
// Write out our HPFS properties, if we have them
writeProperties(fs, excepts);
This is a must for 3.8-final.
Yegor
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 52385] [REGRESSION] HPSF corrupts output when
starting file has unsupported variant props
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=52385
Yegor Kozlov <ye...@dinom.ru> changed:
What |Removed |Added
----------------------------------------------------------------------------
Blocks| |52538
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 52385] [REGRESSION] HPSF corrupts output when
starting file has unsupported variant props
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=52385
--- Comment #2 from Niklas Rehfeld <ni...@gmail.com> 2012-01-05 02:55:53 UTC ---
I had a look around the code, the bug seems to be in
TypedPropertyValue.read(byte[], int)
in the fact that it automatically pads the result, i.e. returns a 'padded'
offset. This is bad when reading the Heading Pairs vector (and possibly others)
in the DocumentSummaryInformation stream, as they use *unpadded* strings of the
type UnalignedLpstr
(http://msdn.microsoft.com/en-us/library/dd950621%28v=office.12%29.aspx).
I hope that this is the same bug, and not completely unrelated.
Nik
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 52385] [REGRESSION] HPSF corrupts output when
starting file has unsupported variant props
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=52385
Yegor Kozlov <ye...@dinom.ru> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
--- Comment #4 from Yegor Kozlov <ye...@dinom.ru> 2012-02-15 07:53:08 UTC ---
Your hypothesis seems to be correct. I changed TypedPropertyValue.read(byte[],
int) to return the unpadded offset and it fixed the problem.
The fix has been committed in 1244388
Regards,
Yegor
(In reply to comment #2)
> I had a look around the code, the bug seems to be in
>
> TypedPropertyValue.read(byte[], int)
>
> in the fact that it automatically pads the result, i.e. returns a 'padded'
> offset. This is bad when reading the Heading Pairs vector (and possibly others)
> in the DocumentSummaryInformation stream, as they use *unpadded* strings of the
> type UnalignedLpstr
> (http://msdn.microsoft.com/en-us/library/dd950621%28v=office.12%29.aspx).
> I hope that this is the same bug, and not completely unrelated.
>
> Nik
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 52385] [REGRESSION] HPSF corrupts output when
starting file has unsupported variant props
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=52385
--- Comment #1 from Niklas Rehfeld <ni...@gmail.com> 2012-01-03 21:47:30 UTC ---
I think this is related to (or rather, causes) bug #52337, as the returned
structure should be of type VT_VECTOR | VT_VARIANT (0x100C).
So it seems to me that the problem is in the code that reads the property sets,
rather than the writing.
Nik
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 52385] [REGRESSION] HPSF corrupts output when
starting file has unsupported variant props
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=52385
Yegor Kozlov <ye...@dinom.ru> changed:
What |Removed |Added
----------------------------------------------------------------------------
Blocks| |52337
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 52385] [REGRESSION] HPSF corrupts output when
starting file has unsupported variant props
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=52385
--- Comment #3 from Niklas Rehfeld <ni...@gmail.com> 2012-01-11 01:39:30 UTC ---
Created attachment 28134
--> https://issues.apache.org/bugzilla/attachment.cgi?id=28134
Diagram of the HeadingPair/DocParts TypedProperty structures
Just thought this might be useful for this bug, it shows some of the structure
of the docparts and headingpair properties, which as far as I have been able to
find, are the only ones that use unaligned strings in property sets.
All the info comes straight from MS-OSHARED (and maybe a little bit from
MS-OLEPS)
Ignore the green stuff on the left, that was from a project that I'm working
on.
Nik
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org