You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2009/10/07 00:48:52 UTC
DO NOT REPLY [Bug 47950] New: No case insensitivity handling for
OLE2 entry names
https://issues.apache.org/bugzilla/show_bug.cgi?id=47950
Summary: No case insensitivity handling for OLE2 entry names
Product: POI
Version: 3.5-FINAL
Platform: PC
OS/Version: Windows NT
Status: NEW
Severity: normal
Priority: P2
Component: POIFS
AssignedTo: dev@poi.apache.org
ReportedBy: trejkaz@trypticon.org
I created some test cases to test case sensitivity in OLE2 files.
@Test
public void testPoiCaseInsensitivityInMemory() throws Exception
{
POIFSFileSystem fs = new POIFSFileSystem();
DirectoryEntry dir = fs.getRoot().createDirectory("A");
dir.createDocument("B", new ByteArrayInputStream(new byte[] { 0, 1, 2,
3, 4, 5 }));
DirectoryEntry dir2 = (DirectoryEntry) fs.getRoot().getEntry("a");
DocumentEntry doc2 = (DocumentEntry) dir2.getEntry("b");
assertArrayEquals("Wrong data read back", new byte[] { 0, 1, 2, 3, 4, 5
},
IOUtils.toByteArray(new DocumentInputStream(doc2)));
}
@Test
public void testPoiCaseInsensitivityAfterReadingFromStorage() throws
Exception
{
POIFSFileSystem fs = new POIFSFileSystem();
DirectoryEntry dir = fs.getRoot().createDirectory("A");
dir.createDocument("B", new ByteArrayInputStream(new byte[] { 0, 1, 2,
3, 4, 5 }));
ByteArrayOutputStream baos = new ByteArrayOutputStream();
fs.writeFilesystem(baos);
POIFSFileSystem fs2 = new POIFSFileSystem(new
ByteArrayInputStream(baos.toByteArray()));
DirectoryEntry dir2 = (DirectoryEntry) fs2.getRoot().getEntry("a");
DocumentEntry doc2 = (DocumentEntry) dir2.getEntry("b");
assertArrayEquals("Wrong data read back", new byte[] { 0, 1, 2, 3, 4, 5
},
IOUtils.toByteArray(new DocumentInputStream(doc2)));
}
Both of these fail looking up "a" as it doesn't exist, but the comparison is
supposed to be case insensitive according to available documentation.
Specifically, [MS-CFB] has the following to say about how entries in an OLE2
directory should be compared:
(2.6.1 pg 23)
When locating an object in the compound file except for the root storage, the
directory entry name is compared using a special case-insensitive upper-case
mapping, described in Red-Black Tree.
(2.6.4 "Red-Black Tree" pg 26)
* For each UTF-16 code point, convert to upper-case with the Unicode Default
Case Conversion
Algorithm, simple case conversion variant (simple case foldings), with the
following notes.<2>
* Unicode surrogate characters are never upper-cased, since they are
represented by two UTF-16
code points, while the sorting relationship upper-cases a single UTF-16
code point at a time.
* Lowercase characters defined in a newer, later version of the Unicode
standard can be upper-
cased by an implementation that conforms to that later Unicode standard.
Note <2> goes into further detail on which version of Unicode is used to
perform the folding:
(pg 39)
For Windows XP and Windows Server 2003: The compound file implementation
conforms to the Unicode 3.0.1 Default Case Conversion Algorithm, simple case
folding (http://www.unicode.org/Public/3.1-Update1/CaseFolding-4.txt) with the
following exceptions.
(table omitted for now)
For Windows Vista and Windows Server 2008: The compound files implementation
conforms to the Unicode 5.0 Default Case Conversion Algorithm, simple case
folding (http://www.unicode.org/Public/5.0.0/ucd/CaseFolding.txt) with the
following exceptions.
(table omitted for now)
References:
[MS-CFB]: Compound File Binary File Format, Revision 0.01 (Wednesday, June 18,
2008)
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org