You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2009/06/03 07:32:11 UTC
DO NOT REPLY [Bug 47304] New: WordDocument uses platform default
encoding
https://issues.apache.org/bugzilla/show_bug.cgi?id=47304
Summary: WordDocument uses platform default encoding
Product: POI
Version: 3.5-dev
Platform: PC
OS/Version: Mac OS X 10.4
Status: NEW
Severity: normal
Priority: P2
Component: HDF
AssignedTo: dev@poi.apache.org
ReportedBy: jelmer@jteam.nl
When using the following code to read the attached word document the text is
not read correctly on macosx
WordDocument wordDoc = new WordDocument(new FileInputStream("test.doc"));
StringWriter docTextWriter = new StringWriter();
wordDoc.writeAllText(new PrintWriter(docTextWriter));
wordDoc.writeAllText(writer);
docTextWriter.close();
System.out.println(docTextWriter.toString());
The reason for this is that the platform default encoding is used to read the
document when the text found is not unicode while windows-1252 should be used
Here's the offending code
if(unicode)
{
....
}
else
{
String sText = new String(_header, start, end-start);
out.write(sText);
}
On windows the platform default encoding is windows-1252, on osx it's macroman
To fix this
String sText = new String(_header, start, end-start);
should be changed to
String sText = new String(_header, start, end-start, "windows-1252");
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
DO NOT REPLY [Bug 47304] WordDocument uses platform default encoding
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=47304
--- Comment #1 from Jelmer Kuperus <je...@jteam.nl> 2009-06-02 22:33:23 PST ---
Created an attachment (id=23746)
--> (https://issues.apache.org/bugzilla/attachment.cgi?id=23746)
example
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org