You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2016/01/14 10:15:35 UTC
[Bug 58858] New: hidden characters not removed
https://bz.apache.org/bugzilla/show_bug.cgi?id=58858
Bug ID: 58858
Summary: hidden characters not removed
Product: POI
Version: unspecified
Hardware: PC
Status: NEW
Severity: critical
Priority: P2
Component: HWPF
Assignee: dev@poi.apache.org
Reporter: sebastian.a.aguirre@gmail.com
Created attachment 33442
--> https://bz.apache.org/bugzilla/attachment.cgi?id=33442&action=edit
sample doc file to test
After reading the file and turning it into a String the hidden characters are
not removed.
This happens in XWPF as well.
For reading the file I'm using a very simple method.
File file = new File("file.doc");
FileInputStream fis;
fis = new FileInputStream(file);
HWPFDocument doc = new HWPFDocument(fis);
WordExtractor ex = new WordExtractor(doc);
String toReturn = ex.getText();
Same thing happens when using XWPF, very simple code.
XWPFDocument doc = new XWPFDocument(fis);
XWPFWordExtractor ex = new XWPFWordExtractor(doc);
String toReturn = ex.getText();
I'm attaching a file you can use as sample.
You can show/hide the hidden characters with ctrl+shift+8
Thanks.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 58858] hidden characters not removed
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58858
Dominik Stadler <do...@gmx.at> changed:
What |Removed |Added
----------------------------------------------------------------------------
OS| |All
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[Bug 58858] hidden characters not removed
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58858
Hamza Gobir <hg...@googlemail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |hgobir@googlemail.com
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org