You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2015/03/11 19:15:38 UTC
[jira] [Created] (TIKA-1574) Frames in header/footer in doc files
aren't extracted
Konstantin Gribov created TIKA-1574:
---------------------------------------
Summary: Frames in header/footer in doc files aren't extracted
Key: TIKA-1574
URL: https://issues.apache.org/jira/browse/TIKA-1574
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 1.7
Environment: linux, openjdk7/openjdk8
Reporter: Konstantin Gribov
Assignee: Konstantin Gribov
Text from frames in header/footer are omitted in WordParser. Text from frames in document body are extracted fine.
Same document converted to docx is extracted fully.
Maybe, it's upstream bug, I'll dig into it and file a ticket to poi bugtracker if it's the case.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)