You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2017/01/11 16:54:49 UTC
[jira] [Commented] (TIKA-2192) Extract embedded files from headers,
footers, footnotes, etc from docx/m
[ https://issues.apache.org/jira/browse/TIKA-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15818813#comment-15818813 ]
Hudson commented on TIKA-2192:
------------------------------
SUCCESS: Integrated in Jenkins build tika-2.x #194 (See [https://builds.apache.org/job/tika-2.x/194/])
TIKA-2192 (tallison: rev e02084cc64c5a825dae6e16853c5dac3cbb55f46)
* (edit) tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/XWPFWordExtractorDecorator.java
> Extract embedded files from headers, footers, footnotes, etc from docx/m
> ------------------------------------------------------------------------
>
> Key: TIKA-2192
> URL: https://issues.apache.org/jira/browse/TIKA-2192
> Project: Tika
> Issue Type: Improvement
> Reporter: Tim Allison
> Fix For: 2.0, 1.15
>
>
> While working on an alternate SAX parser for docx/docm, I found that we're not currently extracting embedded documents from headers, footers, footnotes, endnotes or comments. We should fix this in our classic DOM parser.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)