You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/10/18 21:44:07 UTC

[jira] [Updated] (TIKA-819) Make Option to Exclude Embedded Files' Text for Text Content

     [ https://issues.apache.org/jira/browse/TIKA-819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris A. Mattmann updated TIKA-819:
-----------------------------------
    Fix Version/s:     (was: 1.11)
                   1.12

> Make Option to Exclude Embedded Files' Text for Text Content
> ------------------------------------------------------------
>
>                 Key: TIKA-819
>                 URL: https://issues.apache.org/jira/browse/TIKA-819
>             Project: Tika
>          Issue Type: New Feature
>          Components: general
>    Affects Versions: 1.0
>         Environment: Windows-7 + JDK 1.6 u26
>            Reporter: Albert L.
>             Fix For: 1.12
>
>
> It would be nice to be able to disable text content from embedded files.
> For example, if I have a DOCX with an embedded PPTX, then I would like the option to disable text from the PPTX from showing up when asking for the text content from DOCX.  In other words, it would be nice to have the option to get text content *only* from the DOCX instead of the DOCX+PPTX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)