You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (JIRA)" <ji...@apache.org> on 2016/02/23 16:27:18 UTC

[jira] [Commented] (TIKA-1867) Tika external parsers cannot be turned off without patching the tika-app-XX.jar

    [ https://issues.apache.org/jira/browse/TIKA-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15159042#comment-15159042 ] 

Nick Burch commented on TIKA-1867:
----------------------------------

You should be able to exclude the CompositeExternalParser with a ~5 line Tika Config file, which requires no patching or jars. Just use default parser but with a parser exclude for that one parser

See http://tika.apache.org/1.12/configuring.html for more on how to configure Tika, including an example of how to disable just one parser in config

> Tika external parsers cannot be turned off without patching the tika-app-XX.jar
> -------------------------------------------------------------------------------
>
>                 Key: TIKA-1867
>                 URL: https://issues.apache.org/jira/browse/TIKA-1867
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.11
>            Reporter: Roman Kratochvil
>
> The CompositeExternalParser calls ExternalParsersFactory.create() which always uses configuration from org/apache/tika/parser/external/tika-external-parsers.xml. The issue is that this introduces performance regression as the parser initialization checks for presence of external commands (ffmpeg, exiftool) and that takes time.
> Unfortunately, there is no way how to turn off this functionality without patching the tika-app JAR -- one has to either change the tika-external-parsers.xml or remove the whole CompositeExternalParser from list of services in /META-INF/services/org.apache.tika.parser.Parser.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)