You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/11/03 15:44:00 UTC

[jira] [Updated] (TIKA-2491) Cannot use TikaConfig

     [ https://issues.apache.org/jira/browse/TIKA-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Markus Jelsma updated TIKA-2491:
--------------------------------
    Attachment: tika-config.xml

> Cannot use TikaConfig
> ---------------------
>
>                 Key: TIKA-2491
>                 URL: https://issues.apache.org/jira/browse/TIKA-2491
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 1.16
>            Reporter: Markus Jelsma
>             Fix For: 1.17
>
>         Attachments: tika-config.xml
>
>
> I need to use a custom tika-config.xml in Nutch, which has support for it but i can't get it to work. 
> This is how Nutch gets the parser:
> Parser parser = tikaConfig.getParser(MediaType.parse(mimeType));
> When no custom config is specified config is:
> new TikaConfig(this.getClass().getClassLoader());
> When i specify a custom config, it is:
> tikaConfig = new TikaConfig(conf.getResource(customConfFile));
> getParser always returns null with a custom config file. There are no errors or exceptions. The config is fine, it fixed the encoding problem in a parser outside of Nutch (thanks again Timothy) but i need to get it to work in Nutch too.
> Our external project does:
> AutoDetectParser parser = new AutoDetectParser(tikaConfig); parser.parse(..);
> and it just works! If i do this in Nutch, however, nothing is passed through the content handlers, the parser result is completely empty?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)