You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/11/03 18:58:00 UTC
[jira] [Comment Edited] (TIKA-2491) Cannot use TikaConfig
[ https://issues.apache.org/jira/browse/TIKA-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16238177#comment-16238177 ]
Tim Allison edited comment on TIKA-2491 at 11/3/17 6:57 PM:
------------------------------------------------------------
[~gagravarr] solved this:
bq. I think you need to give both the classloader and the config file for your setup
bq. Can you try this constructor:https://tika.apache.org/1.16/api/org/apache/tika/config/TikaConfig.html#TikaConfig-java.net.URL-java.lang.ClassLoader-
bq. With something like new TikaConfig(conf.getResource(customConfFile), this.getClass().getClassLoader());
Nick, this seems strange that we allow for not including the classloader with regular TikaConfig(), but we require it if specifying a config file. Should we do something like this:
{noformat}
private static ServiceLoader serviceLoaderFromDomElement(Element element, ClassLoader loader) throws TikaConfigException {
if (serviceLoaderElement != null) {
...some stuff...
+ if (loader == null) {
+ loader = ServiceLoader.getContextClassLoader();
+ }
serviceLoader = new ServiceLoader(loader, loadErrorHandler, initializableProblemHandler, dynamic);
} else if(loader != null) {
serviceLoader = new ServiceLoader(loader);
} else {
serviceLoader = new ServiceLoader();
}
{noformat}
was (Author: tallison@mitre.org):
[~gagravarr] solved this:
bq. I think you need to give both the classloader and the config file for your setup
bq. Can you try this constructor:https://tika.apache.org/1.16/api/org/apache/tika/config/TikaConfig.html#TikaConfig-java.net.URL-java.lang.ClassLoader-
bq. With something like new TikaConfig(conf.getResource(customConfFile), this.getClass().getClassLoader());
Nick, this seems strange that we allow for not including the classloader with regular TikaConfig(), but we require it if specifying a config file. Should we do something like this:
{noformat}
if (serviceLoaderElement != null) {
...some stuff...
+ if (loader == null) {
+ loader = ServiceLoader.getContextClassLoader();
+ }
serviceLoader = new ServiceLoader(loader, loadErrorHandler, initializableProblemHandler, dynamic);
} else if(loader != null) {
serviceLoader = new ServiceLoader(loader);
} else {
serviceLoader = new ServiceLoader();
}
{noformat}
> Cannot use TikaConfig
> ---------------------
>
> Key: TIKA-2491
> URL: https://issues.apache.org/jira/browse/TIKA-2491
> Project: Tika
> Issue Type: Bug
> Affects Versions: 1.16
> Reporter: Markus Jelsma
> Fix For: 1.17
>
> Attachments: tika-config.xml
>
>
> I need to use a custom tika-config.xml in Nutch, which has support for it but i can't get it to work.
> This is how Nutch gets the parser:
> Parser parser = tikaConfig.getParser(MediaType.parse(mimeType));
> When no custom config is specified config is:
> new TikaConfig(this.getClass().getClassLoader());
> When i specify a custom config, it is:
> tikaConfig = new TikaConfig(conf.getResource(customConfFile));
> getParser always returns null with a custom config file. There are no errors or exceptions. The config is fine, it fixed the encoding problem in a parser outside of Nutch (thanks again Timothy) but i need to get it to work in Nutch too.
> Our external project does:
> AutoDetectParser parser = new AutoDetectParser(tikaConfig); parser.parse(..);
> and it just works! If i do this in Nutch, however, nothing is passed through the content handlers, the parser result is completely empty?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)