You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by ola nowak <ol...@gmail.com> on 2012/01/05 13:46:49 UTC

External parser in a jar file

 Hi,
I've created an external parser. I added some classes to
org.apache.tika.parser.xml package and listed my parser in
tika-parsers/src/main/resources/META-INF/services/org.apache.tika.parser.Parser
file and my custom mime type in custom-mimetypes.xml. I builded all and it
works :)
Now I have a question if there is a posibility to tell tika to use this
parser without messing in its sources? I'll be using it in Solr, so I would
like just to use orginal tika's jars and probably some jar with my classes.
Is this even possible?
Regards,
Alex

Re: External parser in a jar file

Posted by Nick Burch <ni...@alfresco.com>.
On Thu, 5 Jan 2012, ola nowak wrote:
> Should java -jar tika-app.jar -list-parsers list it?

Nope. The service loading isn't magic - it won't go and find random jars 
that you haven't told it about!

You'll instead need something like:

java -classpath MyParser.jar:tika-app-1.1-SNAPSHOT.jar
    org.apache.tika.cli.TikaCLI --list-parsers

Nick

Re: External parser in a jar file

Posted by ola nowak <ol...@gmail.com>.
Thanks,
I've tried that with no success :( I've added file named
org.apache.tika.parser.Parser to META-INF/services. It only
contains org.apache.tika.parser.xml.myParser and build the jar. Should java
-jar tika-app.jar -list-parsers list it?

 What about my custom-mimetypes.xml?

On Thu, Jan 5, 2012 at 1:54 PM, Uwe Schindler <uw...@thetaphi.de> wrote:

> Hi,****
>
> ** **
>
> Add a service provider file in META-INF/services in your own jar. This
> file would be named like tika’s own, but only list your parser. SPI (the
> technique behind the whole META-INF/services) is collecting all those files
> in the classpath and make all parsers listed in them available.****
>
> ** **
>
> The same happens in Lucene 4.0 (not yet released), it will make it
> possible to plug in any indexing codec just by providing the JAR file with
> the correct SPI metadata.****
>
> ** **
>
> -----****
>
> Uwe Schindler****
>
> H.-H.-Meier-Allee 63, D-28213 Bremen****
>
> http://www.thetaphi.de****
>
> eMail: uwe@thetaphi.de****
>
> ** **
>
> *From:* ola nowak [mailto:ola.m.nowak@gmail.com]
> *Sent:* Thursday, January 05, 2012 1:47 PM
> *To:* user@tika.apache.org
> *Subject:* External parser in a jar file****
>
> ** **
>
>  Hi,
> I've created an external parser. I added some classes to
> org.apache.tika.parser.xml package and listed my parser in
> tika-parsers/src/main/resources/META-INF/services/org.apache.tika.parser.Parser
> file and my custom mime type in custom-mimetypes.xml. I builded all and it
> works :)
> Now I have a question if there is a posibility to tell tika to use this
> parser without messing in its sources? I'll be using it in Solr, so I would
> like just to use orginal tika's jars and probably some jar with my classes.
> Is this even possible?
> Regards,
> Alex****
>

RE: External parser in a jar file

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

 

Add a service provider file in META-INF/services in your own jar. This file
would be named like tika's own, but only list your parser. SPI (the
technique behind the whole META-INF/services) is collecting all those files
in the classpath and make all parsers listed in them available.

 

The same happens in Lucene 4.0 (not yet released), it will make it possible
to plug in any indexing codec just by providing the JAR file with the
correct SPI metadata.

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

http://www.thetaphi.de <http://www.thetaphi.de/> 

eMail: uwe@thetaphi.de

 

From: ola nowak [mailto:ola.m.nowak@gmail.com] 
Sent: Thursday, January 05, 2012 1:47 PM
To: user@tika.apache.org
Subject: External parser in a jar file

 

 Hi,
I've created an external parser. I added some classes to
org.apache.tika.parser.xml package and listed my parser in
tika-parsers/src/main/resources/META-INF/services/org.apache.tika.parser.Par
ser file and my custom mime type in custom-mimetypes.xml. I builded all and
it works :) 
Now I have a question if there is a posibility to tell tika to use this
parser without messing in its sources? I'll be using it in Solr, so I would
like just to use orginal tika's jars and probably some jar with my classes.
Is this even possible? 
Regards, 
Alex