You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pylucene-dev@lucene.apache.org by Atsuo Ishimoto <is...@gembook.org> on 2009/12/03 18:00:56 UTC

Building Apache Tika module

Hello,

I'm trying to build Apache Tika 0.5 module with jcc 2.5. The module
was built successfully by following command.

    python.exe -m jcc.__main__ --shared --jar ./tika-core-0.5.jar\
               --package java.lang java.lang.System  java.io.File\
                 java.io.FileInputStreamjava.io.InputStreamReader\
                 java.lang.Runtime\
               --python tika --build --install

But following Java method is not exported to Python.

class: org.apache.tika.parser.AutoDetectParser

    public void parse(
            InputStream stream, ContentHandler handler,
            Metadata metadata, ParseContext context)
            throws IOException, SAXException, TikaException {
    ...
    }

How can I generate Tika library?

Regards,
-- 
Atsuo Ishimoto
Mail: ishimoto@gembook.org
Blog: http://d.hatena.ne.jp/atsuoishimoto/
Twitter: atsuoishimoto

Re: Building Apache Tika module

Posted by Andi Vajda <va...@apache.org>.
On Fri, 4 Dec 2009, Atsuo Ishimoto wrote:

> On Fri, Dec 4, 2009 at 2:54 AM, Andi Vajda <va...@apache.org> wrote:
>> In your example above, it looks like you're missing one or more --package
>> statements to let JCC generate wrappers for SAXException and ContentHandler.
>
> Thank you, Andi. Now it works.
>
> In my example, org.xml.sax.SAXException was generated and I stupidly assumed
> JCC automagically generated org.xml.sax.ContentHandler for me too.
>
> Thanks for your help.

Excellent !
You're welcome.

Andi..

Re: Building Apache Tika module

Posted by Atsuo Ishimoto <at...@gmail.com>.
On Fri, Dec 4, 2009 at 2:54 AM, Andi Vajda <va...@apache.org> wrote:
> In your example above, it looks like you're missing one or more --package
> statements to let JCC generate wrappers for SAXException and ContentHandler.

Thank you, Andi. Now it works.

In my example, org.xml.sax.SAXException was generated and I stupidly assumed
JCC automagically generated org.xml.sax.ContentHandler for me too.

Thanks for your help.
-- 
Atsuo Ishimoto
Mail: ishimoto@gembook.org
Blog: http://d.hatena.ne.jp/atsuoishimoto/
Twitter: atsuoishimoto

Re: Building Apache Tika module

Posted by Andi Vajda <va...@apache.org>.
On Fri, 4 Dec 2009, Atsuo Ishimoto wrote:

> I'm trying to build Apache Tika 0.5 module with jcc 2.5. The module
> was built successfully by following command.
>
>    python.exe -m jcc.__main__ --shared --jar ./tika-core-0.5.jar\
>               --package java.lang java.lang.System  java.io.File\
>                 java.io.FileInputStreamjava.io.InputStreamReader\
>                 java.lang.Runtime\
>               --python tika --build --install
>
> But following Java method is not exported to Python.
>
> class: org.apache.tika.parser.AutoDetectParser
>
>    public void parse(
>            InputStream stream, ContentHandler handler,
>            Metadata metadata, ParseContext context)
>            throws IOException, SAXException, TikaException {
>    ...
>    }
>
> How can I generate Tika library?

See http://lucene.apache.org/pylucene/jcc/documentation/readme.html#use

JCC will generate wrappers for public methods if all classes involved 
(return type, parameters, exceptions) are in the set of classes that JCC 
could be generating classes for.

In your example above, it looks like you're missing one or more --package 
statements to let JCC generate wrappers for SAXException and ContentHandler.

Without including these packages, JCC will skip any method refering to 
classes in them. Letting JCC generate wrappers for these classes does not 
mean that it will; a --package statement only tells JCC that it can include 
classes in this package in the transitive closure of all dependencies on the 
classes you actually requested be wrapped by listing them individually or 
via a --jar statement.

Andi..