You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Bai Shen <ba...@gmail.com> on 2011/09/26 14:49:42 UTC

Can't retrieve Tika Parser for mime-type

When I run the parse command, I keep getting the following error, "Can't
retrieve Tika Parser for mime-type"

How do I add all of the Tika parsers to Nutch?  I thought they were already
part of it.

Re: Can't retrieve Tika Parser for mime-type

Posted by lewis john mcgibbney <le...@gmail.com>.
Fred,

Please start another thread fro this discussion, this way you might get your
question addressed.

On Mon, Sep 26, 2011 at 6:25 PM, Fred Zimmerman <wf...@nimblebooks.com> wrote:

> Basic question:
>
> I have Nutch crawling and sending documents to Solr for indexing.  Now when
> I get the Solr answer set, I want to go get all the documents at once and
> append them into a single HTMLish document.  Does Nutch have the full text
> of the documents stored somewhere already? Can I just "fetch" the docs from
> the Nutch local store if I have the URLs provided by Solr? How?
>
> Fred
>



-- 
*Lewis*

Re: Can't retrieve Tika Parser for mime-type

Posted by Fred Zimmerman <wf...@nimblebooks.com>.
Basic question:

I have Nutch crawling and sending documents to Solr for indexing.  Now when
I get the Solr answer set, I want to go get all the documents at once and
append them into a single HTMLish document.  Does Nutch have the full text
of the documents stored somewhere already? Can I just "fetch" the docs from
the Nutch local store if I have the URLs provided by Solr? How?

Fred

Re: Can't retrieve Tika Parser for mime-type

Posted by Bai Shen <ba...@gmail.com>.
Currently I'm using the Nutch defaults.  I haven't changed anything.

AFAIK, Tika can parse JS, so why would the parser not be found?

Also, I believe the defaults have parse-js configured to parse the JS as
well, correct?

On Tue, Sep 27, 2011 at 9:11 AM, lewis john mcgibbney <
lewis.mcgibbney@gmail.com> wrote:

> try using parse-js for your JavaScript files
>
> I can't help if I don't know what other files your having an error parsing,
> sorry.
>
> Lewis
>
> On Tue, Sep 27, 2011 at 2:09 PM, Bai Shen <ba...@gmail.com> wrote:
>
> > The main one I'm seeing with this is javascript files.  There have been
> > others, but I don't remember which.
> >
> > On Mon, Sep 26, 2011 at 12:51 PM, lewis john mcgibbney <
> > lewis.mcgibbney@gmail.com> wrote:
> >
> > > what type of files are you trying to parse.
> > >
> > > Have you configured Nutch to use parse-tika in nutch-site.xml
> > >
> > > On Mon, Sep 26, 2011 at 1:49 PM, Bai Shen <ba...@gmail.com>
> > wrote:
> > >
> > > > When I run the parse command, I keep getting the following error,
> > "Can't
> > > > retrieve Tika Parser for mime-type"
> > > >
> > > > How do I add all of the Tika parsers to Nutch?  I thought they were
> > > already
> > > > part of it.
> > > >
> > >
> > >
> > >
> > > --
> > > *Lewis*
> > >
> >
>
>
>
> --
> *Lewis*
>

Re: Can't retrieve Tika Parser for mime-type

Posted by lewis john mcgibbney <le...@gmail.com>.
try using parse-js for your JavaScript files

I can't help if I don't know what other files your having an error parsing,
sorry.

Lewis

On Tue, Sep 27, 2011 at 2:09 PM, Bai Shen <ba...@gmail.com> wrote:

> The main one I'm seeing with this is javascript files.  There have been
> others, but I don't remember which.
>
> On Mon, Sep 26, 2011 at 12:51 PM, lewis john mcgibbney <
> lewis.mcgibbney@gmail.com> wrote:
>
> > what type of files are you trying to parse.
> >
> > Have you configured Nutch to use parse-tika in nutch-site.xml
> >
> > On Mon, Sep 26, 2011 at 1:49 PM, Bai Shen <ba...@gmail.com>
> wrote:
> >
> > > When I run the parse command, I keep getting the following error,
> "Can't
> > > retrieve Tika Parser for mime-type"
> > >
> > > How do I add all of the Tika parsers to Nutch?  I thought they were
> > already
> > > part of it.
> > >
> >
> >
> >
> > --
> > *Lewis*
> >
>



-- 
*Lewis*

Re: Can't retrieve Tika Parser for mime-type

Posted by Bai Shen <ba...@gmail.com>.
The main one I'm seeing with this is javascript files.  There have been
others, but I don't remember which.

On Mon, Sep 26, 2011 at 12:51 PM, lewis john mcgibbney <
lewis.mcgibbney@gmail.com> wrote:

> what type of files are you trying to parse.
>
> Have you configured Nutch to use parse-tika in nutch-site.xml
>
> On Mon, Sep 26, 2011 at 1:49 PM, Bai Shen <ba...@gmail.com> wrote:
>
> > When I run the parse command, I keep getting the following error, "Can't
> > retrieve Tika Parser for mime-type"
> >
> > How do I add all of the Tika parsers to Nutch?  I thought they were
> already
> > part of it.
> >
>
>
>
> --
> *Lewis*
>

Re: Can't retrieve Tika Parser for mime-type

Posted by lewis john mcgibbney <le...@gmail.com>.
what type of files are you trying to parse.

Have you configured Nutch to use parse-tika in nutch-site.xml

On Mon, Sep 26, 2011 at 1:49 PM, Bai Shen <ba...@gmail.com> wrote:

> When I run the parse command, I keep getting the following error, "Can't
> retrieve Tika Parser for mime-type"
>
> How do I add all of the Tika parsers to Nutch?  I thought they were already
> part of it.
>



-- 
*Lewis*