You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Bai Shen <ba...@gmail.com> on 2011/09/26 14:49:42 UTC
Can't retrieve Tika Parser for mime-type
When I run the parse command, I keep getting the following error, "Can't
retrieve Tika Parser for mime-type"
How do I add all of the Tika parsers to Nutch? I thought they were already
part of it.
Re: Can't retrieve Tika Parser for mime-type
Posted by lewis john mcgibbney <le...@gmail.com>.
Fred,
Please start another thread fro this discussion, this way you might get your
question addressed.
On Mon, Sep 26, 2011 at 6:25 PM, Fred Zimmerman <wf...@nimblebooks.com> wrote:
> Basic question:
>
> I have Nutch crawling and sending documents to Solr for indexing. Now when
> I get the Solr answer set, I want to go get all the documents at once and
> append them into a single HTMLish document. Does Nutch have the full text
> of the documents stored somewhere already? Can I just "fetch" the docs from
> the Nutch local store if I have the URLs provided by Solr? How?
>
> Fred
>
--
*Lewis*
Re: Can't retrieve Tika Parser for mime-type
Posted by Fred Zimmerman <wf...@nimblebooks.com>.
Basic question:
I have Nutch crawling and sending documents to Solr for indexing. Now when
I get the Solr answer set, I want to go get all the documents at once and
append them into a single HTMLish document. Does Nutch have the full text
of the documents stored somewhere already? Can I just "fetch" the docs from
the Nutch local store if I have the URLs provided by Solr? How?
Fred
Re: Can't retrieve Tika Parser for mime-type
Posted by Bai Shen <ba...@gmail.com>.
Currently I'm using the Nutch defaults. I haven't changed anything.
AFAIK, Tika can parse JS, so why would the parser not be found?
Also, I believe the defaults have parse-js configured to parse the JS as
well, correct?
On Tue, Sep 27, 2011 at 9:11 AM, lewis john mcgibbney <
lewis.mcgibbney@gmail.com> wrote:
> try using parse-js for your JavaScript files
>
> I can't help if I don't know what other files your having an error parsing,
> sorry.
>
> Lewis
>
> On Tue, Sep 27, 2011 at 2:09 PM, Bai Shen <ba...@gmail.com> wrote:
>
> > The main one I'm seeing with this is javascript files. There have been
> > others, but I don't remember which.
> >
> > On Mon, Sep 26, 2011 at 12:51 PM, lewis john mcgibbney <
> > lewis.mcgibbney@gmail.com> wrote:
> >
> > > what type of files are you trying to parse.
> > >
> > > Have you configured Nutch to use parse-tika in nutch-site.xml
> > >
> > > On Mon, Sep 26, 2011 at 1:49 PM, Bai Shen <ba...@gmail.com>
> > wrote:
> > >
> > > > When I run the parse command, I keep getting the following error,
> > "Can't
> > > > retrieve Tika Parser for mime-type"
> > > >
> > > > How do I add all of the Tika parsers to Nutch? I thought they were
> > > already
> > > > part of it.
> > > >
> > >
> > >
> > >
> > > --
> > > *Lewis*
> > >
> >
>
>
>
> --
> *Lewis*
>
Re: Can't retrieve Tika Parser for mime-type
Posted by lewis john mcgibbney <le...@gmail.com>.
try using parse-js for your JavaScript files
I can't help if I don't know what other files your having an error parsing,
sorry.
Lewis
On Tue, Sep 27, 2011 at 2:09 PM, Bai Shen <ba...@gmail.com> wrote:
> The main one I'm seeing with this is javascript files. There have been
> others, but I don't remember which.
>
> On Mon, Sep 26, 2011 at 12:51 PM, lewis john mcgibbney <
> lewis.mcgibbney@gmail.com> wrote:
>
> > what type of files are you trying to parse.
> >
> > Have you configured Nutch to use parse-tika in nutch-site.xml
> >
> > On Mon, Sep 26, 2011 at 1:49 PM, Bai Shen <ba...@gmail.com>
> wrote:
> >
> > > When I run the parse command, I keep getting the following error,
> "Can't
> > > retrieve Tika Parser for mime-type"
> > >
> > > How do I add all of the Tika parsers to Nutch? I thought they were
> > already
> > > part of it.
> > >
> >
> >
> >
> > --
> > *Lewis*
> >
>
--
*Lewis*
Re: Can't retrieve Tika Parser for mime-type
Posted by Bai Shen <ba...@gmail.com>.
The main one I'm seeing with this is javascript files. There have been
others, but I don't remember which.
On Mon, Sep 26, 2011 at 12:51 PM, lewis john mcgibbney <
lewis.mcgibbney@gmail.com> wrote:
> what type of files are you trying to parse.
>
> Have you configured Nutch to use parse-tika in nutch-site.xml
>
> On Mon, Sep 26, 2011 at 1:49 PM, Bai Shen <ba...@gmail.com> wrote:
>
> > When I run the parse command, I keep getting the following error, "Can't
> > retrieve Tika Parser for mime-type"
> >
> > How do I add all of the Tika parsers to Nutch? I thought they were
> already
> > part of it.
> >
>
>
>
> --
> *Lewis*
>
Re: Can't retrieve Tika Parser for mime-type
Posted by lewis john mcgibbney <le...@gmail.com>.
what type of files are you trying to parse.
Have you configured Nutch to use parse-tika in nutch-site.xml
On Mon, Sep 26, 2011 at 1:49 PM, Bai Shen <ba...@gmail.com> wrote:
> When I run the parse command, I keep getting the following error, "Can't
> retrieve Tika Parser for mime-type"
>
> How do I add all of the Tika parsers to Nutch? I thought they were already
> part of it.
>
--
*Lewis*