You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Noora <no...@gmail.com> on 2014/05/06 13:59:55 UTC

Tika can't retrieve any parser

Hi list,

I am trying to crawl with nutch 1.7 but I have a problem with Tika. It
can't retrieve parser for any mime type.

I also have read archive and I've done these suggestions but it still does
not work.

1. Adding tika-mimetypes.xml manually and includeing it property in
nutch-site.xml.
2. Replacing deprecated function calls according to nutch 2.x or other ways.
3. Editing parse-plugins.xml to test different types and plugins.

How do you run tika? Does have specific setting to run it?

Any help'd be much appreciated.

Noora

Re: Tika can't retrieve any parser

Posted by Noora <no...@gmail.com>.
I solved the problem in eclipse: Tika libraries was created but not added
in build path and should be configured manually.

But anyone know about solving this problem when we run Nutch with shell?


On Tue, May 6, 2014 at 11:35 PM, Chear Huang <ch...@neurosky.com>wrote:

> yes i try to use nutch2.0 to crawl web page and its not work.
>
> On Tue, May 6, 2014 at 7:59 PM, Noora <no...@gmail.com> wrote:
> > Hi list,
> >
> > I am trying to crawl with nutch 1.7 but I have a problem with Tika. It
> > can't retrieve parser for any mime type.
> >
> > I also have read archive and I've done these suggestions but it still
> does
> > not work.
> >
> > 1. Adding tika-mimetypes.xml manually and includeing it property in
> > nutch-site.xml.
> > 2. Replacing deprecated function calls according to nutch 2.x or other
> ways.
> > 3. Editing parse-plugins.xml to test different types and plugins.
> >
> > How do you run tika? Does have specific setting to run it?
> >
> > Any help'd be much appreciated.
> >
> > Noora
>

Re: Tika can't retrieve any parser

Posted by Chear Huang <ch...@neurosky.com>.
yes i try to use nutch2.0 to crawl web page and its not work.

On Tue, May 6, 2014 at 7:59 PM, Noora <no...@gmail.com> wrote:
> Hi list,
>
> I am trying to crawl with nutch 1.7 but I have a problem with Tika. It
> can't retrieve parser for any mime type.
>
> I also have read archive and I've done these suggestions but it still does
> not work.
>
> 1. Adding tika-mimetypes.xml manually and includeing it property in
> nutch-site.xml.
> 2. Replacing deprecated function calls according to nutch 2.x or other ways.
> 3. Editing parse-plugins.xml to test different types and plugins.
>
> How do you run tika? Does have specific setting to run it?
>
> Any help'd be much appreciated.
>
> Noora