You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by James Ford <si...@gmail.com> on 2012/03/26 13:35:54 UTC

Bottleneck of my crawls: NativeCodeLoader

Hello,

I am trying to optimize my crawls as much as possible. The current
bottleneck is the step after adding segments to the linkdb, where Nutch is
trying to load the natiive-hadoop library:

2012-03-26 13:20:59,089 WARN  util.NativeCodeLoader - Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable

This step takes about 15 minutes, compared to all other steps which takes
about 25 minutes in total. How can I make this step faster?

Thanks,
James Ford


--
View this message in context: http://lucene.472066.n3.nabble.com/Bottleneck-of-my-crawls-NativeCodeLoader-tp3857929p3857929.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Bottleneck of my crawls: NativeCodeLoader

Posted by Sebastian Nagel <wa...@googlemail.com>.
Hi James,

there is a description on how to install native libraries:
   lib/native/README.txt
If installed appropriately native libs are loaded and the
warnings will disappear.

But are you sure that it's really the library loading that
takes the time and not the step run after but without an
initial log message? LinkDb may take some time because it
reads all (or all newly created) segments.

Sebastian

On 03/26/2012 01:35 PM, James Ford wrote:
> Hello,
>
> I am trying to optimize my crawls as much as possible. The current
> bottleneck is the step after adding segments to the linkdb, where Nutch is
> trying to load the natiive-hadoop library:
>
> 2012-03-26 13:20:59,089 WARN  util.NativeCodeLoader - Unable to load
> native-hadoop library for your platform... using builtin-java classes where
> applicable
>
> This step takes about 15 minutes, compared to all other steps which takes
> about 25 minutes in total. How can I make this step faster?
>
> Thanks,
> James Ford
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Bottleneck-of-my-crawls-NativeCodeLoader-tp3857929p3857929.html
> Sent from the Nutch - User mailing list archive at Nabble.com.