You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Emmanuel <jo...@gmail.com> on 2007/09/10 15:22:23 UTC
Fetcher2 politeness?
I decided to use Fetcher2 instead of Fetcher and i noticed that
Fetcher2 doesn't act
on a polite way. I mean it doesn't wait fetcher.server.delay before
doing another
request on the same server.
In Fetcher2 (on the last version of trunk), someone has defined this option:
// set non-blocking & no-robots mode for HTTP protocol plugins.
getConf().setBoolean(Protocol.CHECK_BLOCKING, false);
getConf().setBoolean(Protocol.CHECK_ROBOTS, false);
In this case, the protocol HTTP doesn't wait crawlDelay defore doing
another request.
May I know exactly why ?
Is it normal or a bug ?
Re: Fetcher2 politeness?
Posted by Andrzej Bialecki <ab...@getopt.org>.
Emmanuel wrote:
> I decided to use Fetcher2 instead of Fetcher and i noticed that
> Fetcher2 doesn't act
> on a polite way. I mean it doesn't wait fetcher.server.delay before
> doing another
> request on the same server.
>
> In Fetcher2 (on the last version of trunk), someone has defined this option:
> // set non-blocking & no-robots mode for HTTP protocol plugins.
> getConf().setBoolean(Protocol.CHECK_BLOCKING, false);
> getConf().setBoolean(Protocol.CHECK_ROBOTS, false);
>
> In this case, the protocol HTTP doesn't wait crawlDelay defore doing
> another request.
> May I know exactly why ?
> Is it normal or a bug ?
>
Have you actually observed this wrong behavior during fetching? Fetcher2
performs blocking in a different way than Fetcher - it controls the
blocking itself, instead of delegating it to the protocol plugin. These
two properties are set to false on purpose.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com