You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2010/08/09 16:26:16 UTC

[jira] Updated: (NUTCH-876) Remove remaining robots/IP blocking code in lib-http

     [ https://issues.apache.org/jira/browse/NUTCH-876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrzej Bialecki  updated NUTCH-876:
------------------------------------

    Attachment: NUTCH-876.patch

Patch to fix the issue. If there are no objections I'll commit this shortly.

> Remove remaining robots/IP blocking code in lib-http
> ----------------------------------------------------
>
>                 Key: NUTCH-876
>                 URL: https://issues.apache.org/jira/browse/NUTCH-876
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 2.0
>            Reporter: Andrzej Bialecki 
>            Assignee: Andrzej Bialecki 
>         Attachments: NUTCH-876.patch
>
>
> There are remains of the (very old) blocking code in lib-http/.../HttpBase.java. This code was used with the OldFetcher to manage politeness limits. New trunk doesn't have OldFetcher anymore, so this code is useless. Furthermore, there is an actual bug here - FetcherJob forgets to set Protocol.CHECK_BLOCKING and Protocol.CHECK_ROBOTS to false, and the defaults in lib-http are set to true.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.