You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2013/01/12 21:02:12 UTC
[jira] [Resolved] (NUTCH-813) Repetitive crawl 403 status page
[ https://issues.apache.org/jira/browse/NUTCH-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sebastian Nagel resolved NUTCH-813.
-----------------------------------
Resolution: Duplicate
The described problem is identical to that of NUTCH-578. The provided patch (call setPageGoneSchedule when retry counter hits db.fetch.retry.max) is included in all patches of NUTCH-578.
> Repetitive crawl 403 status page
> --------------------------------
>
> Key: NUTCH-813
> URL: https://issues.apache.org/jira/browse/NUTCH-813
> Project: Nutch
> Issue Type: Bug
> Affects Versions: 1.1
> Reporter: Nguyen Manh Tien
> Priority: Minor
> Fix For: 1.7
>
> Attachments: ASF.LICENSE.NOT.GRANTED--Patch
>
>
> When we crawl a page the return a 403 status. It will be crawl repetitively each days with default schedule.
> Even when we restrict by paramter db.fetch.retry.max
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira