You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Nguyen Manh Tien (JIRA)" <ji...@apache.org> on 2010/04/18 06:09:25 UTC

[jira] Created: (NUTCH-813) Repetitive crawl 403 status page

Repetitive crawl 403 status page
--------------------------------

                 Key: NUTCH-813
                 URL: https://issues.apache.org/jira/browse/NUTCH-813
             Project: Nutch
          Issue Type: Bug
    Affects Versions: 1.1
            Reporter: Nguyen Manh Tien
         Attachments: Patch

When we crawl a page the return a 403 status. It will be crawl repetitively each days with default schedule.
Even when we restrict by paramter db.fetch.retry.max


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (NUTCH-813) Repetitive crawl 403 status page

Posted by "Nguyen Manh Tien (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nguyen Manh Tien updated NUTCH-813:
-----------------------------------

    Attachment: Patch

> Repetitive crawl 403 status page
> --------------------------------
>
>                 Key: NUTCH-813
>                 URL: https://issues.apache.org/jira/browse/NUTCH-813
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.1
>            Reporter: Nguyen Manh Tien
>         Attachments: Patch
>
>
> When we crawl a page the return a 403 status. It will be crawl repetitively each days with default schedule.
> Even when we restrict by paramter db.fetch.retry.max

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (NUTCH-813) Repetitive crawl 403 status page

Posted by "Nguyen Manh Tien (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nguyen Manh Tien updated NUTCH-813:
-----------------------------------

    Priority: Minor  (was: Major)

> Repetitive crawl 403 status page
> --------------------------------
>
>                 Key: NUTCH-813
>                 URL: https://issues.apache.org/jira/browse/NUTCH-813
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.1
>            Reporter: Nguyen Manh Tien
>            Priority: Minor
>         Attachments: Patch
>
>
> When we crawl a page the return a 403 status. It will be crawl repetitively each days with default schedule.
> Even when we restrict by paramter db.fetch.retry.max

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira