You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/12/01 16:16:20 UTC
[jira] Closed: (NUTCH-769) Fetcher to skip queues for URLS getting
repeated exceptions
[ https://issues.apache.org/jira/browse/NUTCH-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrzej Bialecki closed NUTCH-769.
-----------------------------------
Resolution: Fixed
Fix Version/s: 1.1
Assignee: Andrzej Bialecki
> Fetcher to skip queues for URLS getting repeated exceptions
> -------------------------------------------------------------
>
> Key: NUTCH-769
> URL: https://issues.apache.org/jira/browse/NUTCH-769
> Project: Nutch
> Issue Type: Improvement
> Components: fetcher
> Reporter: Julien Nioche
> Assignee: Andrzej Bialecki
> Priority: Minor
> Fix For: 1.1
>
> Attachments: NUTCH-769-2.patch, NUTCH-769.patch
>
>
> As discussed on the mailing list (see http://www.mail-archive.com/nutch-user@lucene.apache.org/msg15360.html) this patch allows to clear URLs queues in the Fetcher when more than a set number of exceptions have been encountered in a row. This can speed up the fetching substantially in cases where target hosts are not responsive (as a TimeoutException would be thrown) and limits cases where a whole Fetch step is slowed down because of a few queues.
> by default the parameter fetcher.max.exceptions.per.queue has a value of -1 and is deactivated.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.