You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "King Kong (JIRA)" <ji...@apache.org> on 2007/09/17 08:34:32 UTC
[jira] Created: (NUTCH-556) automatic adjust the
CrawlDatum.fetchInterval according to the number of newly outlinks
automatic adjust the CrawlDatum.fetchInterval according to the number of newly outlinks
---------------------------------------------------------------------------------------
Key: NUTCH-556
URL: https://issues.apache.org/jira/browse/NUTCH-556
Project: Nutch
Issue Type: New Feature
Components: fetcher
Reporter: King Kong
Usually, the spider must could find the new urls in time.
but the score of url can not reflect it Adequately.
Could we adjust the CrawlDatum.fetchInterval according to the number of newly outlinks.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Closed: (NUTCH-556) automatic adjust the
CrawlDatum.fetchInterval according to the number of newly outlinks
Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrzej Bialecki closed NUTCH-556.
-----------------------------------
Resolution: Won't Fix
> automatic adjust the CrawlDatum.fetchInterval according to the number of newly outlinks
> ---------------------------------------------------------------------------------------
>
> Key: NUTCH-556
> URL: https://issues.apache.org/jira/browse/NUTCH-556
> Project: Nutch
> Issue Type: New Feature
> Components: fetcher
> Reporter: King Kong
>
> The spider must could find the new urls in time. and the new urls usually are included in some url like index page,list page.
> but the score of url can not reflect it Adequately.
> Could we adjust the CrawlDatum.fetchInterval according to the number of newly outlinks.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (NUTCH-556) automatic adjust the
CrawlDatum.fetchInterval according to the number of newly outlinks
Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12578960#action_12578960 ]
Andrzej Bialecki commented on NUTCH-556:
-----------------------------------------
Unless I'm missing something, this can be implemented as a custom FetchSchedule. If there are no objections I'd like to close this issue with Won't Fix.
> automatic adjust the CrawlDatum.fetchInterval according to the number of newly outlinks
> ---------------------------------------------------------------------------------------
>
> Key: NUTCH-556
> URL: https://issues.apache.org/jira/browse/NUTCH-556
> Project: Nutch
> Issue Type: New Feature
> Components: fetcher
> Reporter: King Kong
>
> The spider must could find the new urls in time. and the new urls usually are included in some url like index page,list page.
> but the score of url can not reflect it Adequately.
> Could we adjust the CrawlDatum.fetchInterval according to the number of newly outlinks.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (NUTCH-556) automatic adjust the
CrawlDatum.fetchInterval according to the number of newly outlinks
Posted by "King Kong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
King Kong updated NUTCH-556:
----------------------------
Description:
The spider must could find the new urls in time. and the new urls usually are included in some url like index page,list page.
but the score of url can not reflect it Adequately.
Could we adjust the CrawlDatum.fetchInterval according to the number of newly outlinks.
was:
Usually, the spider must could find the new urls in time.
but the score of url can not reflect it Adequately.
Could we adjust the CrawlDatum.fetchInterval according to the number of newly outlinks.
> automatic adjust the CrawlDatum.fetchInterval according to the number of newly outlinks
> ---------------------------------------------------------------------------------------
>
> Key: NUTCH-556
> URL: https://issues.apache.org/jira/browse/NUTCH-556
> Project: Nutch
> Issue Type: New Feature
> Components: fetcher
> Reporter: King Kong
>
> The spider must could find the new urls in time. and the new urls usually are included in some url like index page,list page.
> but the score of url can not reflect it Adequately.
> Could we adjust the CrawlDatum.fetchInterval according to the number of newly outlinks.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.