You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@nutch.apache.org by "King Kong (JIRA)" <ji...@apache.org> on 2007/09/17 08:34:32 UTC

[jira] Created: (NUTCH-556) automatic adjust the CrawlDatum.fetchInterval according to the number of newly outlinks

automatic adjust the CrawlDatum.fetchInterval according to the number of newly outlinks
---------------------------------------------------------------------------------------

                 Key: NUTCH-556
                 URL: https://issues.apache.org/jira/browse/NUTCH-556
             Project: Nutch
          Issue Type: New Feature
          Components: fetcher
            Reporter: King Kong


Usually, the spider must could  find the new urls  in time.

but  the score of url can not reflect it Adequately.

Could we adjust the CrawlDatum.fetchInterval according to the number of newly outlinks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Closed: (NUTCH-556) automatic adjust the CrawlDatum.fetchInterval according to the number of newly outlinks

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/NUTCH-556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrzej Bialecki  closed NUTCH-556.
-----------------------------------

    Resolution: Won't Fix

> automatic adjust the CrawlDatum.fetchInterval according to the number of newly outlinks
> ---------------------------------------------------------------------------------------
>
>                 Key: NUTCH-556
>                 URL: https://issues.apache.org/jira/browse/NUTCH-556
>             Project: Nutch
>          Issue Type: New Feature
>          Components: fetcher
>            Reporter: King Kong
>
> The spider must could  find the new urls  in time.  and the new urls usually are included in some url like index page,list page.
> but  the score of url can not reflect it Adequately.
> Could we adjust the CrawlDatum.fetchInterval according to the number of newly outlinks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (NUTCH-556) automatic adjust the CrawlDatum.fetchInterval according to the number of newly outlinks

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/NUTCH-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12578960#action_12578960 ] 

Andrzej Bialecki  commented on NUTCH-556:
-----------------------------------------

Unless I'm missing something, this can be implemented as a custom FetchSchedule. If there are no objections I'd like to close this issue with Won't Fix.

> automatic adjust the CrawlDatum.fetchInterval according to the number of newly outlinks
> ---------------------------------------------------------------------------------------
>
>                 Key: NUTCH-556
>                 URL: https://issues.apache.org/jira/browse/NUTCH-556
>             Project: Nutch
>          Issue Type: New Feature
>          Components: fetcher
>            Reporter: King Kong
>
> The spider must could  find the new urls  in time.  and the new urls usually are included in some url like index page,list page.
> but  the score of url can not reflect it Adequately.
> Could we adjust the CrawlDatum.fetchInterval according to the number of newly outlinks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (NUTCH-556) automatic adjust the CrawlDatum.fetchInterval according to the number of newly outlinks

Posted by "King Kong (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/NUTCH-556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

King Kong updated NUTCH-556:
----------------------------

    Description: 
The spider must could  find the new urls  in time.  and the new urls usually are included in some url like index page,list page.

but  the score of url can not reflect it Adequately.

Could we adjust the CrawlDatum.fetchInterval according to the number of newly outlinks.

  was:
Usually, the spider must could  find the new urls  in time.

but  the score of url can not reflect it Adequately.

Could we adjust the CrawlDatum.fetchInterval according to the number of newly outlinks.


> automatic adjust the CrawlDatum.fetchInterval according to the number of newly outlinks
> ---------------------------------------------------------------------------------------
>
>                 Key: NUTCH-556
>                 URL: https://issues.apache.org/jira/browse/NUTCH-556
>             Project: Nutch
>          Issue Type: New Feature
>          Components: fetcher
>            Reporter: King Kong
>
> The spider must could  find the new urls  in time.  and the new urls usually are included in some url like index page,list page.
> but  the score of url can not reflect it Adequately.
> Could we adjust the CrawlDatum.fetchInterval according to the number of newly outlinks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.