You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2007/05/30 20:37:16 UTC

[jira] Resolved: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content

     [ https://issues.apache.org/jira/browse/NUTCH-61?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrzej Bialecki  resolved NUTCH-61.
------------------------------------

       Resolution: Fixed
    Fix Version/s: 1.0.0

Applied with some modifications in rev. 542903.

> Adaptive re-fetch interval. Detecting umodified content
> -------------------------------------------------------
>
>                 Key: NUTCH-61
>                 URL: https://issues.apache.org/jira/browse/NUTCH-61
>             Project: Nutch
>          Issue Type: New Feature
>          Components: fetcher
>            Reporter: Andrzej Bialecki 
>            Assignee: Andrzej Bialecki 
>             Fix For: 1.0.0
>
>         Attachments: 20050606.diff, 20051230.txt, 20060227.txt, nutch-61-417287.patch, nutch-61-492176.patch
>
>
> Currently Nutch doesn't adjust automatically its re-fetch period, no matter if individual pages change seldom or frequently. The goal of these changes is to extend the current codebase to support various possible adjustments to re-fetch times and intervals, and specifically a re-fetch schedule which tries to adapt the period between consecutive fetches to the period of content changes.
> Also, these patches implement checking if the content has changed since last fetching; protocol plugins are also changed to make use of this information, so that if content is unmodified it doesn't have to be fetched and processed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Resolved: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content

Posted by Andrzej Bialecki <ab...@getopt.org>.
rubdabadub wrote:
> MANY MANY Super Thanks! I can't thank you enough for this Patch :-)
> This is so cool!!!

You're welcome :) I would appreciate it if you could give it some 
testing and provide feedback ...


-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Re: [jira] Resolved: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content

Posted by rubdabadub <ru...@gmail.com>.
MANY MANY Super Thanks! I can't thank you enough for this Patch :-)
This is so cool!!!

Regards
Rajesh

On 5/30/07, Andrzej Bialecki  (JIRA) <ji...@apache.org> wrote:
>
>      [ https://issues.apache.org/jira/browse/NUTCH-61?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Andrzej Bialecki  resolved NUTCH-61.
> ------------------------------------
>
>        Resolution: Fixed
>     Fix Version/s: 1.0.0
>
> Applied with some modifications in rev. 542903.
>
> > Adaptive re-fetch interval. Detecting umodified content
> > -------------------------------------------------------
> >
> >                 Key: NUTCH-61
> >                 URL: https://issues.apache.org/jira/browse/NUTCH-61
> >             Project: Nutch
> >          Issue Type: New Feature
> >          Components: fetcher
> >            Reporter: Andrzej Bialecki
> >            Assignee: Andrzej Bialecki
> >             Fix For: 1.0.0
> >
> >         Attachments: 20050606.diff, 20051230.txt, 20060227.txt, nutch-61-417287.patch, nutch-61-492176.patch
> >
> >
> > Currently Nutch doesn't adjust automatically its re-fetch period, no matter if individual pages change seldom or frequently. The goal of these changes is to extend the current codebase to support various possible adjustments to re-fetch times and intervals, and specifically a re-fetch schedule which tries to adapt the period between consecutive fetches to the period of content changes.
> > Also, these patches implement checking if the content has changed since last fetching; protocol plugins are also changed to make use of this information, so that if content is unmodified it doesn't have to be fetched and processed.
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>