You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2007/05/30 20:37:16 UTC
[jira] Resolved: (NUTCH-61) Adaptive re-fetch interval. Detecting
umodified content
[ https://issues.apache.org/jira/browse/NUTCH-61?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrzej Bialecki resolved NUTCH-61.
------------------------------------
Resolution: Fixed
Fix Version/s: 1.0.0
Applied with some modifications in rev. 542903.
> Adaptive re-fetch interval. Detecting umodified content
> -------------------------------------------------------
>
> Key: NUTCH-61
> URL: https://issues.apache.org/jira/browse/NUTCH-61
> Project: Nutch
> Issue Type: New Feature
> Components: fetcher
> Reporter: Andrzej Bialecki
> Assignee: Andrzej Bialecki
> Fix For: 1.0.0
>
> Attachments: 20050606.diff, 20051230.txt, 20060227.txt, nutch-61-417287.patch, nutch-61-492176.patch
>
>
> Currently Nutch doesn't adjust automatically its re-fetch period, no matter if individual pages change seldom or frequently. The goal of these changes is to extend the current codebase to support various possible adjustments to re-fetch times and intervals, and specifically a re-fetch schedule which tries to adapt the period between consecutive fetches to the period of content changes.
> Also, these patches implement checking if the content has changed since last fetching; protocol plugins are also changed to make use of this information, so that if content is unmodified it doesn't have to be fetched and processed.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
Re: [jira] Resolved: (NUTCH-61) Adaptive re-fetch interval. Detecting
umodified content
Posted by Andrzej Bialecki <ab...@getopt.org>.
rubdabadub wrote:
> MANY MANY Super Thanks! I can't thank you enough for this Patch :-)
> This is so cool!!!
You're welcome :) I would appreciate it if you could give it some
testing and provide feedback ...
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
Re: [jira] Resolved: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content
Posted by rubdabadub <ru...@gmail.com>.
MANY MANY Super Thanks! I can't thank you enough for this Patch :-)
This is so cool!!!
Regards
Rajesh
On 5/30/07, Andrzej Bialecki (JIRA) <ji...@apache.org> wrote:
>
> [ https://issues.apache.org/jira/browse/NUTCH-61?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Andrzej Bialecki resolved NUTCH-61.
> ------------------------------------
>
> Resolution: Fixed
> Fix Version/s: 1.0.0
>
> Applied with some modifications in rev. 542903.
>
> > Adaptive re-fetch interval. Detecting umodified content
> > -------------------------------------------------------
> >
> > Key: NUTCH-61
> > URL: https://issues.apache.org/jira/browse/NUTCH-61
> > Project: Nutch
> > Issue Type: New Feature
> > Components: fetcher
> > Reporter: Andrzej Bialecki
> > Assignee: Andrzej Bialecki
> > Fix For: 1.0.0
> >
> > Attachments: 20050606.diff, 20051230.txt, 20060227.txt, nutch-61-417287.patch, nutch-61-492176.patch
> >
> >
> > Currently Nutch doesn't adjust automatically its re-fetch period, no matter if individual pages change seldom or frequently. The goal of these changes is to extend the current codebase to support various possible adjustments to re-fetch times and intervals, and specifically a re-fetch schedule which tries to adapt the period between consecutive fetches to the period of content changes.
> > Also, these patches implement checking if the content has changed since last fetching; protocol plugins are also changed to make use of this information, so that if content is unmodified it doesn't have to be fetched and processed.
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>