You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by 韩驰 <ha...@gmail.com> on 2014/05/19 08:52:29 UTC

Fwd: Nutch2.x modifiedTime and prevmodifiedTime?

Hi everyone!


After reading the issue:
https://issues.apache.org/jira/browse/NUTCH-1651, I have some doubts.
What is the modifiedTime and prevmodifiedTime? And is the target to
avoid fetching the same urls when fetching for a second time?


Thank you in advance!

Re: Nutch2.x modifiedTime and prevmodifiedTime?

Posted by Hanchi <ha...@gmail.com>.
Thank you very much, feng lu!
I have same problem with the url
https://www.mail-archive.com/user@nutch.apache.org/msg12111.html, can it be
solved?

Regards.

2014-05-19 15:37 GMT+08:00 feng lu <am...@gmail.com>:

> I see that the ModifiedTime used in protocol plugins that mean if the
> webpage has not changed , no need to download again. And their have also
> used in FetchSchedule implementation that used for continuously monitor a
> site and crawl updates.
>
>
> On Mon, May 19, 2014 at 2:52 PM, 韩驰 <ha...@gmail.com> wrote:
>
> > Hi everyone!
> >
> >
> > After reading the issue:
> > https://issues.apache.org/jira/browse/NUTCH-1651, I have some doubts.
> > What is the modifiedTime and prevmodifiedTime? And is the target to
> > avoid fetching the same urls when fetching for a second time?
> >
> >
> > Thank you in advance!
> >
>
>
>
> --
> Don't Grow Old, Grow Up... :-)
>

Re: Nutch2.x modifiedTime and prevmodifiedTime?

Posted by feng lu <am...@gmail.com>.
I see that the ModifiedTime used in protocol plugins that mean if the
webpage has not changed , no need to download again. And their have also
used in FetchSchedule implementation that used for continuously monitor a
site and crawl updates.


On Mon, May 19, 2014 at 2:52 PM, 韩驰 <ha...@gmail.com> wrote:

> Hi everyone!
>
>
> After reading the issue:
> https://issues.apache.org/jira/browse/NUTCH-1651, I have some doubts.
> What is the modifiedTime and prevmodifiedTime? And is the target to
> avoid fetching the same urls when fetching for a second time?
>
>
> Thank you in advance!
>



-- 
Don't Grow Old, Grow Up... :-)