You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "hanchi (JIRA)" <ji...@apache.org> on 2014/05/17 13:53:14 UTC
[jira] [Created] (NUTCH-1784) CLONE - modifiedTime and
prevmodifiedTime never set
hanchi created NUTCH-1784:
-----------------------------
Summary: CLONE - modifiedTime and prevmodifiedTime never set
Key: NUTCH-1784
URL: https://issues.apache.org/jira/browse/NUTCH-1784
Project: Nutch
Issue Type: Bug
Affects Versions: 2.2.1
Reporter: hanchi
Fix For: 2.3
Attachments: NUTCH-1651.patch
modifiedTime is never set. If you use DefaultFetchScheduler, modifiedTime is always zero as default. But if you use AdaptiveFetchScheduler, modifiedTime is set only once in the beginning by zero-control of AdaptiveFetchScheduler.
But this is not sufficient since modifiedTime needs to be updated whenever last modified time is available. We corrected this with a patch.
Also we noticed that prevModifiedTime is not written to database and we corrected that too.
With this patch, whenever lastModifiedTime is available, we do two things. First we set modifiedTime in the Page object to prevModifiedTime. After that we set lastModifiedTime to modifiedTime.
--
This message was sent by Atlassian JIRA
(v6.2#6252)