You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "hanchi (JIRA)" <ji...@apache.org> on 2014/05/17 13:53:14 UTC

[jira] [Created] (NUTCH-1784) CLONE - modifiedTime and prevmodifiedTime never set

hanchi created NUTCH-1784:
-----------------------------

             Summary: CLONE - modifiedTime and prevmodifiedTime never set 
                 Key: NUTCH-1784
                 URL: https://issues.apache.org/jira/browse/NUTCH-1784
             Project: Nutch
          Issue Type: Bug
    Affects Versions: 2.2.1
            Reporter: hanchi
             Fix For: 2.3
         Attachments: NUTCH-1651.patch

modifiedTime is never set. If you use DefaultFetchScheduler, modifiedTime is always zero as default. But if you use AdaptiveFetchScheduler, modifiedTime is set only once in the beginning by zero-control of AdaptiveFetchScheduler.
But this is not sufficient since modifiedTime needs to be updated whenever last modified time is available. We corrected this with a patch.

Also we noticed that prevModifiedTime is not written to database and we corrected that too.

With this patch, whenever lastModifiedTime is available, we do two things. First we set modifiedTime in the Page object to prevModifiedTime. After that we set lastModifiedTime to modifiedTime.





--
This message was sent by Atlassian JIRA
(v6.2#6252)