You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Paul Tomblin <pt...@xcski.com> on 2009/08/13 17:26:06 UTC

My mistake

The patch I sent a few days ago doesn't work right, because when it's
fetching something that it's never seen before, datum.getFetchTime()
returns the *current* fetch time instead of the last fetch time.  When
it's fetching something that was fetched before, it returns the *last*
fetch time.  Obviously if you ask the web server for something that's
modified since *right*now*, it isn't going to return anything.

This whole problem would go away if datum.getModifiedTime worked.
When I dump the CrawlDatum out of the segment file, the modified time
is definitely in there, but datum.getModifiedTime() seems to always
return 0.  If I find out why that's happening, I'll send a patch.


-- 
http://www.linkedin.com/in/paultomblin