You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by cesar voulgaris <ce...@gmail.com> on 2007/04/03 05:14:15 UTC

problem with date fetched pages?

hi I'm using nutch-0.8.1 the readdb dump of a crawl I´m running gives (just
a part):

http://aaa.org.uy/canopus/agosto_05/articulo_agosto.htm Version: 4
Status: 1 (DB_unfetched)
Fetch time: Tue Feb 13 21:47:26 EST 2007
Modified time: Wed Dec 31 19:00:00 EST 1969
Retries since fetch: 0
Retry interval: 90.0 days
Score: 0.0069444445
Signature: null
Metadata: null

http://aaa.org.uy/canopus/agosto_05/canopus_agosto.htm  Version: 4
Status: 2 (DB_fetched)
Fetch time: Mon May 14 05:40:28 EDT 2007
Modified time: Wed Dec 31 19:00:00 EST 1969
Retries since fetch: 0
Retry interval: 90.0 days
Score: 0.0069444445
Signature: 0b390d46d129cd1c729208bee2d6fbb4
Metadata: null

I don´t understand the meaning of Fetch Time and  Modified Time. Can anyone
explain me what really means each one. Is fetch time the next fetch time?,
if it is why one page has may 2007 (future) and the other feb 2007. Why
modified time has 1969 !! for year?...thanks