You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Patricio Galeas <pg...@yahoo.de> on 2011/03/22 02:57:44 UTC

strange results by nutch readdb crawl/crawldb -stats

Hi,
I have observer a strange results by running "bin/nutch readdb crawl/crawldb 
-stats", after I moved the index to HDFS.
In the last three crawls the number of fetched urls decrease.
Is there some explanation for this values?

Thanks
Pat 

TOTAL urls: 13874470
status 1 (db_unfetched): 11670618
status 2 (db_fetched): 1679676
status 3 (db_gone): 323230
status 4 (db_redir_temp): 92372
status 5 (db_redir_perm): 79895
status 6 (db_notmodified): 28679

TOTAL urls: 13877076
status 1 (db_unfetched): 11672609
status 2 (db_fetched): 1675528
status 3 (db_gone): 323408
status 4 (db_redir_temp): 92451
status 5 (db_redir_perm): 79905
status 6 (db_notmodified): 33175

TOTAL urls: 13882160
status 1 (db_unfetched): 11682215
status 2 (db_fetched): 1665858
status 3 (db_gone): 323961
status 4 (db_redir_temp): 92486
status 5 (db_redir_perm): 79826
status 6 (db_notmodified): 37814