You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Axel Schöner <ax...@hs-kl.de> on 2015/10/16 14:13:52 UTC

after 404 -> status switches directly to db_gone (db.fetch.retry.max does not work)

Hello,

we use apache-nutch-1.10 in combination with solr-5.3.0.
Our problem is, that if we get a 404 status while recrawling, then the status of the document switch to "db_gone" and gets deleted in solr.
Should it not be possible with "db.fetch.retry.max" to set the document temporary to "not_fetched" and after a few retries to  "db_gone"?


Thanks,
Axel

-- 
M. Sc. Axel Schöner
Hochschule Kaiserslautern in Zweibrücken
FB Informatik / MST
Amerikastraße 1
D-66482 Zweibrücken
phone: 0631-3724 5544
email: axel.schoener@hs-kl.de
http://hs-kl.de/~axel.schoener/
-------------------------------------------------------------