commits@nutch.apache.org, 2019-12

You are viewing a plain text version of this content. The canonical link for it is here.

- [nutch] branch master updated: NUTCH-2748 Fetch status gone (redirect exceeded) not to overwrite existing items in CrawlDb - new configuration property `http.redirect.max.exceeded.skip`: * if true skip redirect targets if http.redirect.max is exceeded * if false (default): store the redirect targets with status "linked" - log whether exceeded redirects are "skipped" or "linked" - posted by sn...@apache.org on 2019/12/02 11:45:45 UTC, 0 replies.
- [nutch] branch master updated: NUTCH-2745 Solr schema.xml not shipped in binary release - copy Solr schema.xml to runtime to include it into binary release packages - posted by sn...@apache.org on 2019/12/20 12:37:43 UTC, 0 replies.
- [nutch] branch master updated: NUTCH-2754 fetcher.max.crawl.delay ignored if exceeding 5 min. / 300 sec. - initialize crawler-commons's SimpleRobotRulesParser with the longest possible internal maxDelay - posted by sn...@apache.org on 2019/12/23 10:57:26 UTC, 0 replies.
- [nutch] branch master updated: Fix for NUTCH-1863: Add JSON format dump output to readdb command (#490) - posted by sn...@apache.org on 2019/12/27 16:42:19 UTC, 0 replies.