You are viewing a plain text version of this content. The canonical link for it is here.
- [nutch] branch master updated: NUTCH-2434 Add methods to reset parameters HTMLMetaTags (apply patch contributed by Markus) - posted by sn...@apache.org on 2020/05/05 09:29:19 UTC, 0 replies.
- [nutch] branch master updated: NUTCH-1194 Generator: CrawlDB lock should be released earlier - release CrawlDb lock after select step, in case, generated items are not marked in CrawlDb (generate.update.crawldb is false) - posted by sn...@apache.org on 2020/05/05 12:10:34 UTC, 0 replies.
- [nutch] branch master updated: NUTCH-2785 FreeGenerator: command-line option to define number of generated fetch lists - add command-line option `-numFetchers` to FreeGenerator - in local mode: generate one single fetch list - posted by sn...@apache.org on 2020/05/05 13:56:24 UTC, 0 replies.
- [nutch] branch master updated: NUTCH-2002 parse and index checkers to check robots.txt - applied Julien's patch to recent code base - also check redirects whether they are allowed - add command-line parameter `-checkRobotsTxt` enabling this check - posted by sn...@apache.org on 2020/05/05 13:59:12 UTC, 0 replies.
- [nutch] branch master updated: NUTCH-2753 Add -listen option to command-line help of CrawlDbReader and LinkDbReader - posted by sn...@apache.org on 2020/05/05 14:00:08 UTC, 0 replies.
- [nutch] branch master updated: NUTCH-2758 Add plugin READMEs to binary release packages - posted by sn...@apache.org on 2020/05/05 14:01:14 UTC, 0 replies.
- [nutch] branch master updated: NUTCH-1945 Test for XLSX parser - add Tika unit test for XLSX files - bundle instance variables and utility methods in class TikaParserTest - clean up javadoc comments - posted by sn...@apache.org on 2020/05/12 13:35:21 UTC, 0 replies.
- [nutch] branch master updated (e61a8a3 -> 9139d6e) - posted by sn...@apache.org on 2020/05/14 15:43:24 UTC, 0 replies.