You are viewing a plain text version of this content. The canonical link for it is here.
- Build failed in Hudson: Nutch-Nightly #134 - posted by hu...@lucene.zones.apache.org on 2007/07/01 09:00:56 UTC, 0 replies.
- Hudson build is back to normal: Nutch-Nightly #135 - posted by hu...@lucene.zones.apache.org on 2007/07/02 06:50:17 UTC, 0 replies.
- Nutch nightly build and NUTCH-505 draft patch - posted by Kai_testing Middleton <ka...@yahoo.com> on 2007/07/02 08:59:46 UTC, 1 replies.
- Build failed in Hudson: Nutch-Nightly #136 - posted by hu...@lucene.zones.apache.org on 2007/07/02 09:00:06 UTC, 0 replies.
- Build failed in Hudson: Nutch-Nightly #137 - posted by hu...@lucene.zones.apache.org on 2007/07/03 09:00:26 UTC, 0 replies.
- Plans on releasing another bug fix release? - posted by Briggs <ac...@gmail.com> on 2007/07/03 16:12:48 UTC, 8 replies.
- Patch to skip hidden plugin directories - posted by David Fuhry <df...@cs.kent.edu> on 2007/07/03 19:33:16 UTC, 0 replies.
- Hudson build is back to normal: Nutch-Nightly #138 - posted by hu...@lucene.zones.apache.org on 2007/07/03 20:07:38 UTC, 0 replies.
- Re[2]: Plans on releasing another bug fix release? - posted by Nuther <nu...@proservice.ge> on 2007/07/04 08:12:56 UTC, 0 replies.
- Build failed in Hudson: Nutch-Nightly #139 - posted by hu...@lucene.zones.apache.org on 2007/07/04 09:00:17 UTC, 0 replies.
- Hudson build is back to normal: Nutch-Nightly #140 - posted by hu...@lucene.zones.apache.org on 2007/07/04 09:29:27 UTC, 0 replies.
- URL Injection with another source than text files - posted by Epo Jemba <ta...@gmail.com> on 2007/07/04 12:44:32 UTC, 1 replies.
- [jira] Created: (NUTCH-507) lib-lucene-analyzers jar defintion is wrong in plugin.xml - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/07 19:18:04 UTC, 0 replies.
- [jira] Created: (NUTCH-508) ${hadoop.log.dir} and ${hadoop.log.file} are not propagated to the tasktracker - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/07 19:28:04 UTC, 0 replies.
- mozdex as a backend search engine. - posted by Tsengtan A Shuy <tt...@sbcglobal.net> on 2007/07/07 19:42:21 UTC, 0 replies.
- [jira] Created: (NUTCH-509) Update Crawldb: avoid to start a job if there is no valid segment - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/08 10:04:04 UTC, 0 replies.
- [jira] Updated: (NUTCH-509) Update Crawldb: avoid to start a job if there is no valid segment - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/08 10:04:04 UTC, 0 replies.
- OPIC scoring differences - posted by Carl Cerecke <ca...@nzs.com> on 2007/07/09 00:38:08 UTC, 4 replies.
- [jira] Commented: (NUTCH-509) Update Crawldb: avoid to start a job if there is no valid segment - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/09 08:05:04 UTC, 1 replies.
- [jira] Resolved: (NUTCH-507) lib-lucene-analyzers jar defintion is wrong in plugin.xml - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/09 08:18:04 UTC, 0 replies.
- [jira] Closed: (NUTCH-509) Update Crawldb: avoid to start a job if there is no valid segment - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/09 08:18:04 UTC, 0 replies.
- [jira] Closed: (NUTCH-507) lib-lucene-analyzers jar defintion is wrong in plugin.xml - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/09 08:18:05 UTC, 0 replies.
- [jira] Created: (NUTCH-510) IndexMerger delete working dir - posted by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2007/07/09 08:37:04 UTC, 0 replies.
- [jira] Resolved: (NUTCH-503) Generator exits incorrectly for small fetchlists - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/09 08:48:04 UTC, 0 replies.
- [jira] Updated: (NUTCH-510) IndexMerger delete working dir - posted by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2007/07/09 08:52:04 UTC, 0 replies.
- spam detect - posted by anton <an...@orbita1.ru> on 2007/07/09 11:33:56 UTC, 0 replies.
- [jira] Issue Comment Edited: (NUTCH-510) IndexMerger delete working dir - posted by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2007/07/09 14:34:04 UTC, 0 replies.
- [jira] Commented: (NUTCH-508) ${hadoop.log.dir} and ${hadoop.log.file} are not propagated to the tasktracker - posted by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2007/07/09 15:48:04 UTC, 0 replies.
- Not renewing CrawlDatum on Inject - posted by Robert Young <bu...@gmail.com> on 2007/07/09 19:27:46 UTC, 3 replies.
- [jira] Commented: (NUTCH-507) lib-lucene-analyzers jar defintion is wrong in plugin.xml - posted by "Hudson (JIRA)" <ji...@apache.org> on 2007/07/10 06:21:05 UTC, 0 replies.
- [jira] Commented: (NUTCH-503) Generator exits incorrectly for small fetchlists - posted by "Hudson (JIRA)" <ji...@apache.org> on 2007/07/10 06:21:05 UTC, 0 replies.
- [jira] Updated: (NUTCH-439) Top Level Domains Indexing / Scoring - posted by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2007/07/10 09:51:04 UTC, 5 replies.
- [jira] Commented: (NUTCH-439) Top Level Domains Indexing / Scoring - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2007/07/10 11:24:04 UTC, 5 replies.
- [jira] Updated: (NUTCH-505) Outlink urls should be validated - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/10 14:42:05 UTC, 3 replies.
- Fwd: [Collex] application#index (ActionController::RoutingError) "no route found to match \"/nines/ escape(document.title) u,\" with {:method=>:get}" - posted by Erik Hatcher <er...@ehatchersolutions.com> on 2007/07/10 14:46:53 UTC, 1 replies.
- [jira] Commented: (NUTCH-505) Outlink urls should be validated - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2007/07/10 15:51:04 UTC, 5 replies.
- [jira] Issue Comment Edited: (NUTCH-505) Outlink urls should be validated - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/11 08:30:04 UTC, 0 replies.
- [jira] Resolved: (NUTCH-505) Outlink urls should be validated - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/11 12:56:04 UTC, 0 replies.
- [jira] Updated: (NUTCH-506) Nutch should delegate compression to Hadoop - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/11 14:04:07 UTC, 0 replies.
- [jira] Resolved: (NUTCH-510) IndexMerger delete working dir - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/11 17:32:04 UTC, 0 replies.
- [jira] Closed: (NUTCH-510) IndexMerger delete working dir - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/11 17:32:05 UTC, 0 replies.
- [jira] Commented: (NUTCH-510) IndexMerger delete working dir - posted by "Hudson (JIRA)" <ji...@apache.org> on 2007/07/12 08:50:04 UTC, 0 replies.
- how can i fetch a site manual - posted by Cuongnhc <cu...@gmail.com> on 2007/07/12 08:56:24 UTC, 0 replies.
- [jira] Commented: (NUTCH-506) Nutch should delegate compression to Hadoop - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/12 10:49:04 UTC, 2 replies.
- [jira] Created: (NUTCH-511) Recrawling - posted by "anuradha (JIRA)" <ji...@apache.org> on 2007/07/12 13:40:04 UTC, 0 replies.
- [jira] Created: (NUTCH-512) Search on date range - posted by "anuradha (JIRA)" <ji...@apache.org> on 2007/07/12 13:40:05 UTC, 0 replies.
- [jira] Closed: (NUTCH-511) Recrawling - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2007/07/12 17:25:05 UTC, 0 replies.
- [jira] Closed: (NUTCH-512) Search on date range - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2007/07/12 17:33:05 UTC, 0 replies.
- [jira] Created: (NUTCH-513) suffix-urlfilter.txt does not have a template - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/12 19:13:05 UTC, 0 replies.
- [jira] Closed: (NUTCH-505) Outlink urls should be validated - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/13 14:28:04 UTC, 0 replies.
- [jira] Commented: (NUTCH-513) suffix-urlfilter.txt does not have a template - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/13 14:35:05 UTC, 0 replies.
- NUTCH CONSULTANT NEEDED - posted by Luca Rondanini <lu...@translated.net> on 2007/07/13 17:18:12 UTC, 0 replies.
- running nutch of nfs - posted by prem kumar <pr...@gmail.com> on 2007/07/13 18:04:54 UTC, 0 replies.
- [jira] Resolved: (NUTCH-513) suffix-urlfilter.txt does not have a template - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/13 19:21:04 UTC, 0 replies.
- [jira] Closed: (NUTCH-513) suffix-urlfilter.txt does not have a template - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/13 19:23:04 UTC, 0 replies.
- [jira] Reopened: (NUTCH-471) Fix synchronization in NutchBean creation - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2007/07/13 22:58:06 UTC, 0 replies.
- Build failed in Hudson: Nutch-Nightly #149 - posted by hu...@lucene.zones.apache.org on 2007/07/14 06:08:16 UTC, 0 replies.
- [jira] Commented: (NUTCH-471) Fix synchronization in NutchBean creation - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/14 11:32:04 UTC, 1 replies.
- [jira] Created: (NUTCH-514) Indexer should only index pages with fetch status SUCCESS - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/14 14:10:05 UTC, 0 replies.
- [jira] Updated: (NUTCH-514) Indexer should only index pages with fetch status SUCCESS - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/14 14:12:04 UTC, 0 replies.
- [jira] Closed: (NUTCH-471) Fix synchronization in NutchBean creation - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2007/07/14 15:05:04 UTC, 0 replies.
- inject command fail on whole-web run - posted by Tsengtan A Shuy <tt...@sbcglobal.net> on 2007/07/14 21:10:47 UTC, 1 replies.
- Build failed in Hudson: Nutch-Nightly #150 - posted by hu...@lucene.zones.apache.org on 2007/07/15 06:05:41 UTC, 0 replies.
- Hudson build is back to normal: Nutch-Nightly #151 - posted by hu...@lucene.zones.apache.org on 2007/07/16 06:17:51 UTC, 0 replies.
- OOM error during parsing with nekohtml - posted by Shailendra Mudgal <mu...@gmail.com> on 2007/07/16 12:04:39 UTC, 6 replies.
- [jira] Created: (NUTCH-515) Next fetch time is set incorrectly - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/16 14:15:04 UTC, 0 replies.
- [jira] Updated: (NUTCH-515) Next fetch time is set incorrectly - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/16 14:17:04 UTC, 0 replies.
- [jira] Commented: (NUTCH-515) Next fetch time is set incorrectly - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2007/07/16 22:34:05 UTC, 2 replies.
- [jira] Resolved: (NUTCH-515) Next fetch time is set incorrectly - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/17 08:21:04 UTC, 0 replies.
- [jira] Created: (NUTCH-516) Next fetch time is not set when it is a CrawlDatum.STATUS_FETCH_GONE - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/17 14:08:23 UTC, 0 replies.
- [jira] Commented: (NUTCH-516) Next fetch time is not set when it is a CrawlDatum.STATUS_FETCH_GONE - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/17 15:53:05 UTC, 4 replies.
- [jira] Updated: (NUTCH-516) Next fetch time is not set when it is a CrawlDatum.STATUS_FETCH_GONE - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/17 16:19:04 UTC, 1 replies.
- [jira] Resolved: (NUTCH-506) Nutch should delegate compression to Hadoop - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/17 17:18:06 UTC, 0 replies.
- [jira] Closed: (NUTCH-506) Nutch should delegate compression to Hadoop - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/17 17:20:04 UTC, 0 replies.
- no nutch script file under bin directory - posted by Tsengtan A Shuy <tt...@sbcglobal.net> on 2007/07/17 21:22:48 UTC, 7 replies.
- [jira] Created: (NUTCH-517) build encoding should be UTF-8 - posted by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2007/07/18 10:09:04 UTC, 0 replies.
- [jira] Updated: (NUTCH-517) build encoding should be UTF-8 - posted by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2007/07/18 10:11:04 UTC, 0 replies.
- [jira] Created: (NUTCH-518) Fix OpicScoringFilter to respect scoring filter chaining - posted by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2007/07/18 10:16:04 UTC, 0 replies.
- [jira] Updated: (NUTCH-518) Fix OpicScoringFilter to respect scoring filter chaining - posted by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2007/07/18 10:18:04 UTC, 0 replies.
- [jira] Resolved: (NUTCH-517) build encoding should be UTF-8 - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/18 20:00:56 UTC, 0 replies.
- [jira] Closed: (NUTCH-517) build encoding should be UTF-8 - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/18 20:01:04 UTC, 0 replies.
- [jira] Resolved: (NUTCH-518) Fix OpicScoringFilter to respect scoring filter chaining - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/18 20:05:05 UTC, 0 replies.
- [jira] Closed: (NUTCH-518) Fix OpicScoringFilter to respect scoring filter chaining - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/18 20:05:05 UTC, 0 replies.
- [jira] Reopened: (NUTCH-518) Fix OpicScoringFilter to respect scoring filter chaining - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2007/07/18 20:32:05 UTC, 0 replies.
- [jira] Commented: (NUTCH-518) Fix OpicScoringFilter to respect scoring filter chaining - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/18 20:40:04 UTC, 8 replies.
- [jira] Created: (NUTCH-519)   prased incorrectly - posted by "Chris Hane (JIRA)" <ji...@apache.org> on 2007/07/18 23:54:06 UTC, 0 replies.
- ready for the first assignment - posted by Tsengtan A Shuy <tt...@sbcglobal.net> on 2007/07/19 00:12:04 UTC, 0 replies.
- [jira] Commented: (NUTCH-517) build encoding should be UTF-8 - posted by "Hudson (JIRA)" <ji...@apache.org> on 2007/07/19 06:27:04 UTC, 0 replies.
- resending this query on running nutch on nfs - posted by prem kumar <pr...@gmail.com> on 2007/07/19 09:31:04 UTC, 0 replies.
- [jira] Created: (NUTCH-520) A common infrastructure for different index backends - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/19 10:49:05 UTC, 0 replies.
- [jira] Updated: (NUTCH-520) A common infrastructure for different index backends - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/19 11:26:04 UTC, 2 replies.
- [jira] Created: (NUTCH-521) Modified injector to allow newly injected CrawlDatum to overwrite original - posted by "Rob Young (JIRA)" <ji...@apache.org> on 2007/07/19 11:51:04 UTC, 0 replies.
- [jira] Updated: (NUTCH-521) Modified injector to allow newly injected CrawlDatum to overwrite original - posted by "Rob Young (JIRA)" <ji...@apache.org> on 2007/07/19 11:51:05 UTC, 0 replies.
- Looking to fix relative path issue in linkdb - posted by Robert Young <bu...@gmail.com> on 2007/07/19 12:06:31 UTC, 5 replies.
- [jira] Commented: (NUTCH-521) Modified injector to allow newly injected CrawlDatum to overwrite original - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/19 12:47:04 UTC, 0 replies.
- [jira] Created: (NUTCH-522) Use URLValidator in the Injector - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/19 13:45:15 UTC, 0 replies.
- [jira] Updated: (NUTCH-522) Use URLValidator in the Injector - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/19 13:45:16 UTC, 4 replies.
- [jira] Commented: (NUTCH-522) Use URLValidator in the Injector - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/19 15:55:08 UTC, 6 replies.
- [jira] Commented: (NUTCH-25) needs 'character encoding' detector - posted by "Doug Cook (JIRA)" <ji...@apache.org> on 2007/07/21 01:59:06 UTC, 10 replies.
- [jira] Updated: (NUTCH-25) needs 'character encoding' detector - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/21 18:02:06 UTC, 4 replies.
- [jira] Issue Comment Edited: (NUTCH-25) needs 'character encoding' detector - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/21 21:11:06 UTC, 0 replies.
- [jira] Created: (NUTCH-523) web2 searchform problems with patch - posted by "Hal Finkel (JIRA)" <ji...@apache.org> on 2007/07/22 01:58:06 UTC, 0 replies.
- [jira] Updated: (NUTCH-523) web2 searchform problems with patch - posted by "Hal Finkel (JIRA)" <ji...@apache.org> on 2007/07/22 01:58:06 UTC, 0 replies.
- [jira] Assigned: (NUTCH-439) Top Level Domains Indexing / Scoring - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/23 12:41:31 UTC, 0 replies.
- [jira] Assigned: (NUTCH-522) Use URLValidator in the Injector - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/23 12:41:31 UTC, 0 replies.
- [jira] Issue Comment Edited: (NUTCH-520) A common infrastructure for different index backends - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/23 17:00:38 UTC, 0 replies.
- [jira] Created: (NUTCH-524) Generate Problem with Single Node - posted by "Daniel Clark (JIRA)" <ji...@apache.org> on 2007/07/23 23:27:31 UTC, 0 replies.
- [jira] Updated: (NUTCH-524) Generate Problem with Single Node - posted by "Daniel Clark (JIRA)" <ji...@apache.org> on 2007/07/23 23:29:31 UTC, 0 replies.
- searchserver failover problem - posted by Nathan Wilkinson <ex...@ausvision.com> on 2007/07/24 04:44:19 UTC, 0 replies.
- [jira] Commented: (NUTCH-524) Generate Problem with Single Node - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/24 08:49:31 UTC, 2 replies.
- [jira] Updated: (NUTCH-525) DeleteDuplicates generates ArrayIndexOutOfBoundsException when trying to rerun dedup on a segment - posted by "Vishal Shah (JIRA)" <ji...@apache.org> on 2007/07/24 09:43:31 UTC, 1 replies.
- [jira] Created: (NUTCH-525) DeleteDuplicates generates ArrayIndexOutOfBoundsException when trying to rerun dedup on a segment - posted by "Vishal Shah (JIRA)" <ji...@apache.org> on 2007/07/24 09:43:31 UTC, 0 replies.
- [jira] Commented: (NUTCH-525) DeleteDuplicates generates ArrayIndexOutOfBoundsException when trying to rerun dedup on a segment - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/24 09:57:31 UTC, 5 replies.
- [jira] Updated: (NUTCH-526) Use a combiner in LinDbMerger to improve the performance as in LinkDb - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/25 04:24:31 UTC, 0 replies.
- [jira] Created: (NUTCH-526) Use a combiner in LinDbMerger to improve the performance as in LinkDb - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/25 04:24:31 UTC, 0 replies.
- [jira] Commented: (NUTCH-526) Use a combiner in LinDbMerger to improve the performance as in LinkDb - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/25 08:32:31 UTC, 3 replies.
- [jira] Created: (NUTCH-527) MapWritable doesn't support all hadoops writable types - posted by "Rob Young (JIRA)" <ji...@apache.org> on 2007/07/25 13:03:31 UTC, 0 replies.
- [jira] Updated: (NUTCH-527) MapWritable doesn't support all hadoops writable types - posted by "Rob Young (JIRA)" <ji...@apache.org> on 2007/07/25 13:07:31 UTC, 1 replies.
- CrawlDbReader TopN - posted by Emmanuel <jo...@gmail.com> on 2007/07/25 13:50:36 UTC, 0 replies.
- [jira] Commented: (NUTCH-527) MapWritable doesn't support all hadoops writable types - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/25 14:39:31 UTC, 3 replies.
- [jira] Created: (NUTCH-528) CrawlDbReader: add some new stats + dump into a csv format - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/26 09:55:31 UTC, 0 replies.
- [jira] Updated: (NUTCH-528) CrawlDbReader: add some new stats + dump into a csv format - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/26 09:57:31 UTC, 0 replies.
- [jira] Resolved: (NUTCH-516) Next fetch time is not set when it is a CrawlDatum.STATUS_FETCH_GONE - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/26 10:37:32 UTC, 0 replies.
- [jira] Resolved: (NUTCH-525) DeleteDuplicates generates ArrayIndexOutOfBoundsException when trying to rerun dedup on a segment - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/26 10:54:34 UTC, 0 replies.
- [jira] Closed: (NUTCH-525) DeleteDuplicates generates ArrayIndexOutOfBoundsException when trying to rerun dedup on a segment - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/26 14:55:41 UTC, 0 replies.
- [jira] Closed: (NUTCH-516) Next fetch time is not set when it is a CrawlDatum.STATUS_FETCH_GONE - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/26 14:55:41 UTC, 0 replies.
- [jira] Created: (NUTCH-529) NodeWalker.skipChildren don't wrok for more than 1 child. - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/27 05:30:03 UTC, 0 replies.
- [jira] Updated: (NUTCH-529) NodeWalker.skipChildren don't wrok for more than 1 child. - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/27 05:30:09 UTC, 0 replies.
- [jira] Issue Comment Edited: (NUTCH-522) Use URLValidator in the Injector - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/27 15:10:18 UTC, 0 replies.
- [jira] Created: (NUTCH-530) Add a combiner to improve performance on updatedb - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/29 10:32:52 UTC, 0 replies.
- [jira] Updated: (NUTCH-530) Add a combiner to improve performance on updatedb - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/29 10:35:52 UTC, 0 replies.
- Error indexer - posted by Le Quoc Anh <qu...@gmail.com> on 2007/07/29 11:13:54 UTC, 0 replies.
- [jira] Created: (NUTCH-531) Pages with no ContentType cause a Null Pointer exception - posted by "Carl Cerecke (JIRA)" <ji...@apache.org> on 2007/07/29 22:55:52 UTC, 0 replies.
- [jira] Created: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/30 10:59:53 UTC, 0 replies.
- [jira] Updated: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/30 11:01:53 UTC, 2 replies.
- [jira] Created: (NUTCH-533) LinkDbMerger: url normlaized is not updated in the key and inlinks list - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/30 11:53:52 UTC, 0 replies.
- [jira] Updated: (NUTCH-533) LinkDbMerger: url normlaized is not updated in the key and inlinks list - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/30 11:53:53 UTC, 0 replies.
- [jira] Commented: (NUTCH-530) Add a combiner to improve performance on updatedb - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/30 12:48:53 UTC, 4 replies.
- [jira] Updated: (NUTCH-531) Pages with no ContentType cause a Null Pointer exception - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/30 12:54:53 UTC, 0 replies.
- [jira] Commented: (NUTCH-533) LinkDbMerger: url normlaized is not updated in the key and inlinks list - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/30 12:56:53 UTC, 1 replies.
- [jira] Commented: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/30 13:00:56 UTC, 2 replies.
- [jira] Commented: (NUTCH-514) Indexer should only index pages with fetch status SUCCESS - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/30 13:02:52 UTC, 2 replies.
- [jira] Commented: (NUTCH-528) CrawlDbReader: add some new stats + dump into a csv format - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/30 13:06:52 UTC, 0 replies.
- [jira] Commented: (NUTCH-529) NodeWalker.skipChildren don't wrok for more than 1 child. - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/30 13:08:52 UTC, 0 replies.
- [jira] Updated: (NUTCH-529) NodeWalker.skipChildren doesn't work for more than 1 child. - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/30 20:59:52 UTC, 0 replies.
- [jira] Updated: (NUTCH-533) LinkDbMerger: url normalized is not updated in the key and inlinks list - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/30 20:59:53 UTC, 1 replies.
- [jira] Resolved: (NUTCH-514) Indexer should only index pages with fetch status SUCCESS - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/30 21:03:52 UTC, 0 replies.
- [jira] Closed: (NUTCH-514) Indexer should only index pages with fetch status SUCCESS - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/30 21:03:53 UTC, 0 replies.
- [jira] Created: (NUTCH-534) SegmentMerger: add -normalize option - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/31 12:35:53 UTC, 0 replies.
- [jira] Updated: (NUTCH-534) SegmentMerger: add -normalize option - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/07/31 12:35:53 UTC, 0 replies.
- Pages in UTF-16 - posted by Blaž Smolnikar <bl...@vizija.si> on 2007/07/31 13:23:26 UTC, 0 replies.
- [jira] Resolved: (NUTCH-533) LinkDbMerger: url normalized is not updated in the key and inlinks list - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/31 14:07:56 UTC, 0 replies.
- [jira] Closed: (NUTCH-533) LinkDbMerger: url normalized is not updated in the key and inlinks list - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/31 14:12:53 UTC, 0 replies.
- [jira] Updated: (NUTCH-442) Integrate Solr/Nutch - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/31 15:19:53 UTC, 1 replies.
- [jira] Closed: (NUTCH-520) A common infrastructure for different index backends - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/07/31 15:21:53 UTC, 0 replies.