You are viewing a plain text version of this content. The canonical link for it is here.
- How to create patch? - posted by Manoharam Reddy <ma...@gmail.com> on 2007/06/01 08:12:47 UTC, 2 replies.
- [jira] Commented: (NUTCH-392) OutputFormat implementations should pass on Progressable - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/01 09:53:15 UTC, 13 replies.
- Plugins and Thread Safety - posted by Briggs <ac...@gmail.com> on 2007/06/01 18:16:27 UTC, 7 replies.
- [PATCH] Moving HitDetails construction to a HitDetails constructor (v2). - posted by Nicolás Lichtmaier <ni...@reloco.com.ar> on 2007/06/01 22:38:22 UTC, 2 replies.
- [jira] Created: (NUTCH-496) ConcurrentModificationException can be thrown when getSorted() is called. - posted by "Marc Miller (JIRA)" <ji...@apache.org> on 2007/06/04 18:33:35 UTC, 0 replies.
- [jira] Updated: (NUTCH-496) ConcurrentModificationException can be thrown when getSorted() is called. - posted by "Marc Miller (JIRA)" <ji...@apache.org> on 2007/06/04 18:37:35 UTC, 2 replies.
- [jira] Commented: (NUTCH-496) ConcurrentModificationException can be thrown when getSorted() is called. - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2007/06/04 18:56:36 UTC, 6 replies.
- [Fwd: Nutch 0.9 and Crawl-Delay] - posted by Doug Cutting <cu...@apache.org> on 2007/06/04 22:25:03 UTC, 1 replies.
- Build failed in Hudson: Nutch-Nightly #108 - posted by hu...@lucene.zones.apache.org on 2007/06/06 08:54:34 UTC, 0 replies.
- [jira] Commented: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object - posted by "Gal Nitzan (JIRA)" <ji...@apache.org> on 2007/06/06 14:04:26 UTC, 3 replies.
- [jira] Commented: (NUTCH-466) Flexible segment format - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/06 14:30:26 UTC, 0 replies.
- [jira] Issue Comment Edited: (NUTCH-466) Flexible segment format - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/06 15:08:25 UTC, 0 replies.
- [jira] Created: (NUTCH-497) Extreme Nested Tags causes StackOverflowException in DomContentUtils...Spider Trap - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2007/06/07 01:34:26 UTC, 0 replies.
- [jira] Updated: (NUTCH-497) Extreme Nested Tags causes StackOverflowException in DomContentUtils...Spider Trap - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2007/06/07 01:36:26 UTC, 9 replies.
- Lock file problems... - posted by Briggs <ac...@gmail.com> on 2007/06/07 17:20:36 UTC, 1 replies.
- Hudson build is back to normal: Nutch-Nightly #109 - posted by hu...@lucene.zones.apache.org on 2007/06/07 18:44:16 UTC, 0 replies.
- Re: Plugins initialized all the time! - posted by Doğacan Güney <do...@gmail.com> on 2007/06/08 17:30:16 UTC, 2 replies.
- [jira] Commented: (NUTCH-356) Plugin repository cache can lead to memory leak - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/08 17:37:26 UTC, 1 replies.
- Re: Loading mechanism of plugin classes and singleton objects - posted by Enzo Michelangeli <en...@gmail.com> on 2007/06/09 09:53:36 UTC, 0 replies.
- Nutch search algoritm - posted by wilt <vh...@lohika.com> on 2007/06/11 13:02:23 UTC, 0 replies.
- Welcome Doğacan as Nutch committer - posted by Andrzej Bialecki <ab...@getopt.org> on 2007/06/11 22:33:06 UTC, 4 replies.
- [jira] Created: (NUTCH-498) Use Combiner in LinkDb to increase speed of linkdb generation - posted by "Espen Amble Kolstad (JIRA)" <ji...@apache.org> on 2007/06/14 10:07:25 UTC, 0 replies.
- [jira] Updated: (NUTCH-498) Use Combiner in LinkDb to increase speed of linkdb generation - posted by "Espen Amble Kolstad (JIRA)" <ji...@apache.org> on 2007/06/14 10:28:26 UTC, 2 replies.
- [jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility - posted by "nutch.newbie (JIRA)" <ji...@apache.org> on 2007/06/15 08:16:26 UTC, 8 replies.
- [jira] Commented: (NUTCH-498) Use Combiner in LinkDb to increase speed of linkdb generation - posted by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2007/06/15 09:58:26 UTC, 9 replies.
- [jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/16 11:36:26 UTC, 1 replies.
- [jira] Resolved: (NUTCH-495) Unnecessary delays in Fetcher2 - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/16 12:36:25 UTC, 0 replies.
- [jira] Created: (NUTCH-499) Refactor LinkDb and LinkDbMerger to reuse code - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/16 13:01:25 UTC, 0 replies.
- [jira] Updated: (NUTCH-499) Refactor LinkDb and LinkDbMerger to reuse code - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/16 13:07:25 UTC, 0 replies.
- [jira] Assigned: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/16 13:15:26 UTC, 0 replies.
- [jira] Updated: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/17 11:04:26 UTC, 0 replies.
- [jira] Closed: (NUTCH-270) Apply just the applicable portions of the patch to protocol.httpclient.Http.java - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/17 11:09:26 UTC, 0 replies.
- [jira] Closed: (NUTCH-476) Would like to add a field to the document class for its MD5 signature - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/17 11:18:26 UTC, 0 replies.
- [jira] Resolved: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2007/06/17 19:21:26 UTC, 0 replies.
- [jira] Closed: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2007/06/17 19:21:26 UTC, 0 replies.
- [jira] Work started: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2007/06/17 21:15:26 UTC, 0 replies.
- [jira] Updated: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2007/06/17 22:07:26 UTC, 0 replies.
- [jira] Resolved: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/17 22:29:26 UTC, 1 replies.
- [jira] Created: (NUTCH-500) Add hadoop masters configuration file into conf folder - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/06/18 08:50:26 UTC, 0 replies.
- upgrade to hadoop-0.13? - posted by Doğacan Güney <do...@gmail.com> on 2007/06/18 10:20:17 UTC, 2 replies.
- [jira] Closed: (NUTCH-492) java.lang.OutOfMemoryError while indexing. - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/18 10:57:26 UTC, 0 replies.
- [jira] Closed: (NUTCH-493) contentType parse not correctly,,,,got empty content using readseg -get - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/18 11:01:32 UTC, 0 replies.
- Build failed in Hudson: Nutch-Nightly #120 - posted by hu...@lucene.zones.apache.org on 2007/06/18 11:48:32 UTC, 0 replies.
- [jira] Created: (NUTCH-501) implementing a different caching mechanism for objects - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/18 14:04:26 UTC, 0 replies.
- [jira] Updated: (NUTCH-501) implementing a different caching mechanism for objects - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/18 14:07:27 UTC, 0 replies.
- [jira] Commented: (NUTCH-501) implementing a different caching mechanism for objects - posted by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2007/06/18 14:21:26 UTC, 1 replies.
- [jira] Updated: (NUTCH-501) Implement a different caching mechanism for objects cached in configuration - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/18 15:35:25 UTC, 2 replies.
- [jira] Commented: (NUTCH-501) Implement a different caching mechanism for objects cached in configuration - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/18 15:52:25 UTC, 4 replies.
- Hudson build is back to normal: Nutch-Nightly #121 - posted by hu...@lucene.zones.apache.org on 2007/06/18 18:09:02 UTC, 0 replies.
- [jira] Resolved: (NUTCH-489) URLFilter-suffix management of the url path when the url contains some query parameters - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/18 20:15:26 UTC, 0 replies.
- [jira] Created: (NUTCH-502) Bug in SegmentReader causes infinite loop - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/19 08:01:41 UTC, 0 replies.
- [jira] Updated: (NUTCH-502) Bug in SegmentReader causes infinite loop - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/19 08:03:26 UTC, 0 replies.
- Build failed in Hudson: Nutch-Nightly #122 - posted by hu...@lucene.zones.apache.org on 2007/06/19 09:00:15 UTC, 0 replies.
- [jira] Resolved: (NUTCH-502) Bug in SegmentReader causes infinite loop - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/19 11:22:26 UTC, 0 replies.
- [jira] Resolved: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2007/06/19 16:03:27 UTC, 0 replies.
- [jira] Closed: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2007/06/19 16:03:27 UTC, 0 replies.
- Build failed in Hudson: Nutch-Nightly #123 - posted by hu...@lucene.zones.apache.org on 2007/06/20 09:00:20 UTC, 10 replies.
- [jira] Commented: (NUTCH-497) Extreme Nested Tags causes StackOverflowException in DomContentUtils...Spider Trap - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2007/06/20 18:45:26 UTC, 5 replies.
- Found the bug in Generator when number of URLs is small - posted by Vishal Shah <vi...@rediff.co.in> on 2007/06/21 08:43:30 UTC, 2 replies.
- Hudson build is back to normal: Nutch-Nightly #124 - posted by hu...@lucene.zones.apache.org on 2007/06/21 09:07:08 UTC, 0 replies.
- [jira] Created: (NUTCH-503) Generator exits incorrectly for small fetchlists - posted by "Vishal Shah (JIRA)" <ji...@apache.org> on 2007/06/21 09:39:25 UTC, 0 replies.
- [jira] Updated: (NUTCH-503) Generator exits incorrectly for small fetchlists - posted by "Vishal Shah (JIRA)" <ji...@apache.org> on 2007/06/21 10:07:26 UTC, 1 replies.
- http.content.limit not respected when the Content-Type header has charset attributes - posted by Vishal Shah <vi...@rediff.co.in> on 2007/06/21 12:06:27 UTC, 0 replies.
- [jira] Commented: (NUTCH-471) Fix synchronization in NutchBean creation - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/21 14:29:26 UTC, 1 replies.
- [jira] Commented: (NUTCH-503) Generator exits incorrectly for small fetchlists - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/06/21 16:44:28 UTC, 7 replies.
- [jira] Resolved: (NUTCH-471) Fix synchronization in NutchBean creation - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/21 17:18:26 UTC, 0 replies.
- where to put hadoop native lib in tomcat? - posted by qi wu <ch...@gmail.com> on 2007/06/22 10:11:39 UTC, 0 replies.
- [jira] Created: (NUTCH-504) NUTCH-443 broke parsing during fetching - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/22 10:30:25 UTC, 0 replies.
- [jira] Updated: (NUTCH-504) NUTCH-443 broke parsing during fetching - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/22 10:32:25 UTC, 1 replies.
- [jira] Commented: (NUTCH-504) NUTCH-443 broke parsing during fetching - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/22 10:34:25 UTC, 2 replies.
- [jira] Commented: (NUTCH-465) I download nutch 0.9 used tar zxvf nutch-0.9.tar.gz at last A lone zero block - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/22 10:44:26 UTC, 0 replies.
- [jira] Issue Comment Edited: (NUTCH-503) Generator exits incorrectly for small fetchlists - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/22 10:59:25 UTC, 0 replies.
- [jira] Commented: (NUTCH-479) Support for OR queries - posted by "Rob Young (JIRA)" <ji...@apache.org> on 2007/06/22 14:22:26 UTC, 2 replies.
- [jira] Commented: (NUTCH-468) Scoring filter should distribute score to all outlinks at once - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/22 16:30:26 UTC, 1 replies.
- Build failed in Hudson: Nutch-Nightly #126 - posted by hu...@lucene.zones.apache.org on 2007/06/23 09:00:16 UTC, 0 replies.
- [jira] Commented: (NUTCH-25) needs 'character encoding' detector - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/23 13:06:26 UTC, 0 replies.
- [jira] Created: (NUTCH-505) Outlink urls should be validated - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/23 22:15:25 UTC, 0 replies.
- [jira] Updated: (NUTCH-505) Outlink urls should be validated - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/23 22:21:25 UTC, 1 replies.
- Hudson build is back to normal: Nutch-Nightly #127 - posted by hu...@lucene.zones.apache.org on 2007/06/24 09:03:34 UTC, 0 replies.
- [jira] Resolved: (NUTCH-468) Scoring filter should distribute score to all outlinks at once - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/24 11:30:26 UTC, 0 replies.
- [jira] Resolved: (NUTCH-504) NUTCH-443 broke parsing during fetching - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/24 12:05:26 UTC, 0 replies.
- [jira] Updated: (NUTCH-356) Plugin repository cache can lead to memory leak - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/24 21:05:26 UTC, 0 replies.
- [jira] Commented: (NUTCH-505) Outlink urls should be validated - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/25 10:09:26 UTC, 1 replies.
- [jira] Resolved: (NUTCH-497) Extreme Nested Tags causes StackOverflowException in DomContentUtils...Spider Trap - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2007/06/26 05:35:26 UTC, 0 replies.
- [jira] Closed: (NUTCH-497) Extreme Nested Tags causes StackOverflowException in DomContentUtils...Spider Trap - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2007/06/26 05:35:26 UTC, 0 replies.
- Re: svn commit: r550669 - in /lucene/nutch/trunk/src: java/org/apache/nutch/util/ plugin/languageidentifier/src/java/org/apache/nutch/analysis/lang/ plugin/parse-html/src/java/org/apache/nutch/parse/html/ test/org/apache/nutch/fetcher/ testresources/fetch-... - posted by Chris Mattmann <ch...@jpl.nasa.gov> on 2007/06/26 06:40:24 UTC, 2 replies.
- [jira] Commented: (NUTCH-499) Refactor LinkDb and LinkDbMerger to reuse code - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/26 14:48:25 UTC, 2 replies.
- [jira] Updated: (NUTCH-434) Replace usage of ObjectWritable with something based on GenericWritable - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/26 15:26:26 UTC, 1 replies.
- Re-crawling Problem - posted by Luca Rondanini <lu...@translated.net> on 2007/06/26 17:37:08 UTC, 0 replies.
- [jira] Commented: (NUTCH-434) Replace usage of ObjectWritable with something based on GenericWritable - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2007/06/26 17:51:25 UTC, 2 replies.
- NUTCH-119 :: how hard to fix - posted by Kai_testing Middleton <ka...@yahoo.com> on 2007/06/27 02:49:44 UTC, 3 replies.
- [jira] Commented: (NUTCH-289) CrawlDatum should store IP address - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/27 08:39:27 UTC, 0 replies.
- JIRA email question - posted by Doğacan Güney <do...@gmail.com> on 2007/06/27 09:02:32 UTC, 1 replies.
- [jira] Closed: (NUTCH-434) Replace usage of ObjectWritable with something based on GenericWritable - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/27 09:07:29 UTC, 0 replies.
- [jira] Resolved: (NUTCH-434) Replace usage of ObjectWritable with something based on GenericWritable - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/27 09:07:29 UTC, 0 replies.
- [jira] Resolved: (NUTCH-499) Refactor LinkDb and LinkDbMerger to reuse code - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/27 10:40:25 UTC, 0 replies.
- [jira] Closed: (NUTCH-499) Refactor LinkDb and LinkDbMerger to reuse code - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/27 10:40:26 UTC, 0 replies.
- [jira] Updated: (NUTCH-479) Support for OR queries - posted by "Rob Young (JIRA)" <ji...@apache.org> on 2007/06/27 12:58:26 UTC, 0 replies.
- [jira] Closed: (NUTCH-498) Use Combiner in LinkDb to increase speed of linkdb generation - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/27 14:47:26 UTC, 0 replies.
- [jira] Resolved: (NUTCH-498) Use Combiner in LinkDb to increase speed of linkdb generation - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/27 14:47:26 UTC, 0 replies.
- [jira] Commented: (NUTCH-474) Fetcher2 sets server-delay and blocking checks incorrectly - posted by "Hudson (JIRA)" <ji...@apache.org> on 2007/06/28 09:03:47 UTC, 1 replies.
- [jira] Updated: (NUTCH-392) OutputFormat implementations should pass on Progressable - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/28 17:59:26 UTC, 0 replies.
- problem with nutch 0.8.1 compile - posted by Tsengtan A Shuy <tt...@sbcglobal.net> on 2007/06/28 18:37:24 UTC, 2 replies.
- [jira] Created: (NUTCH-506) Nutch should delegate compression to Hadoop - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/29 14:46:04 UTC, 0 replies.
- [jira] Updated: (NUTCH-506) Nutch should delegate compression to Hadoop - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/29 14:48:04 UTC, 0 replies.
- [jira] Issue Comment Edited: (NUTCH-506) Nutch should delegate compression to Hadoop - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/06/29 14:51:04 UTC, 0 replies.
- problem running "bin/nutch crawl urls -dir crawl -depth 3 -topN 50" command - posted by Tsengtan A Shuy <tt...@sbcglobal.net> on 2007/06/29 18:47:14 UTC, 1 replies.
- Fwd: failed to subscribe 'nutch-user' maillist - posted by Oscar <ro...@gmail.com> on 2007/06/30 12:58:39 UTC, 1 replies.