You are viewing a plain text version of this content. The canonical link for it is here.
- Re: Release 1.0? - posted by dealmaker <vi...@gmail.com> on 2009/03/01 07:22:00 UTC, 2 replies.
- [jira] Commented: (NUTCH-419) unavailable robots.txt kills fetch - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/03/01 08:32:13 UTC, 1 replies.
- [jira] Created: (NUTCH-708) NutchBean: OOM due to searcher.max.hits and dedup. - posted by "Aaron Binns (JIRA)" <ji...@apache.org> on 2009/03/01 21:13:12 UTC, 0 replies.
- [jira] Commented: (NUTCH-705) parse-rtf plugin - posted by "Dmitry Lihachev (JIRA)" <ji...@apache.org> on 2009/03/02 04:05:12 UTC, 1 replies.
- How to make parse-xml plugin (NUTCH-185) compatible with the latest trunk ? - posted by Gopikrishnan Kookkal <go...@gmail.com> on 2009/03/02 08:23:15 UTC, 0 replies.
- [jira] Updated: (NUTCH-700) Neko1.9.11 goes into a loop - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/02 09:28:12 UTC, 0 replies.
- Re: planning for nutch-1.0-rc1 - posted by Sami Siren <ss...@gmail.com> on 2009/03/02 09:32:52 UTC, 10 replies.
- [jira] Closed: (NUTCH-419) unavailable robots.txt kills fetch - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/03/02 10:12:12 UTC, 0 replies.
- [jira] Resolved: (NUTCH-700) Neko1.9.11 goes into a loop - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/02 11:22:13 UTC, 0 replies.
- [jira] Resolved: (NUTCH-669) Consolidate code for Fetcher and Fetcher2 - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/02 13:32:17 UTC, 3 replies.
- Job offer for Nutch-Lucene Programmer - posted by Wolfgang Sander-Beuermann <ws...@rrzn.uni-hannover.de> on 2009/03/02 17:45:26 UTC, 0 replies.
- Re: [jira] Resolved: (NUTCH-669) Consolidate code for Fetcher and Fetcher2 - posted by Todd Lipcon <tl...@gmail.com> on 2009/03/02 18:33:55 UTC, 0 replies.
- Parsing, Indexing multiple values (of same type) per document - Nutch-0.9 - posted by Stefan Dlugolinsky <s....@gmail.com> on 2009/03/03 05:08:38 UTC, 0 replies.
- Build failed in Hudson: Nutch-trunk #741 - posted by Apache Hudson Server <hu...@hudson.zones.apache.org> on 2009/03/03 05:18:06 UTC, 0 replies.
- Re: Is there the functions of "More Like This" and "Spell Checking"? - posted by dealmaker <vi...@gmail.com> on 2009/03/03 09:08:35 UTC, 4 replies.
- [jira] Updated: (NUTCH-650) Hbase Integration - posted by "Andrew McCall (JIRA)" <ji...@apache.org> on 2009/03/03 12:42:56 UTC, 7 replies.
- [jira] Created: (NUTCH-709) JSParseFilter gets into an infinate loop and ets all the stack - posted by "Tim Hawkins (JIRA)" <ji...@apache.org> on 2009/03/03 14:26:56 UTC, 0 replies.
- [jira] Commented: (NUTCH-709) JSParseFilter gets into an infinate loop and ets all the stack - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/03/03 14:30:56 UTC, 7 replies.
- [jira] Updated: (NUTCH-709) JSParseFilter gets into an infinate loop and ets all the stack - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/03/03 15:30:57 UTC, 0 replies.
- site: operator with no query term - posted by Frank McCown <fm...@harding.edu> on 2009/03/03 15:39:24 UTC, 2 replies.
- [jira] Created: (NUTCH-710) Support for rel="canonical" attribute - posted by "Frank McCown (JIRA)" <ji...@apache.org> on 2009/03/03 15:50:56 UTC, 0 replies.
- Hudson build is back to normal: Nutch-trunk #742 - posted by Apache Hudson Server <hu...@hudson.zones.apache.org> on 2009/03/04 05:18:41 UTC, 0 replies.
- [jira] Commented: (NUTCH-669) Consolidate code for Fetcher and Fetcher2 - posted by "Hudson (JIRA)" <ji...@apache.org> on 2009/03/04 05:19:56 UTC, 0 replies.
- [jira] Commented: (NUTCH-700) Neko1.9.11 goes into a loop - posted by "Hudson (JIRA)" <ji...@apache.org> on 2009/03/04 05:19:56 UTC, 0 replies.
- [jira] Created: (NUTCH-711) Indexer failing after upgrade to Hadoop 0.19.1 - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/03/04 11:53:56 UTC, 0 replies.
- [jira] Updated: (NUTCH-711) Indexer failing after upgrade to Hadoop 0.19.1 - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/03/04 11:53:56 UTC, 2 replies.
- [jira] Commented: (NUTCH-711) Indexer failing after upgrade to Hadoop 0.19.1 - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/04 12:09:56 UTC, 1 replies.
- [jira] Resolved: (NUTCH-711) Indexer failing after upgrade to Hadoop 0.19.1 - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/03/04 16:11:57 UTC, 1 replies.
- [jira] Created: (NUTCH-712) ParseOutputFormat should catch java.net.MalformedURLException coming from normalizers - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/03/06 12:57:56 UTC, 0 replies.
- [jira] Updated: (NUTCH-712) ParseOutputFormat should catch java.net.MalformedURLException coming from normalizers - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/03/06 12:59:56 UTC, 2 replies.
- [jira] Commented: (NUTCH-712) ParseOutputFormat should catch java.net.MalformedURLException coming from normalizers - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/03/06 13:53:56 UTC, 0 replies.
- [Nutch Wiki] Update of "NewScoringIndexingExample" by DennisKubes - posted by Apache Wiki <wi...@apache.org> on 2009/03/06 18:07:00 UTC, 2 replies.
- [Nutch Wiki] Update of "FrontPage" by DennisKubes - posted by Apache Wiki <wi...@apache.org> on 2009/03/06 18:09:34 UTC, 0 replies.
- [jira] Updated: (NUTCH-684) Dedup support for Solr - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2009/03/08 01:02:56 UTC, 0 replies.
- [VOTE] Release Apache Nutch 1.0 - posted by Sami Siren <ss...@gmail.com> on 2009/03/08 19:25:37 UTC, 29 replies.
- NUTCH-684 [was: Re: [VOTE] Release Apache Nutch 1.0] - posted by Sami Siren <ss...@gmail.com> on 2009/03/09 10:05:17 UTC, 3 replies.
- [jira] Commented: (NUTCH-684) Dedup support for Solr - posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org> on 2009/03/09 16:26:51 UTC, 2 replies.
- [jira] Created: (NUTCH-713) Config options for webgraph Scoring not documented - posted by "Eric J. Christeson (JIRA)" <ji...@apache.org> on 2009/03/09 16:42:50 UTC, 0 replies.
- [jira] Updated: (NUTCH-713) Config options for webgraph Scoring not documented - posted by "Eric J. Christeson (JIRA)" <ji...@apache.org> on 2009/03/09 16:52:50 UTC, 0 replies.
- [jira] Closed: (NUTCH-684) Dedup support for Solr - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2009/03/09 18:35:50 UTC, 0 replies.
- Nutch ML cleanup - posted by Otis Gospodnetic <og...@yahoo.com> on 2009/03/09 22:07:07 UTC, 5 replies.
- [jira] Created: (NUTCH-714) Need a SFTP and SCP Protocol Handler - posted by "Sanjoy Ghosh (JIRA)" <ji...@apache.org> on 2009/03/10 01:41:50 UTC, 0 replies.
- [jira] Assigned: (NUTCH-714) Need a SFTP and SCP Protocol Handler - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2009/03/10 02:05:50 UTC, 0 replies.
- [jira] Commented: (NUTCH-714) Need a SFTP and SCP Protocol Handler - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2009/03/10 02:05:50 UTC, 0 replies.
- [jira] Created: (NUTCH-715) Subcollection plugin doesn't work with default subcollections.xml file - posted by "Dmitry Lihachev (JIRA)" <ji...@apache.org> on 2009/03/10 06:54:50 UTC, 0 replies.
- [jira] Updated: (NUTCH-715) Subcollection plugin doesn't work with default subcollections.xml file - posted by "Dmitry Lihachev (JIRA)" <ji...@apache.org> on 2009/03/10 07:00:50 UTC, 3 replies.
- [jira] Assigned: (NUTCH-715) Subcollection plugin doesn't work with default subcollections.xml file - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/10 07:40:50 UTC, 0 replies.
- [jira] Resolved: (NUTCH-715) Subcollection plugin doesn't work with default subcollections.xml file - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/10 08:08:50 UTC, 0 replies.
- [jira] Created: (NUTCH-716) Make subcollection index filed multivalued - posted by "Dmitry Lihachev (JIRA)" <ji...@apache.org> on 2009/03/10 09:42:50 UTC, 0 replies.
- [jira] Updated: (NUTCH-716) Make subcollection index filed multivalued - posted by "Dmitry Lihachev (JIRA)" <ji...@apache.org> on 2009/03/10 09:44:50 UTC, 0 replies.
- Moving Nutch parsers to Tika - posted by Andrzej Bialecki <ab...@getopt.org> on 2009/03/10 10:57:36 UTC, 2 replies.
- [jira] Created: (NUTCH-717) Make Nutch Solr integration easier - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/10 10:58:50 UTC, 0 replies.
- Use of general@l.a.o for... - posted by Grant Ingersoll <gs...@apache.org> on 2009/03/10 14:56:01 UTC, 0 replies.
- (Unknown) - posted by Agnieszka Zbrzezny <ag...@gmail.com> on 2009/03/10 16:15:56 UTC, 0 replies.
- [jira] Commented: (NUTCH-715) Subcollection plugin doesn't work with default subcollections.xml file - posted by "Hudson (JIRA)" <ji...@apache.org> on 2009/03/11 05:20:50 UTC, 0 replies.
- [jira] Updated: (NUTCH-479) Support for OR queries - posted by "Robert Buccigrossi (JIRA)" <ji...@apache.org> on 2009/03/11 22:18:51 UTC, 0 replies.
- PowerPoint Parsing Exception - posted by "Bullard, Luke" <Lu...@pfizer.com> on 2009/03/12 10:36:11 UTC, 0 replies.
- [jira] Created: (NUTCH-718) urlfilter-subnets plugin - posted by "Dmitry Lihachev (JIRA)" <ji...@apache.org> on 2009/03/12 10:50:53 UTC, 0 replies.
- [jira] Updated: (NUTCH-718) urlfilter-subnets plugin - posted by "Dmitry Lihachev (JIRA)" <ji...@apache.org> on 2009/03/12 10:54:50 UTC, 1 replies.
- [jira] Created: (NUTCH-719) fetchQueues.totalSize incorrect in Fetcher2 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/03/12 15:42:50 UTC, 0 replies.
- [jira] Created: (NUTCH-720) site: search operator with no query term - posted by "Frank McCown (JIRA)" <ji...@apache.org> on 2009/03/12 19:54:50 UTC, 0 replies.
- [Nutch Wiki] Update of "NutchTutorial" by FrankMcCown - posted by Apache Wiki <wi...@apache.org> on 2009/03/12 20:23:18 UTC, 0 replies.
- [jira] Commented: (NUTCH-699) Add an "official" solr schema for solr integration - posted by "Dmitry Lihachev (JIRA)" <ji...@apache.org> on 2009/03/16 06:17:50 UTC, 0 replies.
- robots.txt redirect (NUTCH-124) - posted by Mathijs Homminga <ma...@gmail.com> on 2009/03/16 14:13:01 UTC, 1 replies.
- [jira] Created: (NUTCH-721) Fetcher2 Slow - posted by "Roger Dunk (JIRA)" <ji...@apache.org> on 2009/03/17 21:29:50 UTC, 0 replies.
- [jira] Updated: (NUTCH-721) Fetcher2 Slow - posted by "Roger Dunk (JIRA)" <ji...@apache.org> on 2009/03/17 21:35:50 UTC, 0 replies.
- MergeSegments Error. - posted by Armando Gonçalves <ma...@gmail.com> on 2009/03/19 02:57:46 UTC, 0 replies.
- [jira] Created: (NUTCH-722) Nutch contains jars that we cannot redistribute - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/19 14:19:50 UTC, 0 replies.
- [jira] Created: (NUTCH-723) LICENCE.txt is lacking info that should be there - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/19 14:23:50 UTC, 0 replies.
- [jira] Created: (NUTCH-725) NOTICE.txt is lacking info that should be there - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/19 14:25:50 UTC, 0 replies.
- [jira] Created: (NUTCH-724) Drop the JAI libraries - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2009/03/19 14:25:50 UTC, 0 replies.
- [jira] Commented: (NUTCH-525) DeleteDuplicates generates ArrayIndexOutOfBoundsException when trying to rerun dedup on a segment - posted by "minhthucpham (JIRA)" <ji...@apache.org> on 2009/03/19 14:25:50 UTC, 0 replies.
- [jira] Created: (NUTCH-726) README.txt is lacking info that should be there - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/19 14:27:51 UTC, 0 replies.
- [jira] Commented: (NUTCH-722) Nutch contains jars that we cannot redistribute - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/03/19 14:31:56 UTC, 6 replies.
- [jira] Created: (NUTCH-727) Add KEYS file to release artifact - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/19 14:37:50 UTC, 0 replies.
- [DISCUSS] contents of nutch release artifact - posted by Sami Siren <ss...@gmail.com> on 2009/03/19 14:48:21 UTC, 16 replies.
- [jira] Resolved: (NUTCH-726) README.txt is lacking info that should be there - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/19 14:49:50 UTC, 0 replies.
- [jira] Resolved: (NUTCH-724) Drop the JAI libraries - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/19 14:51:50 UTC, 0 replies.
- [jira] Created: (NUTCH-728) Improve nutch release packaging - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/19 18:14:50 UTC, 0 replies.
- [jira] Resolved: (NUTCH-725) NOTICE.txt is lacking info that should be there - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/19 20:50:50 UTC, 0 replies.
- [jira] Resolved: (NUTCH-723) LICENCE.txt is lacking info that should be there - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/19 22:11:50 UTC, 0 replies.
- [jira] Issue Comment Edited: (NUTCH-723) LICENCE.txt is lacking info that should be there - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/19 22:13:50 UTC, 0 replies.
- [jira] Updated: (NUTCH-728) Improve nutch release packaging - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/19 22:23:50 UTC, 0 replies.
- [jira] Resolved: (NUTCH-727) Add KEYS file to release artifact - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/19 22:35:50 UTC, 0 replies.
- [jira] Commented: (NUTCH-725) NOTICE.txt is lacking info that should be there - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2009/03/19 23:13:50 UTC, 1 replies.
- [jira] Commented: (NUTCH-723) LICENCE.txt is lacking info that should be there - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2009/03/19 23:17:50 UTC, 2 replies.
- [jira] Commented: (NUTCH-727) Add KEYS file to release artifact - posted by "Hudson (JIRA)" <ji...@apache.org> on 2009/03/20 05:15:50 UTC, 0 replies.
- [jira] Commented: (NUTCH-726) README.txt is lacking info that should be there - posted by "Hudson (JIRA)" <ji...@apache.org> on 2009/03/20 05:15:50 UTC, 0 replies.
- [jira] Commented: (NUTCH-702) Lazy Instanciation of Metadata in CrawlDatum - posted by "Edwin Chu (JIRA)" <ji...@apache.org> on 2009/03/20 08:00:50 UTC, 0 replies.
- [jira] Commented: (NUTCH-728) Improve nutch release packaging - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2009/03/20 10:32:57 UTC, 2 replies.
- Nutch on Eclipse How To? - posted by Sherjeel Niazi <sh...@softmatics.com> on 2009/03/20 13:27:21 UTC, 3 replies.
- [Nutch Wiki] Update of "RunNutchInEclipse0.9" by BartoszGadzimski - posted by Apache Wiki <wi...@apache.org> on 2009/03/20 15:11:14 UTC, 0 replies.
- Problems compiling Nutch in Eclipse - posted by "Rodrigo Reyes C." <rr...@corbitecso.com> on 2009/03/21 02:02:03 UTC, 6 replies.
- [jira] Resolved: (NUTCH-722) Nutch contains jars that we cannot redistribute - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2009/03/23 07:42:50 UTC, 0 replies.
- NUTCH-722 is resolved - posted by Sami Siren <ss...@gmail.com> on 2009/03/23 07:50:07 UTC, 1 replies.
- How do I prioritise URLs to be fetched? - posted by "Rodrigo Reyes C." <ro...@avity.com> on 2009/03/23 20:11:27 UTC, 0 replies.
- Nutch and Lucene payload - posted by Юрий Михеев <yu...@gmail.com> on 2009/03/24 15:45:53 UTC, 0 replies.
- [Nutch Wiki] Update of "HardwareRequirements" by NycoNyco - posted by Apache Wiki <wi...@apache.org> on 2009/03/24 17:33:41 UTC, 0 replies.
- [Nutch Wiki] Update of "Features" by NycoNyco - posted by Apache Wiki <wi...@apache.org> on 2009/03/24 17:52:32 UTC, 0 replies.
- Problems writing QueryFilter plugin - posted by Tomas Ukkonen <to...@helsinki.fi> on 2009/03/24 18:04:35 UTC, 0 replies.
- [jira] Updated: (NUTCH-714) Need a SFTP and SCP Protocol Handler - posted by "Sanjoy Ghosh (JIRA)" <ji...@apache.org> on 2009/03/24 18:27:53 UTC, 1 replies.
- Announce: New PMC member Dennis Kubes - posted by Andrzej Bialecki <ab...@getopt.org> on 2009/03/25 11:24:55 UTC, 6 replies.
- [jira] Resolved: (NUTCH-720) site: search operator with no query term - posted by "Frank McCown (JIRA)" <ji...@apache.org> on 2009/03/25 14:20:59 UTC, 0 replies.
- [jira] Closed: (NUTCH-291) OpenSearchServlet should return "date" as well as "lastModified" - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2009/03/25 15:50:56 UTC, 0 replies.
- [jira] Created: (NUTCH-729) NPE in FieldIndexer when BasicFields url doesn't exist - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2009/03/25 15:54:51 UTC, 0 replies.
- [jira] Updated: (NUTCH-729) NPE in FieldIndexer when BasicFields url doesn't exist - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2009/03/25 16:00:59 UTC, 0 replies.
- [jira] Created: (NUTCH-730) NPE in LinkRank if no nodes with which to create the WebGraph - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2009/03/26 04:14:51 UTC, 0 replies.
- [jira] Updated: (NUTCH-730) NPE in LinkRank if no nodes with which to create the WebGraph - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2009/03/26 04:16:51 UTC, 1 replies.
- [jira] Commented: (NUTCH-706) Url regex normalizer - posted by "Dmitry Lihachev (JIRA)" <ji...@apache.org> on 2009/03/26 08:35:50 UTC, 0 replies.
- LinkRank why 10 iterations? - posted by Bartosz Gadzimski <ba...@o2.pl> on 2009/03/27 15:15:56 UTC, 1 replies.
- [Nutch Wiki] Update of "PublicServers" by KevinReader - posted by Apache Wiki <wi...@apache.org> on 2009/03/28 18:08:27 UTC, 0 replies.
- [ANNOUNCE] Apache Nutch 1.0 - posted by Sami Siren <ss...@gmail.com> on 2009/03/28 20:53:52 UTC, 0 replies.
- Nutch Topical / Focused Crawl - posted by MyD <My...@googlemail.com> on 2009/03/29 11:39:39 UTC, 1 replies.
- Running Invertlinks twice - posted by krishsoumyacom <kr...@soumya.com> on 2009/03/31 01:49:02 UTC, 0 replies.
- Where to find Lucene Source code?? - posted by Sherjeel Niazi <sh...@softmatics.com> on 2009/03/31 09:37:16 UTC, 0 replies.
- [jira] Updated: (NUTCH-578) URL fetched with 403 is generated over and over again - posted by "Dmitry Lihachev (JIRA)" <ji...@apache.org> on 2009/03/31 11:43:51 UTC, 0 replies.
- [Nutch Wiki] Update of "HttpAuthenticationSchemes" by susam - posted by Apache Wiki <wi...@apache.org> on 2009/03/31 18:54:09 UTC, 0 replies.