You are viewing a plain text version of this content. The canonical link for it is here.
- Bug in index-more plugin? - posted by "yoursoft@freemail.hu" <yo...@freemail.hu> on 2005/07/01 10:59:05 UTC, 4 replies.
- [jira] Created: (NUTCH-65) index-more plugin can't parse large set of modification-date - posted by "Lutischán Ferenc (JIRA)" <ji...@apache.org> on 2005/07/01 11:55:59 UTC, 0 replies.
- [jira] Commented: (NUTCH-65) index-more plugin can't parse large set of modification-date - posted by "Jerome Charron (JIRA)" <ji...@apache.org> on 2005/07/01 12:50:57 UTC, 3 replies.
- [jira] Closed: (NUTCH-60) Bad language identifier plugin performances - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2005/07/02 21:32:10 UTC, 0 replies.
- [jira] Closed: (NUTCH-57) text and html files unrecognized - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2005/07/02 21:43:10 UTC, 0 replies.
- [jira] Closed: (NUTCH-27) Patch to get a status of running Fetcher - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2005/07/02 21:54:09 UTC, 0 replies.
- [jira] Closed: (NUTCH-32) Nutch Webapp could only be deployed on root namespace - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2005/07/02 22:26:10 UTC, 0 replies.
- [jira] Created: (NUTCH-66) Cookies are not being read properly - posted by "CC Chaman (JIRA)" <ji...@apache.org> on 2005/07/02 22:37:09 UTC, 0 replies.
- [jira] Closed: (NUTCH-56) Crawling sites with 403 Forbidden robots.txt - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2005/07/02 22:48:10 UTC, 0 replies.
- both html parser have bug with javascript - posted by "Ilia S. Yatsenko" <sh...@yandex.ru> on 2005/07/03 17:05:57 UTC, 8 replies.
- Re: Why Crawl failed to fetch so many pages? - posted by Nutch开发邮件 <pr...@gmail.com> on 2005/07/04 05:18:03 UTC, 0 replies.
- [jira] Created: (NUTCH-67) I want crawl the websites including news.yahoo.com,game.yahoo.com,blog.yahoo.com,etc! - posted by "zhangjin (JIRA)" <ji...@apache.org> on 2005/07/04 05:42:10 UTC, 1 replies.
- hits.getTotal() - posted by "Ilia S. Yatsenko" <sh...@yandex.ru> on 2005/07/04 11:54:24 UTC, 1 replies.
- Problems with Fetcher threads? - posted by Jakob Heidebrecht <Ja...@gmx.de> on 2005/07/04 13:36:06 UTC, 1 replies.
- Re: [jira] Created: (NUTCH-67) I want crawl the websites including news.yahoo.com,game.yahoo.com,blog.yahoo.com,etc! - posted by Nutch开发邮件 <pr...@gmail.com> on 2005/07/04 18:00:43 UTC, 1 replies.
- [jira] Commented: (NUTCH-66) Cookies are not being read properly - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2005/07/04 18:57:13 UTC, 2 replies.
- [jira] Updated: (NUTCH-68) A tool to generate arbitrary fetchlists - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2005/07/05 10:07:12 UTC, 0 replies.
- [jira] Created: (NUTCH-68) A tool to generate arbitrary fetchlists - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2005/07/05 10:07:12 UTC, 0 replies.
- Iterating spidered pages - posted by Fredrik Andersson <fi...@gmail.com> on 2005/07/05 10:58:33 UTC, 2 replies.
- Re: LanguageIdentifier refactoring - posted by Andrzej Bialecki <ab...@getopt.org> on 2005/07/05 15:02:40 UTC, 3 replies.
- Bad URLs causing SEVERE exception - posted by Chirag Chaman <cc...@mxinteractive.com> on 2005/07/05 22:47:34 UTC, 1 replies.
- max fetcher threads per host, buggy behaviour. - posted by Emilijan Mirceski <em...@cpuedge.com> on 2005/07/08 00:52:50 UTC, 0 replies.
- [jira] Closed: (NUTCH-58) NullPointerException while coping NDFS file - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2005/07/08 12:38:09 UTC, 1 replies.
- nutch server performance - posted by Michael Nebel <mi...@nebel.de> on 2005/07/08 14:55:07 UTC, 0 replies.
- [jira] Created: (NUTCH-69) fetcher.threads.per.host ignored - posted by "Matthias Jaekle (JIRA)" <ji...@apache.org> on 2005/07/08 16:28:09 UTC, 0 replies.
- [jira] Resolved: (NUTCH-69) fetcher.threads.per.host ignored - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2005/07/08 16:39:10 UTC, 0 replies.
- [jira] Closed: (NUTCH-63) the distributed search client generate too much logging statements - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2005/07/08 17:45:13 UTC, 1 replies.
- ESP - Ethics search protocol for internet search engines. - posted by Bernhard Fastenrath <Be...@arcor.de> on 2005/07/09 14:22:54 UTC, 4 replies.
- Re: [Nutch-dev] Re: ESP - Ethics search protocol for internet search engines. - posted by Erik Hatcher <er...@ehatchersolutions.com> on 2005/07/11 02:43:27 UTC, 0 replies.
- [jira] Created: (NUTCH-70) duplicate pages - virtual hosts in db. - posted by "Lutischán Ferenc (JIRA)" <ji...@apache.org> on 2005/07/11 11:13:10 UTC, 2 replies.
- Possible race condition while loading plugins - posted by Diego Basch <db...@gmail.com> on 2005/07/11 15:18:21 UTC, 0 replies.
- Website Visualization Questions - posted by Nils Hoeller <ni...@web.de> on 2005/07/11 16:36:09 UTC, 3 replies.
- hi all - posted by Bin Shi <sh...@gmail.com> on 2005/07/12 00:56:46 UTC, 1 replies.
- Fwd: links in db and pagerank calculation - posted by Orkunt Sabuncu <or...@agmlab.com> on 2005/07/12 13:43:19 UTC, 0 replies.
- [jira] Created: (NUTCH-71) Search web page doesn't not focus on query input - posted by "Christophe Noel (JIRA)" <ji...@apache.org> on 2005/07/12 14:19:11 UTC, 0 replies.
- [jira] Updated: (NUTCH-71) Search web page doesn't not focus on query input - posted by "Christophe Noel (JIRA)" <ji...@apache.org> on 2005/07/12 14:19:12 UTC, 0 replies.
- [jira] Commented: (NUTCH-71) Search web page doesn't not focus on query input - posted by "Christophe Noel (JIRA)" <ji...@apache.org> on 2005/07/12 14:30:26 UTC, 0 replies.
- Re: [Nutch-dev] Exception "Could not obtain new output block" - posted by "yoursoft@freemail.hu" <yo...@freemail.hu> on 2005/07/13 08:52:40 UTC, 0 replies.
- Amin GH's invitation - posted by Am...@invitation.sms.ac on 2005/07/14 15:57:52 UTC, 0 replies.
- [jira] Closed: (NUTCH-46) the NDFS problem(Could not obtain new output block for file) - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2005/07/14 23:01:13 UTC, 0 replies.
- NutchAnalysis and CJK - posted by Jack Tang <hi...@gmail.com> on 2005/07/15 04:49:01 UTC, 7 replies.
- [jira] Created: (NUTCH-72) Query basic filter with correction feature - posted by "Christophe Noel (JIRA)" <ji...@apache.org> on 2005/07/15 13:27:10 UTC, 0 replies.
- [jira] Updated: (NUTCH-72) Query basic filter with correction feature - posted by "Christophe Noel (JIRA)" <ji...@apache.org> on 2005/07/15 14:00:11 UTC, 0 replies.
- [jira] Created: (NUTCH-73) A page for CSV results - posted by "Christophe Noel (JIRA)" <ji...@apache.org> on 2005/07/15 14:11:09 UTC, 0 replies.
- [jira] Updated: (NUTCH-73) A page for CSV results - posted by "Christophe Noel (JIRA)" <ji...@apache.org> on 2005/07/15 14:11:10 UTC, 0 replies.
- a silly question - posted by "Feng (Michael) Ji" <fj...@yahoo.com> on 2005/07/16 05:27:44 UTC, 6 replies.
- Re: [Nutch-dev] Re: a silly question - posted by yoursoft <yo...@freemail.hu> on 2005/07/16 20:08:55 UTC, 4 replies.
- Nutch Compiling - posted by "Feng (Michael) Ji" <fj...@yahoo.com> on 2005/07/16 21:22:58 UTC, 1 replies.
- Nutch and cluster search result - posted by Jack Tang <hi...@gmail.com> on 2005/07/17 08:19:11 UTC, 3 replies.
- indexed records in segments - posted by yoursoft <yo...@freemail.hu> on 2005/07/17 09:09:51 UTC, 0 replies.
- search result NULL - posted by "Feng (Michael) Ji" <fj...@yahoo.com> on 2005/07/17 22:46:13 UTC, 0 replies.
- Nutch Result NULL - posted by "Feng (Michael) Ji" <fj...@yahoo.com> on 2005/07/17 22:48:02 UTC, 0 replies.
- Nutch returning NULL Result - posted by "Feng (Michael) Ji" <fj...@yahoo.com> on 2005/07/17 23:31:01 UTC, 0 replies.
- Deploying crawl-only development version of Nutch - posted by Ken Krugler <kk...@transpac.com> on 2005/07/18 02:13:19 UTC, 1 replies.
- image search - posted by "yoursoft@freemail.hu" <yo...@freemail.hu> on 2005/07/18 09:52:22 UTC, 0 replies.
- Prerequisites for searching - posted by Fredrik Andersson <fi...@gmail.com> on 2005/07/18 22:55:36 UTC, 3 replies.
- [jira] Created: (NUTCH-74) French Analyzer Plugin - posted by "Christophe Noel (JIRA)" <ji...@apache.org> on 2005/07/19 14:38:45 UTC, 0 replies.
- [jira] Updated: (NUTCH-74) French Analyzer Plugin - posted by "Christophe Noel (JIRA)" <ji...@apache.org> on 2005/07/19 14:38:46 UTC, 1 replies.
- [crawl] Response content length is not known - posted by Christophe Noel <ch...@cetic.be> on 2005/07/19 15:01:49 UTC, 0 replies.
- [jira] Commented: (NUTCH-74) French Analyzer Plugin - posted by "Jerome Charron (JIRA)" <ji...@apache.org> on 2005/07/19 15:33:53 UTC, 0 replies.
- bin/nutch issue - on Mac OS X - posted by Erik Hatcher <er...@ehatchersolutions.com> on 2005/07/19 21:36:52 UTC, 2 replies.
- Re: Classnotfoundexception in https plugin - posted by Feng Ji <fe...@gmail.com> on 2005/07/20 02:45:58 UTC, 0 replies.
- Log Error Stack - Re: Nutch Fetch - HttpException : Connect Exception : Invalid Argument - posted by Jon Shoberg <jo...@shoberg.net> on 2005/07/20 14:39:23 UTC, 0 replies.
- API misspelling? - posted by Erik Hatcher <er...@ehatchersolutions.com> on 2005/07/20 16:37:48 UTC, 1 replies.
- [jira] Created: (NUTCH-75) Patch for WebDBReader to get more detailed information about WebDBs - posted by "Matthias Jaekle (JIRA)" <ji...@apache.org> on 2005/07/20 17:25:52 UTC, 0 replies.
- [jira] Updated: (NUTCH-75) Patch for WebDBReader to get more detailed information about WebDBs - posted by "Matthias Jaekle (JIRA)" <ji...@apache.org> on 2005/07/20 17:25:54 UTC, 0 replies.
- [jira] Closed: (NUTCH-66) Cookies are not being read properly - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2005/07/20 23:40:51 UTC, 0 replies.
- SVN repo, Where Art Thou? (Re: [jira] Closed: (NUTCH-66) Cookies are not being read properly) - posted by Andrzej Bialecki <ab...@getopt.org> on 2005/07/21 00:12:29 UTC, 2 replies.
- NDFS Requests - posted by webmaster <we...@www.poundwebhosting.com> on 2005/07/21 05:36:22 UTC, 0 replies.
- Searching certain fields - posted by Fredrik Andersson <fi...@gmail.com> on 2005/07/21 11:53:54 UTC, 0 replies.
- mulitple website crawling - posted by "Feng (Michael) Ji" <fj...@yahoo.com> on 2005/07/21 14:05:16 UTC, 2 replies.
- Fwd: svn commit: r220056 - /lucene/nutch/trunk/src/test/org/apache/nutch/plugin/TestPluginSystem.java - posted by Erik Hatcher <er...@ehatchersolutions.com> on 2005/07/21 15:07:03 UTC, 5 replies.
- parser plugin lifecycle - posted by Erik Hatcher <er...@ehatchersolutions.com> on 2005/07/21 17:36:06 UTC, 3 replies.
- getDiscriptor - posted by Erik Hatcher <er...@ehatchersolutions.com> on 2005/07/21 19:56:18 UTC, 0 replies.
- Re: [Nutch-dev] getDiscriptor - posted by Erik Hatcher <er...@ehatchersolutions.com> on 2005/07/21 20:07:30 UTC, 3 replies.
- segments can be added to searchservers without restarting the frontend - posted by Stefan Groschupf <sg...@media-style.com> on 2005/07/22 13:47:00 UTC, 0 replies.
- IndexOptimizer bug? - posted by "yoursoft@freemail.hu" <yo...@freemail.hu> on 2005/07/22 14:58:41 UTC, 4 replies.
- Nutch on Windows - posted by Cuong Viet Hoang <cl...@gmail.com> on 2005/07/22 20:42:10 UTC, 2 replies.
- search result - posted by "Feng (Michael) Ji" <fj...@yahoo.com> on 2005/07/23 05:18:36 UTC, 6 replies.
- access to old versions of nutch? - posted by Luke Baker <lu...@gospelcom.net> on 2005/07/23 15:47:07 UTC, 2 replies.
- March 9 svn - posted by Paul Harrison <pr...@swbell.net> on 2005/07/23 16:51:00 UTC, 0 replies.
- search depth - posted by "Feng (Michael) Ji" <fj...@yahoo.com> on 2005/07/23 19:57:17 UTC, 0 replies.
- html parsers and windows-1251 (ukrainian) - posted by "Ilia S. Yatsenko" <sh...@yandex.ru> on 2005/07/24 10:23:59 UTC, 4 replies.
- fetching behavior of Nutch - posted by "Feng (Michael) Ji" <fj...@yahoo.com> on 2005/07/24 16:03:33 UTC, 2 replies.
- [jira] Created: (NUTCH-76) NDFS DataNode advertises localhost as it's address - posted by "Peter Sandström (JIRA)" <ji...@apache.org> on 2005/07/24 16:55:45 UTC, 0 replies.
- [jira] Updated: (NUTCH-76) NDFS DataNode advertises localhost as it's address - posted by "Peter Sandström (JIRA)" <ji...@apache.org> on 2005/07/24 16:55:46 UTC, 0 replies.
- Nutch's intranet VS internet crawling - posted by "Feng (Michael) Ji" <fj...@yahoo.com> on 2005/07/24 17:52:33 UTC, 1 replies.
- fetcher blocked - posted by EM <em...@cpuedge.com> on 2005/07/24 19:33:11 UTC, 0 replies.
- Reader and Writer at the same DB - posted by Jakob Heidebrecht <Ja...@gmx.de> on 2005/07/25 14:49:39 UTC, 0 replies.
- whats used from the segments dir when searching - posted by EM <em...@cpuedge.com> on 2005/07/25 21:49:14 UTC, 1 replies.
- Re: Information extraction - posted by Jack Tang <hi...@gmail.com> on 2005/07/26 10:15:53 UTC, 5 replies.
- [jira] Updated: (NUTCH-64) no results after a restart of a search--server (without tomcat restart) - posted by "Michael Nebel (JIRA)" <ji...@apache.org> on 2005/07/26 16:20:45 UTC, 0 replies.
- Vacation... - posted by Andrzej Bialecki <ab...@getopt.org> on 2005/07/27 00:20:41 UTC, 0 replies.
- Http Max Delays - posted by Christophe Noel <ch...@cetic.be> on 2005/07/27 16:46:54 UTC, 2 replies.
- [jira] Commented: (NUTCH-30) rss feed parser - posted by "Michael Nebel (JIRA)" <ji...@apache.org> on 2005/07/27 17:42:22 UTC, 1 replies.
- Corrections to README.txt - posted by Hasan Diwan <ha...@gmail.com> on 2005/07/27 20:59:03 UTC, 1 replies.
- http.max.delays - posted by "Feng (Michael) Ji" <fj...@yahoo.com> on 2005/07/28 02:21:51 UTC, 0 replies.
- ranking algorithm - posted by EM <em...@cpuedge.com> on 2005/07/28 13:20:53 UTC, 5 replies.
- 0.7-dev, the search scoring - posted by Fredrik Andersson <fi...@gmail.com> on 2005/07/28 14:28:59 UTC, 3 replies.
- Re: [Nutch-dev] Re: ranking algorithm - posted by Massimo Miccoli <mm...@iltrovatore.it> on 2005/07/28 17:07:22 UTC, 0 replies.
- object not parsed but indexed? - posted by Michael Nebel <mi...@nebel.de> on 2005/07/28 17:21:41 UTC, 0 replies.
- NDFS Bug, Mapred from SVN - Tokenizer and New Line Error - posted by Jon Shoberg <jo...@shoberg.net> on 2005/07/29 04:44:24 UTC, 0 replies.
- NDFS and Fedora Core 3 - posted by Jon Shoberg <jo...@shoberg.net> on 2005/07/29 04:52:18 UTC, 0 replies.
- recursion: see recursion - posted by EM <em...@cpuedge.com> on 2005/07/30 02:05:03 UTC, 1 replies.