You are viewing a plain text version of this content. The canonical link for it is here.
- RE: [jira] Commented: (NUTCH-7) analyze tool takes up all the dis k space when there are circular links - posted by Jay Yu <jy...@looksmart.net> on 2005/04/01 01:26:02 UTC, 0 replies.
- Google patent application March 31, 2005 - posted by Mike Peterson <mi...@mail.ru> on 2005/04/01 04:34:11 UTC, 0 replies.
- Re: hits page list - posted by Roger Dunk <ro...@at.com.au> on 2005/04/01 08:05:27 UTC, 1 replies.
- [jira] Updated: (NUTCH-21) parser plugin for MS PowerPoint slides - posted by "Stephan Strittmatter (JIRA)" <ji...@apache.org> on 2005/04/01 10:22:43 UTC, 0 replies.
- [jira] Created: (NUTCH-34) Parsing different content formats - posted by "Stephan Strittmatter (JIRA)" <ji...@apache.org> on 2005/04/01 10:33:23 UTC, 0 replies.
- [jira] Updated: (NUTCH-34) Parsing different content formats - posted by "Stephan Strittmatter (JIRA)" <ji...@apache.org> on 2005/04/01 10:33:24 UTC, 0 replies.
- Re: [Nutch-dev] RE: A problem about Chinese word segment - posted by Jack Tang <hi...@gmail.com> on 2005/04/01 12:12:54 UTC, 1 replies.
- Page ranking by Nutch - posted by Kannan Sundaramoorthy <ka...@cognizant.com> on 2005/04/01 12:52:15 UTC, 0 replies.
- Re: Needing more protocols - posted by Konstantin Ott <ot...@netropol.de> on 2005/04/01 16:07:57 UTC, 1 replies.
- updatedb ioexception - posted by Luke Baker <lu...@gospelcom.net> on 2005/04/01 16:17:10 UTC, 2 replies.
- Re: PDF Parsing Revisited - posted by Andy Liu <an...@gmail.com> on 2005/04/01 17:20:05 UTC, 0 replies.
- [jira] Created: (NUTCH-35) modify XML parsing code in Nutch to use single API - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2005/04/01 19:16:17 UTC, 0 replies.
- Nutch documentation - posted by Siva Bandhamravuri <sb...@umich.edu> on 2005/04/01 21:18:38 UTC, 1 replies.
- [jira] Updated: (NUTCH-32) Nutch Webapp could only be deployed on root namespace - posted by "Jerome Charron (JIRA)" <ji...@apache.org> on 2005/04/02 00:35:17 UTC, 1 replies.
- Re: Date range and url search - posted by John X <jo...@neasys.com> on 2005/04/02 03:03:10 UTC, 0 replies.
- term document matrix - posted by Siva Bandhamravuri <sb...@umich.edu> on 2005/04/02 18:56:07 UTC, 0 replies.
- Re: Licenses - posted by Hari Kodungallur <ha...@gmail.com> on 2005/04/03 12:59:54 UTC, 4 replies.
- junit reporting.. - posted by Hari Kodungallur <ha...@gmail.com> on 2005/04/03 13:03:17 UTC, 0 replies.
- [jira] Commented: (NUTCH-10) extension points are defined multiple times - posted by "Stefan Grroschupf (JIRA)" <ji...@apache.org> on 2005/04/03 14:07:16 UTC, 0 replies.
- Converted Wiki - posted by Chirag Chaman <de...@filangy.com> on 2005/04/03 20:21:46 UTC, 0 replies.
- [jira] Assigned: (NUTCH-35) modify XML parsing code in Nutch to use single API - posted by "Stefan Grroschupf (JIRA)" <ji...@apache.org> on 2005/04/03 21:54:16 UTC, 0 replies.
- NUTCH-35 (xml api) - posted by Stefan Groschupf <sg...@media-style.com> on 2005/04/03 23:17:18 UTC, 1 replies.
- Distributed WebDB - posted by Byron Miller <by...@yahoo.com> on 2005/04/03 23:27:49 UTC, 1 replies.
- term frequency - posted by Siva Bandhamravuri <sb...@umich.edu> on 2005/04/04 03:32:14 UTC, 1 replies.
- How to add Analyzer? - posted by Jack Tang <hi...@gmail.com> on 2005/04/04 04:07:58 UTC, 2 replies.
- does nutch have these features ? - posted by Rohit Kulkarni <ro...@gmail.com> on 2005/04/04 05:36:49 UTC, 0 replies.
- [jira] Commented: (NUTCH-26) New Http Authentication mechanism - posted by "Jack Tang (JIRA)" <ji...@apache.org> on 2005/04/04 06:07:22 UTC, 0 replies.
- [jira] Updated: (NUTCH-28) No support for https - posted by "Konstantin Ignatyev (JIRA)" <ji...@apache.org> on 2005/04/04 07:02:21 UTC, 0 replies.
- Nutch And Chinese - posted by Jack Tang <hi...@gmail.com> on 2005/04/04 08:59:51 UTC, 0 replies.
- [jira] Updated: (NUTCH-30) rss feed parser - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2005/04/04 19:06:38 UTC, 3 replies.
- [jira] Commented: (NUTCH-30) rss feed parser - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2005/04/04 19:17:16 UTC, 8 replies.
- RSS Parser Plugin based on commons-feedparser submitted - posted by Chris Mattmann <ch...@jpl.nasa.gov> on 2005/04/04 19:27:13 UTC, 5 replies.
- [jira] Closed: (NUTCH-11) Link.java needs a
 tag so javadoc renders - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2005/04/04 20:12:17 UTC, 0 replies.
- [jira] Resolved: (NUTCH-11) Link.java needs a
 tag so javadoc renders - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2005/04/04 20:12:17 UTC, 0 replies.
- [jira] Commented: (NUTCH-32) Nutch Webapp could only be deployed on root namespace - posted by "Doug Cutting (JIRA)" <ji...@apache.org> on 2005/04/04 20:23:20 UTC, 2 replies.
- Re: [Nutch-dev] Re: RSS Parser Plugin based on commons-feedparser submitted - posted by "Kevin A. Burton" <bu...@rojo.com> on 2005/04/04 20:38:08 UTC, 6 replies.
- [jira] Resolved: (NUTCH-15) ipc client timeout should be configurable - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2005/04/04 20:45:20 UTC, 0 replies.
- [jira] Commented: (NUTCH-28) No support for https - posted by "Doug Bakewell (JIRA)" <ji...@apache.org> on 2005/04/04 20:56:26 UTC, 0 replies.
- [jira] Resolved: (NUTCH-32) Nutch Webapp could only be deployed on root namespace - posted by "Doug Cutting (JIRA)" <ji...@apache.org> on 2005/04/05 00:35:24 UTC, 0 replies.
- Re: [Nutch-dev] Re: Distributed WebDB - posted by Byron Miller <by...@yahoo.com> on 2005/04/05 03:12:40 UTC, 0 replies.
- [jira] Created: (NUTCH-36) Chinese in Nutch - posted by "Jack Tang (JIRA)" <ji...@apache.org> on 2005/04/05 04:24:16 UTC, 0 replies.
- [jira] Commented: (NUTCH-36) Chinese in Nutch - posted by "Jack Tang (JIRA)" <ji...@apache.org> on 2005/04/05 04:24:17 UTC, 2 replies.
- [jira] Updated: (NUTCH-36) Chinese in Nutch - posted by "Jack Tang (JIRA)" <ji...@apache.org> on 2005/04/05 05:07:18 UTC, 0 replies.
- protocol-file plugin requires activation framework? - posted by Chris Mattmann <ch...@jpl.nasa.gov> on 2005/04/05 06:21:58 UTC, 6 replies.
- parse-mp3 plugin - posted by ch...@jpl.nasa.gov on 2005/04/05 06:53:43 UTC, 0 replies.
- Vertical Search Opportunity - posted by AJ Archibald <aj...@yahoo.com> on 2005/04/05 06:59:44 UTC, 0 replies.
- [jira] Created: (NUTCH-37) Javadoc Warnings - posted by "Jerome Charron (JIRA)" <ji...@apache.org> on 2005/04/05 13:15:18 UTC, 0 replies.
- [jira] Updated: (NUTCH-33) MIME content type detector (using magic char sequences) - posted by "Jerome Charron (JIRA)" <ji...@apache.org> on 2005/04/05 15:04:31 UTC, 2 replies.
- Exceeded http.max.delays - posted by Fabrice Estiévenart <fe...@cetic.be> on 2005/04/05 15:22:06 UTC, 0 replies.
- [jira] Commented: (NUTCH-33) MIME content type detector (using magic char sequences) - posted by "John Xing (JIRA)" <ji...@apache.org> on 2005/04/05 20:43:03 UTC, 12 replies.
- [jira] Assigned: (NUTCH-33) MIME content type detector (using magic char sequences) - posted by "John Xing (JIRA)" <ji...@apache.org> on 2005/04/05 20:43:04 UTC, 0 replies.
- RE: [Nutch-dev] Converted Wiki - posted by Chirag Chaman <de...@filangy.com> on 2005/04/05 22:09:25 UTC, 4 replies.
- advanced search query syntax - posted by Rohit Kulkarni <ro...@gmail.com> on 2005/04/06 03:40:23 UTC, 0 replies.
- [jira] Assigned: (NUTCH-4) Serious bug: OutOfMemoryError: Java heap space - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2005/04/06 07:37:16 UTC, 0 replies.
- [jira] Updated: (NUTCH-4) Serious bug: OutOfMemoryError: Java heap space - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2005/04/06 15:25:16 UTC, 1 replies.
- [jira] Commented: (NUTCH-4) Serious bug: OutOfMemoryError: Java heap space - posted by "Piotr Kosiorowski (JIRA)" <ji...@apache.org> on 2005/04/06 16:19:21 UTC, 2 replies.
- [jira] Created: (NUTCH-38) distributed search improvement - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2005/04/06 21:17:12 UTC, 0 replies.
- [jira] Updated: (NUTCH-38) distributed search improvement - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2005/04/06 21:17:13 UTC, 2 replies.
- [jira] Commented: (NUTCH-38) distributed search improvement - posted by "Doug Cutting (JIRA)" <ji...@apache.org> on 2005/04/06 21:49:23 UTC, 2 replies.
- getTermFreqVector - posted by Siva Bandhamravuri <sb...@umich.edu> on 2005/04/07 07:42:37 UTC, 1 replies.
- Re: Highlighting query words in cached html - posted by Jack Tang <hi...@gmail.com> on 2005/04/07 09:20:37 UTC, 1 replies.
- [jira] Created: (NUTCH-39) pagination in search result - posted by "Jack Tang (JIRA)" <ji...@apache.org> on 2005/04/07 11:46:22 UTC, 0 replies.
- [jira] Commented: (NUTCH-39) pagination in search result - posted by "Jack Tang (JIRA)" <ji...@apache.org> on 2005/04/07 11:57:18 UTC, 21 replies.
- RE: [Nutch-dev] [jira] Commented: (NUTCH-39) pagination in search result - posted by Chirag Chaman <de...@filangy.com> on 2005/04/07 19:57:46 UTC, 0 replies.
- Appending with SegmentWriter - posted by Daniel Russo <ru...@gmail.com> on 2005/04/07 22:41:00 UTC, 1 replies.
- [jira] Resolved: (NUTCH-4) Serious bug: OutOfMemoryError: Java heap space - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2005/04/07 22:44:16 UTC, 0 replies.
- [jira] Resolved: (NUTCH-38) distributed search improvement - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2005/04/07 22:44:17 UTC, 0 replies.
- when compile nutch-0.6,there is a problem - posted by Zhou LiBing <zh...@gmail.com> on 2005/04/08 03:42:43 UTC, 0 replies.
- nutch engines - posted by Siva Bandhamravuri <sb...@umich.edu> on 2005/04/08 20:31:29 UTC, 2 replies.
- Image and Video Search - posted by lu...@uol.com.br on 2005/04/09 02:20:28 UTC, 2 replies.
- Wiki has been moved.... - posted by Chirag Chaman <de...@filangy.com> on 2005/04/09 03:36:44 UTC, 0 replies.
- nutch search - posted by Siva Bandhamravuri <sb...@umich.edu> on 2005/04/09 07:30:58 UTC, 0 replies.
- Re: tools cleanup - posted by Sami Siren <s....@sonera.inet.fi> on 2005/04/09 09:15:32 UTC, 2 replies.
- Re: [Nutch-dev] nutch search - posted by Stefan Groschupf <sg...@media-style.com> on 2005/04/10 15:22:17 UTC, 0 replies.
- XML OUTPUT - posted by lu...@uol.com.br on 2005/04/10 17:06:29 UTC, 3 replies.
- How to do OR search in Nutch? - posted by Kannan Sundaramoorthy <ka...@cognizant.com> on 2005/04/11 06:08:31 UTC, 3 replies.
- AW: [Nutch-dev] Re: tools cleanup - posted by "Strittmatter, Stephan" <St...@sybit.de> on 2005/04/11 10:26:47 UTC, 1 replies.
- Re: [Nutch-dev] Supported web server platform & version - posted by Stefan Groschupf <sg...@media-style.com> on 2005/04/11 12:41:52 UTC, 0 replies.
- rank of hits - posted by Siva Bandhamravuri <sb...@umich.edu> on 2005/04/11 15:48:08 UTC, 0 replies.
- Re: [Nutch-dev] Re: Image and Video Search - posted by Hasan Diwan <ha...@gmail.com> on 2005/04/11 15:56:56 UTC, 0 replies.
- sorting search results - posted by Doug Cutting <cu...@nutch.org> on 2005/04/11 23:29:37 UTC, 1 replies.
- [jira] Updated: (NUTCH-35) modify XML parsing code in Nutch to use single API - posted by "Stefan Grroschupf (JIRA)" <ji...@apache.org> on 2005/04/12 00:31:17 UTC, 2 replies.
- [jira] Closed: (NUTCH-15) ipc client timeout should be configurable - posted by "Stefan Grroschupf (JIRA)" <ji...@apache.org> on 2005/04/12 00:31:18 UTC, 0 replies.
- resolve or close bugs? - posted by Stefan Groschupf <sg...@media-style.com> on 2005/04/12 00:31:36 UTC, 3 replies.
- Bot information within server log - posted by Michael Wechner <mi...@wyona.com> on 2005/04/12 00:49:44 UTC, 0 replies.
- Re: [Nutch-dev] Feature request - pluggable Analyzer - posted by Jason Tang <ja...@commcentral.com> on 2005/04/12 07:39:27 UTC, 1 replies.
- Chinese in Nutch:My solution - posted by cao yuzhong <ca...@hotmail.com> on 2005/04/12 08:37:02 UTC, 1 replies.
- WebDBInjector and DMOZ separation - posted by David Spencer <da...@tropo.com> on 2005/04/12 20:38:56 UTC, 1 replies.
- Re: [Nutch-dev] resolve or close bugs? - posted by og...@yahoo.com on 2005/04/12 20:49:39 UTC, 0 replies.
- [jira] Commented: (NUTCH-35) modify XML parsing code in Nutch to use single API - posted by "Doug Cutting (JIRA)" <ji...@apache.org> on 2005/04/12 20:52:24 UTC, 2 replies.
- action apis (NUTCH-27) - posted by Stefan Groschupf <sg...@media-style.com> on 2005/04/12 22:10:46 UTC, 4 replies.
- Re: [Nutch-dev] Re: How to do OR search in Nutch? - posted by Hasan Diwan <ha...@gmail.com> on 2005/04/12 22:19:35 UTC, 0 replies.
- [jira] Updated: (NUTCH-5) Hit limiter off-by-one bug - posted by "Andy Liu (JIRA)" <ji...@apache.org> on 2005/04/12 23:59:17 UTC, 0 replies.
- Re: [Nutch-dev] Re: nutch engines - posted by Zhou LiBing <zh...@gmail.com> on 2005/04/13 03:11:17 UTC, 1 replies.
- Why Crawl failed to fetch so many pages? - posted by cao yuzhong <ca...@hotmail.com> on 2005/04/13 12:56:07 UTC, 3 replies.
- Optimal segment size? - posted by Luke Baker <lu...@gospelcom.net> on 2005/04/13 15:37:40 UTC, 3 replies.
- [jira] Resolved: (NUTCH-5) Hit limiter off-by-one bug - posted by "Doug Cutting (JIRA)" <ji...@apache.org> on 2005/04/13 19:24:19 UTC, 0 replies.
- [jira] Closed: (NUTCH-5) Hit limiter off-by-one bug - posted by "Doug Cutting (JIRA)" <ji...@apache.org> on 2005/04/13 19:24:20 UTC, 0 replies.
- MapFile.Reader bug (Re: Optimal segment size?) - posted by Andrzej Bialecki <ab...@getopt.org> on 2005/04/13 19:41:32 UTC, 3 replies.
- [jira] Created: (NUTCH-40) TestSegmentMergeTool fail - posted by "Stefan Grroschupf (JIRA)" <ji...@apache.org> on 2005/04/13 22:55:31 UTC, 0 replies.
- retrieving Websites using docId - posted by Siva Bandhamravuri <sb...@umich.edu> on 2005/04/13 23:14:34 UTC, 0 replies.
- filesystem indexing - posted by Boris Kröger <bo...@cip.wiwi.uni-karlsruhe.de> on 2005/04/13 23:16:47 UTC, 1 replies.
- Wiki Up! - posted by Chirag Chaman <de...@filangy.com> on 2005/04/13 23:31:07 UTC, 0 replies.
- Crawl-urlfilter cann't deals with relative urls appropriately ?? - posted by cao yuzhong <ca...@hotmail.com> on 2005/04/14 05:31:29 UTC, 0 replies.
- fetcher failling on urlnormalizer - posted by Byron Miller <by...@yahoo.com> on 2005/04/14 07:48:12 UTC, 1 replies.
- [jira] Resolved: (NUTCH-35) modify XML parsing code in Nutch to use single API - posted by "Doug Cutting (JIRA)" <ji...@apache.org> on 2005/04/14 19:57:16 UTC, 0 replies.
- [jira] Closed: (NUTCH-35) modify XML parsing code in Nutch to use single API - posted by "Doug Cutting (JIRA)" <ji...@apache.org> on 2005/04/14 19:57:17 UTC, 0 replies.
- dedup and redirect handling - posted by Luke Baker <lu...@gospelcom.net> on 2005/04/14 20:21:11 UTC, 1 replies.
- Re: [Nutch-dev] Crawl-urlfilter cann't deals with relative urls appropriately ?? - posted by David Wallace <da...@nzqa.govt.nz> on 2005/04/14 22:13:06 UTC, 0 replies.
- [jira] Created: (NUTCH-41) Replace CVS by SVN within tutorial of Documentation - posted by "Michael Wechner (JIRA)" <ji...@apache.org> on 2005/04/14 23:26:19 UTC, 0 replies.
- [jira] Updated: (NUTCH-41) Replace CVS by SVN within tutorial of Documentation - posted by "Michael Wechner (JIRA)" <ji...@apache.org> on 2005/04/14 23:26:19 UTC, 0 replies.
- [jira] Created: (NUTCH-42) enhance search.jsp such that it can also returns XML - posted by "Michael Wechner (JIRA)" <ji...@apache.org> on 2005/04/15 00:32:17 UTC, 0 replies.
- [jira] Updated: (NUTCH-42) enhance search.jsp such that it can also returns XML - posted by "Michael Wechner (JIRA)" <ji...@apache.org> on 2005/04/15 00:32:18 UTC, 3 replies.
- Nutch and Maven? - posted by ch...@jpl.nasa.gov on 2005/04/15 03:11:40 UTC, 0 replies.
- Re: [Nutch-dev] Re: fetcher failling on urlnormalizer - posted by Byron Miller <by...@yahoo.com> on 2005/04/15 03:25:37 UTC, 0 replies.
- Re: [Nutch-dev] Crawl-urlfilter cann't deals with relativeurls appropriately ?? - posted by cao yuzhong <ca...@hotmail.com> on 2005/04/15 03:43:44 UTC, 0 replies.
- summaries - posted by Byron Miller <by...@yahoo.com> on 2005/04/15 04:00:17 UTC, 1 replies.
- [jira] Created: (NUTCH-43) replace / by request.getContextPath()+/ - posted by "Joost Baaij (JIRA)" <ji...@apache.org> on 2005/04/15 12:17:23 UTC, 0 replies.
- Questions about distributed search servers - posted by Andy Liu <an...@gmail.com> on 2005/04/15 16:58:07 UTC, 1 replies.
- [jira] Commented: (NUTCH-42) enhance search.jsp such that it can also returns XML - posted by "Doug Cutting (JIRA)" <ji...@apache.org> on 2005/04/15 18:31:38 UTC, 4 replies.
- Re: [Nutch-dev] [jira] Commented: (NUTCH-42) enhance search.jsp such that it can also returns XML - posted by Michael Wechner <mi...@wyona.com> on 2005/04/16 00:44:37 UTC, 4 replies.
- Starting the webapp and finding the segments - posted by Michael Wechner <mi...@wyona.com> on 2005/04/16 11:32:25 UTC, 2 replies.
- filename problem during local filesystem crawl - posted by Boris Kroeger <bo...@cip.wiwi.uni-karlsruhe.de> on 2005/04/16 13:21:51 UTC, 0 replies.
- WebDBWriter & NutchFileSystem - posted by Ben <ne...@gmail.com> on 2005/04/16 14:29:54 UTC, 1 replies.
- [jira] Commented: (NUTCH-43) replace / by request.getContextPath()+/ - posted by "Jerome Charron (JIRA)" <ji...@apache.org> on 2005/04/16 15:50:58 UTC, 0 replies.
- language identifier - posted by Stefan Groschupf <sg...@media-style.com> on 2005/04/16 22:52:02 UTC, 14 replies.
- Someone working on NUTCH-34? - posted by Jérôme Charron <je...@gmail.com> on 2005/04/16 23:33:28 UTC, 1 replies.
- [jira] Commented: (NUTCH-34) Parsing different content formats - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2005/04/17 12:06:59 UTC, 4 replies.
- [jira] Closed: (NUTCH-22) ontology supported query refinement - posted by "John Xing (JIRA)" <ji...@apache.org> on 2005/04/17 21:22:58 UTC, 0 replies.
- [jira] Closed: (NUTCH-19) Space in Java.exe path chokes bin/nutch - posted by "John Xing (JIRA)" <ji...@apache.org> on 2005/04/17 21:34:00 UTC, 0 replies.
- [jira] Assigned: (NUTCH-30) rss feed parser - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2005/04/17 21:45:05 UTC, 0 replies.
- [jira] Created: (NUTCH-44) too many search results - posted by "Emilijan Mirceski (JIRA)" <ji...@apache.org> on 2005/04/17 23:43:57 UTC, 0 replies.
- going backwards? svn getting deprecated errors - posted by Byron Miller <by...@yahoo.com> on 2005/04/18 04:36:46 UTC, 1 replies.
- [jira] Closed: (NUTCH-33) MIME content type detector (using magic char sequences) - posted by "John Xing (JIRA)" <ji...@apache.org> on 2005/04/18 06:21:00 UTC, 0 replies.
- [jira] Created: (NUTCH-45) Log corrupt segments in SegmentMergeTool - posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2005/04/18 06:43:05 UTC, 0 replies.
- [jira] Updated: (NUTCH-45) Log corrupt segments in SegmentMergeTool - posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2005/04/18 06:43:08 UTC, 0 replies.
- [jira] Kommentiert: (NUTCH-34) Parsing different content formats - posted by "Stephan Strittmatter (JIRA)" <ji...@apache.org> on 2005/04/18 10:59:45 UTC, 4 replies.
- new parse-html - posted by Marco Pereira <nu...@hotmail.com> on 2005/04/18 11:11:50 UTC, 1 replies.
- [jira] Kommentiert: (NUTCH-21) parser plugin for MS PowerPoint slides - posted by "Stephan Strittmatter (JIRA)" <ji...@apache.org> on 2005/04/18 12:36:44 UTC, 0 replies.
- [jira] Commented: (NUTCH-44) too many search results - posted by "Doug Cutting (JIRA)" <ji...@apache.org> on 2005/04/18 18:44:52 UTC, 1 replies.
- HashMap - linkParams - posted by Marco PV <nu...@hotmail.com> on 2005/04/18 22:08:18 UTC, 1 replies.
- Re: [Nutch-dev] Re: going backwards? svn getting deprecated errors - posted by Byron Miller <by...@yahoo.com> on 2005/04/19 00:57:35 UTC, 0 replies.
- Parse Rss Compile errors - posted by Marco PV <nu...@hotmail.com> on 2005/04/19 06:30:50 UTC, 1 replies.
- AWS OpenSearch on unto.net - posted by Jack Tang <hi...@gmail.com> on 2005/04/19 07:26:05 UTC, 0 replies.
- Killed crawl process and corrupted segment - posted by Egor Chernodarov <eg...@zarinsk.dem.ru> on 2005/04/19 09:01:42 UTC, 0 replies.
- [jira] Created: (NUTCH-46) the NDFS problem(Could not obtain new output block for file) - posted by "zhangjin (JIRA)" <ji...@apache.org> on 2005/04/19 11:42:18 UTC, 0 replies.
- NUTCH-7 - analyze tool takes up all the disk space when there are circular links - posted by "yoursoft@freemail.hu" <yo...@freemail.hu> on 2005/04/19 12:25:18 UTC, 1 replies.
- Re: Incremental Crawling - posted by Kannan Sundaramoorthy <ka...@cognizant.com> on 2005/04/19 12:44:25 UTC, 1 replies.
- indexing more fields - posted by Konstantin Ott <ot...@netropol.de> on 2005/04/19 17:22:07 UTC, 1 replies.
- How to manage fetching? - posted by Tim Martin <ma...@gmail.com> on 2005/04/19 17:36:33 UTC, 3 replies.
- Re: [Nutch-dev] Re: NUTCH-7 - analyze tool takes up all the disk space when there are circular links - posted by Massimo Miccoli <mm...@iltrovatore.it> on 2005/04/19 18:28:15 UTC, 6 replies.
- [jira] Commented: (NUTCH-7) analyze tool takes up all the disk space when there are circular links - posted by "Doug Cutting (JIRA)" <ji...@apache.org> on 2005/04/19 18:28:23 UTC, 0 replies.
- link analysis - posted by Doug Cutting <cu...@nutch.org> on 2005/04/19 19:43:47 UTC, 2 replies.
- Configurable boost - posted by Piotr Kosiorowski <pk...@gmail.com> on 2005/04/19 23:23:43 UTC, 2 replies.
- [jira] Resolved: (NUTCH-41) Replace CVS by SVN within tutorial of Documentation - posted by "Doug Cutting (JIRA)" <ji...@apache.org> on 2005/04/19 23:57:26 UTC, 0 replies.
- [jira] Closed: (NUTCH-41) Replace CVS by SVN within tutorial of Documentation - posted by "Doug Cutting (JIRA)" <ji...@apache.org> on 2005/04/19 23:57:26 UTC, 0 replies.
- [jira] Commented: (NUTCH-40) TestSegmentMergeTool fail - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2005/04/20 00:51:26 UTC, 0 replies.
- RSS Updates -- Best strategy - posted by Hasan Diwan <ha...@gmail.com> on 2005/04/20 03:20:02 UTC, 1 replies.
- Sort does not work properly - posted by Alan Wang <sf...@gmail.com> on 2005/04/20 04:01:56 UTC, 2 replies.
- Re: [Nutch-dev] filesystem indexing - posted by Jason Tang <ja...@commcentral.com> on 2005/04/20 04:16:41 UTC, 3 replies.
- dev@nutch.org Mailinglist - posted by Michael Wechner <mi...@wyona.com> on 2005/04/20 09:28:11 UTC, 4 replies.
- Re: [Nutch-dev] [jira] Commented: (NUTCH-7) analyze tool takes up all the disk space when there are circular links - posted by "yoursoft@freemail.hu" <yo...@freemail.hu> on 2005/04/20 15:36:01 UTC, 2 replies.
- How to make stopwords configurable? - posted by Massimo Miccoli <mm...@iltrovatore.it> on 2005/04/20 17:55:25 UTC, 0 replies.
- [jira] Created: (NUTCH-47) Configure host filter to do wildcard prefixes - *.redhat.com - posted by "byron miller (JIRA)" <ji...@apache.org> on 2005/04/20 19:13:34 UTC, 0 replies.
- [jira] Commented: (NUTCH-13) If dns points to 127.0.0.1, the url is also crawled - posted by "byron miller (JIRA)" <ji...@apache.org> on 2005/04/20 20:08:23 UTC, 6 replies.
- [jira] Commented: (NUTCH-46) the NDFS problem(Could not obtain new output block for file) - posted by "byron miller (JIRA)" <ji...@apache.org> on 2005/04/20 20:08:24 UTC, 5 replies.
- "link:" feature - posted by Marco PV <nu...@hotmail.com> on 2005/04/20 20:16:44 UTC, 0 replies.
- [jira] Commented: (NUTCH-47) Configure host filter to do wildcard prefixes - *.redhat.com - posted by "Doug Cutting (JIRA)" <ji...@apache.org> on 2005/04/20 21:14:24 UTC, 1 replies.
- Re: [Nutch-dev] [jira] Commented: (NUTCH-7) analyze tool tak - posted by YourSoft <yo...@freemail.hu> on 2005/04/20 21:17:14 UTC, 0 replies.
- Nutch Distributed File System - posted by Piotr Kosiorowski <pk...@gmail.com> on 2005/04/20 22:31:56 UTC, 0 replies.
- Re: [Nutch-dev] filename problem during local filesystem crawl - posted by Kragen Sitaker <ks...@commerce.net> on 2005/04/20 23:06:17 UTC, 0 replies.
- parse-mp3 dependency missing - posted by Hasan Diwan <ha...@gmail.com> on 2005/04/20 23:20:39 UTC, 1 replies.
- [jira] Created: (NUTCH-48) "Did you mean" query enhancement/refignment feature request - posted by "byron miller (JIRA)" <ji...@apache.org> on 2005/04/20 23:26:27 UTC, 0 replies.
- [jira] Updated: (NUTCH-48) "Did you mean" query enhancement/refignment feature request - posted by "byron miller (JIRA)" <ji...@apache.org> on 2005/04/20 23:26:27 UTC, 1 replies.
- parse-rss fetch problems - posted by Marco PV <nu...@hotmail.com> on 2005/04/21 04:24:29 UTC, 2 replies.
- [nutch-dev] Sort does not work properly - posted by Alan Wang <sf...@gmail.com> on 2005/04/21 06:05:40 UTC, 0 replies.
- Re: [Nutch-dev] Re: Sort does not work properly - posted by Alan Wang <sf...@gmail.com> on 2005/04/21 06:37:36 UTC, 3 replies.
- Re: [Nutch-dev] [jira] Commented: (NUTCH-7) please update it with the svn - posted by "yoursoft@freemail.hu" <yo...@freemail.hu> on 2005/04/21 12:58:17 UTC, 7 replies.
- [jira] Created: (NUTCH-49) Flag for generate to fetch only new pages to complement the -refetchonly flag - posted by "Luke Baker (JIRA)" <ji...@apache.org> on 2005/04/21 15:38:26 UTC, 0 replies.
- [jira] Updated: (NUTCH-49) Flag for generate to fetch only new pages to complement the -refetchonly flag - posted by "Luke Baker (JIRA)" <ji...@apache.org> on 2005/04/21 15:38:27 UTC, 0 replies.
- [jira] Commented: (NUTCH-48) "Did you mean" query enhancement/refignment feature request - posted by "Andy Liu (JIRA)" <ji...@apache.org> on 2005/04/21 15:38:28 UTC, 0 replies.
- Re: [Nutch-dev] Re: parse-mp3 dependency missing - posted by Hasan Diwan <ha...@gmail.com> on 2005/04/21 21:18:31 UTC, 4 replies.
- RE: [Nutch-dev] Re: dev@nutch.org Mailinglist - posted by Chirag Chaman <de...@filangy.com> on 2005/04/22 04:15:26 UTC, 3 replies.
- Looking for crawler - posted by rajat swarup <ra...@gmail.com> on 2005/04/22 06:17:44 UTC, 0 replies.
- [jira] Aktualisiert: (NUTCH-20) Extract urls from plain texts - posted by "Stephan Strittmatter (JIRA)" <ji...@apache.org> on 2005/04/22 14:25:23 UTC, 0 replies.
- [jira] Created: (NUTCH-50) Benchmarks & Performance goals - posted by "byron miller (JIRA)" <ji...@apache.org> on 2005/04/22 15:42:24 UTC, 3 replies.
- [jira] Closed: (NUTCH-4) Serious bug: OutOfMemoryError: Java heap space - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2005/04/22 19:54:23 UTC, 0 replies.
- [jira] Closed: (NUTCH-38) distributed search improvement - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2005/04/22 19:54:24 UTC, 0 replies.
- [jira] Commented: (NUTCH-49) Flag for generate to fetch only new pages to complement the -refetchonly flag - posted by "Doug Cutting (JIRA)" <ji...@apache.org> on 2005/04/22 20:38:23 UTC, 0 replies.
- IlTrovatore check: e' SPAM? Re: [Nutch-dev] [jira] Commented: (NUTCH-7) please update it with the svn - posted by massimo miccoli <mm...@iltrovatore.it> on 2005/04/22 21:52:32 UTC, 0 replies.
- Re: [Nutch-dev] Re: How to manage fetching? - posted by Bill Goffe <go...@Oswego.EDU> on 2005/04/22 23:19:31 UTC, 0 replies.
- Possible bug in HttpResponse.java in protocol-http plugin - posted by Rohit Kulkarni <ro...@gmail.com> on 2005/04/24 00:24:43 UTC, 0 replies.
- Getting HTML source - posted by rajat swarup <ra...@gmail.com> on 2005/04/24 02:56:25 UTC, 0 replies.
- getLinks - posted by Marco PV <nu...@hotmail.com> on 2005/04/24 19:31:04 UTC, 0 replies.
- [jira] Created: (NUTCH-51) Removing a plugin after fetch but before indexing causes errors - posted by "byron miller (JIRA)" <ji...@apache.org> on 2005/04/25 00:14:26 UTC, 0 replies.
- Bug: Nutch indexer crashed - posted by John Doe <df...@lycos.com> on 2005/04/25 04:12:30 UTC, 0 replies.
- To get Nutch to print debug messages - posted by rajat swarup <ra...@gmail.com> on 2005/04/25 13:51:31 UTC, 2 replies.
- [jira] Commented: (NUTCH-51) Removing a plugin after fetch but before indexing causes errors - posted by "Piotr Kosiorowski (JIRA)" <ji...@apache.org> on 2005/04/25 16:20:28 UTC, 2 replies.
- Re: [Nutch-dev] Getting HTML source - posted by Hasan Diwan <ha...@gmail.com> on 2005/04/25 18:10:52 UTC, 1 replies.
- [PATCH] - NDFS TestClient command line handling - posted by Piotr Kosiorowski <pk...@gmail.com> on 2005/04/25 21:57:51 UTC, 0 replies.
- [PATCH] - Datanode command line handling - posted by Piotr Kosiorowski <pk...@gmail.com> on 2005/04/25 22:21:24 UTC, 0 replies.
- [jira] Created: (NUTCH-52) Parser plugin for MS Excel files - posted by "Rohit Kulkarni (JIRA)" <ji...@apache.org> on 2005/04/26 08:10:43 UTC, 0 replies.
- [jira] Updated: (NUTCH-52) Parser plugin for MS Excel files - posted by "Rohit Kulkarni (JIRA)" <ji...@apache.org> on 2005/04/26 08:10:44 UTC, 0 replies.
- [jira] Created: (NUTCH-53) Parser plugin for Zip files - posted by "Rohit Kulkarni (JIRA)" <ji...@apache.org> on 2005/04/26 08:21:23 UTC, 0 replies.
- [jira] Updated: (NUTCH-53) Parser plugin for Zip files - posted by "Rohit Kulkarni (JIRA)" <ji...@apache.org> on 2005/04/26 08:32:26 UTC, 0 replies.
- Error at building nutch with ant. - posted by Jakob Heidebrecht <Ja...@gmx.de> on 2005/04/26 17:35:33 UTC, 3 replies.
- Where are the nutch experts? - posted by Marco PV <nu...@hotmail.com> on 2005/04/26 19:32:59 UTC, 3 replies.
- problems running crawl tool - posted by Chris Mattmann <ch...@jpl.nasa.gov> on 2005/04/27 07:20:45 UTC, 0 replies.
- Re: [Nutch-dev] Re: Error at building nutch with ant. - posted by Zhou LiBing <zh...@gmail.com> on 2005/04/27 10:40:11 UTC, 4 replies.
- Fetching tool - posted by YourSoft <yo...@freemail.hu> on 2005/04/27 18:08:36 UTC, 0 replies.
- Creating an index (as in books!) from TermFreqVector - posted by praveen pathiyil <pa...@gmail.com> on 2005/04/27 20:42:08 UTC, 0 replies.
- [jira] Updated: (NUTCH-46) the NDFS problem(Could not obtain new output block for file) - posted by "Piotr Kosiorowski (JIRA)" <ji...@apache.org> on 2005/04/27 22:21:33 UTC, 0 replies.
- Re: Nutch Distributed File System - posted by Piotr Kosiorowski <pk...@gmail.com> on 2005/04/27 22:25:52 UTC, 0 replies.
- JSP's - posted by Hasan Diwan <ha...@gmail.com> on 2005/04/28 02:34:29 UTC, 1 replies.
- incoming anchor text and referer page url - posted by Marco PV <nu...@hotmail.com> on 2005/04/28 07:52:39 UTC, 0 replies.
- Bug? Couldn't compile. - posted by Jakob Heidebrecht <Ja...@gmx.de> on 2005/04/28 12:27:11 UTC, 2 replies.
- [PATCH] NullPointerException while coping NDFS file - posted by Piotr Kosiorowski <pk...@gmail.com> on 2005/04/28 21:31:02 UTC, 0 replies.
- Upcoming work on Fetcher - posted by Andrzej Bialecki <ab...@getopt.org> on 2005/04/29 00:13:49 UTC, 13 replies.
- Inject URL|SUMMARY|CATEGORY - posted by Marco PV <nu...@hotmail.com> on 2005/04/29 04:05:25 UTC, 0 replies.
- nutch and linux box - posted by Jack Tang <hi...@gmail.com> on 2005/04/29 10:15:51 UTC, 3 replies.
- Caching DNS for Nutch installation (Re: nutch and linux box) - posted by Andrzej Bialecki <ab...@getopt.org> on 2005/04/29 11:59:16 UTC, 1 replies.
- [jira] Created: (NUTCH-54) Fetcher improvements - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2005/04/30 09:04:04 UTC, 0 replies.
- [jira] Updated: (NUTCH-54) Fetcher improvements - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2005/04/30 09:04:05 UTC, 0 replies.