You are viewing a plain text version of this content. The canonical link for it is here.
- [jira] Commented: (NUTCH-335) Pdf summary corrupt issue - posted by "Siddharudh nadgeri (JIRA)" <ji...@apache.org> on 2006/08/01 15:16:14 UTC, 0 replies.
- fetcher improvements (was: Re: 0.8 much slower than 0.7) - posted by Sami Siren <ss...@gmail.com> on 2006/08/01 18:04:37 UTC, 0 replies.
- [jira] Resolved: (NUTCH-318) log4j not proper configured, readdb doesnt give any information - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/08/01 18:26:17 UTC, 0 replies.
- [jira] Commented: (NUTCH-266) hadoop bug when doing updatedb - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/08/01 18:57:15 UTC, 3 replies.
- [jira] Created: (NUTCH-336) Harvested links shouldn't get db.score.injected in addition to inbound contributions - posted by "Chris Schneider (JIRA)" <ji...@apache.org> on 2006/08/01 19:22:13 UTC, 0 replies.
- [jira] Created: (NUTCH-337) Fetcher ignores the fetcher.parse value configured in config file - posted by "Jeremy Huylebroeck (JIRA)" <ji...@apache.org> on 2006/08/02 02:06:13 UTC, 0 replies.
- nutch - posted by an...@orbita1.ru on 2006/08/02 09:37:50 UTC, 3 replies.
- [jira] Updated: (NUTCH-266) hadoop bug when doing updatedb - posted by "Renaud Richardet (JIRA)" <ji...@apache.org> on 2006/08/02 15:29:15 UTC, 2 replies.
- nutch/lucene question.. - posted by bruce <be...@earthlink.net> on 2006/08/02 15:41:46 UTC, 0 replies.
- [jira] Updated: (NUTCH-336) Harvested links shouldn't get db.score.injected in addition to inbound contributions - posted by "Chris Schneider (JIRA)" <ji...@apache.org> on 2006/08/02 20:34:18 UTC, 1 replies.
- .classpath for Ecplise - posted by Uroš Gruber <ur...@sir-mag.com> on 2006/08/03 10:00:23 UTC, 2 replies.
- parse-plugins.xml - posted by Marko Bauhardt <mb...@media-style.com> on 2006/08/03 13:04:23 UTC, 7 replies.
- [jira] Created: (NUTCH-338) Remove the text parser as an option for parsing PDF files in parse-plugins.xml - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2006/08/03 17:33:13 UTC, 0 replies.
- [jira] Updated: (NUTCH-338) Remove the text parser as an option for parsing PDF files in parse-plugins.xml - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2006/08/03 17:35:15 UTC, 1 replies.
- [Fwd: Re: 0.8 Recrawl script updated] - posted by Matthew Holt <mh...@redhat.com> on 2006/08/04 16:11:42 UTC, 3 replies.
- [jira] Created: (NUTCH-339) Refactor nutch to allow fetcher improvements - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/08/04 16:17:14 UTC, 0 replies.
- [jira] Commented: (NUTCH-339) Refactor nutch to allow fetcher improvements - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2006/08/04 16:54:14 UTC, 2 replies.
- [jira] Updated: (NUTCH-339) Refactor nutch to allow fetcher improvements - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2006/08/04 17:00:14 UTC, 2 replies.
- Re: (NUTCH-339) Refactor nutch to allow fetcher improvements - posted by Andrzej Bialecki <ab...@getopt.org> on 2006/08/04 18:44:15 UTC, 7 replies.
- [jira] Created: (NUTCH-340) Bug(s) in 0.8 tutorial - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/08/04 18:51:13 UTC, 0 replies.
- [jira] Updated: (NUTCH-340) Bug(s) in 0.8 tutorial - posted by "Uros Gruber (JIRA)" <ji...@apache.org> on 2006/08/04 20:06:15 UTC, 1 replies.
- [jira] Commented: (NUTCH-340) Bug(s) in 0.8 tutorial - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/08/04 21:02:15 UTC, 0 replies.
- [jira] Updated: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2006/08/05 02:24:15 UTC, 0 replies.
- [jira] Created: (NUTCH-341) IndexMerger now deletes entire after completing - posted by "Chris Schneider (JIRA)" <ji...@apache.org> on 2006/08/05 06:18:14 UTC, 0 replies.
- Terminating slashes in URL normalization - posted by Chris Schneider <Sc...@TransPac.com> on 2006/08/05 06:23:33 UTC, 4 replies.
- [jira] Resolved: (NUTCH-340) Bug(s) in 0.8 tutorial - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/08/05 15:52:14 UTC, 0 replies.
- [jira] Created: (NUTCH-342) Nutch commands log to nutch/logs/hadoop.logs by default - posted by "Chris Schneider (JIRA)" <ji...@apache.org> on 2006/08/05 17:04:13 UTC, 0 replies.
- [jira] Closed: (NUTCH-334) I am using the search technique - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/08/05 18:12:15 UTC, 0 replies.
- [jira] Updated: (NUTCH-342) Nutch commands log to nutch/logs/hadoop.logs by default - posted by "Chris Schneider (JIRA)" <ji...@apache.org> on 2006/08/05 23:19:14 UTC, 0 replies.
- [jira] Commented: (NUTCH-342) Nutch commands log to nutch/logs/hadoop.logs by default - posted by "Chris Schneider (JIRA)" <ji...@apache.org> on 2006/08/06 10:09:14 UTC, 1 replies.
- [jira] Created: (NUTCH-343) Index MP3 SHA1 hashes - posted by "Hasan Diwan (JIRA)" <ji...@apache.org> on 2006/08/06 22:58:13 UTC, 0 replies.
- Patch: deflate encoding - posted by Pascal Beis <pa...@gmail.com> on 2006/08/07 10:17:33 UTC, 5 replies.
- How do I write a nutch query. - posted by Fred Tyre <fr...@hlipublishing.com> on 2006/08/07 22:00:35 UTC, 0 replies.
- [jira] Created: (NUTCH-344) Fetcher threads blocked on synchronized block in cleanExpiredServerBlocks - posted by "Greg Kim (JIRA)" <ji...@apache.org> on 2006/08/08 01:56:15 UTC, 0 replies.
- [jira] Updated: (NUTCH-344) Fetcher threads blocked on synchronized block in cleanExpiredServerBlocks - posted by "Greg Kim (JIRA)" <ji...@apache.org> on 2006/08/08 07:23:14 UTC, 1 replies.
- [jira] Created: (NUTCH-345) Add support for Content-Encoding: deflated - posted by "Pascal Beis (JIRA)" <ji...@apache.org> on 2006/08/08 13:04:13 UTC, 0 replies.
- [jira] Commented: (NUTCH-330) command line tool to search a Lucene index - posted by "Renaud Richardet (JIRA)" <ji...@apache.org> on 2006/08/08 18:43:16 UTC, 0 replies.
- [jira] Resolved: (NUTCH-344) Fetcher threads blocked on synchronized block in cleanExpiredServerBlocks - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/08/08 21:10:16 UTC, 0 replies.
- [jira] Resolved: (NUTCH-266) hadoop bug when doing updatedb - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/08/08 21:33:15 UTC, 0 replies.
- parse-oo plugin - posted by Matthew Holt <mh...@redhat.com> on 2006/08/08 22:54:58 UTC, 0 replies.
- "Could not obtain block" Error - posted by Uygar Yüzsüren <uy...@gmail.com> on 2006/08/09 10:06:39 UTC, 0 replies.
- Error in 0.8 regex-urlfilter.txt - posted by Matthew Holt <mh...@redhat.com> on 2006/08/09 15:51:19 UTC, 1 replies.
- [jira] Created: (NUTCH-346) Improve readability of logs/hadoop.log - posted by "Renaud Richardet (JIRA)" <ji...@apache.org> on 2006/08/09 22:13:13 UTC, 0 replies.
- [jira] Commented: (NUTCH-344) Fetcher threads blocked on synchronized block in cleanExpiredServerBlocks - posted by "Jacob Brunson (JIRA)" <ji...@apache.org> on 2006/08/10 06:22:16 UTC, 2 replies.
- Logs for unit test - posted by HUYLEBROECK Jeremy RD-ILAB-SSF <je...@orange-ft.com> on 2006/08/10 23:40:12 UTC, 1 replies.
- nutch-0.8. indexer issue - posted by Feng Ji <fe...@gmail.com> on 2006/08/11 16:54:04 UTC, 1 replies.
- Neko parsing fix inadvertently reverted? - posted by Benjamin Higgins <bh...@gmail.com> on 2006/08/11 19:51:41 UTC, 2 replies.
- turn on debug log on nutch-0.8. - posted by Feng Ji <fe...@gmail.com> on 2006/08/12 00:41:41 UTC, 0 replies.
- [jira] Created: (NUTCH-347) Build: plugins' Jars not found - posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2006/08/12 05:49:14 UTC, 0 replies.
- [jira] Commented: (NUTCH-233) wrong regular expression hang reduce process for ever - posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2006/08/12 07:04:14 UTC, 1 replies.
- [jira] Commented: (NUTCH-347) Build: plugins' Jars not found - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/08/13 01:15:15 UTC, 1 replies.
- [jira] Updated: (NUTCH-347) Build: plugins' Jars not found - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/08/13 01:17:14 UTC, 0 replies.
- OPICScoringFilter - posted by Marko Bauhardt <mb...@media-style.com> on 2006/08/13 23:27:26 UTC, 1 replies.
- Allowing search from command line - posted by Michael Wechner <mi...@wyona.com> on 2006/08/14 14:13:36 UTC, 1 replies.
- Patch Available status? - posted by Chris Mattmann <ch...@jpl.nasa.gov> on 2006/08/15 22:18:28 UTC, 7 replies.
- [jira] Commented: (NUTCH-48) "Did you mean" query enhancement/refignment feature request - posted by "Daniel Drozdovich (JIRA)" <ji...@apache.org> on 2006/08/16 01:53:17 UTC, 0 replies.
- [jira] Created: (NUTCH-348) Generator is building fetch list using *lowest* scoring URLs - posted by "Chris Schneider (JIRA)" <ji...@apache.org> on 2006/08/16 09:27:13 UTC, 0 replies.
- Tika update - posted by Jukka Zitting <ju...@gmail.com> on 2006/08/16 13:06:13 UTC, 4 replies.
- [jira] Created: (NUTCH-349) Port Nutch to use Hadoop Text instead of UTF8 - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2006/08/16 13:35:14 UTC, 0 replies.
- Thoughts on Parser design and dependencies - posted by Jukka Zitting <ju...@gmail.com> on 2006/08/16 13:59:00 UTC, 10 replies.
- Webinterface ignores hidden language field - posted by David Podunavac <da...@wyona.com> on 2006/08/16 16:16:44 UTC, 0 replies.
- [jira] Commented: (NUTCH-349) Port Nutch to use Hadoop Text instead of UTF8 - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/08/16 16:19:14 UTC, 1 replies.
- Re: Any plans to move to build Nutchusing Maven? - posted by steven shingler <sh...@gmail.com> on 2006/08/16 16:36:17 UTC, 7 replies.
- Nutch, samba and urls... - posted by René Treffer <tr...@in.tum.de> on 2006/08/16 19:25:37 UTC, 1 replies.
- HTTP Accept Header seems to be missing - posted by Michael Wechner <mi...@wyona.com> on 2006/08/16 23:12:02 UTC, 2 replies.
- [jira] Updated: (NUTCH-348) Generator is building fetch list using *lowest* scoring URLs - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/17 01:23:14 UTC, 0 replies.
- [jira] Closed: (NUTCH-348) Generator is building fetch list using *lowest* scoring URLs - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2006/08/17 18:34:15 UTC, 0 replies.
- [jira] Created: (NUTCH-350) urls blocked db.fetch.retry.max * http.max.delays times during fetching are marked as STATUS_DB_GONE - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/17 21:29:16 UTC, 0 replies.
- [jira] Updated: (NUTCH-350) urls blocked db.fetch.retry.max * http.max.delays times during fetching are marked as STATUS_DB_GONE - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/17 21:31:17 UTC, 0 replies.
- [jira] Created: (NUTCH-351) Protocol forward proxy - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/08/17 21:39:13 UTC, 0 replies.
- [jira] Updated: (NUTCH-351) Protocol forward proxy - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/08/17 21:41:15 UTC, 0 replies.
- 0.8 not loading plugins - posted by Chris Stephens <ch...@liveoakinteractive.com> on 2006/08/17 23:14:13 UTC, 11 replies.
- [jira] Created: (NUTCH-352) Add jar command to bin/nutch to allow launching hadoop job jars - posted by "David Cathcart (JIRA)" <ji...@apache.org> on 2006/08/17 23:52:13 UTC, 0 replies.
- [jira] Updated: (NUTCH-352) Add jar command to bin/nutch to allow launching hadoop job jars - posted by "David Cathcart (JIRA)" <ji...@apache.org> on 2006/08/17 23:52:14 UTC, 0 replies.
- [jira] Commented: (NUTCH-322) Fetcher discards ProtocolStatus, doesn't store redirected pages - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/18 03:47:14 UTC, 0 replies.
- [jira] Created: (NUTCH-353) pages that serverside forwards will be refetched every time - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/18 06:51:17 UTC, 0 replies.
- [jira] Updated: (NUTCH-353) pages that serverside forwards will be refetched every time - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/18 06:51:21 UTC, 0 replies.
- [jira] Resolved: (NUTCH-322) Fetcher discards ProtocolStatus, doesn't store redirected pages - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/18 06:51:23 UTC, 1 replies.
- [jira] Commented: (NUTCH-346) Improve readability of logs/hadoop.log - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/18 07:45:17 UTC, 0 replies.
- [jira] Commented: (NUTCH-345) Add support for Content-Encoding: deflated - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/18 07:54:14 UTC, 2 replies.
- [jira] Commented: (NUTCH-343) Index MP3 SHA1 hashes - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/18 08:00:15 UTC, 0 replies.
- [jira] Updated: (NUTCH-341) IndexMerger now deletes entire after completing - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/18 08:21:14 UTC, 1 replies.
- some questions - posted by an...@orbita1.ru on 2006/08/18 08:22:51 UTC, 0 replies.
- [jira] Updated: (NUTCH-337) Fetcher ignores the fetcher.parse value configured in config file - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/18 08:32:14 UTC, 1 replies.
- [jira] Reopened: (NUTCH-322) Fetcher discards ProtocolStatus, doesn't store redirected pages - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2006/08/18 11:46:16 UTC, 0 replies.
- Adding Database Field - posted by Levent Ulutas <Le...@web.de> on 2006/08/18 13:46:00 UTC, 0 replies.
- [jira] Commented: (NUTCH-341) IndexMerger now deletes entire after completing - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/08/18 16:40:14 UTC, 0 replies.
- [jira] Commented: (NUTCH-338) Remove the text parser as an option for parsing PDF files in parse-plugins.xml - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2006/08/18 16:56:14 UTC, 2 replies.
- [jira] Resolved: (NUTCH-347) Build: plugins' Jars not found - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/08/18 16:56:16 UTC, 0 replies.
- [jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2006/08/18 16:56:18 UTC, 0 replies.
- [jira] Resolved: (NUTCH-338) Remove the text parser as an option for parsing PDF files in parse-plugins.xml - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/08/18 17:13:14 UTC, 0 replies.
- [jira] Closed: (NUTCH-341) IndexMerger now deletes entire after completing - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2006/08/18 20:47:15 UTC, 0 replies.
- [jira] Updated: (NUTCH-105) Network error during robots.txt fetch causes file to be ignored - posted by "Greg Kim (JIRA)" <ji...@apache.org> on 2006/08/19 02:37:14 UTC, 2 replies.
- architecture question/thoughts - posted by bruce <be...@earthlink.net> on 2006/08/19 03:48:24 UTC, 0 replies.
- show new data in search result page - posted by Feng Ji <fe...@gmail.com> on 2006/08/19 23:37:48 UTC, 0 replies.
- [jira] Created: (NUTCH-354) MapWritable, nextEntry is not reset when Entries are recycled - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/20 00:49:13 UTC, 0 replies.
- [jira] Updated: (NUTCH-354) MapWritable, nextEntry is not reset when Entries are recycled - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/20 00:51:15 UTC, 0 replies.
- [jira] Closed: (NUTCH-354) MapWritable, nextEntry is not reset when Entries are recycled - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2006/08/20 01:30:14 UTC, 0 replies.
- the implementation code of explanation.jsp in Search Page - posted by Feng Ji <fe...@gmail.com> on 2006/08/20 15:04:55 UTC, 1 replies.
- [jira] Created: (NUTCH-355) The title of query result could like the summary have the highlight?? - posted by "King Kong (JIRA)" <ji...@apache.org> on 2006/08/20 16:14:14 UTC, 0 replies.
- [jira] Created: (NUTCH-356) Plugin repository cache can lead to memory leak - posted by "Enrico Triolo (JIRA)" <ji...@apache.org> on 2006/08/21 13:59:13 UTC, 0 replies.
- [jira] Updated: (NUTCH-346) Improve readability of logs/hadoop.log - posted by "Renaud Richardet (JIRA)" <ji...@apache.org> on 2006/08/21 16:10:14 UTC, 0 replies.
- [jira] Commented: (NUTCH-355) The title of query result could like the summary have the highlight?? - posted by "King Kong (JIRA)" <ji...@apache.org> on 2006/08/21 18:41:14 UTC, 0 replies.
- [jira] Commented: (NUTCH-354) MapWritable, nextEntry is not reset when Entries are recycled - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/21 21:09:17 UTC, 0 replies.
- Fwd: [webspam-announces] Web Spam Collection Announced - posted by Stefan Groschupf <sg...@101tec.com> on 2006/08/21 21:16:12 UTC, 0 replies.
- [jira] Commented: (NUTCH-356) Plugin repository cache can lead to memory leak - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/21 23:09:14 UTC, 3 replies.
- [jira] Created: (NUTCH-357) crawling simulation - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/22 07:40:13 UTC, 0 replies.
- [jira] Updated: (NUTCH-357) crawling simulation - posted by "Stefan Groschupf (JIRA)" <ji...@apache.org> on 2006/08/22 07:54:14 UTC, 0 replies.
- [jira] Created: (NUTCH-358) Language Switching - posted by "David Podunavac (JIRA)" <ji...@apache.org> on 2006/08/22 15:04:13 UTC, 0 replies.
- Ontology compile bug - posted by Michael Wechner <mi...@wyona.com> on 2006/08/22 15:07:12 UTC, 0 replies.
- Junit testing, was: Re: [jira] Updated: (NUTCH-357) crawling simulation - posted by Sami Siren <ss...@gmail.com> on 2006/08/22 17:27:49 UTC, 1 replies.
- differ search in filesystem or webpages - posted by David Podunavac <da...@wyona.com> on 2006/08/22 17:41:07 UTC, 0 replies.
- Injector calls Map with blank lines - posted by Dennis Kubes <nu...@dragonflymc.com> on 2006/08/22 19:13:47 UTC, 0 replies.
- Use CrawlDb as a metadata Db? - posted by HUYLEBROECK Jeremy RD-ILAB-SSF <je...@orange-ft.com> on 2006/08/22 19:38:31 UTC, 2 replies.
- Nutch as caching web proxy - posted by Neil Ireson <ne...@gmail.com> on 2006/08/23 12:49:53 UTC, 0 replies.
- problem with nutch - posted by an...@orbita1.ru on 2006/08/23 14:02:06 UTC, 4 replies.
- How to debug War/Tomcat? - posted by Chris Stephens <ch...@liveoakinteractive.com> on 2006/08/23 18:48:16 UTC, 0 replies.
- Re: [Nutch Wiki] Update of "RenaudRichardet" by RenaudRichardet - posted by Stefan Groschupf <sg...@101tec.com> on 2006/08/23 21:00:22 UTC, 0 replies.
- [jira] Commented: (NUTCH-273) When a page is redirected, the original url is NOT updated. - posted by "Chris Schneider (JIRA)" <ji...@apache.org> on 2006/08/24 08:49:16 UTC, 0 replies.
- [jira] Created: (NUTCH-359) extraction of links will fail for whole page if one single link cannot be parsed - posted by "Renaud Richardet (JIRA)" <ji...@apache.org> on 2006/08/24 08:49:46 UTC, 0 replies.
- HTTP/1.1 problem - posted by Doğacan Güney <do...@agmlab.com> on 2006/08/24 10:15:53 UTC, 0 replies.
- Single Search Server, Multiple Indexes on Separate Disks - posted by Dennis Kubes <nu...@dragonflymc.com> on 2006/08/24 17:23:55 UTC, 0 replies.
- Re: [Fwd: Re: [Nutch Wiki] Update of "RenaudRichardet" by RenaudRichardet] - posted by Renaud Richardet <re...@wyona.com> on 2006/08/24 17:35:33 UTC, 1 replies.
- Checking if crawl dir exists ... - posted by Michael Wechner <mi...@wyona.com> on 2006/08/25 15:52:03 UTC, 2 replies.
- reading crawl dir from nutch-default.xml - posted by David Podunavac <da...@wyona.com> on 2006/08/25 16:26:29 UTC, 0 replies.
- nutch/lucene question... - posted by bruce <be...@earthlink.net> on 2006/08/25 19:44:31 UTC, 1 replies.
- Re: [Nutch-dev] Checking if crawl dir exists ... - posted by Hasan Diwan <ha...@gmail.com> on 2006/08/26 12:37:08 UTC, 1 replies.
- Missing pages & anchor text - posted by Doug Cook <na...@candiru.com> on 2006/08/28 20:33:15 UTC, 5 replies.
- Hadoop job question - posted by HUYLEBROECK Jeremy RD-ILAB-SSF <je...@orange-ft.com> on 2006/08/29 03:41:00 UTC, 0 replies.
- Nutch internals - posted by Uroš Gruber <ur...@sir-mag.com> on 2006/08/29 14:11:36 UTC, 0 replies.
- Re: Hadoop job question - posted by Dennis Kubes <nu...@dragonflymc.com> on 2006/08/29 16:58:56 UTC, 1 replies.
- books (and articles) about search engine algorithms - posted by Mladen Adamovic <ad...@blic.net> on 2006/08/29 17:26:43 UTC, 2 replies.
- Re: [Nutch Wiki] Update of "RunNutchInEclipse" by UrosG - posted by Stefan Groschupf <sg...@101tec.com> on 2006/08/29 18:38:20 UTC, 1 replies.
- get CrawlDatum - posted by Uroš Gruber <ur...@sir-mag.com> on 2006/08/30 09:52:20 UTC, 5 replies.
- Fetch error - posted by an...@orbita1.ru on 2006/08/30 10:17:21 UTC, 1 replies.
- Should URL normalization iterate? - posted by Doug Cook <na...@candiru.com> on 2006/08/30 16:21:04 UTC, 0 replies.
- fetcher status missing in log file - posted by AJ Chen <ca...@gmail.com> on 2006/08/30 22:37:28 UTC, 0 replies.
- [jira] Closed: (NUTCH-242) Add optional -urlFiltering to updatedb - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2006/08/31 00:15:23 UTC, 0 replies.
- [jira] Closed: (NUTCH-143) Improper error numbers returned on exit - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2006/08/31 00:17:23 UTC, 0 replies.
- Why are lib- plugins needed? - posted by Teruhiko Kurosaka <Ku...@basistech.com> on 2006/08/31 22:10:13 UTC, 0 replies.