You are viewing a plain text version of this content. The canonical link for it is here.
- Java problem - posted by Pine Cone <pc...@yahoo.com> on 2006/01/01 18:14:47 UTC, 1 replies.
- Re: uses of 'io.sort.mb' and ' io.sort.factor' in nutch-default.xml - posted by Piotr Kosiorowski <pk...@gmail.com> on 2006/01/02 12:53:41 UTC, 2 replies.
- is the nutch shell script only used for initial crawling - posted by Thomas Sondergaard <ts...@sondergaard.cc> on 2006/01/02 14:41:52 UTC, 2 replies.
- Re: Is any one able to successfully run Distributed Crawl? - posted by Doug Cutting <cu...@nutch.org> on 2006/01/02 20:10:02 UTC, 6 replies.
- New index representation in search results - posted by Chetan Sahasrabudhe <Ch...@KPITCummins.com> on 2006/01/02 20:13:57 UTC, 1 replies.
- Re: New Tutorial Needed - posted by Doug Cutting <cu...@nutch.org> on 2006/01/02 20:45:03 UTC, 2 replies.
- Re: Can we search based on two fileds? - posted by Nguyen Ngoc Giang <gi...@gmail.com> on 2006/01/03 03:09:00 UTC, 1 replies.
- Localization bug in web interface?? - posted by Sergio <re...@redsun.homeip.net> on 2006/01/03 06:31:26 UTC, 2 replies.
- Limiting search/crawl to specific language - posted by Byron Miller <by...@yahoo.com> on 2006/01/04 05:46:30 UTC, 2 replies.
- Re: [Nutch-general] Limiting search/crawl to specific language - posted by og...@yahoo.com on 2006/01/04 08:11:23 UTC, 0 replies.
- Fetching in multiple machines setup. - posted by Gal Nitzan <gn...@usa.net> on 2006/01/04 11:42:15 UTC, 0 replies.
- Remove links from index - posted by Aled Jones <Al...@comtec-europe.co.uk> on 2006/01/04 12:50:34 UTC, 2 replies.
- upgrade to version 0.8 - posted by "Insurance Squared Inc." <gc...@insurancesquared.com> on 2006/01/04 14:53:25 UTC, 2 replies.
- java.io.IOException: already exists - posted by Nguyen Ngoc Giang <gi...@gmail.com> on 2006/01/04 15:58:37 UTC, 4 replies.
- Scaling Nutch 0.8 via Map/Reduce - posted by "Goldschmidt, Dave" <dg...@globalspec.com> on 2006/01/04 21:39:33 UTC, 5 replies.
- port :8080 no longer brings up Nutch search page! - posted by Bryan Woliner <br...@gmail.com> on 2006/01/04 22:29:29 UTC, 2 replies.
- 0.7, Trunk, Compatibility Question - posted by Albert Chern <al...@gmail.com> on 2006/01/05 00:33:57 UTC, 1 replies.
- LanguageIdentifierPlugin and CJK - posted by Otis Gospodnetic <ot...@yahoo.com> on 2006/01/05 00:41:21 UTC, 1 replies.
- impossible situation error: score-edit - posted by Sunnyvale Fl <su...@gmail.com> on 2006/01/05 00:43:11 UTC, 1 replies.
- Re: [Nutch-general] Re: LanguageIdentifierPlugin and CJK - posted by og...@yahoo.com on 2006/01/05 01:34:35 UTC, 3 replies.
- Getting java.io.IOException: Couldn't rename \tmp\nutch\mapred\local\map_n68li2\part-0.out with Nutch 0.8 - posted by Arun Kumar Sharma <sh...@yahoo.co.in> on 2006/01/05 08:08:47 UTC, 5 replies.
- Categories - posted by Boštjan <bg...@siol.net> on 2006/01/05 08:38:43 UTC, 1 replies.
- Clustering with clustering-carrot2 - posted by Neal Whitley <ne...@e-travelmedia.com> on 2006/01/05 11:32:15 UTC, 2 replies.
- tech... - posted by Dan Segel <da...@sureglow.com> on 2006/01/05 21:40:00 UTC, 0 replies.
- please disregard last post.... - posted by Dan Segel <da...@sureglow.com> on 2006/01/05 21:41:58 UTC, 3 replies.
- Re: Multiple anchors on same site - what's better than making these unique? - posted by Doug Cutting <cu...@nutch.org> on 2006/01/05 21:44:27 UTC, 0 replies.
- pooling for nutch bean - posted by Raghavendra Prabhu <rr...@gmail.com> on 2006/01/05 21:56:31 UTC, 9 replies.
- Re: Does Search Result Show Similar Pages Like Google? - posted by Doug Cutting <cu...@nutch.org> on 2006/01/05 22:01:53 UTC, 0 replies.
- resource pool for nutchbean - posted by Raghavendra Prabhu <rr...@gmail.com> on 2006/01/05 22:43:02 UTC, 3 replies.
- Google.com Search - posted by Dan Segel <da...@sureglow.com> on 2006/01/05 23:30:50 UTC, 3 replies.
- mapred system dir - posted by Matt Zytaruk <ma...@wavefire.com> on 2006/01/06 01:59:58 UTC, 2 replies.
- Dedup - works on single file - posted by "K.A.Hussain Ali" <Hu...@photoninfotech.com> on 2006/01/06 15:13:52 UTC, 1 replies.
- app server requirement - posted by Teruhiko Kurosaka <Ku...@basistech.com> on 2006/01/06 19:55:47 UTC, 1 replies.
- Urgent help requested regarding Nutch obeying instructions - posted by Ed Whittaker <ed...@ewdw.com> on 2006/01/07 00:19:10 UTC, 3 replies.
- Appropriate MapReduce Hardware - posted by Chris Schneider <Sc...@TransPac.com> on 2006/01/07 00:37:51 UTC, 2 replies.
- nutch task tracker help - posted by Raghavendra Prabhu <rr...@gmail.com> on 2006/01/07 18:22:43 UTC, 1 replies.
- mapred setup - posted by Raghavendra Prabhu <rr...@gmail.com> on 2006/01/07 18:30:12 UTC, 4 replies.
- file sytem content is also saved - posted by Raghavendra Prabhu <rr...@gmail.com> on 2006/01/07 19:28:46 UTC, 2 replies.
- MD5Hash - posted by Thomas Delnoij <di...@gmail.com> on 2006/01/07 22:14:17 UTC, 6 replies.
- small problem? - posted by Gal Nitzan <gn...@usa.net> on 2006/01/07 23:42:02 UTC, 0 replies.
- Re: small problem? IGNORE - posted by Gal Nitzan <gn...@usa.net> on 2006/01/08 01:19:24 UTC, 0 replies.
- newbie question - fetcher - posted by Gal Nitzan <gn...@usa.net> on 2006/01/08 01:24:28 UTC, 0 replies.
- JSON output - posted by Byron Miller <by...@yahoo.com> on 2006/01/08 02:22:39 UTC, 0 replies.
- newbie install...errors help - posted by Andy Morris <an...@woodward.edu> on 2006/01/08 04:38:45 UTC, 4 replies.
- url outlink problem - posted by Raghavendra Prabhu <rr...@gmail.com> on 2006/01/08 09:28:19 UTC, 1 replies.
- About ranking in Nutch - posted by NamNH <n....@gmail.com> on 2006/01/08 09:28:52 UTC, 2 replies.
- Number of links in WebDB - posted by Yousef Ourabi <yo...@gmail.com> on 2006/01/08 09:40:49 UTC, 0 replies.
- Help on language - posted by Sameer Tamsekar <st...@gmail.com> on 2006/01/08 11:39:34 UTC, 6 replies.
- IndexSearcher memory - posted by Raghavendra Prabhu <rr...@gmail.com> on 2006/01/08 16:46:10 UTC, 1 replies.
- using index filters to add a field - posted by Raghavendra Prabhu <rr...@gmail.com> on 2006/01/08 19:42:06 UTC, 8 replies.
- Fetching only the pages in an urlfile - posted by "Vish D." <vi...@gmail.com> on 2006/01/09 01:05:37 UTC, 1 replies.
- Help needed please - posted by Gal Nitzan <gn...@usa.net> on 2006/01/09 01:06:00 UTC, 1 replies.
- integrating lucene query parser - posted by Raghavendra Prabhu <rr...@gmail.com> on 2006/01/09 08:27:34 UTC, 0 replies.
- Re: Help needed please !Please Ignore - posted by Gal Nitzan <gn...@usa.net> on 2006/01/09 09:06:07 UTC, 0 replies.
- Nutch freezing - deflateBytes - posted by "Insurance Squared Inc." <gc...@insurancesquared.com> on 2006/01/09 12:34:08 UTC, 0 replies.
- Search result is an empty site - posted by "Håvard W. Kongsgård" <h....@niap.no> on 2006/01/09 14:52:27 UTC, 3 replies.
- Fedora core 2 install - posted by Andy Morris <an...@woodward.edu> on 2006/01/09 18:07:12 UTC, 0 replies.
- Full Range of Results Not Showing - posted by Neal Whitley <ne...@e-travelmedia.com> on 2006/01/09 18:26:22 UTC, 3 replies.
- fetcher.threads.per.host bug in 0.7.1? - posted by "Håvard W. Kongsgård" <h....@niap.no> on 2006/01/09 18:59:22 UTC, 1 replies.
- Creating Multiple Nutch Beans for Searching - posted by Saravanaraj Duraisamy <sa...@gmail.com> on 2006/01/09 21:19:06 UTC, 0 replies.
- Multi CPU support - posted by Teruhiko Kurosaka <Ku...@basistech.com> on 2006/01/09 21:23:52 UTC, 1 replies.
- No cluster results - posted by "Håvard W. Kongsgård" <h....@niap.no> on 2006/01/09 22:10:30 UTC, 0 replies.
- http status 500? - posted by Andy Morris <an...@woodward.edu> on 2006/01/09 22:53:00 UTC, 4 replies.
- nutch-0.8-dev - posted by "R.Mayoran" <ma...@team-lab.com> on 2006/01/10 10:48:45 UTC, 2 replies.
- java.net.ConnectException: Connection refused: connect at org.apache.nutch.crawl.Injector.inject(Injector.java:102) - posted by Arun Kumar Sharma <sh...@yahoo.co.in> on 2006/01/10 11:33:02 UTC, 0 replies.
- java.io.IOException: Couldn't rename \tmp\nutch\mapred\local\map_tff7vn\part-0.out - posted by Arun Kumar Sharma <sh...@yahoo.co.in> on 2006/01/10 12:08:25 UTC, 0 replies.
- [Article] Introduction to Nutch, Part 1: Crawling - posted by Tom White <to...@gmail.com> on 2006/01/10 16:37:18 UTC, 1 replies.
- multiple nutch-site.xml files possible? - posted by Bryan Woliner <br...@gmail.com> on 2006/01/10 20:12:15 UTC, 0 replies.
- mapred & fetching weirdness - posted by Florent Gluck <fl...@busytonight.com> on 2006/01/11 00:17:24 UTC, 0 replies.
- fresh fedora core4 install tomcat5 nutch .7.0.1 error - posted by Andy Morris <an...@woodward.edu> on 2006/01/11 02:00:46 UTC, 2 replies.
- Nutch and Lucene - posted by Da...@sybase.com on 2006/01/11 02:03:31 UTC, 0 replies.
- other newbies like me - posted by Andy Morris <an...@woodward.edu> on 2006/01/11 07:19:46 UTC, 10 replies.
- About Fetching - posted by "R.Mayoran" <ma...@team-lab.com> on 2006/01/11 10:01:36 UTC, 0 replies.
- Introduction to Nutch, Part 1: Crawling - posted by Tom White <to...@gmail.com> on 2006/01/11 10:21:21 UTC, 12 replies.
- Sprechen Sie Deutsch? - posted by Nick Pisarro <ni...@aperture.com> on 2006/01/11 15:37:19 UTC, 0 replies.
- Document.GetBoost( ) always returns 1 even if I SetBoost( ) = 2 - posted by codejunky codejunky <co...@yahoo.com> on 2006/01/11 19:04:15 UTC, 3 replies.
- Any idea why this could happen? - posted by Gal Nitzan <gn...@usa.net> on 2006/01/11 19:12:24 UTC, 0 replies.
- Re: Background color searched word - posted by carmmello <ca...@globo.com> on 2006/01/11 20:28:22 UTC, 9 replies.
- MapReduce questions - posted by Mike Alulin <mi...@yahoo.com> on 2006/01/12 00:17:14 UTC, 0 replies.
- crawling - posted by Andy Morris <an...@woodward.edu> on 2006/01/12 00:43:46 UTC, 1 replies.
- HELP: Fetch only small number of pages from 4 websites - posted by Chih How Bong <ch...@gmail.com> on 2006/01/12 09:44:24 UTC, 3 replies.
- large filter file, time to update db - posted by "Insurance Squared Inc." <gc...@insurancesquared.com> on 2006/01/12 14:06:56 UTC, 1 replies.
- Only Tomcat? - posted by Mike Markzon <mm...@yahoo.com> on 2006/01/12 15:37:00 UTC, 3 replies.
- unable to get started with tutorials - posted by Mike Markzon <mm...@yahoo.com> on 2006/01/12 21:29:04 UTC, 5 replies.
- come together NYC? - posted by Stefan Groschupf <sg...@media-style.com> on 2006/01/12 21:32:20 UTC, 0 replies.
- Relevance search in Nutch. - posted by "K.A.Hussain Ali" <Hu...@photoninfotech.com> on 2006/01/13 10:57:07 UTC, 0 replies.
- Re: Crawling listing (pagination) pages. - posted by "K.A.Hussain Ali" <Hu...@photoninfotech.com> on 2006/01/13 11:09:07 UTC, 0 replies.
- getting last-modified date from the crawled pages - posted by "K.A.Hussain Ali" <Hu...@photoninfotech.com> on 2006/01/13 12:05:27 UTC, 0 replies.
- Nutch/Lucene as parametric search engine - posted by Jerry Russell <je...@worldwideweave.com> on 2006/01/13 12:10:19 UTC, 1 replies.
- URL filters and outlinks - posted by carmmello <ca...@globo.com> on 2006/01/13 15:40:55 UTC, 1 replies.
- Excluding parked/ad-generator domains - posted by James Jory <ja...@thejorys.net> on 2006/01/13 18:25:44 UTC, 0 replies.
- Access pasword protected sites? - posted by Andy Morris <an...@woodward.edu> on 2006/01/13 20:15:42 UTC, 2 replies.
- url filter prefix - posted by Gal Nitzan <gn...@usa.net> on 2006/01/14 01:46:37 UTC, 0 replies.
- Crawling all links on a page - posted by Steven Yelton <st...@missiondata.com> on 2006/01/14 15:16:52 UTC, 1 replies.
- Error running MapReduce - Host key verification failed - posted by Ken Krugler <kk...@krugle.net> on 2006/01/14 18:33:28 UTC, 1 replies.
- Error running MapReduce - Jetty server & .jsp files - posted by Ken Krugler <kk...@krugle.net> on 2006/01/14 23:50:00 UTC, 2 replies.
- Error at end of MapReduce run with indexing - posted by Ken Krugler <kk...@transpac.com> on 2006/01/15 02:02:59 UTC, 6 replies.
- Improving Nutch throughput w/MapReduce - posted by Ken Krugler <kk...@transpac.com> on 2006/01/15 04:35:36 UTC, 2 replies.
- Re: [Nutch-general] Error running MapReduce - Jetty server & .jsp files - posted by og...@yahoo.com on 2006/01/15 06:51:12 UTC, 2 replies.
- How to catch the exception generated during crawling - posted by "K.A.Hussain Ali" <Hu...@photoninfotech.com> on 2006/01/15 07:13:25 UTC, 0 replies.
- Nutch plugin architecture - posted by Sameer Tamsekar <st...@gmail.com> on 2006/01/15 09:01:10 UTC, 1 replies.
- Closing database causes system crash - posted by Nguyen Ngoc Giang <gi...@gmail.com> on 2006/01/15 17:51:17 UTC, 0 replies.
- So many Unfetched Pages using MapReduce - posted by Mike Smith <mi...@gmail.com> on 2006/01/15 23:18:11 UTC, 11 replies.
- Help pls Nutch & RAM - posted by Michael Sashnikov <ms...@hotmail.com> on 2006/01/16 00:15:24 UTC, 0 replies.
- How can no URLs be fetched until the 11th round of fetching? - posted by Bryan Woliner <br...@gmail.com> on 2006/01/16 04:05:21 UTC, 3 replies.
- Do I understand how this is going to work? - posted by Mike Markzon <mm...@yahoo.com> on 2006/01/16 05:36:15 UTC, 0 replies.
- is it safe to inject into fetchlist directly? - posted by Byron Miller <by...@yahoo.com> on 2006/01/16 16:02:41 UTC, 1 replies.
- filtering content/results - posted by Byron Miller <by...@yahoo.com> on 2006/01/16 16:05:27 UTC, 4 replies.
- Help pls Nutch & RAM - posted by Mike Alulin <mi...@yahoo.com> on 2006/01/16 19:45:11 UTC, 3 replies.
- pagination in search result - posted by Andy Morris <an...@woodward.edu> on 2006/01/16 21:56:56 UTC, 0 replies.
- Common Lucene Queries for PruneIndexTool -- GROUPS of files or folders - posted by Bryan Woliner <br...@gmail.com> on 2006/01/16 23:41:42 UTC, 1 replies.
- throttling bandwidth - posted by "Insurance Squared Inc." <gc...@insurancesquared.com> on 2006/01/17 00:02:55 UTC, 13 replies.
- Nutch system running on multiple servers | fetcher - posted by "Håvard W. Kongsgård" <h....@niap.no> on 2006/01/17 15:57:03 UTC, 1 replies.
- nutch integration - posted by Ennio Tosi <en...@gmail.com> on 2006/01/17 16:04:02 UTC, 0 replies.
- About skipping some sites - posted by Sameer Tamsekar <st...@gmail.com> on 2006/01/17 17:48:12 UTC, 0 replies.
- so many unfetched URLs after depth 4 using mapReduce on 3 three machines - posted by Mike Smith <mi...@gmail.com> on 2006/01/17 17:56:13 UTC, 0 replies.
- XP/Cygwin setup problems - posted by Chris Shepard <ch...@yahoo.com> on 2006/01/17 20:36:57 UTC, 8 replies.
- How do I control log level with MapReduce? - posted by Chris Schneider <Sc...@TransPac.com> on 2006/01/17 20:53:33 UTC, 1 replies.
- adding additional custom documents to nutch-created index - posted by ci...@bloglines.com on 2006/01/17 21:00:00 UTC, 0 replies.
- Problem with fetching:Fail in DataXCeiver - posted by Rafit Izhak_Ratzin <sa...@hotmail.com> on 2006/01/17 22:36:29 UTC, 0 replies.
- HTTP 404 - posted by Mike Alulin <mi...@yahoo.com> on 2006/01/18 04:21:21 UTC, 0 replies.
- Re: HTTP 404 bug? - posted by Mike Alulin <mi...@yahoo.com> on 2006/01/18 09:16:53 UTC, 1 replies.
- Query on specialized site crawling and webdb - posted by Chun Wei Ho <cw...@gmail.com> on 2006/01/18 11:28:31 UTC, 0 replies.
- Running crawl on nutch 0.8 - posted by Sameer Tamsekar <st...@gmail.com> on 2006/01/18 11:56:52 UTC, 1 replies.
- Beginning with the Ontology Plugin ? - posted by philippe eugene <ph...@neuf.fr> on 2006/01/18 12:19:01 UTC, 0 replies.
- Error while generating new segment - posted by "Håvard W. Kongsgård" <h....@niap.no> on 2006/01/18 17:44:54 UTC, 0 replies.
- Can't index some pages - posted by Michael Plax <mi...@mcycorp.com> on 2006/01/19 01:49:04 UTC, 5 replies.
- [off-topic] Web Crawlers Comparison. - posted by Fuad Efendi <fu...@efendi.ca> on 2006/01/19 04:57:43 UTC, 0 replies.
- Searching for pre-defined terms - posted by Miguel A Paraz <mp...@gmail.com> on 2006/01/19 06:18:50 UTC, 0 replies.
- Nutch merge problem after fetch is aborted with hung threads. - posted by Lukas Vlcek <lu...@gmail.com> on 2006/01/19 08:42:41 UTC, 0 replies.
- interesting paper with competing index systems - posted by Byron Miller <by...@yahoo.com> on 2006/01/19 15:01:16 UTC, 7 replies.
- wildcard matches not working? - posted by ci...@bloglines.com on 2006/01/19 17:33:24 UTC, 6 replies.
- Recovered from failed datanode connection - posted by Gal Nitzan <gn...@usa.net> on 2006/01/19 20:04:00 UTC, 0 replies.
- Subj: Cygwin Intranet: 0 Hits - posted by Chris Shepard <ch...@yahoo.com> on 2006/01/19 21:22:36 UTC, 0 replies.
- please help: Recovered from failed datanode connection - posted by Gal Nitzan <gn...@usa.net> on 2006/01/19 22:23:07 UTC, 0 replies.
- getOutlinks doesn't work properly - posted by Nguyen Ngoc Giang <gi...@gmail.com> on 2006/01/20 05:17:58 UTC, 3 replies.
- org.apache.nutch.indexer.IndexMerger (Nutch 0.7) - posted by Chun Wei Ho <cw...@gmail.com> on 2006/01/20 06:03:15 UTC, 1 replies.
- how to manage segment size during fetching - posted by Arun Kaundal <ar...@gmail.com> on 2006/01/20 09:05:41 UTC, 0 replies.
- Do not index seed page? - posted by Franz Werfel <fr...@gmail.com> on 2006/01/20 11:23:23 UTC, 6 replies.
- New server not processing pdf files in asp pages, how to add plugins - posted by Andy Morris <an...@woodward.edu> on 2006/01/20 20:10:51 UTC, 2 replies.
- Restarting a task - posted by Matt Zytaruk <ma...@wavefire.com> on 2006/01/20 22:18:55 UTC, 1 replies.
- Re: [Nutch-general] RE: interesting paper with competing index systems - posted by og...@yahoo.com on 2006/01/20 22:26:06 UTC, 0 replies.
- unable to start nutch - posted by Fabio Biscaro <bi...@webscience.it> on 2006/01/22 14:28:13 UTC, 5 replies.
- Whole-web crawling problem on nutch 0.7.1 - posted by WONG KIONG <ki...@yahoo.com> on 2006/01/23 02:38:25 UTC, 1 replies.
- nutch indexes - posted by Cherian Thomas <Ch...@KPITCummins.com> on 2006/01/23 15:27:13 UTC, 3 replies.
- crawl/update speed - posted by "Insurance Squared Inc." <gc...@insurancesquared.com> on 2006/01/23 15:32:00 UTC, 1 replies.
- exclude bad url in crawling - posted by AJ Chen <ca...@gmail.com> on 2006/01/23 22:08:38 UTC, 0 replies.
- Filtering content before parsing? - posted by Chun Wei Ho <cw...@gmail.com> on 2006/01/24 03:16:13 UTC, 3 replies.
- retrieve data from index file - posted by Wong Ting Kiong <wo...@gmail.com> on 2006/01/24 08:18:04 UTC, 1 replies.
- Filter based on content - posted by Franz Werfel <fr...@gmail.com> on 2006/01/24 10:12:29 UTC, 2 replies.
- Injecting new url - posted by Ennio Tosi <en...@gmail.com> on 2006/01/24 11:59:10 UTC, 3 replies.
- Exception in thread "main" java.io.IOException: Job failed! - posted by Mike Smith <mi...@gmail.com> on 2006/01/24 19:00:41 UTC, 0 replies.
- Mapping score to rank - posted by Karambir Singh <ks...@etouch.net> on 2006/01/24 19:33:12 UTC, 0 replies.
- Tool for pruning based on regular expressions - posted by Thomas Mayfield <Th...@oberlin.edu> on 2006/01/25 02:15:58 UTC, 0 replies.
- Using Nutch indexes to search with Lucene - posted by "Lakshman, Madhusudhan" <ma...@logicacmg.com> on 2006/01/25 12:17:23 UTC, 2 replies.
- Parsing PDF Nutch Achilles heel? - posted by "Håvard W. Kongsgård" <h....@niap.no> on 2006/01/25 16:10:24 UTC, 9 replies.
- indexing image names - posted by Michael Dodson <mg...@mac.com> on 2006/01/25 18:57:16 UTC, 0 replies.
- Showing relevance - posted by Karambir Singh <ks...@etouch.net> on 2006/01/25 21:12:40 UTC, 1 replies.
- A Query about QueryFilters - posted by "Vanderdray, Jacob" <JV...@aarp.org> on 2006/01/26 00:19:42 UTC, 0 replies.
- getting page content for nutch search result - posted by Leslie Rohde <ad...@windrosesoftware.com> on 2006/01/26 04:55:30 UTC, 1 replies.
- After the crawl - posted by Andy Morris <an...@woodward.edu> on 2006/01/26 11:54:41 UTC, 6 replies.
- [Nutch 0.7.1]- Multi-Values for a Lucene Field - posted by philippe eugene <ph...@neuf.fr> on 2006/01/26 14:22:42 UTC, 0 replies.
- how do we have spaces in urlfilter - posted by Raghavendra Prabhu <rr...@gmail.com> on 2006/01/26 16:15:25 UTC, 1 replies.
- does nutch implement page ranking - posted by Raghavendra Prabhu <rr...@gmail.com> on 2006/01/26 20:31:26 UTC, 1 replies.
- Searching - posted by "Vanderdray, Jacob" <JV...@aarp.org> on 2006/01/26 21:19:08 UTC, 2 replies.
- updatedb - posted by Sunnyvale Fl <su...@gmail.com> on 2006/01/27 02:03:43 UTC, 1 replies.
- Updating the search index - posted by Chun Wei Ho <cw...@gmail.com> on 2006/01/27 05:03:13 UTC, 6 replies.
- How do we get the last modified date in a file - posted by Raghavendra Prabhu <rr...@gmail.com> on 2006/01/27 13:28:04 UTC, 2 replies.
- Connecting the search to the Db - posted by Lawrence W Thorne <lt...@thornedigital.com> on 2006/01/28 07:33:39 UTC, 0 replies.
- download/mirror - posted by Michael Dodson <mg...@mac.com> on 2006/01/28 14:19:21 UTC, 4 replies.
- readdb command error, help - posted by 盖世豪侠 <ma...@gmail.com> on 2006/01/28 15:32:58 UTC, 2 replies.
- Problems with MapRed- - posted by Rafit Izhak_Ratzin <sa...@hotmail.com> on 2006/01/28 21:21:10 UTC, 1 replies.
- The parsing is part of the Map or part of the Reduce? - posted by Rafit Izhak_Ratzin <sa...@hotmail.com> on 2006/01/28 21:25:49 UTC, 0 replies.
- Re: The parsing is part of the Map or part of the Reduce? - posted by "Håvard W. Kongsgård" <h....@niap.no> on 2006/01/28 23:05:05 UTC, 3 replies.
- How to restrict the URL patterns of the internet crawl - posted by 盖世豪侠 <ma...@gmail.com> on 2006/01/29 06:53:42 UTC, 0 replies.
- benchmark and performance - posted by Raghavendra Prabhu <rr...@gmail.com> on 2006/01/29 13:35:38 UTC, 6 replies.
- More Documentation - posted by "Vanderdray, Jacob" <JV...@aarp.org> on 2006/01/29 20:20:57 UTC, 0 replies.
- Hung threads - posted by "Håvard W. Kongsgård" <h....@niap.no> on 2006/01/29 21:13:53 UTC, 0 replies.
- Re: Hung threads (no solution :-( - posted by Michael Nebel <mi...@nebel.de> on 2006/01/29 21:38:14 UTC, 0 replies.
- Search setup - posted by Gal Nitzan <gn...@usa.net> on 2006/01/29 22:01:46 UTC, 6 replies.
- Re: Hung threads old problem? - posted by "Håvard W. Kongsgård" <h....@niap.no> on 2006/01/29 23:19:11 UTC, 0 replies.
- Re: Problems with MapRed- - posted by Mike Smith <mi...@gmail.com> on 2006/01/30 00:50:34 UTC, 9 replies.
- distributed computing doubt - posted by Raghavendra Prabhu <rr...@gmail.com> on 2006/01/30 08:10:50 UTC, 3 replies.
- Search Results? - posted by Sameer Tamsekar <st...@gmail.com> on 2006/01/30 12:10:21 UTC, 4 replies.
- Nutch irc channel - posted by Dominik Friedrich <do...@wipe-records.org> on 2006/01/30 14:17:37 UTC, 0 replies.
- puzzle about regx ofurl pattern - posted by 盖世豪侠 <ma...@gmail.com> on 2006/01/30 14:57:37 UTC, 3 replies.
- Differences between intranet crawl and whole-web crawl - posted by 盖世豪侠 <ma...@gmail.com> on 2006/01/30 16:56:43 UTC, 0 replies.
- searcher memory question - posted by Sunnyvale Fl <su...@gmail.com> on 2006/01/30 20:09:35 UTC, 1 replies.
- Configuring for multiple sites indexing - posted by "Lakshman, Madhusudhan" <ma...@logicacmg.com> on 2006/01/30 20:10:57 UTC, 1 replies.
- Nutch nightly build-crawl-search - posted by Andy Morris <an...@woodward.edu> on 2006/01/30 23:58:18 UTC, 0 replies.
- Recovering from Socket closed - posted by Chris Schneider <Sc...@TransPac.com> on 2006/01/31 04:46:33 UTC, 2 replies.
- Use Nutch to collect web statistic - posted by Meryl Silverburgh <si...@gmail.com> on 2006/01/31 06:21:42 UTC, 2 replies.
- Problem with plugins - posted by Enrico Triolo <en...@gmail.com> on 2006/01/31 12:05:32 UTC, 10 replies.
- How many data have you got? - posted by 盖世豪侠 <ma...@gmail.com> on 2006/01/31 14:21:47 UTC, 2 replies.
- Re: [Nutch-general] More Documentation - posted by og...@yahoo.com on 2006/01/31 20:04:04 UTC, 1 replies.
- passing type: or lang: as hidden field (not in query) - posted by Byron Miller <by...@yahoo.com> on 2006/01/31 20:10:33 UTC, 2 replies.
- incremental crawling in nutch-0.8 - posted by Derek Young <dm...@gmail.com> on 2006/01/31 21:30:42 UTC, 0 replies.
- Re: adding meta to domain - posted by Stefan Groschupf <sg...@media-style.com> on 2006/01/31 23:24:52 UTC, 3 replies.