You are viewing a plain text version of this content. The canonical link for it is here.
- Re: The ranking is wrong - posted by purpureleaf <pu...@gmail.com> on 2007/09/01 03:24:43 UTC, 1 replies.
- Re: search by field - posted by kevin chen <ke...@bdsing.com> on 2007/09/01 03:40:35 UTC, 0 replies.
- Re: hadoop on single machine - posted by "renaud@apache.org" <re...@apache.org> on 2007/09/01 21:28:12 UTC, 1 replies.
- bin/nutch file problem - posted by Le Mai Tung <ch...@gmail.com> on 2007/09/02 20:47:34 UTC, 1 replies.
- Re: Getting page information given the URL - posted by Carl Cerecke <ca...@nzs.com> on 2007/09/03 06:29:21 UTC, 0 replies.
- Re: Outlinks normalizer - posted by Doğacan Güney <do...@gmail.com> on 2007/09/03 10:53:33 UTC, 1 replies.
- Re: New Hadoop Version - posted by Doğacan Güney <do...@gmail.com> on 2007/09/03 11:07:19 UTC, 1 replies.
- search hits - posted by Thomas Kurzman <re...@gmx.at> on 2007/09/03 11:52:28 UTC, 1 replies.
- Fetching single / choosen URL's - posted by eyal edri <ey...@gmail.com> on 2007/09/03 13:15:34 UTC, 2 replies.
- nutch 0.9 with j2re1.4.2_10 - posted by Thomas Kurzman <re...@gmx.at> on 2007/09/03 17:09:22 UTC, 2 replies.
- pingomatic and pings with nutch - posted by Fabian López <fa...@syameses.com> on 2007/09/03 17:43:03 UTC, 3 replies.
- Fetch2 vs Fetch - posted by eyal edri <ey...@gmail.com> on 2007/09/03 22:45:19 UTC, 1 replies.
- downloading zip/exe files - posted by eyal edri <ey...@gmail.com> on 2007/09/03 22:45:52 UTC, 2 replies.
- Can i use my own analyzer to build index and search instead nutch default analyzer? - posted by martin <ma...@gmail.com> on 2007/09/04 12:25:32 UTC, 1 replies.
- Searching in field "content" doesn't return any hit - posted by Ismael <kr...@gmail.com> on 2007/09/05 04:05:51 UTC, 2 replies.
- Re: Getting page information given the URL (SOLVED, kind-of) - posted by Carl Cerecke <ca...@nzs.com> on 2007/09/05 04:30:09 UTC, 0 replies.
- searching on date field - posted by aditya naga hemanth kumar <ad...@gmail.com> on 2007/09/05 07:53:15 UTC, 3 replies.
- how to fetch the websites with the depth level 2 links - posted by Jenny LIU <je...@yahoo.com> on 2007/09/05 15:32:37 UTC, 0 replies.
- Re: how to fetch the websites with the depth level 2 links - posted by eyal edri <ey...@gmail.com> on 2007/09/05 16:54:50 UTC, 2 replies.
- Nutch and OpenAds - posted by Tec <nu...@tecnica.cc> on 2007/09/05 18:59:49 UTC, 0 replies.
- RE: nutch nightly: IllegalArgumentException: Illegal Capacity: -1 - posted by "Bolle, Jeffrey F." <jb...@mitre.org> on 2007/09/05 23:46:11 UTC, 1 replies.
- Slow search - posted by Lyndon Maydwell <ma...@gmail.com> on 2007/09/06 05:39:47 UTC, 0 replies.
- Re: fetch errors? - posted by Lyndon Maydwell <ma...@gmail.com> on 2007/09/06 06:55:05 UTC, 0 replies.
- Nutch0.9 how get the cached web‘ Charset - posted by crossafire <cr...@gmail.com> on 2007/09/06 12:10:58 UTC, 0 replies.
- Increase ranks of some pages or sites manually? - posted by Smith Norton <sm...@gmail.com> on 2007/09/06 13:13:54 UTC, 3 replies.
- ranking works in topN selection? - posted by Smith Norton <sm...@gmail.com> on 2007/09/06 13:15:41 UTC, 0 replies.
- Problem with fetch reduce phase - posted by Ned Rockson <nr...@stanford.edu> on 2007/09/06 13:28:27 UTC, 8 replies.
- Effect of no topN argument in generate - posted by Smith Norton <sm...@gmail.com> on 2007/09/06 18:28:35 UTC, 7 replies.
- Only one URL per site is selected from the URL file - posted by Smith Norton <sm...@gmail.com> on 2007/09/07 09:53:02 UTC, 3 replies.
- Set number of mappers/reducers from command line - posted by Ned Rockson <nr...@stanford.edu> on 2007/09/07 10:10:56 UTC, 0 replies.
- Changing reduce pull order - posted by Ned Rockson <ne...@gmail.com> on 2007/09/07 10:47:33 UTC, 0 replies.
- slash-delimited segment that repeats 3+ times, an example? - posted by Smith Norton <sm...@gmail.com> on 2007/09/07 15:19:43 UTC, 3 replies.
- How to use query-site plugin? - posted by Smith Norton <sm...@gmail.com> on 2007/09/07 15:50:46 UTC, 0 replies.
- Regarding Lucene & Nutch - posted by Kunal Wku <wk...@yahoo.com> on 2007/09/07 18:49:57 UTC, 0 replies.
- Increase number of tasks on a certain node - posted by Ned Rockson <nr...@stanford.edu> on 2007/09/07 19:55:02 UTC, 0 replies.
- Re: help with hardware requirements - posted by Tomislav Poljak <tp...@gmail.com> on 2007/09/08 00:52:58 UTC, 1 replies.
- Number of reduce tasks per machine - posted by Ned Rockson <nr...@stanford.edu> on 2007/09/08 03:15:07 UTC, 0 replies.
- Daniel Udatny is out of the office. - posted by ru...@rosa.com on 2007/09/08 10:09:24 UTC, 0 replies.
- dual-core cpu usage while parsing and indexing - posted by Tomislav Poljak <tp...@gmail.com> on 2007/09/08 10:12:47 UTC, 1 replies.
- Script execution in cached.jsp may be a security concern - posted by Susam Pal <su...@gmail.com> on 2007/09/08 15:35:33 UTC, 5 replies.
- Apachecon early bird registration extended to September 22, 2007 - posted by Chris Hostetter <ho...@fucit.org> on 2007/09/08 20:40:32 UTC, 0 replies.
- Re: Regarding Lucene & Nutc - posted by aditya naga hemanth kumar <ad...@gmail.com> on 2007/09/09 12:20:30 UTC, 2 replies.
- how to generate seperate segment to have a small list of new urls to be fetched only - posted by Jenny LIU <je...@yahoo.com> on 2007/09/09 22:07:12 UTC, 6 replies.
- hadoop upgrade version mismatch - posted by ramires <uy...@beriltech.com> on 2007/09/10 15:20:01 UTC, 4 replies.
- Fetcher2 politeness? - posted by Emmanuel <jo...@gmail.com> on 2007/09/10 15:22:23 UTC, 7 replies.
- OutOfMemoryError while fetching - posted by Tomislav Poljak <tp...@gmail.com> on 2007/09/10 16:34:01 UTC, 6 replies.
- Injector: java.lang.IllegalStateException (at nutch fetch stage) - posted by eyal edri <ey...@gmail.com> on 2007/09/10 17:17:22 UTC, 1 replies.
- ParseResults - posted by Emmanuel <jo...@gmail.com> on 2007/09/10 17:26:17 UTC, 1 replies.
- Downloading file types to file system - posted by eyal edri <ey...@gmail.com> on 2007/09/11 10:41:14 UTC, 1 replies.
- Clustering - posted by Smith Norton <sm...@gmail.com> on 2007/09/11 11:13:40 UTC, 0 replies.
- UTF-16 problem - posted by Vasja Ocvirk <va...@vizija.si> on 2007/09/11 11:58:49 UTC, 2 replies.
- Why 'nutch generate' is ignoring my argument of -numFetchers - posted by Jenny LIU <je...@yahoo.com> on 2007/09/11 18:37:53 UTC, 1 replies.
- Crawler fetching weird urls - posted by Jeff Van Boxtel <jb...@grpmack.com> on 2007/09/11 21:14:46 UTC, 3 replies.
- Re: Why 'nutch generate' is ignoring my argument of -numFetchers - posted by Doğacan Güney <do...@gmail.com> on 2007/09/11 21:18:59 UTC, 0 replies.
- Nutch can't fetch pages under hadoop - posted by 陈钊 <ch...@gmail.com> on 2007/09/12 09:25:30 UTC, 0 replies.
- Distributed Search - posted by Milan Krendzelak <mk...@mtld.mobi> on 2007/09/12 13:44:37 UTC, 4 replies.
- index time for lucene - posted by Dmitry <dm...@hotmail.com> on 2007/09/12 18:20:38 UTC, 1 replies.
- Problem: Compiling Plugin Using Ant - posted by Kunal Wku <wk...@yahoo.com> on 2007/09/12 20:27:44 UTC, 1 replies.
- Upgrading Hadoop for Nutch - posted by Ned Rockson <nr...@stanford.edu> on 2007/09/12 22:25:20 UTC, 0 replies.
- maybe dumb question about nutch index and segments file - posted by DerFichtl <de...@gmail.com> on 2007/09/12 22:54:13 UTC, 3 replies.
- Fetching - posted by Srinivasarao Vundavalli <sr...@gmail.com> on 2007/09/13 11:03:50 UTC, 0 replies.
- Sample normalize - posted by Smith Norton <sm...@gmail.com> on 2007/09/13 15:40:33 UTC, 2 replies.
- NTLM Authentication - posted by Smith Norton <sm...@gmail.com> on 2007/09/13 15:41:41 UTC, 0 replies.
- NTLM authentication not working in protocol-httpclient - posted by Smith Norton <sm...@gmail.com> on 2007/09/13 20:09:20 UTC, 1 replies.
- Parse pulls strange urls - posted by Ned Rockson <nr...@stanford.edu> on 2007/09/13 23:00:53 UTC, 0 replies.
- Question about filters - posted by Ned Rockson <nr...@stanford.edu> on 2007/09/13 23:13:34 UTC, 0 replies.
- {Dangerous Content?} Fwd: 100 Messaggi Inoltrati - posted by g....@ifc.cnr.it on 2007/09/14 12:38:36 UTC, 1 replies.
- Problems with the crawl database - posted by Tim Gautier <ti...@gmail.com> on 2007/09/14 19:06:26 UTC, 4 replies.
- Indexing HTML Meta Tags - posted by Jeff Van Boxtel <jb...@grpmack.com> on 2007/09/14 23:02:16 UTC, 0 replies.
- Fetch fails after unsuccessful parse of zip file - posted by Manoharam Reddy <ma...@gmail.com> on 2007/09/15 11:14:05 UTC, 0 replies.
- How to change logging level to see trace message? - posted by Alexis Votta <al...@gmail.com> on 2007/09/16 20:55:24 UTC, 1 replies.
- maintain crawl script is failing - posted by Lyndon Maydwell <ma...@gmail.com> on 2007/09/17 04:11:51 UTC, 0 replies.
- free disk space - posted by Lyndon Maydwell <ma...@gmail.com> on 2007/09/17 11:33:13 UTC, 2 replies.
- Nutch vs CURL PHP - posted by varun krishnan <va...@gmail.com> on 2007/09/17 15:06:59 UTC, 0 replies.
- Unknown format version:- 3 with Nutch trunk - posted by Alexis Votta <al...@gmail.com> on 2007/09/17 16:34:48 UTC, 2 replies.
- range of IP's using smb protocol - posted by Dmitry Glussky <gd...@tut.by> on 2007/09/17 18:17:00 UTC, 0 replies.
- protocol-httpclient NTLM authentication fails - posted by Aryan Sahoo <ar...@gmail.com> on 2007/09/17 21:32:36 UTC, 2 replies.
- Recovery possible? - posted by Tim Gautier <ti...@gmail.com> on 2007/09/18 00:48:53 UTC, 4 replies.
- util/CommandRunner - posted by Ned Rockson <nr...@stanford.edu> on 2007/09/18 01:46:25 UTC, 0 replies.
- NullPointerException while fetching - posted by Srinivasarao Vundavalli <sr...@gmail.com> on 2007/09/18 06:42:03 UTC, 1 replies.
- nutch fetch status codes - posted by eyal edri <ey...@gmail.com> on 2007/09/18 16:30:28 UTC, 2 replies.
- nutch scoring - documentation - posted by eyal edri <ey...@gmail.com> on 2007/09/18 16:56:11 UTC, 1 replies.
- freegen handles duplicate (reccurent urls) in crawldb? - posted by eyal edri <ey...@gmail.com> on 2007/09/19 17:46:43 UTC, 1 replies.
- Nutch recrawl script for 0.9 doesn't work with trunk. Help - posted by Alexis Votta <al...@gmail.com> on 2007/09/19 19:34:13 UTC, 8 replies.
- indexing and searching by Nutch - posted by payo <pa...@yahoo.com> on 2007/09/19 20:04:50 UTC, 0 replies.
- Blank result page - posted by balachanthar palanivelu <ba...@gmail.com> on 2007/09/20 09:27:49 UTC, 1 replies.
- Indexing Process - posted by Jeff Maki <cr...@gmail.com> on 2007/09/20 17:34:36 UTC, 1 replies.
- Nutch Dedup Question - posted by karthik085 <ka...@gmail.com> on 2007/09/20 17:36:16 UTC, 2 replies.
- cached page not showing images - posted by "Joseph M." <jo...@gmail.com> on 2007/09/20 18:44:38 UTC, 2 replies.
- Changing HTTP/1.0 to HTTP/1.1 - posted by "Joseph M." <jo...@gmail.com> on 2007/09/20 20:53:02 UTC, 0 replies.
- Newbie questions about filter, bandwidth, NTLM and threads - posted by Bent Hugh <be...@gmail.com> on 2007/09/20 21:04:29 UTC, 0 replies.
- Policy of merging patches - posted by Bent Hugh <be...@gmail.com> on 2007/09/21 07:13:59 UTC, 1 replies.
- Ranking Technology - posted by Kunal Wku <wk...@yahoo.com> on 2007/09/21 22:50:43 UTC, 0 replies.
- Plugin for Metadata - posted by Kunal Wku <wk...@yahoo.com> on 2007/09/21 22:51:07 UTC, 0 replies.
- How the trunk revisions are numbered - posted by Bent Hugh <be...@gmail.com> on 2007/09/22 08:50:24 UTC, 1 replies.
- NekoHTML Parse update ? - posted by Emmanuel <jo...@gmail.com> on 2007/09/22 19:55:15 UTC, 0 replies.
- SegmentMerger - posted by Emmanuel <jo...@gmail.com> on 2007/09/22 19:58:26 UTC, 0 replies.
- nutch trunk filtering URLs in invertlinks even if -noFilter is on? - posted by Brian Whitman <br...@variogr.am> on 2007/09/22 21:37:12 UTC, 1 replies.
- Parse reduce task fails to respond? - posted by Ned Rockson <nr...@stanford.edu> on 2007/09/23 11:17:09 UTC, 1 replies.
- Administration GUI on nutch 0.81 - posted by djames <dj...@supinfo.com> on 2007/09/24 16:27:35 UTC, 1 replies.
- Problems running multiple nutch nodes - posted by vikasran <hi...@hotmail.com> on 2007/09/25 00:55:27 UTC, 2 replies.
- problem with MoreIndexingFilter - posted by Sebastian Schick <sc...@informatik.uni-rostock.de> on 2007/09/25 16:05:45 UTC, 0 replies.
- Re: Last-modified / creation date or time - posted by Sebastian Schick <sc...@informatik.uni-rostock.de> on 2007/09/25 16:47:55 UTC, 3 replies.
- MP3 parser errors - posted by Vinny Gupta <vi...@yahoo.com> on 2007/09/25 17:47:39 UTC, 2 replies.
- Does authentication work? - posted by Alexis Votta <al...@gmail.com> on 2007/09/25 19:01:28 UTC, 4 replies.
- Question about NutchAnalysis#parse. - posted by martin <ma...@gmail.com> on 2007/09/26 11:37:48 UTC, 0 replies.
- distributed search server - posted by charlie w <sp...@gmail.com> on 2007/09/26 14:28:46 UTC, 2 replies.
- problem with summary highlighting - posted by Sebastian Schick <sc...@informatik.uni-rostock.de> on 2007/09/26 19:18:58 UTC, 0 replies.
- No results in cached.jsp ; Why? - posted by Bent Hugh <be...@gmail.com> on 2007/09/27 14:28:53 UTC, 1 replies.
- Is it possible to crawl a site that requires a log in? - posted by Matthew Vickery <vi...@gmail.com> on 2007/09/27 19:47:53 UTC, 0 replies.
- Newbie query: problem indexing pdf files - posted by Gareth Gale <ga...@hp.com> on 2007/09/28 14:26:59 UTC, 0 replies.
- Re: Newbie query: problem indexing pdf files - posted by Susam Pal <su...@gmail.com> on 2007/09/28 14:36:07 UTC, 4 replies.
- Trouble building nutch - posted by Jeff Van Boxtel <jb...@grpmack.com> on 2007/09/28 16:47:21 UTC, 1 replies.
- Cannot get nutch logs - posted by vikasran <hi...@hotmail.com> on 2007/09/28 23:08:50 UTC, 1 replies.
- Nutch logger does not work - posted by vikasran <hi...@hotmail.com> on 2007/09/28 23:11:55 UTC, 0 replies.