You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by "Robb, Sam" <sa...@timesys.com> on 2005/10/17 19:56:02 UTC

Nutch on FC4

Hi,

  I'm experimenting with running Nutch under Fedora Core 4.
I'd like to run Nutch using Tomcat 5 started as a service
via /etc/init.d - the default setup under FC4.  However, the
last part of the Nutch tutorial states:

  "The webapp finds its indexes in ./segments, relative to
   where you start Tomcat, so, if you've done intranet crawling,
   connect to your crawl directory, or, if you've done whole-web
   crawling, don't change directories, and give the command:
   ~/local/tomcat/bin/catalina.sh start"

  I'm new to Tomcat, Nutch, etc. and I'm wondering what I should
do to set things up so that Nutch can find ./segments in the
expected location?

  Thanks,

-Samrobb

Re: Nutch on FC4

Posted by Andy Lee <ag...@earthlink.net>.
On Oct 17, 2005, at 1:56 PM, Robb, Sam wrote:
>   "The webapp finds its indexes in ./segments, relative to
>    where you start Tomcat, so, if you've done intranet crawling,
>    connect to your crawl directory, or, if you've done whole-web
>    crawling, don't change directories, and give the command:
>    ~/local/tomcat/bin/catalina.sh start"
>
>   I'm new to Tomcat, Nutch, etc. and I'm wondering what I should
> do to set things up so that Nutch can find ./segments in the
> expected location?

You can declare the location of the segments by adding a setting  
nutch-site.xml.

Settings in nutch-site.xml override settings in nutch-default.xml.   
Both files are looked for on your classpath.  It might be clearer if  
you see the source where the files are loaded -- I think it's in  
NutchConf.java.

See the searcher.dir property in nutch-defaults.xml and set it to  
what you want in nutch-site.xml.

--Andy


Re: Nutch on FC4

Posted by cf-auto <cf...@folge2.de>.
Hi

add 
<property>
   <name>searcher.dir</name>
   <value>/path/to/your/segments/dir/</value>
</property>

to WEB-INF/classes/nutch-site.xml

I'm using an absolute path in <value>.
The path should point to the directory that contains the
"segments"-directory.



regards
c

Am Montag, den 17.10.2005, 13:56 -0400 schrieb Robb, Sam:
> Hi,
> 
>   I'm experimenting with running Nutch under Fedora Core 4.
> I'd like to run Nutch using Tomcat 5 started as a service
> via /etc/init.d - the default setup under FC4.  However, the
> last part of the Nutch tutorial states:
> 
>   "The webapp finds its indexes in ./segments, relative to
>    where you start Tomcat, so, if you've done intranet crawling,
>    connect to your crawl directory, or, if you've done whole-web
>    crawling, don't change directories, and give the command:
>    ~/local/tomcat/bin/catalina.sh start"
> 
>   I'm new to Tomcat, Nutch, etc. and I'm wondering what I should
> do to set things up so that Nutch can find ./segments in the
> expected location?
> 
>   Thanks,
> 
> -Samrobb