You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Briggs <ac...@gmail.com> on 2006/12/11 16:39:09 UTC

Changing NutchConf params at Runtime.

First off, Hello all.

Second, I am using Nutch 0.7.2.  The 0.8.x branch is not an option.

And now the question:

Is there a way alter the Nutch configuration parameters at run time?  The
issue I have is the fact that I have several crawlers, deployed on multiple
machines and I need each to share the same config.  I do not want this
configuration bundled within the application's codebase/classpath.   I need
the ability to change the configuration for all deployments via either a
single file (nfs mount or something), database, web services or anything. I
just can't have it relying on a "nutch-site.xml" file lying in the
WEB-INF/classes dir of my webapp.  I am creating a console to control boost
values and such for things such as title, url etc.  So, I need to feed this
to Nutch at run time or at least initialization time.

My crawlers are deployed within Tomcat and I have tried to put the nutch
conf on a common NFS mounted directory that all deployments have access to
and are on the classpath. But, it seems Tomcat and Nutch are behaving funny
because Nutch cannot find the "nutch-site.xml" file on the classpath unless
I put it either in the WEB-INF/classes dir or within a jar within the
WEB-INF/lib directory.  I have forced the "nutch-site.xml" on the tomcat
classpath explicitly so it is on the system classpath, but the web app does
not seem to find it; it always loads the one located within the Nutch jar.
I read up on the Tomcat class loaders and it seems there is a small
contradiction of loading resources, but I won't get into that here.

Am I making any sense?  I don't have massive experience with Nutch so
forgive me as a I haven't found what I was looking for in the archives.

Thanks for your time,

Briggs