You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Ken Krugler <kk...@transpac.com> on 2005/07/18 02:13:19 UTC
Deploying crawl-only development version of Nutch
Hi all,
What's the best way to deploy a customized version of Nutch on a
server, where it only crawls/indexes (no search support)?
The .war Ant build bundles up a bunch of stuff we don't need, and
sticks things in .jsp-specific directories.
But an initial quick attempt at hacking up an Ant build to create a
deploy folder with just the .jars we need has met with numerous
problems, ranging from classpath-related stuff (works in the main
.jar manifest, not on the command line) to head scratching over why
nutch.jar includes only the nutch-default.xml & nutch-site.xml conf
files (e.g. it doesn't have regex-urlfilter.txt, which we need),
while nutch.war has a bigger set (including these three).
So is the best approach to just modify Nutch's .war Ant build for our
purposes, even though we're using Eclipse to build/debug portions of
the code?
Thanks for any advice,
-- Ken
--
Ken Krugler
TransPac Software, Inc.
<http://www.transpac.com>
+1 530-470-9200
Re: Deploying crawl-only development version of Nutch
Posted by Piotr Kosiorowski <pk...@gmail.com>.
Hello Ken,
"ant tar" produces full installation of nutch - it includes also *.war
file but you do not have to use it if you do not plan to deploy search
frontend. But majority of other directories included would be important
- bin for nutch shell script, conf for configuration files or plugins
for nutch plugins. I would use standard nutch tar file as installation
in your case (maybe throwing away nutch*.war file if you really want to).
Ragards
Piotr
Ken Krugler wrote:
> Hi all,
>
> What's the best way to deploy a customized version of Nutch on a server,
> where it only crawls/indexes (no search support)?
>
> The .war Ant build bundles up a bunch of stuff we don't need, and sticks
> things in .jsp-specific directories.
>
> But an initial quick attempt at hacking up an Ant build to create a
> deploy folder with just the .jars we need has met with numerous
> problems, ranging from classpath-related stuff (works in the main .jar
> manifest, not on the command line) to head scratching over why nutch.jar
> includes only the nutch-default.xml & nutch-site.xml conf files (e.g. it
> doesn't have regex-urlfilter.txt, which we need), while nutch.war has a
> bigger set (including these three).
>
> So is the best approach to just modify Nutch's .war Ant build for our
> purposes, even though we're using Eclipse to build/debug portions of the
> code?
>
> Thanks for any advice,
>
> -- Ken