You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by lonely Feb <lo...@gmail.com> on 2010/09/10 05:37:42 UTC

How to setup Nutch on existing Hadoop

Hello~ I just start to deploy Nutch on my distributed machines, and an
existing Hadoop system has already deplyed on these machines, I wonder how
to setup Nutch on them but do not change the Hadoop settings. Please give me
some advices, Thx~

Re: How to setup Nutch on existing Hadoop

Posted by lonely Feb <lo...@gmail.com>.
I've been successfully used the nutch-version.job to crawl on Hadoop, Thanks
for your helps~
But the new problem comes:
How can i setup tomcat for the web searching? How can i edit the
WEB-INF/classes/nutch-site.xml ?
Need i put tomcat on each node for distributed searching? If i did this, how
can i set the search.dir (which should be a HDFS path)
Or need i copyToLocal for a single node so that i can use the local path to
set the search.dir?

Re: How to setup Nutch on existing Hadoop

Posted by lonely Feb <lo...@gmail.com>.
> Thank u so much~
>

RE: How to setup Nutch on existing Hadoop

Posted by Brian Tingle <Br...@ucop.edu>.
There is also an ant task to build the job file if you are building from source...  took me weeks to figure that out... 

-----Original Message-----
From: Sonal Goyal [mailto:sonalgoyal4@gmail.com]
Sent: Thu 9/9/2010 10:30 PM
To: user@nutch.apache.org
Subject: Re: How to setup Nutch on existing Hadoop
 
No, just use the prebuilt nutch-version.job which is part of the Nutch
release. It can be used from the jobTracker like other Hadoop jobs.

Thanks and Regards,
Sonal
www.meghsoft.com
http://in.linkedin.com/in/sonalgoyal


On Fri, Sep 10, 2010 at 10:56 AM, lonely Feb <lo...@gmail.com> wrote:

> Thanks for your advices, Can u specify the whole process ?
> Need i generate a new jar with all the jars in nutch/lib ? And Nutch should
> be put on the single Master or all the Master and Slaves?
>



Re: How to setup Nutch on existing Hadoop

Posted by Sonal Goyal <so...@gmail.com>.
No, just use the prebuilt nutch-version.job which is part of the Nutch
release. It can be used from the jobTracker like other Hadoop jobs.

Thanks and Regards,
Sonal
www.meghsoft.com
http://in.linkedin.com/in/sonalgoyal


On Fri, Sep 10, 2010 at 10:56 AM, lonely Feb <lo...@gmail.com> wrote:

> Thanks for your advices, Can u specify the whole process ?
> Need i generate a new jar with all the jars in nutch/lib ? And Nutch should
> be put on the single Master or all the Master and Slaves?
>

Re: How to setup Nutch on existing Hadoop

Posted by lonely Feb <lo...@gmail.com>.
Thanks for your advices, Can u specify the whole process ?
Need i generate a new jar with all the jars in nutch/lib ? And Nutch should
be put on the single Master or all the Master and Slaves?

Re: How to setup Nutch on existing Hadoop

Posted by Sonal Goyal <so...@gmail.com>.
You can use the Nutch job file which can be used with existing Hadoop
cluster like any other Hadoop job jar. You will have to call the injector,
generate etc jobs yourself. Have a look at bin/nutch and you should be able
to figure out the Job classes.

Thanks and Regards,
Sonal
www.meghsoft.com
http://in.linkedin.com/in/sonalgoyal


On Fri, Sep 10, 2010 at 9:07 AM, lonely Feb <lo...@gmail.com> wrote:

> Hello~ I just start to deploy Nutch on my distributed machines, and an
> existing Hadoop system has already deplyed on these machines, I wonder how
> to setup Nutch on them but do not change the Hadoop settings. Please give
> me
> some advices, Thx~
>