You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Amit Sela <am...@infolinks.com> on 2013/02/21 11:00:29 UTC

Deploy nutch on existing Hadoop cluster

Anyone have a good tutorial about deploying nutch (1.6) on a pre-existing
Hadoop cluster ?

Thanks.

Re: Deploy nutch on existing Hadoop cluster

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Welcome to the world of post 1.3 Nutch ;)

On Thursday, February 21, 2013, Amit Sela <am...@infolinks.com> wrote:
> I basically just built with ant and copied the contents of deploy (job
file
> + nutch and crawl scripts) to "nutch" folder in my hadoop-user directory
on
> the master.
>
> I changed the crawl script to work only in distributed mode and it seems
to
> work... though I am getting a lot of Child Error exceptions in one of the
> nodes (not the master)
> while another node seems to work fine (total 1 master + 2 slaves).
>
> Could it be so simple ? am I missing something ?
>
>
> Thanks
>
>
> On Thu, Feb 21, 2013 at 6:21 PM, Julien Nioche <
> lists.digitalpebble@gmail.com> wrote:
>
>> https://wiki.apache.org/nutch/NutchHadoopTutorial
>>
>> basically follow the steps in
>> http://hadoop.apache.org/docs/stable/cluster_setup.html then install
Nutch
>> on the master node of your cluster, 'cd runtime/deploy/bin' and use the
>> nutch scripts as usual. You can then use the standard Mapreduce webapp to
>> monitor the progress of your crawl
>>
>> Julien
>>
>> On 21 February 2013 10:00, Amit Sela <am...@infolinks.com> wrote:
>>
>> > Anyone have a good tutorial about deploying nutch (1.6) on a
pre-existing
>> > Hadoop cluster ?
>> >
>> > Thanks.
>> >
>>
>>
>>
>> --
>> *
>> *Open Source Solutions for Text Engineering
>>
>> http://digitalpebble.blogspot.com/
>> http://www.digitalpebble.com
>> http://twitter.com/digitalpebble
>>
>

-- 
*Lewis*

Re: Deploy nutch on existing Hadoop cluster

Posted by Amit Sela <am...@infolinks.com>.
I basically just built with ant and copied the contents of deploy (job file
+ nutch and crawl scripts) to "nutch" folder in my hadoop-user directory on
the master.

I changed the crawl script to work only in distributed mode and it seems to
work... though I am getting a lot of Child Error exceptions in one of the
nodes (not the master)
while another node seems to work fine (total 1 master + 2 slaves).

Could it be so simple ? am I missing something ?


Thanks


On Thu, Feb 21, 2013 at 6:21 PM, Julien Nioche <
lists.digitalpebble@gmail.com> wrote:

> https://wiki.apache.org/nutch/NutchHadoopTutorial
>
> basically follow the steps in
> http://hadoop.apache.org/docs/stable/cluster_setup.html then install Nutch
> on the master node of your cluster, 'cd runtime/deploy/bin' and use the
> nutch scripts as usual. You can then use the standard Mapreduce webapp to
> monitor the progress of your crawl
>
> Julien
>
> On 21 February 2013 10:00, Amit Sela <am...@infolinks.com> wrote:
>
> > Anyone have a good tutorial about deploying nutch (1.6) on a pre-existing
> > Hadoop cluster ?
> >
> > Thanks.
> >
>
>
>
> --
> *
> *Open Source Solutions for Text Engineering
>
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com
> http://twitter.com/digitalpebble
>

Re: Deploy nutch on existing Hadoop cluster

Posted by Julien Nioche <li...@gmail.com>.
https://wiki.apache.org/nutch/NutchHadoopTutorial

basically follow the steps in
http://hadoop.apache.org/docs/stable/cluster_setup.html then install Nutch
on the master node of your cluster, 'cd runtime/deploy/bin' and use the
nutch scripts as usual. You can then use the standard Mapreduce webapp to
monitor the progress of your crawl

Julien

On 21 February 2013 10:00, Amit Sela <am...@infolinks.com> wrote:

> Anyone have a good tutorial about deploying nutch (1.6) on a pre-existing
> Hadoop cluster ?
>
> Thanks.
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Re: Deploy nutch on existing Hadoop cluster

Posted by Jorge Luis Betancourt Gonzalez <jl...@uci.cu>.
Perhaps this could help:

http://www.rui-yang.com/develop/build-nutch-1-4-cluster-with-hadoop/

----- Mensaje original -----
De: "Amit Sela" <am...@infolinks.com>
Para: user@nutch.apache.org
Enviados: Jueves, 21 de Febrero 2013 5:00:29
Asunto: Deploy nutch on existing Hadoop cluster

Anyone have a good tutorial about deploying nutch (1.6) on a pre-existing
Hadoop cluster ?

Thanks.