You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@nutch.apache.org by jeffersonzhou <je...@gmail.com> on 2011/08/10 11:39:36 UTC

Nutch & Hadoop

Hi,

 

I know Nutch itself has included Hadoop. My question is if I can put Nutch
in an Hadoop framework; that being said, Hadoop sits outside Nutch, and
Nutch calls mapper/reducer from outside.

 

Thanks!

Re: Nutch & Hadoop

Posted by Markus Jelsma <ma...@openindex.io>.

Well, it's bundled there, i don't know. Maybe check the old nutch+hadoop 
tutorial of pre 1.3 days.

http://wiki.apache.org/nutch/NutchHadoopTutorial

Should work

On Wednesday 10 August 2011 13:49:02 jeffersonzhou wrote:
> What if I use 1.2?
> 
> -----Original Message-----
> From: Markus Jelsma [mailto:markus.jelsma@openindex.io]
> Sent: Wednesday, August 10, 2011 7:45 PM
> To: user@nutch.apache.org
> Cc: jeffersonzhou
> Subject: Re: Nutch & Hadoop
> 
> Use the job file and the bin script in runtime/deploy in 1.3 and 1.4-dev.
> Set
> the appropriate HADOOP_HOME var and all is good. If you don't it will
> complain
> about the env var anyway. You can execute it one any node although i tend
> to
> 
> use the namenode for it.
> 
> On Wednesday 10 August 2011 13:36:42 jeffersonzhou wrote:
> > Markus, could you please be more specific?
> > 
> > Do you have some links or materials?
> > 
> > -----Original Message-----
> > From: Markus Jelsma [mailto:markus.jelsma@openindex.io]
> > Sent: Wednesday, August 10, 2011 5:47 PM
> > To: user@nutch.apache.org
> > Subject: Re: Nutch & Hadoop
> > 
> > Yes, you can simply load a job file and run it on an existing cluster.
> > 
> > On Wednesday 10 August 2011 11:39:36 jeffersonzhou wrote:
> > > Hi,
> > > 
> > > 
> > > 
> > > I know Nutch itself has included Hadoop. My question is if I can put
> > > Nutch in an Hadoop framework; that being said, Hadoop sits outside
> > > Nutch, and Nutch calls mapper/reducer from outside.
> > > 
> > > 
> > > 
> > > Thanks!

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

RE: Nutch & Hadoop

Posted by jeffersonzhou <je...@gmail.com>.

What if I use 1.2?

-----Original Message-----
From: Markus Jelsma [mailto:markus.jelsma@openindex.io] 
Sent: Wednesday, August 10, 2011 7:45 PM
To: user@nutch.apache.org
Cc: jeffersonzhou
Subject: Re: Nutch & Hadoop

Use the job file and the bin script in runtime/deploy in 1.3 and 1.4-dev.
Set 
the appropriate HADOOP_HOME var and all is good. If you don't it will
complain 
about the env var anyway. You can execute it one any node although i tend to

use the namenode for it.

On Wednesday 10 August 2011 13:36:42 jeffersonzhou wrote:
> Markus, could you please be more specific?
> 
> Do you have some links or materials?
> 
> -----Original Message-----
> From: Markus Jelsma [mailto:markus.jelsma@openindex.io]
> Sent: Wednesday, August 10, 2011 5:47 PM
> To: user@nutch.apache.org
> Subject: Re: Nutch & Hadoop
> 
> Yes, you can simply load a job file and run it on an existing cluster.
> 
> On Wednesday 10 August 2011 11:39:36 jeffersonzhou wrote:
> > Hi,
> > 
> > 
> > 
> > I know Nutch itself has included Hadoop. My question is if I can put
> > Nutch in an Hadoop framework; that being said, Hadoop sits outside
> > Nutch, and Nutch calls mapper/reducer from outside.
> > 
> > 
> > 
> > Thanks!

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: Nutch & Hadoop

Posted by Markus Jelsma <ma...@openindex.io>.

Use the job file and the bin script in runtime/deploy in 1.3 and 1.4-dev. Set 
the appropriate HADOOP_HOME var and all is good. If you don't it will complain 
about the env var anyway. You can execute it one any node although i tend to 
use the namenode for it.

On Wednesday 10 August 2011 13:36:42 jeffersonzhou wrote:
> Markus, could you please be more specific?
> 
> Do you have some links or materials?
> 
> -----Original Message-----
> From: Markus Jelsma [mailto:markus.jelsma@openindex.io]
> Sent: Wednesday, August 10, 2011 5:47 PM
> To: user@nutch.apache.org
> Subject: Re: Nutch & Hadoop
> 
> Yes, you can simply load a job file and run it on an existing cluster.
> 
> On Wednesday 10 August 2011 11:39:36 jeffersonzhou wrote:
> > Hi,
> > 
> > 
> > 
> > I know Nutch itself has included Hadoop. My question is if I can put
> > Nutch in an Hadoop framework; that being said, Hadoop sits outside
> > Nutch, and Nutch calls mapper/reducer from outside.
> > 
> > 
> > 
> > Thanks!

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

RE: Nutch & Hadoop

Posted by jeffersonzhou <je...@gmail.com>.

Markus, could you please be more specific?

Do you have some links or materials?

-----Original Message-----
From: Markus Jelsma [mailto:markus.jelsma@openindex.io] 
Sent: Wednesday, August 10, 2011 5:47 PM
To: user@nutch.apache.org
Subject: Re: Nutch & Hadoop

Yes, you can simply load a job file and run it on an existing cluster.

On Wednesday 10 August 2011 11:39:36 jeffersonzhou wrote:
> Hi,
> 
> 
> 
> I know Nutch itself has included Hadoop. My question is if I can put Nutch
> in an Hadoop framework; that being said, Hadoop sits outside Nutch, and
> Nutch calls mapper/reducer from outside.
> 
> 
> 
> Thanks!

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: Nutch & Hadoop

Posted by Markus Jelsma <ma...@openindex.io>.

Yes, you can simply load a job file and run it on an existing cluster.

On Wednesday 10 August 2011 11:39:36 jeffersonzhou wrote:
> Hi,
> 
> 
> 
> I know Nutch itself has included Hadoop. My question is if I can put Nutch
> in an Hadoop framework; that being said, Hadoop sits outside Nutch, and
> Nutch calls mapper/reducer from outside.
> 
> 
> 
> Thanks!

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350