You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Vjeran Marcinko <vj...@email.t-com.hr> on 2013/04/14 07:18:35 UTC

Best Hadoop dev environment [WAS: RE: Few noob MR questions]

Hi again,

 

You actually touched what I'm trying to do here - setup best Hadooop
development environment. 

 

Moreoever, don't ask me why, my development machine is on Windows, so I
don't have my Hadoop on it, so I use linux virtual machine with Hadoop
running in it, so I would like mostly to develop my job code in my favourite
IDE, and just deploy my jobs from there, and let them see running in this
"remote" virtual Hadoop platform. Although build scripts can help a lot, so
each time I change some job code, using these scripts I could package it and
transfer to Hadoop machine where I can deploy it via "hadoop jar." command,
and I will certainly do that *in production*, but *in development
environment* I would like to avoid that, And when in IDE, when I say "Run",
it uses "java -classpath .", not even "java -jar .", so job class is not
found in some packaged form. (at least by default - any proper IDE can add
additional build steps to it),

 

So are there any more hints for me to setup this environment?

 

Hadoop can really be intimidating for newbvie - there so much versions out
there, so many examples using different APIs, and so many ways to deploy a
job for eg, that I don't know how to start. And my windows OS brings even
more problems in the beginning, when I don't know much.

 

Regards,

Vjeran 

 

From: Bjorn Jonsson [mailto:bjornjon@gmail.com] 
Sent: Sunday, April 14, 2013 5:27 AM
To: user@hadoop.apache.org
Subject: Re: Few noob MR questions

 

Correct, you can use java -jar to submit a job...with the "driver" code in a
plain static main method. I do it all the time. You can of course run a Job
straight from your IDE Java code also. You can check out the .runJar()
method in the Hadoop API Javadoc to see what the hadoop command does
essentially I think. 

 

Cheers,

Bj

 

On Sat, Apr 13, 2013 at 3:59 PM, Jens Scheidtmann
<je...@gmail.com> wrote:

Dear Vjeran,

your own jobs should implement the Tool Interface and ToolRunner. This gives
additional standard options on the command line. 

Also have a look at class ProgramDriver as used here:
https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-projec
t/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/Example
Driver.java

which further simplifies executing your MR jobs.

 

Best regards,

Jens

 


Re: Best Hadoop dev environment [WAS: RE: Few noob MR questions]

Posted by Michel Segel <mi...@hotmail.com>.
I tend to use a real cluster so that I can test at a reasonable fraction of scale.
I've seen some instances where code that ran 'okay' in aVM failed to perform adequately at scale.


Sent from a remote device. Please excuse any typos...

Mike Segel

On Apr 14, 2013, at 2:19 AM, Jens Scheidtmann <je...@gmail.com> wrote:

> Dear Vjeran,
> 
> 
> 2013/4/14 Vjeran Marcinko <vj...@email.t-com.hr>
>> Hi again,
>> 
>>  
>> 
>> You actually touched what I'm trying to do here – setup best Hadooop development environment. 
>> [...] 
>> 
>>  So are there any more hints for me to setup this environment?
>> 
> 
> In Eclipse you can use the Maven plugin. See here: http://www.philippeadjiman.com/blog/2009/12/07/hadoop-tutorial-part-1-setting-up-your-mapreduce-learning-playground/
> Works like a charm for me.
> 
> I also have installed cygwin with Xwindows on my windows computer, so that I can remotely access the linux box and have my dev environment there.
> 
> In Hadoop's subversion, you can also find an Eclipse plugin, but I didn't try this yet. http://wiki.apache.org/hadoop/EclipsePlugIn
> 
> Best regards,
> 
> Jens
> 
> 
> 
> 

Re: Best Hadoop dev environment [WAS: RE: Few noob MR questions]

Posted by Michel Segel <mi...@hotmail.com>.
I tend to use a real cluster so that I can test at a reasonable fraction of scale.
I've seen some instances where code that ran 'okay' in aVM failed to perform adequately at scale.


Sent from a remote device. Please excuse any typos...

Mike Segel

On Apr 14, 2013, at 2:19 AM, Jens Scheidtmann <je...@gmail.com> wrote:

> Dear Vjeran,
> 
> 
> 2013/4/14 Vjeran Marcinko <vj...@email.t-com.hr>
>> Hi again,
>> 
>>  
>> 
>> You actually touched what I'm trying to do here – setup best Hadooop development environment. 
>> [...] 
>> 
>>  So are there any more hints for me to setup this environment?
>> 
> 
> In Eclipse you can use the Maven plugin. See here: http://www.philippeadjiman.com/blog/2009/12/07/hadoop-tutorial-part-1-setting-up-your-mapreduce-learning-playground/
> Works like a charm for me.
> 
> I also have installed cygwin with Xwindows on my windows computer, so that I can remotely access the linux box and have my dev environment there.
> 
> In Hadoop's subversion, you can also find an Eclipse plugin, but I didn't try this yet. http://wiki.apache.org/hadoop/EclipsePlugIn
> 
> Best regards,
> 
> Jens
> 
> 
> 
> 

Re: Best Hadoop dev environment [WAS: RE: Few noob MR questions]

Posted by Michel Segel <mi...@hotmail.com>.
I tend to use a real cluster so that I can test at a reasonable fraction of scale.
I've seen some instances where code that ran 'okay' in aVM failed to perform adequately at scale.


Sent from a remote device. Please excuse any typos...

Mike Segel

On Apr 14, 2013, at 2:19 AM, Jens Scheidtmann <je...@gmail.com> wrote:

> Dear Vjeran,
> 
> 
> 2013/4/14 Vjeran Marcinko <vj...@email.t-com.hr>
>> Hi again,
>> 
>>  
>> 
>> You actually touched what I'm trying to do here – setup best Hadooop development environment. 
>> [...] 
>> 
>>  So are there any more hints for me to setup this environment?
>> 
> 
> In Eclipse you can use the Maven plugin. See here: http://www.philippeadjiman.com/blog/2009/12/07/hadoop-tutorial-part-1-setting-up-your-mapreduce-learning-playground/
> Works like a charm for me.
> 
> I also have installed cygwin with Xwindows on my windows computer, so that I can remotely access the linux box and have my dev environment there.
> 
> In Hadoop's subversion, you can also find an Eclipse plugin, but I didn't try this yet. http://wiki.apache.org/hadoop/EclipsePlugIn
> 
> Best regards,
> 
> Jens
> 
> 
> 
> 

Re: Best Hadoop dev environment [WAS: RE: Few noob MR questions]

Posted by Michel Segel <mi...@hotmail.com>.
I tend to use a real cluster so that I can test at a reasonable fraction of scale.
I've seen some instances where code that ran 'okay' in aVM failed to perform adequately at scale.


Sent from a remote device. Please excuse any typos...

Mike Segel

On Apr 14, 2013, at 2:19 AM, Jens Scheidtmann <je...@gmail.com> wrote:

> Dear Vjeran,
> 
> 
> 2013/4/14 Vjeran Marcinko <vj...@email.t-com.hr>
>> Hi again,
>> 
>>  
>> 
>> You actually touched what I'm trying to do here – setup best Hadooop development environment. 
>> [...] 
>> 
>>  So are there any more hints for me to setup this environment?
>> 
> 
> In Eclipse you can use the Maven plugin. See here: http://www.philippeadjiman.com/blog/2009/12/07/hadoop-tutorial-part-1-setting-up-your-mapreduce-learning-playground/
> Works like a charm for me.
> 
> I also have installed cygwin with Xwindows on my windows computer, so that I can remotely access the linux box and have my dev environment there.
> 
> In Hadoop's subversion, you can also find an Eclipse plugin, but I didn't try this yet. http://wiki.apache.org/hadoop/EclipsePlugIn
> 
> Best regards,
> 
> Jens
> 
> 
> 
> 

Re: Best Hadoop dev environment [WAS: RE: Few noob MR questions]

Posted by Jens Scheidtmann <je...@gmail.com>.
Dear Vjeran,


2013/4/14 Vjeran Marcinko <vj...@email.t-com.hr>

> Hi again,****
>
> ** **
>
> You actually touched what I'm trying to do here – setup best Hadooop
> development environment.
> [...]
>
 **So are there any more hints for me to setup this environment?****
>
>
>

In Eclipse you can use the Maven plugin. See here:
http://www.philippeadjiman.com/blog/2009/12/07/hadoop-tutorial-part-1-setting-up-your-mapreduce-learning-playground/
Works like a charm for me.

I also have installed cygwin with Xwindows on my windows computer, so that
I can remotely access the linux box and have my dev environment there.

In Hadoop's subversion, you can also find an Eclipse plugin, but I didn't
try this yet. http://wiki.apache.org/hadoop/EclipsePlugIn

Best regards,

Jens

Re: Best Hadoop dev environment [WAS: RE: Few noob MR questions]

Posted by Jens Scheidtmann <je...@gmail.com>.
Dear Vjeran,


2013/4/14 Vjeran Marcinko <vj...@email.t-com.hr>

> Hi again,****
>
> ** **
>
> You actually touched what I'm trying to do here – setup best Hadooop
> development environment.
> [...]
>
 **So are there any more hints for me to setup this environment?****
>
>
>

In Eclipse you can use the Maven plugin. See here:
http://www.philippeadjiman.com/blog/2009/12/07/hadoop-tutorial-part-1-setting-up-your-mapreduce-learning-playground/
Works like a charm for me.

I also have installed cygwin with Xwindows on my windows computer, so that
I can remotely access the linux box and have my dev environment there.

In Hadoop's subversion, you can also find an Eclipse plugin, but I didn't
try this yet. http://wiki.apache.org/hadoop/EclipsePlugIn

Best regards,

Jens

Re: Best Hadoop dev environment [WAS: RE: Few noob MR questions]

Posted by Jens Scheidtmann <je...@gmail.com>.
Dear Vjeran,


2013/4/14 Vjeran Marcinko <vj...@email.t-com.hr>

> Hi again,****
>
> ** **
>
> You actually touched what I'm trying to do here – setup best Hadooop
> development environment.
> [...]
>
 **So are there any more hints for me to setup this environment?****
>
>
>

In Eclipse you can use the Maven plugin. See here:
http://www.philippeadjiman.com/blog/2009/12/07/hadoop-tutorial-part-1-setting-up-your-mapreduce-learning-playground/
Works like a charm for me.

I also have installed cygwin with Xwindows on my windows computer, so that
I can remotely access the linux box and have my dev environment there.

In Hadoop's subversion, you can also find an Eclipse plugin, but I didn't
try this yet. http://wiki.apache.org/hadoop/EclipsePlugIn

Best regards,

Jens

Re: Best Hadoop dev environment [WAS: RE: Few noob MR questions]

Posted by Jens Scheidtmann <je...@gmail.com>.
Dear Vjeran,


2013/4/14 Vjeran Marcinko <vj...@email.t-com.hr>

> Hi again,****
>
> ** **
>
> You actually touched what I'm trying to do here – setup best Hadooop
> development environment.
> [...]
>
 **So are there any more hints for me to setup this environment?****
>
>
>

In Eclipse you can use the Maven plugin. See here:
http://www.philippeadjiman.com/blog/2009/12/07/hadoop-tutorial-part-1-setting-up-your-mapreduce-learning-playground/
Works like a charm for me.

I also have installed cygwin with Xwindows on my windows computer, so that
I can remotely access the linux box and have my dev environment there.

In Hadoop's subversion, you can also find an Eclipse plugin, but I didn't
try this yet. http://wiki.apache.org/hadoop/EclipsePlugIn

Best regards,

Jens