You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Jérôme Verstrynge <jv...@gmail.com> on 2011/01/17 20:47:46 UTC

Newbie question about application development

Hi,

Trying to understand HBase/Hadoop here. Let's imagine I write some 
MapReduce Java code in my application. Let's imagine I deploy my 
application on 1 node of a ring of 20. Let's imagine the data is 
distributed on the 20 nodes.

Do I need to distribute my JAR on the 20 to run the MapReduce Java code? 
Or can I launch this code from my node and HBase/Hadoop will distribute 
it automatically to other nodes for local processing? How does the magic 
happen?

Thanks,

JVerstry

Re: Newbie question about application development

Posted by Chris Tarnas <cf...@email.com>.

When you submit a job via hadoop you include the jar with the submission. Hadoop takes care of the distribution of the jar to each of the nodes and the cleanup when all processing is done. You can also include other needed files and archives with the job as well. You would use the 'hadoop' command for this:

http://hadoop.apache.org/common/docs/r0.20.2/commands_manual.html

-chris

On Jan 17, 2011, at 11:47 AM, Jérôme Verstrynge wrote:

> Hi,
> 
> Trying to understand HBase/Hadoop here. Let's imagine I write some MapReduce Java code in my application. Let's imagine I deploy my application on 1 node of a ring of 20. Let's imagine the data is distributed on the 20 nodes.
> 
> Do I need to distribute my JAR on the 20 to run the MapReduce Java code? Or can I launch this code from my node and HBase/Hadoop will distribute it automatically to other nodes for local processing? How does the magic happen?
> 
> Thanks,
> 
> JVerstry
>