You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "Natarajan, Senthil" <se...@pitt.edu> on 2008/01/11 17:00:50 UTC

Hadoop Job Submission

Hi,
I have some basic questions in Hadoop Job submission. Could you please let me know.


1)      Once Hadoop daemons (dfs, JobTracker etc...) are started by hadoop user.

2)      Can any user submit job to Hadoop.

3)      Or does each user has to start the Hadoop daemons and submit job.

4)      Is there any queue available, like condor, to submit multiple jobs.

Thanks,
Senthil

Re: Hadoop Job Submission

Posted by Owen O'Malley <oo...@yahoo-inc.com>.
On Jan 14, 2008, at 12:10 PM, Natarajan, Senthil wrote:

> Hi,
> Thanks for the reply.
>
> Actually, what I am trying to ask is suppose if the jobtracker 
> +namenode are started by user hadoop, does user "senthil" can  
> submit the job without starting its own jobtracker+namenode.

It works fine, but you need to define a map/reduce system directory  
(mapred.system.dir) that is constant. (The default defines a  
directory name that depends on the user...)

-- Owen

RE: Hadoop Job Submission

Posted by "Natarajan, Senthil" <se...@pitt.edu>.
Hi,
Thanks for the reply.

Actually, what I am trying to ask is suppose if the jobtracker+namenode are started by user hadoop, does user "senthil" can submit the job without starting its own jobtracker+namenode.

The test Hadoop cluster I setup is, using individual Redhat Linux machines. So for the user hadoop I need to copy the SSH key to all the machines, so that user hadoop can ssh to all the nodes without password.

Do I need to do this (generating SSH key and copying to all the node) for all the users who are going to use Hadoop and MapReduce.

Thanks,
Senthil

-----Original Message-----
From: Khalil Honsali [mailto:k.honsali@gmail.com]
Sent: Saturday, January 12, 2008 7:01 PM
To: hadoop-user@lucene.apache.org
Subject: Re: Hadoop Job Submission

Hi,

once you have started the jobtracker+namenode on your cluster, you can
launch a job from any node of the cluster.

AFAIK, to submit multiple jobs you need to do that yourself either:
 - by writing a bash script to launch several jobs.jar one after the other
 - by bundling several jobs in a single job.jar (calling API::
JobClient.runjob( job ) repeatdly for each job in jobs ), for this you have
to create a new instance of JobClient for each job.

On 12/01/2008, Natarajan, Senthil <se...@pitt.edu> wrote:
>
> Hi,
> I have some basic questions in Hadoop Job submission. Could you please let
> me know.
>
>
> 1)      Once Hadoop daemons (dfs, JobTracker etc...) are started by hadoop
> user.
>
> 2)      Can any user submit job to Hadoop.
>
> 3)      Or does each user has to start the Hadoop daemons and submit job.
>
> 4)      Is there any queue available, like condor, to submit multiple
> jobs.
>
> Thanks,
> Senthil
>

Re: Hadoop Job Submission

Posted by Khalil Honsali <k....@gmail.com>.
Hi,

once you have started the jobtracker+namenode on your cluster, you can
launch a job from any node of the cluster.

AFAIK, to submit multiple jobs you need to do that yourself either:
 - by writing a bash script to launch several jobs.jar one after the other
 - by bundling several jobs in a single job.jar (calling API::
JobClient.runjob( job ) repeatdly for each job in jobs ), for this you have
to create a new instance of JobClient for each job.

On 12/01/2008, Natarajan, Senthil <se...@pitt.edu> wrote:
>
> Hi,
> I have some basic questions in Hadoop Job submission. Could you please let
> me know.
>
>
> 1)      Once Hadoop daemons (dfs, JobTracker etc...) are started by hadoop
> user.
>
> 2)      Can any user submit job to Hadoop.
>
> 3)      Or does each user has to start the Hadoop daemons and submit job.
>
> 4)      Is there any queue available, like condor, to submit multiple
> jobs.
>
> Thanks,
> Senthil
>