You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by C J <c....@yahoo.com> on 2010/12/09 23:20:14 UTC

Multiple hadoop configurations

Hi,

I have 2 hadoop clusters . Both these clusters have their own set of jobs 
running. I also have some distcp jobs which copy over data from one cluster to 
another.

I want to be able to control the jobs on both the clusters through one scheduler 
(so I can coordinate the jobs).

I am wondering when I need to trigger a job through a scheduler, how can I send 
to one of the 2 clusters. 


2 clusters means 2 sets of configuration files.Is there any way to get a handle 
to one of these clusters, by specifying the configuration file name or 
something?

Will appreciate any help or clues.

Thanks.



      

Re: Multiple hadoop configurations

Posted by C J <c....@yahoo.com>.
Thanks everyone. I do have the same version of hadoop on both clusters. And 
oozie does seem to be the best option to get what I need.

- C




________________________________
From: Venkatesh S <sv...@yahoo-inc.com>
To: "general@hadoop.apache.org" <ge...@hadoop.apache.org>
Sent: Thu, December 9, 2010 8:36:29 PM
Subject: Re: Multiple hadoop configurations

If you are not running the same version, you could use Reverse Class Loader 
trick to be able to launch jobs to multiple clusters.

-Venkatesh


On 12/10/10 6:46 AM, "Alejandro Abdelnur" <tu...@cloudera.com> wrote:

Deepika,

If both of your clusters run the same version of Hadoop, then -as
Konstantin suggested- you could use Oozie. Oozie does not rely on
local Hadoop configuration files to determine the JT/NN to use, you
specify them in the Oozie workflow application XML for each job.

Cheers.

Alejandro

On Fri, Dec 10, 2010 at 7:01 AM, C J <c....@yahoo.com> wrote:
> Thanks Konstantin. My issue is more with supporting the multiple clusters . I 
>am
> using quartz for scheduling the jobs (currently it is 2 separate schedulers 
for
> the 2 clusters).
>
>
> If while submitting a job I am able to specify which cluster it should trigger
> the job on (by giving the handle to the appropriate cluster), I think I will 
be
> able to manage it.
>
> Thanks,
> Deepika
>
>
>
>
> ________________________________
> From: Konstantin Boudnik <co...@apache.org>
> To: general@hadoop.apache.org
> Sent: Thu, December 9, 2010 2:24:34 PM
> Subject: Re: Multiple hadoop configurations
>
> I believe the answer you are looking for is Oozie coordinator, but I
> am not which version of it supports multiple clusters.
>
> On Thu, Dec 9, 2010 at 14:20, C J <c....@yahoo.com> wrote:
>> Hi,
>>
>> I have 2 hadoop clusters . Both these clusters have their own set of jobs
>> running. I also have some distcp jobs which copy over data from one cluster 
to
>> another.
>>
>> I want to be able to control the jobs on both the clusters through one
>>scheduler
>> (so I can coordinate the jobs).
>>
>> I am wondering when I need to trigger a job through a scheduler, how can I
> send
>> to one of the 2 clusters.
>>
>>
>> 2 clusters means 2 sets of configuration files.Is there any way to get a
> handle
>> to one of these clusters, by specifying the configuration file name or
>> something?
>>
>> Will appreciate any help or clues.
>>
>> Thanks.
>>
>>
>>
>>
>
>
>
>


      

Re: Multiple hadoop configurations

Posted by Venkatesh S <sv...@yahoo-inc.com>.
If you are not running the same version, you could use Reverse Class Loader trick to be able to launch jobs to multiple clusters.

-Venkatesh


On 12/10/10 6:46 AM, "Alejandro Abdelnur" <tu...@cloudera.com> wrote:

Deepika,

If both of your clusters run the same version of Hadoop, then -as
Konstantin suggested- you could use Oozie. Oozie does not rely on
local Hadoop configuration files to determine the JT/NN to use, you
specify them in the Oozie workflow application XML for each job.

Cheers.

Alejandro

On Fri, Dec 10, 2010 at 7:01 AM, C J <c....@yahoo.com> wrote:
> Thanks Konstantin. My issue is more with supporting the multiple clusters . I am
> using quartz for scheduling the jobs (currently it is 2 separate schedulers for
> the 2 clusters).
>
>
> If while submitting a job I am able to specify which cluster it should trigger
> the job on (by giving the handle to the appropriate cluster), I think I will be
> able to manage it.
>
> Thanks,
> Deepika
>
>
>
>
> ________________________________
> From: Konstantin Boudnik <co...@apache.org>
> To: general@hadoop.apache.org
> Sent: Thu, December 9, 2010 2:24:34 PM
> Subject: Re: Multiple hadoop configurations
>
> I believe the answer you are looking for is Oozie coordinator, but I
> am not which version of it supports multiple clusters.
>
> On Thu, Dec 9, 2010 at 14:20, C J <c....@yahoo.com> wrote:
>> Hi,
>>
>> I have 2 hadoop clusters . Both these clusters have their own set of jobs
>> running. I also have some distcp jobs which copy over data from one cluster to
>> another.
>>
>> I want to be able to control the jobs on both the clusters through one
>>scheduler
>> (so I can coordinate the jobs).
>>
>> I am wondering when I need to trigger a job through a scheduler, how can I
> send
>> to one of the 2 clusters.
>>
>>
>> 2 clusters means 2 sets of configuration files.Is there any way to get a
> handle
>> to one of these clusters, by specifying the configuration file name or
>> something?
>>
>> Will appreciate any help or clues.
>>
>> Thanks.
>>
>>
>>
>>
>
>
>
>


Re: Multiple hadoop configurations

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
Deepika,

If both of your clusters run the same version of Hadoop, then -as
Konstantin suggested- you could use Oozie. Oozie does not rely on
local Hadoop configuration files to determine the JT/NN to use, you
specify them in the Oozie workflow application XML for each job.

Cheers.

Alejandro

On Fri, Dec 10, 2010 at 7:01 AM, C J <c....@yahoo.com> wrote:
> Thanks Konstantin. My issue is more with supporting the multiple clusters . I am
> using quartz for scheduling the jobs (currently it is 2 separate schedulers for
> the 2 clusters).
>
>
> If while submitting a job I am able to specify which cluster it should trigger
> the job on (by giving the handle to the appropriate cluster), I think I will be
> able to manage it.
>
> Thanks,
> Deepika
>
>
>
>
> ________________________________
> From: Konstantin Boudnik <co...@apache.org>
> To: general@hadoop.apache.org
> Sent: Thu, December 9, 2010 2:24:34 PM
> Subject: Re: Multiple hadoop configurations
>
> I believe the answer you are looking for is Oozie coordinator, but I
> am not which version of it supports multiple clusters.
>
> On Thu, Dec 9, 2010 at 14:20, C J <c....@yahoo.com> wrote:
>> Hi,
>>
>> I have 2 hadoop clusters . Both these clusters have their own set of jobs
>> running. I also have some distcp jobs which copy over data from one cluster to
>> another.
>>
>> I want to be able to control the jobs on both the clusters through one
>>scheduler
>> (so I can coordinate the jobs).
>>
>> I am wondering when I need to trigger a job through a scheduler, how can I
> send
>> to one of the 2 clusters.
>>
>>
>> 2 clusters means 2 sets of configuration files.Is there any way to get a
> handle
>> to one of these clusters, by specifying the configuration file name or
>> something?
>>
>> Will appreciate any help or clues.
>>
>> Thanks.
>>
>>
>>
>>
>
>
>
>

Re: Multiple hadoop configurations

Posted by C J <c....@yahoo.com>.
Thanks Konstantin. My issue is more with supporting the multiple clusters . I am 
using quartz for scheduling the jobs (currently it is 2 separate schedulers for 
the 2 clusters). 


If while submitting a job I am able to specify which cluster it should trigger 
the job on (by giving the handle to the appropriate cluster), I think I will be 
able to manage it.

Thanks,
Deepika




________________________________
From: Konstantin Boudnik <co...@apache.org>
To: general@hadoop.apache.org
Sent: Thu, December 9, 2010 2:24:34 PM
Subject: Re: Multiple hadoop configurations

I believe the answer you are looking for is Oozie coordinator, but I
am not which version of it supports multiple clusters.

On Thu, Dec 9, 2010 at 14:20, C J <c....@yahoo.com> wrote:
> Hi,
>
> I have 2 hadoop clusters . Both these clusters have their own set of jobs
> running. I also have some distcp jobs which copy over data from one cluster to
> another.
>
> I want to be able to control the jobs on both the clusters through one 
>scheduler
> (so I can coordinate the jobs).
>
> I am wondering when I need to trigger a job through a scheduler, how can I 
send
> to one of the 2 clusters.
>
>
> 2 clusters means 2 sets of configuration files.Is there any way to get a 
handle
> to one of these clusters, by specifying the configuration file name or
> something?
>
> Will appreciate any help or clues.
>
> Thanks.
>
>
>
>



      

Re: Multiple hadoop configurations

Posted by Konstantin Boudnik <co...@apache.org>.
I believe the answer you are looking for is Oozie coordinator, but I
am not which version of it supports multiple clusters.

On Thu, Dec 9, 2010 at 14:20, C J <c....@yahoo.com> wrote:
> Hi,
>
> I have 2 hadoop clusters . Both these clusters have their own set of jobs
> running. I also have some distcp jobs which copy over data from one cluster to
> another.
>
> I want to be able to control the jobs on both the clusters through one scheduler
> (so I can coordinate the jobs).
>
> I am wondering when I need to trigger a job through a scheduler, how can I send
> to one of the 2 clusters.
>
>
> 2 clusters means 2 sets of configuration files.Is there any way to get a handle
> to one of these clusters, by specifying the configuration file name or
> something?
>
> Will appreciate any help or clues.
>
> Thanks.
>
>
>
>