You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by C J <c....@yahoo.com> on 2010/12/09 23:20:14 UTC
Multiple hadoop configurations
Hi,
I have 2 hadoop clusters . Both these clusters have their own set of jobs
running. I also have some distcp jobs which copy over data from one cluster to
another.
I want to be able to control the jobs on both the clusters through one scheduler
(so I can coordinate the jobs).
I am wondering when I need to trigger a job through a scheduler, how can I send
to one of the 2 clusters.
2 clusters means 2 sets of configuration files.Is there any way to get a handle
to one of these clusters, by specifying the configuration file name or
something?
Will appreciate any help or clues.
Thanks.
Re: Multiple hadoop configurations
Posted by C J <c....@yahoo.com>.
Thanks everyone. I do have the same version of hadoop on both clusters. And
oozie does seem to be the best option to get what I need.
- C
________________________________
From: Venkatesh S <sv...@yahoo-inc.com>
To: "general@hadoop.apache.org" <ge...@hadoop.apache.org>
Sent: Thu, December 9, 2010 8:36:29 PM
Subject: Re: Multiple hadoop configurations
If you are not running the same version, you could use Reverse Class Loader
trick to be able to launch jobs to multiple clusters.
-Venkatesh
On 12/10/10 6:46 AM, "Alejandro Abdelnur" <tu...@cloudera.com> wrote:
Deepika,
If both of your clusters run the same version of Hadoop, then -as
Konstantin suggested- you could use Oozie. Oozie does not rely on
local Hadoop configuration files to determine the JT/NN to use, you
specify them in the Oozie workflow application XML for each job.
Cheers.
Alejandro
On Fri, Dec 10, 2010 at 7:01 AM, C J <c....@yahoo.com> wrote:
> Thanks Konstantin. My issue is more with supporting the multiple clusters . I
>am
> using quartz for scheduling the jobs (currently it is 2 separate schedulers
for
> the 2 clusters).
>
>
> If while submitting a job I am able to specify which cluster it should trigger
> the job on (by giving the handle to the appropriate cluster), I think I will
be
> able to manage it.
>
> Thanks,
> Deepika
>
>
>
>
> ________________________________
> From: Konstantin Boudnik <co...@apache.org>
> To: general@hadoop.apache.org
> Sent: Thu, December 9, 2010 2:24:34 PM
> Subject: Re: Multiple hadoop configurations
>
> I believe the answer you are looking for is Oozie coordinator, but I
> am not which version of it supports multiple clusters.
>
> On Thu, Dec 9, 2010 at 14:20, C J <c....@yahoo.com> wrote:
>> Hi,
>>
>> I have 2 hadoop clusters . Both these clusters have their own set of jobs
>> running. I also have some distcp jobs which copy over data from one cluster
to
>> another.
>>
>> I want to be able to control the jobs on both the clusters through one
>>scheduler
>> (so I can coordinate the jobs).
>>
>> I am wondering when I need to trigger a job through a scheduler, how can I
> send
>> to one of the 2 clusters.
>>
>>
>> 2 clusters means 2 sets of configuration files.Is there any way to get a
> handle
>> to one of these clusters, by specifying the configuration file name or
>> something?
>>
>> Will appreciate any help or clues.
>>
>> Thanks.
>>
>>
>>
>>
>
>
>
>
Re: Multiple hadoop configurations
Posted by Venkatesh S <sv...@yahoo-inc.com>.
If you are not running the same version, you could use Reverse Class Loader trick to be able to launch jobs to multiple clusters.
-Venkatesh
On 12/10/10 6:46 AM, "Alejandro Abdelnur" <tu...@cloudera.com> wrote:
Deepika,
If both of your clusters run the same version of Hadoop, then -as
Konstantin suggested- you could use Oozie. Oozie does not rely on
local Hadoop configuration files to determine the JT/NN to use, you
specify them in the Oozie workflow application XML for each job.
Cheers.
Alejandro
On Fri, Dec 10, 2010 at 7:01 AM, C J <c....@yahoo.com> wrote:
> Thanks Konstantin. My issue is more with supporting the multiple clusters . I am
> using quartz for scheduling the jobs (currently it is 2 separate schedulers for
> the 2 clusters).
>
>
> If while submitting a job I am able to specify which cluster it should trigger
> the job on (by giving the handle to the appropriate cluster), I think I will be
> able to manage it.
>
> Thanks,
> Deepika
>
>
>
>
> ________________________________
> From: Konstantin Boudnik <co...@apache.org>
> To: general@hadoop.apache.org
> Sent: Thu, December 9, 2010 2:24:34 PM
> Subject: Re: Multiple hadoop configurations
>
> I believe the answer you are looking for is Oozie coordinator, but I
> am not which version of it supports multiple clusters.
>
> On Thu, Dec 9, 2010 at 14:20, C J <c....@yahoo.com> wrote:
>> Hi,
>>
>> I have 2 hadoop clusters . Both these clusters have their own set of jobs
>> running. I also have some distcp jobs which copy over data from one cluster to
>> another.
>>
>> I want to be able to control the jobs on both the clusters through one
>>scheduler
>> (so I can coordinate the jobs).
>>
>> I am wondering when I need to trigger a job through a scheduler, how can I
> send
>> to one of the 2 clusters.
>>
>>
>> 2 clusters means 2 sets of configuration files.Is there any way to get a
> handle
>> to one of these clusters, by specifying the configuration file name or
>> something?
>>
>> Will appreciate any help or clues.
>>
>> Thanks.
>>
>>
>>
>>
>
>
>
>
Re: Multiple hadoop configurations
Posted by Alejandro Abdelnur <tu...@cloudera.com>.
Deepika,
If both of your clusters run the same version of Hadoop, then -as
Konstantin suggested- you could use Oozie. Oozie does not rely on
local Hadoop configuration files to determine the JT/NN to use, you
specify them in the Oozie workflow application XML for each job.
Cheers.
Alejandro
On Fri, Dec 10, 2010 at 7:01 AM, C J <c....@yahoo.com> wrote:
> Thanks Konstantin. My issue is more with supporting the multiple clusters . I am
> using quartz for scheduling the jobs (currently it is 2 separate schedulers for
> the 2 clusters).
>
>
> If while submitting a job I am able to specify which cluster it should trigger
> the job on (by giving the handle to the appropriate cluster), I think I will be
> able to manage it.
>
> Thanks,
> Deepika
>
>
>
>
> ________________________________
> From: Konstantin Boudnik <co...@apache.org>
> To: general@hadoop.apache.org
> Sent: Thu, December 9, 2010 2:24:34 PM
> Subject: Re: Multiple hadoop configurations
>
> I believe the answer you are looking for is Oozie coordinator, but I
> am not which version of it supports multiple clusters.
>
> On Thu, Dec 9, 2010 at 14:20, C J <c....@yahoo.com> wrote:
>> Hi,
>>
>> I have 2 hadoop clusters . Both these clusters have their own set of jobs
>> running. I also have some distcp jobs which copy over data from one cluster to
>> another.
>>
>> I want to be able to control the jobs on both the clusters through one
>>scheduler
>> (so I can coordinate the jobs).
>>
>> I am wondering when I need to trigger a job through a scheduler, how can I
> send
>> to one of the 2 clusters.
>>
>>
>> 2 clusters means 2 sets of configuration files.Is there any way to get a
> handle
>> to one of these clusters, by specifying the configuration file name or
>> something?
>>
>> Will appreciate any help or clues.
>>
>> Thanks.
>>
>>
>>
>>
>
>
>
>
Re: Multiple hadoop configurations
Posted by C J <c....@yahoo.com>.
Thanks Konstantin. My issue is more with supporting the multiple clusters . I am
using quartz for scheduling the jobs (currently it is 2 separate schedulers for
the 2 clusters).
If while submitting a job I am able to specify which cluster it should trigger
the job on (by giving the handle to the appropriate cluster), I think I will be
able to manage it.
Thanks,
Deepika
________________________________
From: Konstantin Boudnik <co...@apache.org>
To: general@hadoop.apache.org
Sent: Thu, December 9, 2010 2:24:34 PM
Subject: Re: Multiple hadoop configurations
I believe the answer you are looking for is Oozie coordinator, but I
am not which version of it supports multiple clusters.
On Thu, Dec 9, 2010 at 14:20, C J <c....@yahoo.com> wrote:
> Hi,
>
> I have 2 hadoop clusters . Both these clusters have their own set of jobs
> running. I also have some distcp jobs which copy over data from one cluster to
> another.
>
> I want to be able to control the jobs on both the clusters through one
>scheduler
> (so I can coordinate the jobs).
>
> I am wondering when I need to trigger a job through a scheduler, how can I
send
> to one of the 2 clusters.
>
>
> 2 clusters means 2 sets of configuration files.Is there any way to get a
handle
> to one of these clusters, by specifying the configuration file name or
> something?
>
> Will appreciate any help or clues.
>
> Thanks.
>
>
>
>
Re: Multiple hadoop configurations
Posted by Konstantin Boudnik <co...@apache.org>.
I believe the answer you are looking for is Oozie coordinator, but I
am not which version of it supports multiple clusters.
On Thu, Dec 9, 2010 at 14:20, C J <c....@yahoo.com> wrote:
> Hi,
>
> I have 2 hadoop clusters . Both these clusters have their own set of jobs
> running. I also have some distcp jobs which copy over data from one cluster to
> another.
>
> I want to be able to control the jobs on both the clusters through one scheduler
> (so I can coordinate the jobs).
>
> I am wondering when I need to trigger a job through a scheduler, how can I send
> to one of the 2 clusters.
>
>
> 2 clusters means 2 sets of configuration files.Is there any way to get a handle
> to one of these clusters, by specifying the configuration file name or
> something?
>
> Will appreciate any help or clues.
>
> Thanks.
>
>
>
>