You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Ronak Bhatt <ro...@gmail.com> on 2012/05/26 16:48:01 UTC

Job Scheduling in Hadoop-Hive

Hello -

For those users whose setup is somewhat production, what do you use for job
scheduling and dependency management?

*thanks, ronak*
*
*
*
*

Re: Job Scheduling in Hadoop-Hive

Posted by Tim <ti...@gmail.com>.
Oozie for workflow pipelines, manually triggered but would use a cron. Maven for everything, including packaging oozie stuff. All open source so can point you at it if you want to poke around?

Tim,
Sent from my iPhone (which makes terrible auto-correct spelling mistakes)

On 26 May 2012, at 16:48, Ronak Bhatt <ro...@gmail.com> wrote:

> Hello - 
> 
> For those users whose setup is somewhat production, what do you use for job scheduling and dependency management? 
> 
> thanks, ronak
> 
> 
> 

Re: Job Scheduling in Hadoop-Hive

Posted by Abhishek Pratap Singh <ma...@gmail.com>.
Try Oozie.

Regards,
Abhishek

On Sat, May 26, 2012 at 7:48 AM, Ronak Bhatt <ro...@gmail.com> wrote:

> Hello -
>
> For those users whose setup is somewhat production, what do you use for
> job scheduling and dependency management?
>
> *thanks, ronak*
> *
> *
> *
> *
>
>

RE: Job Scheduling in Hadoop-Hive

Posted by Ruben de Vries <ru...@hyves.nl>.
Hey, 

We use hadoop/hive for processing our access logs and we run a daily cronjob (python script) which does the parsing jobs and some partitioning etc.
The results from those jobs are then queried on by other jobs which generate the results the management team wants to see :-)


From: Ronak Bhatt [mailto:ronakbaps@gmail.com] 
Sent: Saturday, May 26, 2012 4:48 PM
To: hive-user@hadoop.apache.org
Subject: Job Scheduling in Hadoop-Hive

Hello - 

For those users whose setup is somewhat production, what do you use for job scheduling and dependency management? 

thanks, ronak




Re: Job Scheduling in Hadoop-Hive

Posted by Florin Diaconeasa <fl...@gmail.com>.
Spring batch with a basic tasklet for querying the hive db should be of
help. :)

On 26.05.2012, at 17:48, Ronak Bhatt <ro...@gmail.com> wrote:

Hello -

For those users whose setup is somewhat production, what do you use for job
scheduling and dependency management?

*thanks, ronak*
*
*
*
*

Re: Job Scheduling in Hadoop-Hive

Posted by Pedro Figueiredo <pf...@89clouds.com>.
On 26 May 2012, at 18:52, Senthilvel Rangaswamy wrote:

> On Sat, May 26, 2012 at 7:48 AM, Ronak Bhatt <ro...@gmail.com> wrote:
> Hello - 
> 
> For those users whose setup is somewhat production, what do you use for job scheduling and dependency management? 
> 
> 
> 
> Oozie has both job scheduling and dependency management.   You can also use something like rundeck
> if you want to do scheduling outside of Oozie. I would avoid cron at all costs.

What are people's experiences with Oozie running against EMR clusters? This (scheduling) is definitely *the* pain point I've seen the most, but I've never tried Oozie with EMR (although I can't think of a reason it wouldn't work, once the cluster is up).

Cheers,

Pedro

Pedro Figueiredo
Skype: pfig.89clouds
http://89clouds.com/ - Big Data Consulting





Re: Job Scheduling in Hadoop-Hive

Posted by Senthilvel Rangaswamy <se...@gmail.com>.
On Sat, May 26, 2012 at 7:48 AM, Ronak Bhatt <ro...@gmail.com> wrote:

> Hello -
>
> For those users whose setup is somewhat production, what do you use for
> job scheduling and dependency management?
>
> **
>

Oozie has both job scheduling and dependency management.   You can also use
something like rundeck
if you want to do scheduling outside of Oozie. I would avoid cron at all
costs.

-- 
..Senthil

"If there's anything more important than my ego around, I want it
 caught and shot now."
                                                    - Douglas Adams.

Re: Job Scheduling in Hadoop-Hive

Posted by "Hambleton, Jordan" <Jo...@netapp.com>.
Hi Ronak,

Pentaho has been our scheduling/workflow management tool of choice.  Previously we had used cloudera's bundled oozie in cdh3u2.

Jordan Hambleton
ASUP Data Processing & Services [CFT]

From: Ronak Bhatt <ro...@gmail.com>>
Reply-To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Date: Saturday, May 26, 2012 7:48 AM
To: "hive-user@hadoop.apache.org<ma...@hadoop.apache.org>" <hi...@hadoop.apache.org>>
Subject: Job Scheduling in Hadoop-Hive

Hello -

For those users whose setup is somewhat production, what do you use for job scheduling and dependency management?

thanks, ronak