You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Ronak Bhatt <ro...@gmail.com> on 2012/05/26 16:48:01 UTC
Job Scheduling in Hadoop-Hive
Hello -
For those users whose setup is somewhat production, what do you use for job
scheduling and dependency management?
*thanks, ronak*
*
*
*
*
Re: Job Scheduling in Hadoop-Hive
Posted by Tim <ti...@gmail.com>.
Oozie for workflow pipelines, manually triggered but would use a cron. Maven for everything, including packaging oozie stuff. All open source so can point you at it if you want to poke around?
Tim,
Sent from my iPhone (which makes terrible auto-correct spelling mistakes)
On 26 May 2012, at 16:48, Ronak Bhatt <ro...@gmail.com> wrote:
> Hello -
>
> For those users whose setup is somewhat production, what do you use for job scheduling and dependency management?
>
> thanks, ronak
>
>
>
Re: Job Scheduling in Hadoop-Hive
Posted by Abhishek Pratap Singh <ma...@gmail.com>.
Try Oozie.
Regards,
Abhishek
On Sat, May 26, 2012 at 7:48 AM, Ronak Bhatt <ro...@gmail.com> wrote:
> Hello -
>
> For those users whose setup is somewhat production, what do you use for
> job scheduling and dependency management?
>
> *thanks, ronak*
> *
> *
> *
> *
>
>
RE: Job Scheduling in Hadoop-Hive
Posted by Ruben de Vries <ru...@hyves.nl>.
Hey,
We use hadoop/hive for processing our access logs and we run a daily cronjob (python script) which does the parsing jobs and some partitioning etc.
The results from those jobs are then queried on by other jobs which generate the results the management team wants to see :-)
From: Ronak Bhatt [mailto:ronakbaps@gmail.com]
Sent: Saturday, May 26, 2012 4:48 PM
To: hive-user@hadoop.apache.org
Subject: Job Scheduling in Hadoop-Hive
Hello -
For those users whose setup is somewhat production, what do you use for job scheduling and dependency management?
thanks, ronak
Re: Job Scheduling in Hadoop-Hive
Posted by Florin Diaconeasa <fl...@gmail.com>.
Spring batch with a basic tasklet for querying the hive db should be of
help. :)
On 26.05.2012, at 17:48, Ronak Bhatt <ro...@gmail.com> wrote:
Hello -
For those users whose setup is somewhat production, what do you use for job
scheduling and dependency management?
*thanks, ronak*
*
*
*
*
Re: Job Scheduling in Hadoop-Hive
Posted by Pedro Figueiredo <pf...@89clouds.com>.
On 26 May 2012, at 18:52, Senthilvel Rangaswamy wrote:
> On Sat, May 26, 2012 at 7:48 AM, Ronak Bhatt <ro...@gmail.com> wrote:
> Hello -
>
> For those users whose setup is somewhat production, what do you use for job scheduling and dependency management?
>
>
>
> Oozie has both job scheduling and dependency management. You can also use something like rundeck
> if you want to do scheduling outside of Oozie. I would avoid cron at all costs.
What are people's experiences with Oozie running against EMR clusters? This (scheduling) is definitely *the* pain point I've seen the most, but I've never tried Oozie with EMR (although I can't think of a reason it wouldn't work, once the cluster is up).
Cheers,
Pedro
Pedro Figueiredo
Skype: pfig.89clouds
http://89clouds.com/ - Big Data Consulting
Re: Job Scheduling in Hadoop-Hive
Posted by Senthilvel Rangaswamy <se...@gmail.com>.
On Sat, May 26, 2012 at 7:48 AM, Ronak Bhatt <ro...@gmail.com> wrote:
> Hello -
>
> For those users whose setup is somewhat production, what do you use for
> job scheduling and dependency management?
>
> **
>
Oozie has both job scheduling and dependency management. You can also use
something like rundeck
if you want to do scheduling outside of Oozie. I would avoid cron at all
costs.
--
..Senthil
"If there's anything more important than my ego around, I want it
caught and shot now."
- Douglas Adams.
Re: Job Scheduling in Hadoop-Hive
Posted by "Hambleton, Jordan" <Jo...@netapp.com>.
Hi Ronak,
Pentaho has been our scheduling/workflow management tool of choice. Previously we had used cloudera's bundled oozie in cdh3u2.
Jordan Hambleton
ASUP Data Processing & Services [CFT]
From: Ronak Bhatt <ro...@gmail.com>>
Reply-To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Date: Saturday, May 26, 2012 7:48 AM
To: "hive-user@hadoop.apache.org<ma...@hadoop.apache.org>" <hi...@hadoop.apache.org>>
Subject: Job Scheduling in Hadoop-Hive
Hello -
For those users whose setup is somewhat production, what do you use for job scheduling and dependency management?
thanks, ronak