You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Gautam Singaraju <ga...@gmail.com> on 2012/03/23 19:33:06 UTC

AWS and Hive support questions

We are building a new data platform and are considering Oozie for our
workflow system. Would it be possible to share some thoughts on using Oozie
with Hive and running Oozie on AWS?
---
Gautam

Re: AWS and Hive support questions

Posted by Gautam Singaraju <ga...@gmail.com>.
Hi Samir,

Thanks for the heads-up. We did notice that Cloudera version has
integration with Hive. I will let you know how it goes!

---
Gautam


On Mon, Mar 26, 2012 at 5:03 AM, Samir Eljazovic
<sa...@gmail.com>wrote:

> Hi Gautam
>
> Last week I've tried to get Oozie up and running on EC2 as a workflow
> engine for Elastic MapReduce cluster. My primary goal was to setup Oozie
> for running Hive scripts on EMR cluster. Unfortunately I wasn't able to
> make it work.
>
> The version of Oozie that supports running Hive actions is from Cloudera's
> CDH3 distribution. See this link -
> http://archive.cloudera.com/cdh/3/oozie/DG_HiveActionExtension.htmlAccording
> to CDH version and packaging info the version is oozie-2.3.2+27.12.
> Hadoop version running on EMR cluster is 0.20.205 while CDH3U3 ships with
> hadoop-0.20.2+923.197.
>
> In CDH3U3 package Oozie comes with Hadoop 0.20.2 jars. I have tried to
> manually replace Hadoop jars with those from EMR (0.20.205) but I've found
> that other dependencies were missing. I did not try to resolve these
> dependencies as this setup would be most likely unstable.
>
> On the other hand, I did manage to setup latest stable version of Oozie
> 3.1.3 <http://incubator.apache.org/oozie/> and successfully submit map
> reduce jobs to EMR. But the problem with this version of Oozie from Apache
> is that does not support Hive actions.
>
> I believe these are the options that are currently available:
> - wait for CDH4 Beta 2 release which should bring Ooozie back to CDH
> - wait for Oozie 3.2.0 which supports Hive actions
> (OOZIE-68<https://issues.apache.org/jira/browse/OOZIE-68>
> )
> - manually patch Oozie from trunk with fix for OOZIE-68 (if possible at
> all)
>
> I hope this helps.
>
> Regards,
> Samir
>
>
> On 23 March 2012 19:33, Gautam Singaraju <ga...@gmail.com>
> wrote:
>
> > We are building a new data platform and are considering Oozie for our
> > workflow system. Would it be possible to share some thoughts on using
> Oozie
> > with Hive and running Oozie on AWS?
> > ---
> > Gautam
> >
>

Re: AWS and Hive support questions

Posted by Samir Eljazovic <sa...@gmail.com>.
Hi Gautam

Last week I've tried to get Oozie up and running on EC2 as a workflow
engine for Elastic MapReduce cluster. My primary goal was to setup Oozie
for running Hive scripts on EMR cluster. Unfortunately I wasn't able to
make it work.

The version of Oozie that supports running Hive actions is from Cloudera's
CDH3 distribution. See this link -
http://archive.cloudera.com/cdh/3/oozie/DG_HiveActionExtension.html According
to CDH version and packaging info the version is oozie-2.3.2+27.12.
Hadoop version running on EMR cluster is 0.20.205 while CDH3U3 ships with
hadoop-0.20.2+923.197.

In CDH3U3 package Oozie comes with Hadoop 0.20.2 jars. I have tried to
manually replace Hadoop jars with those from EMR (0.20.205) but I've found
that other dependencies were missing. I did not try to resolve these
dependencies as this setup would be most likely unstable.

On the other hand, I did manage to setup latest stable version of Oozie
3.1.3 <http://incubator.apache.org/oozie/> and successfully submit map
reduce jobs to EMR. But the problem with this version of Oozie from Apache
is that does not support Hive actions.

I believe these are the options that are currently available:
- wait for CDH4 Beta 2 release which should bring Ooozie back to CDH
- wait for Oozie 3.2.0 which supports Hive actions
(OOZIE-68<https://issues.apache.org/jira/browse/OOZIE-68>
)
- manually patch Oozie from trunk with fix for OOZIE-68 (if possible at all)

I hope this helps.

Regards,
Samir


On 23 March 2012 19:33, Gautam Singaraju <ga...@gmail.com> wrote:

> We are building a new data platform and are considering Oozie for our
> workflow system. Would it be possible to share some thoughts on using Oozie
> with Hive and running Oozie on AWS?
> ---
> Gautam
>