You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by James Yu <jy...@gmail.com> on 2014/09/03 00:06:03 UTC

can I run cdh5, with Apache Hive 0.13.1 ? I would like to try out the Hive-on-Tez

Hi everyone,

I have a question regarding CHD5 and Hive0.13.

Looks like CDH has no plan to include Hive-on-Tez in CDH, but I like CDH in
general. So my question is, can I run Apache Hive 0.13 on top of CDH5
(HDFS/YARN)? Has anyone tried before?

Not sure if Apache Hive 0.13 has dependency in HDFS/YARN which is not
included in CDH5.

Thanks,
James

Re: can I run cdh5, with Apache Hive 0.13.1 ? I would like to try out the Hive-on-Tez

Posted by James Yu <jy...@gmail.com>.
Update:

I have Tez on CDH runs (only run some very simple tests).

HDFS: CDH 5.1.2
YARN: CDH 5.1.2

Hive: Apache 0.13.1
Tez: Apache 0.4.1





On Tue, Sep 2, 2014 at 4:23 PM, James Yu <jy...@gmail.com> wrote:

> Thanks Edward and Andrew!
>
> From my experience, YARN/MR2 is faster (2-5x) than old MR1. So I'm going
> to try if Hive13 works with CDH5 with YARN. This will be new installation,
> and I won't worry about metastore upgrade.
>
> Will post further result when I get there.
>
>
>
>
> On Tue, Sep 2, 2014 at 3:36 PM, Edward Capriolo <ed...@gmail.com>
> wrote:
>
>> Speaking of which has anyone every ran MRV1 vs YARN? One thing I feel
>> with yarn is there is probably and extra 15-30 second delay, vs the old
>> days when you just had a job tracker, considering having speciality mrv1
>> nodes for this.
>>
>> Or is the perceived start lag just in my imagination?
>>
>>
>> On Tue, Sep 2, 2014 at 6:30 PM, Andrew Mains <an...@kontagent.com>
>> wrote:
>>
>>>  We've been running hive 13 over CDH5 with MRv1. Things have worked
>>> pretty much out of the box thus far--we haven't run into any HDFS/mapreduce
>>> compatibility issues, for instance. The caveats Edward mentioned are
>>> definitely worth taking note of; we haven't tried running 12 and 13
>>> concurrently, and I imagine there might be issues with such a setup for the
>>> reasons he described.
>>>
>>>  So, anecdotally, the answer is yes, at least for hive 13 against CDH 5
>>> with MRv1 :)
>>>
>>> Andrew
>>>
>>> On 9/2/14, 3:17 PM, Edward Capriolo wrote:
>>>
>>>   It can work but there are things to watch out for:
>>>
>>> Hive requires a metastore, usually data stored in mysql. Some versions
>>> of hive make no changes to the metastore.
>>>
>>> For example between versions 0.7 and 0.8 hive added views. When you
>>> launch a hive cli by default hive realizes the schema is missing and
>>> attempts to add it.
>>>
>>>  BUT  hive can alter the metastore (there are variables to disable this)
>>> in ways not reverse compatible. So if you were to say upgrade to 13.1 it
>>> MIGHT make a change that causes 12.X (that comes with cdh) to stop working
>>> properly.
>>>
>>>  You can have two hive installs and two metastores launching jobs on
>>> the same hadoop cluster, which is not a huge hassle to setup, but in my
>>> case I will just wait.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Sep 2, 2014 at 6:06 PM, James Yu <jy...@gmail.com> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>>  I have a question regarding CHD5 and Hive0.13.
>>>>
>>>>  Looks like CDH has no plan to include Hive-on-Tez in CDH, but I like
>>>> CDH in general. So my question is, can I run Apache Hive 0.13 on top of
>>>> CDH5 (HDFS/YARN)? Has anyone tried before?
>>>>
>>>>  Not sure if Apache Hive 0.13 has dependency in HDFS/YARN which is not
>>>> included in CDH5.
>>>>
>>>>  Thanks,
>>>> James
>>>>
>>>>
>>>
>>>
>>
>

Re: can I run cdh5, with Apache Hive 0.13.1 ? I would like to try out the Hive-on-Tez

Posted by James Yu <jy...@gmail.com>.
Thanks Edward and Andrew!

>From my experience, YARN/MR2 is faster (2-5x) than old MR1. So I'm going to
try if Hive13 works with CDH5 with YARN. This will be new installation, and
I won't worry about metastore upgrade.

Will post further result when I get there.




On Tue, Sep 2, 2014 at 3:36 PM, Edward Capriolo <ed...@gmail.com>
wrote:

> Speaking of which has anyone every ran MRV1 vs YARN? One thing I feel with
> yarn is there is probably and extra 15-30 second delay, vs the old days
> when you just had a job tracker, considering having speciality mrv1 nodes
> for this.
>
> Or is the perceived start lag just in my imagination?
>
>
> On Tue, Sep 2, 2014 at 6:30 PM, Andrew Mains <an...@kontagent.com>
> wrote:
>
>>  We've been running hive 13 over CDH5 with MRv1. Things have worked
>> pretty much out of the box thus far--we haven't run into any HDFS/mapreduce
>> compatibility issues, for instance. The caveats Edward mentioned are
>> definitely worth taking note of; we haven't tried running 12 and 13
>> concurrently, and I imagine there might be issues with such a setup for the
>> reasons he described.
>>
>>  So, anecdotally, the answer is yes, at least for hive 13 against CDH 5
>> with MRv1 :)
>>
>> Andrew
>>
>> On 9/2/14, 3:17 PM, Edward Capriolo wrote:
>>
>>   It can work but there are things to watch out for:
>>
>> Hive requires a metastore, usually data stored in mysql. Some versions of
>> hive make no changes to the metastore.
>>
>> For example between versions 0.7 and 0.8 hive added views. When you
>> launch a hive cli by default hive realizes the schema is missing and
>> attempts to add it.
>>
>>  BUT  hive can alter the metastore (there are variables to disable this)
>> in ways not reverse compatible. So if you were to say upgrade to 13.1 it
>> MIGHT make a change that causes 12.X (that comes with cdh) to stop working
>> properly.
>>
>>  You can have two hive installs and two metastores launching jobs on the
>> same hadoop cluster, which is not a huge hassle to setup, but in my case I
>> will just wait.
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Sep 2, 2014 at 6:06 PM, James Yu <jy...@gmail.com> wrote:
>>
>>> Hi everyone,
>>>
>>>  I have a question regarding CHD5 and Hive0.13.
>>>
>>>  Looks like CDH has no plan to include Hive-on-Tez in CDH, but I like
>>> CDH in general. So my question is, can I run Apache Hive 0.13 on top of
>>> CDH5 (HDFS/YARN)? Has anyone tried before?
>>>
>>>  Not sure if Apache Hive 0.13 has dependency in HDFS/YARN which is not
>>> included in CDH5.
>>>
>>>  Thanks,
>>> James
>>>
>>>
>>
>>
>

Re: can I run cdh5, with Apache Hive 0.13.1 ? I would like to try out the Hive-on-Tez

Posted by Edward Capriolo <ed...@gmail.com>.
Speaking of which has anyone every ran MRV1 vs YARN? One thing I feel with
yarn is there is probably and extra 15-30 second delay, vs the old days
when you just had a job tracker, considering having speciality mrv1 nodes
for this.

Or is the perceived start lag just in my imagination?


On Tue, Sep 2, 2014 at 6:30 PM, Andrew Mains <an...@kontagent.com>
wrote:

>  We've been running hive 13 over CDH5 with MRv1. Things have worked
> pretty much out of the box thus far--we haven't run into any HDFS/mapreduce
> compatibility issues, for instance. The caveats Edward mentioned are
> definitely worth taking note of; we haven't tried running 12 and 13
> concurrently, and I imagine there might be issues with such a setup for the
> reasons he described.
>
>  So, anecdotally, the answer is yes, at least for hive 13 against CDH 5
> with MRv1 :)
>
> Andrew
>
> On 9/2/14, 3:17 PM, Edward Capriolo wrote:
>
>   It can work but there are things to watch out for:
>
> Hive requires a metastore, usually data stored in mysql. Some versions of
> hive make no changes to the metastore.
>
> For example between versions 0.7 and 0.8 hive added views. When you launch
> a hive cli by default hive realizes the schema is missing and attempts to
> add it.
>
>  BUT  hive can alter the metastore (there are variables to disable this)
> in ways not reverse compatible. So if you were to say upgrade to 13.1 it
> MIGHT make a change that causes 12.X (that comes with cdh) to stop working
> properly.
>
>  You can have two hive installs and two metastores launching jobs on the
> same hadoop cluster, which is not a huge hassle to setup, but in my case I
> will just wait.
>
>
>
>
>
>
>
>
> On Tue, Sep 2, 2014 at 6:06 PM, James Yu <jy...@gmail.com> wrote:
>
>> Hi everyone,
>>
>>  I have a question regarding CHD5 and Hive0.13.
>>
>>  Looks like CDH has no plan to include Hive-on-Tez in CDH, but I like
>> CDH in general. So my question is, can I run Apache Hive 0.13 on top of
>> CDH5 (HDFS/YARN)? Has anyone tried before?
>>
>>  Not sure if Apache Hive 0.13 has dependency in HDFS/YARN which is not
>> included in CDH5.
>>
>>  Thanks,
>> James
>>
>>
>
>

Re: can I run cdh5, with Apache Hive 0.13.1 ? I would like to try out the Hive-on-Tez

Posted by Andrew Mains <an...@kontagent.com>.
We've been running hive 13 over CDH5 with MRv1. Things have worked 
pretty much out of the box thus far--we haven't run into any 
HDFS/mapreduce compatibility issues, for instance. The caveats Edward 
mentioned are definitely worth taking note of; we haven't tried running 
12 and 13 concurrently, and I imagine there might be issues with such a 
setup for the reasons he described.

  So, anecdotally, the answer is yes, at least for hive 13 against CDH 5 
with MRv1 :)

Andrew

On 9/2/14, 3:17 PM, Edward Capriolo wrote:
> It can work but there are things to watch out for:
>
> Hive requires a metastore, usually data stored in mysql. Some versions 
> of hive make no changes to the metastore.
>
> For example between versions 0.7 and 0.8 hive added views. When you 
> launch a hive cli by default hive realizes the schema is missing and 
> attempts to add it.
>
> BUT  hive can alter the metastore (there are variables to disable 
> this) in ways not reverse compatible. So if you were to say upgrade to 
> 13.1 it MIGHT make a change that causes 12.X (that comes with cdh) to 
> stop working properly.
>
> You can have two hive installs and two metastores launching jobs on 
> the same hadoop cluster, which is not a huge hassle to setup, but in 
> my case I will just wait.
>
>
>
>
>
>
>
>
> On Tue, Sep 2, 2014 at 6:06 PM, James Yu <jym2307@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     Hi everyone,
>
>     I have a question regarding CHD5 and Hive0.13.
>
>     Looks like CDH has no plan to include Hive-on-Tez in CDH, but I
>     like CDH in general. So my question is, can I run Apache Hive 0.13
>     on top of CDH5 (HDFS/YARN)? Has anyone tried before?
>
>     Not sure if Apache Hive 0.13 has dependency in HDFS/YARN which is
>     not included in CDH5.
>
>     Thanks,
>     James
>
>


Re: can I run cdh5, with Apache Hive 0.13.1 ? I would like to try out the Hive-on-Tez

Posted by Edward Capriolo <ed...@gmail.com>.
It can work but there are things to watch out for:

Hive requires a metastore, usually data stored in mysql. Some versions of
hive make no changes to the metastore.

For example between versions 0.7 and 0.8 hive added views. When you launch
a hive cli by default hive realizes the schema is missing and attempts to
add it.

BUT  hive can alter the metastore (there are variables to disable this) in
ways not reverse compatible. So if you were to say upgrade to 13.1 it MIGHT
make a change that causes 12.X (that comes with cdh) to stop working
properly.

You can have two hive installs and two metastores launching jobs on the
same hadoop cluster, which is not a huge hassle to setup, but in my case I
will just wait.








On Tue, Sep 2, 2014 at 6:06 PM, James Yu <jy...@gmail.com> wrote:

> Hi everyone,
>
> I have a question regarding CHD5 and Hive0.13.
>
> Looks like CDH has no plan to include Hive-on-Tez in CDH, but I like CDH
> in general. So my question is, can I run Apache Hive 0.13 on top of CDH5
> (HDFS/YARN)? Has anyone tried before?
>
> Not sure if Apache Hive 0.13 has dependency in HDFS/YARN which is not
> included in CDH5.
>
> Thanks,
> James
>
>