You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by 彭鱼宴 <46...@qq.com> on 2018/07/19 13:34:45 UTC

回复: Does Hive 3.0 only works with hadoop3.x.y?

Hi Sungwoo,


Just want to confirm, does that mean I just need to update the hive version, without updating the hadoop version?


Thanks!


Best,
Zhefu Peng




------------------ 原始邮件 ------------------
发件人: "Sungwoo Park"<gl...@gmail.com>;
发送时间: 2018年7月19日(星期四) 晚上8:20
收件人: "user"<us...@hive.apache.org>;

主题: Re: Does Hive 3.0 only works with hadoop3.x.y?



Hive 3.0 make a few function calls that depend on Hadoop 3.x, but they are easy to replace with code that compiles okay on Hadoop 2.8+. I am currently running Hadoop 3.x on Hadoop 2.7.6 and HDP 2.6.4 to test with the TPC-DS benchmark, and have not encountered any compatibility issue yet. I previously posted a diff file that lets us compile Hadoop 3.x on Hadoop 2.8+.

http://mail-archives.apache.org/mod_mbox/hive-user/201806.mbox/%3CCAKHFPXDDFn52buKetHzSXTtjzX3UMHf%3DQvxm9QNNkv9r5xBs-Q%40mail.gmail.com%3E 

--- Sungwoo Park





On Thu, Jul 19, 2018 at 8:21 PM, 彭鱼宴 <46...@qq.com> wrote:
Hi,


I already deployed hive 2.2.0 on our hadoop cluster. And recently, we deployed the spark cluster with 2.3.0, aiming at using the feature that hive on spark engine. However, when I checked the website of hive release, I found the text below:

21 May 2018 : release 3.0.0 available

This release works with Hadoop 3.x.y.

Now the hadoop version we deployed is hadoop 2.7.6. I wonder, does Hive 3.0 only work with hadoop 3.x.y? Or, if we want to use hive 3.0, we have to update the hadoop version to 3.x.y?

Looking forward to your reply and help.

Best,

Zhefu Peng

Re: 回复: Does Hive 3.0 only works with hadoop3.x.y?

Posted by Jörn Franke <jo...@gmail.com>.
Hadoop 3.0 brings anyway some interesting benefits such as reduced storage needs (you dont need to replicate anymore 3 times for reliability reasons), so that may be convincing.

> On 22. Jul 2018, at 08:28, 彭鱼宴 <46...@qq.com> wrote:
> 
> Hi Tanvi,
> 
> Thanks! I will check that and have a talk with my colleagues to consider about the upgrading.
> 
> Best,
> Zhefu Peng
> 
> 
> ------------------ 原始邮件 ------------------
> 发件人: "Tanvi Thacker"<ta...@gmail.com>;
> 发送时间: 2018年7月21日(星期六) 下午3:24
> 收件人: "user"<us...@hive.apache.org>;
> 主题: Re: Does Hive 3.0 only works with hadoop3.x.y?
> 
> I would recommend upgrading to Hadoop 3.0 or 3.1 because of the following reasons:-
> 
> It may be possible that Hadoop 2.x transitively brings some dependencies which may conflict with libraries used by hive( like unpredictable library google guava etc), which will affect your runtime environment.
> Hive might be utilizing some of the new public APIs which are exposed in 3.x line of Hadoop , so with Hadoop 2.x you may see some ClassNotFound/NoSuchMethod in runtime if your query is addressing such code path.
> In production, you must use the same the dependencies in which hive is compiled and tested.
> https://github.com/apache/hive/blob/rel/release-3.0.0/pom.xml#L149
> 
> Thanks,
> Tanvi Thacker
> 
> 
>> On Thu, Jul 19, 2018 at 8:15 PM, Sungwoo Park <gl...@gmail.com> wrote:
>> I would say yes (because I am actually running Hive 3.0 on Hadoop 2.7.6 and HDP 2.7.5), provided that you make small changes to the source code to Hive 3.0. However, I have not tested Hive 3.0 on Spark.
>> 
>> --- Sungwoo 
>> 
>>> On Thu, Jul 19, 2018 at 10:34 PM, 彭鱼宴 <46...@qq.com> wrote:
>>> Hi Sungwoo,
>>> 
>>> Just want to confirm, does that mean I just need to update the hive version, without updating the hadoop version?
>>> 
>>> Thanks!
>>> 
>>> Best,
>>> Zhefu Peng
>>> 
>>> 
>>> ------------------ 原始邮件 ------------------
>>> 发件人: "Sungwoo Park"<gl...@gmail.com>;
>>> 发送时间: 2018年7月19日(星期四) 晚上8:20
>>> 收件人: "user"<us...@hive.apache.org>;
>>> 主题: Re: Does Hive 3.0 only works with hadoop3.x.y?
>>> 
>>> Hive 3.0 make a few function calls that depend on Hadoop 3.x, but they are easy to replace with code that compiles okay on Hadoop 2.8+. I am currently running Hadoop 3.x on Hadoop 2.7.6 and HDP 2.6.4 to test with the TPC-DS benchmark, and have not encountered any compatibility issue yet. I previously posted a diff file that lets us compile Hadoop 3.x on Hadoop 2.8+.
>>> 
>>> http://mail-archives.apache.org/mod_mbox/hive-user/201806.mbox/%3CCAKHFPXDDFn52buKetHzSXTtjzX3UMHf%3DQvxm9QNNkv9r5xBs-Q%40mail.gmail.com%3E 
>>> 
>>> --- Sungwoo Park
>>> 
>>> 
>>>> On Thu, Jul 19, 2018 at 8:21 PM, 彭鱼宴 <46...@qq.com> wrote:
>>>> Hi,
>>>> 
>>>> I already deployed hive 2.2.0 on our hadoop cluster. And recently, we deployed the spark cluster with 2.3.0, aiming at using the feature that hive on spark engine. However, when I checked the website of hive release, I found the text below:
>>>> 21 May 2018 : release 3.0.0 available
>>>> This release works with Hadoop 3.x.y.
>>>> 
>>>> Now the hadoop version we deployed is hadoop 2.7.6. I wonder, does Hive 3.0 only work with hadoop 3.x.y? Or, if we want to use hive 3.0, we have to update the hadoop version to 3.x.y?
>>>> 
>>>> Looking forward to your reply and help.
>>>> 
>>>> Best,
>>>> 
>>>> Zhefu Peng
>>>> 
>>> 
>> 
> 

回复: Does Hive 3.0 only works with hadoop3.x.y?

Posted by 彭鱼宴 <46...@qq.com>.
Hi Tanvi,


Thanks! I will check that and have a talk with my colleagues to consider about the upgrading.


Best,
Zhefu Peng




------------------ 原始邮件 ------------------
发件人: "Tanvi Thacker"<ta...@gmail.com>;
发送时间: 2018年7月21日(星期六) 下午3:24
收件人: "user"<us...@hive.apache.org>;

主题: Re: Does Hive 3.0 only works with hadoop3.x.y?



I would recommend upgrading to Hadoop 3.0 or 3.1 because of the following reasons:-


It may be possible that Hadoop 2.x transitively brings some dependencies which may conflict with libraries used by hive( like unpredictable library google guava etc), which will affect your runtime environment.


Hive might be utilizing some of the new public APIs which are exposed in 3.x line of Hadoop , so with Hadoop 2.x you may see some ClassNotFound/NoSuchMethod in runtime if your query is addressing such code path.
In production, you must use the same the dependencies in which hive is compiled and tested.

https://github.com/apache/hive/blob/rel/release-3.0.0/pom.xml#L149



Thanks,
Tanvi Thacker




On Thu, Jul 19, 2018 at 8:15 PM, Sungwoo Park <gl...@gmail.com> wrote:
I would say yes (because I am actually running Hive 3.0 on Hadoop 2.7.6 and HDP 2.7.5), provided that you make small changes to the source code to Hive 3.0. However, I have not tested Hive 3.0 on Spark.

--- Sungwoo 


On Thu, Jul 19, 2018 at 10:34 PM, 彭鱼宴 <46...@qq.com> wrote:
Hi Sungwoo,


Just want to confirm, does that mean I just need to update the hive version, without updating the hadoop version?


Thanks!


Best,
Zhefu Peng




------------------ 原始邮件 ------------------
发件人: "Sungwoo Park"<gl...@gmail.com>;
发送时间: 2018年7月19日(星期四) 晚上8:20
收件人: "user"<us...@hive.apache.org>;

主题: Re: Does Hive 3.0 only works with hadoop3.x.y?



Hive 3.0 make a few function calls that depend on Hadoop 3.x, but they are easy to replace with code that compiles okay on Hadoop 2.8+. I am currently running Hadoop 3.x on Hadoop 2.7.6 and HDP 2.6.4 to test with the TPC-DS benchmark, and have not encountered any compatibility issue yet. I previously posted a diff file that lets us compile Hadoop 3.x on Hadoop 2.8+.

http://mail-archives.apache.org/mod_mbox/hive-user/201806.mbox/%3CCAKHFPXDDFn52buKetHzSXTtjzX3UMHf%3DQvxm9QNNkv9r5xBs-Q%40mail.gmail.com%3E 

--- Sungwoo Park





On Thu, Jul 19, 2018 at 8:21 PM, 彭鱼宴 <46...@qq.com> wrote:
Hi,


I already deployed hive 2.2.0 on our hadoop cluster. And recently, we deployed the spark cluster with 2.3.0, aiming at using the feature that hive on spark engine. However, when I checked the website of hive release, I found the text below:

21 May 2018 : release 3.0.0 available

This release works with Hadoop 3.x.y.

Now the hadoop version we deployed is hadoop 2.7.6. I wonder, does Hive 3.0 only work with hadoop 3.x.y? Or, if we want to use hive 3.0, we have to update the hadoop version to 3.x.y?

Looking forward to your reply and help.

Best,

Zhefu Peng

Re: Does Hive 3.0 only works with hadoop3.x.y?

Posted by Tanvi Thacker <ta...@gmail.com>.
I would recommend upgrading to Hadoop 3.0 or 3.1 because of the following
reasons:-


   - It may be possible that Hadoop 2.x transitively brings some
   dependencies which may conflict with libraries used by hive( like
   unpredictable library google guava etc), which will affect your runtime
   environment.
   - Hive might be utilizing some of the new public APIs which are exposed
   in 3.x line of Hadoop , so with Hadoop 2.x you may see some
   ClassNotFound/NoSuchMethod in runtime if your query is addressing such code
   path.

In production, you must use the same the dependencies in which hive is
compiled and tested.
https://github.com/apache/hive/blob/rel/release-3.0.0/pom.xml#L149

Thanks,
Tanvi Thacker


On Thu, Jul 19, 2018 at 8:15 PM, Sungwoo Park <gl...@gmail.com> wrote:

> I would say yes (because I am actually running Hive 3.0 on Hadoop 2.7.6
> and HDP 2.7.5), provided that you make small changes to the source code to
> Hive 3.0. However, I have not tested Hive 3.0 on Spark.
>
> --- Sungwoo
>
> On Thu, Jul 19, 2018 at 10:34 PM, 彭鱼宴 <46...@qq.com> wrote:
>
>> Hi Sungwoo,
>>
>> Just want to confirm, does that mean I just need to update the hive
>> version, without updating the hadoop version?
>>
>> Thanks!
>>
>> Best,
>> Zhefu Peng
>>
>>
>> ------------------ 原始邮件 ------------------
>> *发件人:* "Sungwoo Park"<gl...@gmail.com>;
>> *发送时间:* 2018年7月19日(星期四) 晚上8:20
>> *收件人:* "user"<us...@hive.apache.org>;
>> *主题:* Re: Does Hive 3.0 only works with hadoop3.x.y?
>>
>> Hive 3.0 make a few function calls that depend on Hadoop 3.x, but they
>> are easy to replace with code that compiles okay on Hadoop 2.8+. I am
>> currently running Hadoop 3.x on Hadoop 2.7.6 and HDP 2.6.4 to test with the
>> TPC-DS benchmark, and have not encountered any compatibility issue yet. I
>> previously posted a diff file that lets us compile Hadoop 3.x on Hadoop
>> 2.8+.
>>
>> http://mail-archives.apache.org/mod_mbox/hive-user/201806.mb
>> ox/%3CCAKHFPXDDFn52buKetHzSXTtjzX3UMHf%3DQvxm9QNNkv9r5xBs-Q%
>> 40mail.gmail.com%3E
>>
>> --- Sungwoo Park
>>
>>
>> On Thu, Jul 19, 2018 at 8:21 PM, 彭鱼宴 <46...@qq.com> wrote:
>>
>>> Hi,
>>>
>>> I already deployed hive 2.2.0 on our hadoop cluster. And recently, we
>>> deployed the spark cluster with 2.3.0, aiming at using the feature that
>>> hive on spark engine. However, when I checked the website of hive release,
>>> I found the text below:
>>> 21 May 2018 : release 3.0.0 available
>>> <https://hive.apache.org/downloads.html#21-may-2018-release-300-available>
>>>
>>> This release works with Hadoop 3.x.y.
>>>
>>> Now the hadoop version we deployed is hadoop 2.7.6. I wonder, does Hive
>>> 3.0 only work with hadoop 3.x.y? Or, if we want to use hive 3.0, we have to
>>> update the hadoop version to 3.x.y?
>>>
>>> Looking forward to your reply and help.
>>>
>>> Best,
>>>
>>> Zhefu Peng
>>>
>>
>>
>

Re: Does Hive 3.0 only works with hadoop3.x.y?

Posted by Sungwoo Park <gl...@gmail.com>.
I would say yes (because I am actually running Hive 3.0 on Hadoop 2.7.6 and
HDP 2.7.5), provided that you make small changes to the source code to Hive
3.0. However, I have not tested Hive 3.0 on Spark.

--- Sungwoo

On Thu, Jul 19, 2018 at 10:34 PM, 彭鱼宴 <46...@qq.com> wrote:

> Hi Sungwoo,
>
> Just want to confirm, does that mean I just need to update the hive
> version, without updating the hadoop version?
>
> Thanks!
>
> Best,
> Zhefu Peng
>
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Sungwoo Park"<gl...@gmail.com>;
> *发送时间:* 2018年7月19日(星期四) 晚上8:20
> *收件人:* "user"<us...@hive.apache.org>;
> *主题:* Re: Does Hive 3.0 only works with hadoop3.x.y?
>
> Hive 3.0 make a few function calls that depend on Hadoop 3.x, but they are
> easy to replace with code that compiles okay on Hadoop 2.8+. I am currently
> running Hadoop 3.x on Hadoop 2.7.6 and HDP 2.6.4 to test with the TPC-DS
> benchmark, and have not encountered any compatibility issue yet. I
> previously posted a diff file that lets us compile Hadoop 3.x on Hadoop
> 2.8+.
>
> http://mail-archives.apache.org/mod_mbox/hive-user/201806.mbox/%
> 3CCAKHFPXDDFn52buKetHzSXTtjzX3UMHf%3DQvxm9QNNkv9r5xBs-Q%
> 40mail.gmail.com%3E
>
> --- Sungwoo Park
>
>
> On Thu, Jul 19, 2018 at 8:21 PM, 彭鱼宴 <46...@qq.com> wrote:
>
>> Hi,
>>
>> I already deployed hive 2.2.0 on our hadoop cluster. And recently, we
>> deployed the spark cluster with 2.3.0, aiming at using the feature that
>> hive on spark engine. However, when I checked the website of hive release,
>> I found the text below:
>> 21 May 2018 : release 3.0.0 available
>> <https://hive.apache.org/downloads.html#21-may-2018-release-300-available>
>>
>> This release works with Hadoop 3.x.y.
>>
>> Now the hadoop version we deployed is hadoop 2.7.6. I wonder, does Hive
>> 3.0 only work with hadoop 3.x.y? Or, if we want to use hive 3.0, we have to
>> update the hadoop version to 3.x.y?
>>
>> Looking forward to your reply and help.
>>
>> Best,
>>
>> Zhefu Peng
>>
>
>