You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Mich Talebzadeh <mi...@gmail.com> on 2016/03/08 18:48:00 UTC

Hive 2 and versions of Spark as the execution engine for Hive

Hi,

As I recall Hive 2 now officially recommends using Spark or Tez as
execution engine instead of Map-Reduce (MR)

hive>
*set hive.execution.engine=mr;*Hive-on-MR is deprecated in Hive 2 and may
not be available in the future versions. Consider using a different
execution engine (i.e. tez, spark) or using Hive 1.X releases.

I still run my Hive 2 on Spark 1.3.1 (I know some state that it works with
1.4.1 but that is still an older version of Spark) and even with my
1.3.1 version of Spark engine things have improved in terms of stability
and performance.

Now as a consumer I can verify that for larger tables, Hive
1.2.1 queries (even simple things like COUNT(*) or INSERT/SELECT) used to
crash before and I had to switch to MR as the execution engine. On Hive 2.0
it all works fine.

However, I guess the 60K question is what has been done to make Hive 2 work
with newer versions of Spark like 1.5.2 and 1.6, given that Hive 2
encourages  using Hive on Spark and/or Tez.

Thanks,

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com