You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "William Handy (JIRA)" <ji...@apache.org> on 2017/05/18 19:23:04 UTC
[jira] [Commented] (SPARK-19076) Upgrade Hive dependence to Hive
2.x
[ https://issues.apache.org/jira/browse/SPARK-19076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016309#comment-16016309 ]
William Handy commented on SPARK-19076:
---------------------------------------
It seems like it was decided that this was too difficult, but I wanted to point out that hive 2.1 has multithreaded writes with settings hive.mv.files.thread and hive.metastore.fshandler.threads. If you happen to be using spark on S3, these settings would be a significant performance boost.
There are several articles talking about using these settings in the context of "Hive on Spark", when I want to see them in "Hive _in_ Spark" instead :-/
> Upgrade Hive dependence to Hive 2.x
> -----------------------------------
>
> Key: SPARK-19076
> URL: https://issues.apache.org/jira/browse/SPARK-19076
> Project: Spark
> Issue Type: Improvement
> Reporter: Dapeng Sun
>
> Currently the upstream Spark depends on Hive 1.2.1 to build package, and Hive 2.0 has been released in February 2016, Hive 2.0.1 and 2.1.0 also released for a long time, at Spark side, it is better to support Hive 2.0 and above.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org