You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Chengxiang Li (JIRA)" <ji...@apache.org> on 2014/07/11 07:29:04 UTC

[jira] [Commented] (HIVE-7371) Identify a minimum set of JARs needed to ship to Spark cluster [Spark Branch]

    [ https://issues.apache.org/jira/browse/HIVE-7371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14058406#comment-14058406 ] 

Chengxiang Li commented on HIVE-7371:
-------------------------------------

Similar to MR and Tez engine implementation, 4 kinds of lib dependencies should be shipped to Spark cluster:
# hive-exec jar, according to hive-exec module's build file, hive-exec JAR is a fat jar contains a minimal set dependencies for Hive execution.
# auxiliary  jars defined by user through 'hive.aux.jars.path'.
# added jars, user could add jar on Hive CLI. Hive should ship these jars to Spark cluster either.
# add plugin module dependencies on demand. For example, HBase dependencies are not shipped to Spark cluster in default, but if data source stored in HBase, and HBaseStorageHandler is used, Hive should ship HBase related jars to Spark cluster.

> Identify a minimum set of JARs needed to ship to Spark cluster [Spark Branch]
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-7371
>                 URL: https://issues.apache.org/jira/browse/HIVE-7371
>             Project: Hive
>          Issue Type: Task
>          Components: Spark
>            Reporter: Xuefu Zhang
>            Assignee: Chengxiang Li
>
> Currently, Spark client ships all Hive JARs, including those that Hive depends on, to Spark cluster when a query is executed by Spark. This is not efficient, causing potential library conflicts. Ideally, only a minimum set of JARs needs to be shipped. This task is to identify such a set.
> We should learn from current MR cluster, for which I assume only hive-exec JAR is shipped to MR cluster.
> We also need to ensure that user-supplied JARs are also shipped to Spark cluster, in a similar fashion as MR does.
> NO PRECOMMIT TESTS. This is for spark-branch only.



--
This message was sent by Atlassian JIRA
(v6.2#6252)