You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Chengxiang Li (JIRA)" <ji...@apache.org> on 2014/11/18 07:48:34 UTC

[jira] [Updated] (HIVE-8835) identify dependency scope for Remote Spark Context.[Spark Branch]

     [ https://issues.apache.org/jira/browse/HIVE-8835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chengxiang Li updated HIVE-8835:
--------------------------------
    Attachment: HIVE-8835.1-spark.patch

Remote Spark Context need at least 3 dependencies to go through a query.
# spark assembly jar
# hive exec bundle jar
# spark-client jar

Hive ql module depends on spark-client module, we should add spark-client into hive exec bundle jar.

> identify dependency scope for Remote Spark Context.[Spark Branch]
> -----------------------------------------------------------------
>
>                 Key: HIVE-8835
>                 URL: https://issues.apache.org/jira/browse/HIVE-8835
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Chengxiang Li
>            Assignee: Chengxiang Li
>              Labels: Spark-M3
>         Attachments: HIVE-8835.1-spark.patch
>
>
> While submit job through Remote Spark Context, spark RDD graph generation and job submit is executed in remote side, so we have to add hive  related dependency into its classpath with spark.driver.extraClassPath. instead of add all hive/hadoop dependency, we should narrow the scope and identify what dependency remote spark context required. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)